As described in the user documentation, there are four types of formatters
- formats
- summaries
- filters
- synthetic children
Architecturally, these are implemented by classes in the source/DataFormatters/ folder
Formatters have descriptor classes, Type*Impl, which contain at least a "Flags" nested object, which contains both rules to be used
by the matching algorithm (e.g. should the formatter for type Foo apply to a Foo*?) or rules to be used
by the formatter itself (e.g. is this summary a oneliner?)
Individual formatter descriptor classes then also contain data items useful to them for performing their functionality.
For instance TypeFormatImpl (backing formats) contains an lldb::Format that is the format to then be applied
were this formatter to be selected. Upon issuing a "type format add", a new TypeFormatImpl is created that wraps
the user-specified format, and matching options:
entry.reset(new TypeFormatImpl(format,
TypeFormatImpl::Flags().SetCascades(m_command_options.m_cascade).
SetSkipPointers(m_command_options.m_skip_pointers).
SetSkipReferences(m_command_options.m_skip_references)));
While formats are fairly simple and only implemented by one class, the other formatter types are backed by a class hierarchy
Summaries, for instance, can exist in one of three "flavors":
- summary strings
- Python script
- native C++
The base class for summaries, TypeSummaryImpl, is a pure virtual class that wraps, again, the Flags, and exports among others a
virtual bool
FormatObject (ValueObject *valobj,
std::string& dest) = 0;
This is the core entry point, which allows subclasses to specify their mode of operation
StringSummaryFormat, which is the class that implements summary strings, does a check as to whether
the summary is a one-liner, and if not, then uses its stored summary string to call into
Debugger::FormatPrompt, and obtain a string back, which it returns in dest as the resulting summary
For a Python summary, implemented in ScriptSummaryFormat, FormatObject() calls into the ScriptInterpreter
which is supposed to hold the knowledge on how to bridge back and forth with the scripting language
(Python in the case of LLDB) in order to produce a valid string. Implementors of new ScriptInterpreters for other
languages are expected to provide a GetScriptedSummary() entrypoint for this purpose, if they desire to allow
users to provide formatters in the new language
Lastly, C++ summaries (CXXFunctionSummaryFormat), wrap a function pointer and call into it to execute their duty.
It should be noted that there are no facilities for users to interact with C++ formatters, and as such they are extremely
opaque, effectively being a thin wrapper between plain function pointers and the LLDB formatters subsystem.
Also, dynamic loading of C++ formatters in LLDB is currently not implemented, and as such it is safe and reasonable
for these formatters to deal with internal ValueObjects instances instead of public SBValue objects
An interesting data point is that summaries are expected to be stateless. While at the Python layer they are handed
an SBValue (since nothing else could be visible for scripts), it is not expected that the SBValue should be cached
and reused - any and all caching occurs on the LLDB side, completely transparent to the formatter itself
The design of synthetic children is somewhat more intricate, due to them being stateful objects.
The core idea of the design is that synthetic children act like a two-tier model, in which there is a backend
dataset (the underlying unformatted ValueObject), and an higher level view (frontend) which vends the computed
representation
To implement a new type of synthetic children one would implement a subclass of SyntheticChildren, which akin to the TypeFormatImpl,
contains Flags for matching, and data items to be used for formatting. For instance, TypeFilterImpl (which implements filters),
stores the list of expression paths of the children to be displayed.
Filters are themselves synthetic children. Since all they
do is provide child values for a ValueObject, it does not truly matter whether these come from the real set of children or are
crafted through some intricate algorithm. As such, they perfectly fit within the realm of synthetic children and are only
shown as separate entities for user friendliness (to a user, picking a subset of elements to be shown with relative ease is a
valuable task, and they should not be concerned with writing scripts to do so)
Once the descriptor of the synthetic children has been coded, in order to hook it up, one has to implement a subclass of
SyntheticChildrenFrontEnd. For a given type of synthetic children, there is a deep coupling with the matching front-end class,
given that the front-end usually needs data stored in the descriptor (e.g. a filter needs the list of child elements)
The front-end answers the interesting questions that are the true raison d'ĂȘtre of synthetic children:
-
virtual size_t
CalculateNumChildren () = 0;
-
virtual lldb::ValueObjectSP
GetChildAtIndex (size_t idx) = 0;
-
virtual size_t
GetIndexOfChildWithName (const ConstString &name) = 0;
-
virtual bool
Update () = 0;
-
virtual bool
MightHaveChildren () = 0;
Synthetic children providers (their front-ends) will be queried by LLDB for a number of children, and then for each of them
as necessary, they should be prepared to return a ValueObject describing the child. They might also be asked to provide a
name-to-index mapping (e.g. to allow LLDB to resolve queries like myFoo.myChild
)
Update() and MightHaveChildren() are described in the user documentation, and they mostly serve bookkeeping purposes
LLDB provides three kinds of synthetic children: filters, scripted synthetics, and the native C++ providers
Filters are implemented by TypeFilterImpl/TypeFilterImpl::FrontEnd
Scripted synthetics are implemented by ScriptedSyntheticChildren/ScriptedSyntheticChildren::FrontEnd, plus
a set of callbacks provided by the ScriptInterpteter infrastructure to allow LLDB to pass the front-end queries
down to the scripting languages
As for C++ native synthetics, there is a CXXSyntheticChildren, but no corresponding FrontEnd class. The reason for this design is
that CXXSyntheticChildren store a callback to a creator function, which is responsible for providing a FrontEnd.
Each individual formatter (e.g. LibstdcppMapIteratorSyntheticFrontEnd, NSDictionaryMSyntheticFrontEnd, ...) is a standalone
frontend, and once created retains to relation to its underlying SyntheticChildren object
On a ValueObject level, upon being asked to generate synthetic children for a ValueObject, LLDB spawns a ValueObjectSynthetic object
which is a subclass of ValueObject. Building upon the ValueObject infrastructure, it stores a backend, and a shared pointer to
the SyntheticChildren.
Upon being asked queries about children, it will use the SyntheticChildren to generate a front-end for itself
and will let the front-end answer questions. The reason for not storing the FrontEnd itself is that there is no guarantee that across
updates, the same FrontEnd will be used over and over (e.g. a SyntheticChildren object could serve an entire class hierarchy
and vend different frontends for different subclasses)