You've already forked linux-packaging-mono
Imported Upstream version 5.18.0.167
Former-commit-id: 289509151e0fee68a1b591a20c9f109c3c789d3a
This commit is contained in:
parent
e19d552987
commit
b084638f15
398
external/llvm/docs/tutorial/BuildingAJIT1.rst
vendored
398
external/llvm/docs/tutorial/BuildingAJIT1.rst
vendored
@ -1,398 +0,0 @@
|
||||
=======================================================
|
||||
Building a JIT: Starting out with KaleidoscopeJIT
|
||||
=======================================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Chapter 1 Introduction
|
||||
======================
|
||||
|
||||
Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
|
||||
tutorial runs through the implementation of a JIT compiler using LLVM's
|
||||
On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
|
||||
KaleidoscopeJIT class used in the
|
||||
`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
|
||||
introduces new features like optimization, lazy compilation and remote
|
||||
execution.
|
||||
|
||||
The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
|
||||
these APIs interact with other parts of LLVM, and to teach you how to recombine
|
||||
them to build a custom JIT that is suited to your use-case.
|
||||
|
||||
The structure of the tutorial is:
|
||||
|
||||
- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
|
||||
introduce some of the basic concepts of the ORC JIT APIs, including the
|
||||
idea of an ORC *Layer*.
|
||||
|
||||
- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
|
||||
a new layer that will optimize IR and generated code.
|
||||
|
||||
- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
|
||||
Compile-On-Demand layer to lazily compile IR.
|
||||
|
||||
- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
|
||||
replacing the Compile-On-Demand layer with a custom layer that uses the ORC
|
||||
Compile Callbacks API directly to defer IR-generation until functions are
|
||||
called.
|
||||
|
||||
- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
|
||||
a remote process with reduced privileges using the JIT Remote APIs.
|
||||
|
||||
To provide input for our JIT we will use the Kaleidoscope REPL from
|
||||
`Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial",
|
||||
with one minor modification: We will remove the FunctionPassManager from the
|
||||
code for that chapter and replace it with optimization support in our JIT class
|
||||
in Chapter #2.
|
||||
|
||||
Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
|
||||
It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
|
||||
These tutorials don't assume any experience with these earlier APIs, but
|
||||
readers acquainted with them will see many familiar elements. Where appropriate
|
||||
we will make this connection with the earlier APIs explicit to help people who
|
||||
are transitioning from them to ORC.
|
||||
|
||||
JIT API Basics
|
||||
==============
|
||||
|
||||
The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
|
||||
rather than compiling whole programs to disk ahead of time as a traditional
|
||||
compiler does. To support that aim our initial, bare-bones JIT API will be:
|
||||
|
||||
1. Handle addModule(Module &M) -- Make the given IR module available for
|
||||
execution.
|
||||
2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to
|
||||
symbols (functions or variables) that have been added to the JIT.
|
||||
3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any
|
||||
memory that had been used for the compiled code.
|
||||
|
||||
A basic use-case for this API, executing the 'main' function from a module,
|
||||
will look like:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
std::unique_ptr<Module> M = buildModule();
|
||||
JIT J;
|
||||
Handle H = J.addModule(*M);
|
||||
int (*Main)(int, char*[]) = (int(*)(int, char*[]))J.getSymbolAddress("main");
|
||||
int Result = Main();
|
||||
J.removeModule(H);
|
||||
|
||||
The APIs that we build in these tutorials will all be variations on this simple
|
||||
theme. Behind the API we will refine the implementation of the JIT to add
|
||||
support for optimization and lazy compilation. Eventually we will extend the
|
||||
API itself to allow higher-level program representations (e.g. ASTs) to be
|
||||
added to the JIT.
|
||||
|
||||
KaleidoscopeJIT
|
||||
===============
|
||||
|
||||
In the previous section we described our API, now we examine a simple
|
||||
implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
|
||||
`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
|
||||
the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
|
||||
input for our JIT: Each time the user enters an expression the REPL will add a
|
||||
new IR module containing the code for that expression to the JIT. If the
|
||||
expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
|
||||
use the findSymbol method of our JIT class find and execute the code for the
|
||||
expression, and then use the removeModule method to remove the code again
|
||||
(since there's no way to re-invoke an anonymous expression). In later chapters
|
||||
of this tutorial we'll modify the REPL to enable new interactions with our JIT
|
||||
class, but for now we will take this setup for granted and focus our attention on
|
||||
the implementation of our JIT itself.
|
||||
|
||||
Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
|
||||
usual include guards and #includes [2]_, we get to the definition of our class:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
#ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
|
||||
#define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
|
||||
|
||||
#include "llvm/ADT/STLExtras.h"
|
||||
#include "llvm/ExecutionEngine/ExecutionEngine.h"
|
||||
#include "llvm/ExecutionEngine/JITSymbol.h"
|
||||
#include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
|
||||
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
|
||||
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
|
||||
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
|
||||
#include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
|
||||
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
|
||||
#include "llvm/IR/DataLayout.h"
|
||||
#include "llvm/IR/Mangler.h"
|
||||
#include "llvm/Support/DynamicLibrary.h"
|
||||
#include "llvm/Support/raw_ostream.h"
|
||||
#include "llvm/Target/TargetMachine.h"
|
||||
#include <algorithm>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace llvm {
|
||||
namespace orc {
|
||||
|
||||
class KaleidoscopeJIT {
|
||||
private:
|
||||
std::unique_ptr<TargetMachine> TM;
|
||||
const DataLayout DL;
|
||||
RTDyldObjectLinkingLayer ObjectLayer;
|
||||
IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer;
|
||||
|
||||
public:
|
||||
using ModuleHandle = decltype(CompileLayer)::ModuleHandleT;
|
||||
|
||||
Our class begins with four members: A TargetMachine, TM, which will be used to
|
||||
build our LLVM compiler instance; A DataLayout, DL, which will be used for
|
||||
symbol mangling (more on that later), and two ORC *layers*: an
|
||||
RTDyldObjectLinkingLayer and a CompileLayer. We'll be talking more about layers
|
||||
in the next chapter, but for now you can think of them as analogous to LLVM
|
||||
Passes: they wrap up useful JIT utilities behind an easy to compose interface.
|
||||
The first layer, ObjectLayer, is the foundation of our JIT: it takes in-memory
|
||||
object files produced by a compiler and links them on the fly to make them
|
||||
executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, however
|
||||
the linker was hidden inside the MCJIT class. In ORC we expose the linker so
|
||||
that clients can access and configure it directly if they need to. In this
|
||||
tutorial our ObjectLayer will just be used to support the next layer in our
|
||||
stack: the CompileLayer, which will be responsible for taking LLVM IR, compiling
|
||||
it, and passing the resulting in-memory object files down to the object linking
|
||||
layer below.
|
||||
|
||||
That's it for member variables, after that we have a single typedef:
|
||||
ModuleHandle. This is the handle type that will be returned from our JIT's
|
||||
addModule method, and can be passed to the removeModule method to remove a
|
||||
module. The IRCompileLayer class already provides a convenient handle type
|
||||
(IRCompileLayer::ModuleHandleT), so we just alias our ModuleHandle to this.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
KaleidoscopeJIT()
|
||||
: TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
|
||||
ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }),
|
||||
CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
|
||||
llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
|
||||
}
|
||||
|
||||
TargetMachine &getTargetMachine() { return *TM; }
|
||||
|
||||
Next up we have our class constructor. We begin by initializing TM using the
|
||||
EngineBuilder::selectTarget helper method which constructs a TargetMachine for
|
||||
the current process. Then we use our newly created TargetMachine to initialize
|
||||
DL, our DataLayout. After that we need to initialize our ObjectLayer. The
|
||||
ObjectLayer requires a function object that will build a JIT memory manager for
|
||||
each module that is added (a JIT memory manager manages memory allocations,
|
||||
memory permissions, and registration of exception handlers for JIT'd code). For
|
||||
this we use a lambda that returns a SectionMemoryManager, an off-the-shelf
|
||||
utility that provides all the basic memory management functionality required for
|
||||
this chapter. Next we initialize our CompileLayer. The CompileLayer needs two
|
||||
things: (1) A reference to our object layer, and (2) a compiler instance to use
|
||||
to perform the actual compilation from IR to object files. We use the
|
||||
off-the-shelf SimpleCompiler instance for now. Finally, in the body of the
|
||||
constructor, we call the DynamicLibrary::LoadLibraryPermanently method with a
|
||||
nullptr argument. Normally the LoadLibraryPermanently method is called with the
|
||||
path of a dynamic library to load, but when passed a null pointer it will 'load'
|
||||
the host process itself, making its exported symbols available for execution.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
ModuleHandle addModule(std::unique_ptr<Module> M) {
|
||||
// Build our symbol resolver:
|
||||
// Lambda 1: Look back into the JIT itself to find symbols that are part of
|
||||
// the same "logical dylib".
|
||||
// Lambda 2: Search for external symbols in the host process.
|
||||
auto Resolver = createLambdaResolver(
|
||||
[&](const std::string &Name) {
|
||||
if (auto Sym = CompileLayer.findSymbol(Name, false))
|
||||
return Sym;
|
||||
return JITSymbol(nullptr);
|
||||
},
|
||||
[](const std::string &Name) {
|
||||
if (auto SymAddr =
|
||||
RTDyldMemoryManager::getSymbolAddressInProcess(Name))
|
||||
return JITSymbol(SymAddr, JITSymbolFlags::Exported);
|
||||
return JITSymbol(nullptr);
|
||||
});
|
||||
|
||||
// Add the set to the JIT with the resolver we created above and a newly
|
||||
// created SectionMemoryManager.
|
||||
return cantFail(CompileLayer.addModule(std::move(M),
|
||||
std::move(Resolver)));
|
||||
}
|
||||
|
||||
Now we come to the first of our JIT API methods: addModule. This method is
|
||||
responsible for adding IR to the JIT and making it available for execution. In
|
||||
this initial implementation of our JIT we will make our modules "available for
|
||||
execution" by adding them straight to the CompileLayer, which will immediately
|
||||
compile them. In later chapters we will teach our JIT to defer compilation
|
||||
of individual functions until they're actually called.
|
||||
|
||||
To add our module to the CompileLayer we need to supply both the module and a
|
||||
symbol resolver. The symbol resolver is responsible for supplying the JIT with
|
||||
an address for each *external symbol* in the module we are adding. External
|
||||
symbols are any symbol not defined within the module itself, including calls to
|
||||
functions outside the JIT and calls to functions defined in other modules that
|
||||
have already been added to the JIT. (It may seem as though modules added to the
|
||||
JIT should know about one another by default, but since we would still have to
|
||||
supply a symbol resolver for references to code outside the JIT it turns out to
|
||||
be easier to re-use this one mechanism for all symbol resolution.) This has the
|
||||
added benefit that the user has full control over the symbol resolution
|
||||
process. Should we search for definitions within the JIT first, then fall back
|
||||
on external definitions? Or should we prefer external definitions where
|
||||
available and only JIT code if we don't already have an available
|
||||
implementation? By using a single symbol resolution scheme we are free to choose
|
||||
whatever makes the most sense for any given use case.
|
||||
|
||||
Building a symbol resolver is made especially easy by the *createLambdaResolver*
|
||||
function. This function takes two lambdas [3]_ and returns a JITSymbolResolver
|
||||
instance. The first lambda is used as the implementation of the resolver's
|
||||
findSymbolInLogicalDylib method, which searches for symbol definitions that
|
||||
should be thought of as being part of the same "logical" dynamic library as this
|
||||
Module. If you are familiar with static linking: this means that
|
||||
findSymbolInLogicalDylib should expose symbols with common linkage and hidden
|
||||
visibility. If all this sounds foreign you can ignore the details and just
|
||||
remember that this is the first method that the linker will use to try to find a
|
||||
symbol definition. If the findSymbolInLogicalDylib method returns a null result
|
||||
then the linker will call the second symbol resolver method, called findSymbol,
|
||||
which searches for symbols that should be thought of as external to (but
|
||||
visibile from) the module and its logical dylib. In this tutorial we will adopt
|
||||
the following simple scheme: All modules added to the JIT will behave as if they
|
||||
were linked into a single, ever-growing logical dylib. To implement this our
|
||||
first lambda (the one defining findSymbolInLogicalDylib) will just search for
|
||||
JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a
|
||||
symbol in the JIT itself we'll fall back to our second lambda, which implements
|
||||
findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess
|
||||
method to search for the symbol within the program itself. If we can't find a
|
||||
symbol definition via either of these paths, the JIT will refuse to accept our
|
||||
module, returning a "symbol not found" error.
|
||||
|
||||
Now that we've built our symbol resolver, we're ready to add our module to the
|
||||
JIT. We do this by calling the CompileLayer's addModule method. The addModule
|
||||
method returns an ``Expected<CompileLayer::ModuleHandle>``, since in more
|
||||
advanced JIT configurations it could fail. In our basic configuration we know
|
||||
that it will always succeed so we use the cantFail utility to assert that no
|
||||
error occurred, and extract the handle value. Since we have already typedef'd
|
||||
our ModuleHandle type to be the same as the CompileLayer's handle type, we can
|
||||
return the unwrapped handle directly.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
JITSymbol findSymbol(const std::string Name) {
|
||||
std::string MangledName;
|
||||
raw_string_ostream MangledNameStream(MangledName);
|
||||
Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
|
||||
return CompileLayer.findSymbol(MangledNameStream.str(), true);
|
||||
}
|
||||
|
||||
JITTargetAddress getSymbolAddress(const std::string Name) {
|
||||
return cantFail(findSymbol(Name).getAddress());
|
||||
}
|
||||
|
||||
void removeModule(ModuleHandle H) {
|
||||
cantFail(CompileLayer.removeModule(H));
|
||||
}
|
||||
|
||||
Now that we can add code to our JIT, we need a way to find the symbols we've
|
||||
added to it. To do that we call the findSymbol method on our CompileLayer, but
|
||||
with a twist: We have to *mangle* the name of the symbol we're searching for
|
||||
first. The ORC JIT components use mangled symbols internally the same way a
|
||||
static compiler and linker would, rather than using plain IR symbol names. This
|
||||
allows JIT'd code to interoperate easily with precompiled code in the
|
||||
application or shared libraries. The kind of mangling will depend on the
|
||||
DataLayout, which in turn depends on the target platform. To allow us to remain
|
||||
portable and search based on the un-mangled name, we just re-produce this
|
||||
mangling ourselves.
|
||||
|
||||
Next we have a convenience function, getSymbolAddress, which returns the address
|
||||
of a given symbol. Like CompileLayer's addModule function, JITSymbol's getAddress
|
||||
function is allowed to fail [4]_, however we know that it will not in our simple
|
||||
example, so we wrap it in a call to cantFail.
|
||||
|
||||
We now come to the last method in our JIT API: removeModule. This method is
|
||||
responsible for destructing the MemoryManager and SymbolResolver that were
|
||||
added with a given module, freeing any resources they were using in the
|
||||
process. In our Kaleidoscope demo we rely on this method to remove the module
|
||||
representing the most recent top-level expression, preventing it from being
|
||||
treated as a duplicate definition when the next top-level expression is
|
||||
entered. It is generally good to free any module that you know you won't need
|
||||
to call further, just to free up the resources dedicated to it. However, you
|
||||
don't strictly need to do this: All resources will be cleaned up when your
|
||||
JIT class is destructed, if they haven't been freed before then. Like
|
||||
``CompileLayer::addModule`` and ``JITSymbol::getAddress``, removeModule may
|
||||
fail in general but will never fail in our example, so we wrap it in a call to
|
||||
cantFail.
|
||||
|
||||
This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
|
||||
but fully functioning JIT stack that you can use to take LLVM IR and make it
|
||||
executable within the context of your JIT process. In the next chapter we'll
|
||||
look at how to extend this JIT to produce better quality code, and in the
|
||||
process take a deeper look at the ORC layer concept.
|
||||
|
||||
`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
|
||||
|
||||
Full Code Listing
|
||||
=================
|
||||
|
||||
Here is the complete code listing for our running example. To build this
|
||||
example, use:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Compile
|
||||
clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
|
||||
# Run
|
||||
./toy
|
||||
|
||||
Here is the code:
|
||||
|
||||
.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
|
||||
:language: c++
|
||||
|
||||
.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
|
||||
simplifying assumption: symbols cannot be re-defined. This will make it
|
||||
impossible to re-define symbols in the REPL, but will make our symbol
|
||||
lookup logic simpler. Re-introducing support for symbol redefinition is
|
||||
left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
|
||||
original tutorials will be a helpful reference).
|
||||
|
||||
.. [2] +-----------------------------+-----------------------------------------------+
|
||||
| File | Reason for inclusion |
|
||||
+=============================+===============================================+
|
||||
| STLExtras.h | LLVM utilities that are useful when working |
|
||||
| | with the STL. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| ExecutionEngine.h | Access to the EngineBuilder::selectTarget |
|
||||
| | method. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| | Access to the |
|
||||
| RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess|
|
||||
| | method. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| CompileUtils.h | Provides the SimpleCompiler class. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| IRCompileLayer.h | Provides the IRCompileLayer class. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| | Access the createLambdaResolver function, |
|
||||
| LambdaResolver.h | which provides easy construction of symbol |
|
||||
| | resolvers. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| Mangler.h | Provides the Mangler class for platform |
|
||||
| | specific name-mangling. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| DynamicLibrary.h | Provides the DynamicLibrary class, which |
|
||||
| | makes symbols in the host process searchable. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| | A fast output stream class. We use the |
|
||||
| raw_ostream.h | raw_string_ostream subclass for symbol |
|
||||
| | mangling |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
| TargetMachine.h | LLVM target machine description class. |
|
||||
+-----------------------------+-----------------------------------------------+
|
||||
|
||||
.. [3] Actually they don't have to be lambdas, any object with a call operator
|
||||
will do, including plain old functions or std::functions.
|
||||
|
||||
.. [4] ``JITSymbol::getAddress`` will force the JIT to compile the definition of
|
||||
the symbol if it hasn't already been compiled, and since the compilation
|
||||
process could fail getAddress must be able to return this failure.
|
Reference in New Issue
Block a user