You've already forked linux-packaging-mono
Imported Upstream version 5.18.0.167
Former-commit-id: 289509151e0fee68a1b591a20c9f109c3c789d3a
This commit is contained in:
parent
e19d552987
commit
b084638f15
@ -1 +0,0 @@
|
||||
439089348fffb8add0314fe259e5608250a49415
|
BIN
external/llvm/docs/ARM-BE-bitcastfail.png
vendored
BIN
external/llvm/docs/ARM-BE-bitcastfail.png
vendored
Binary file not shown.
Before Width: | Height: | Size: 29 KiB |
BIN
external/llvm/docs/ARM-BE-bitcastsuccess.png
vendored
BIN
external/llvm/docs/ARM-BE-bitcastsuccess.png
vendored
Binary file not shown.
Before Width: | Height: | Size: 40 KiB |
BIN
external/llvm/docs/ARM-BE-ld1.png
vendored
BIN
external/llvm/docs/ARM-BE-ld1.png
vendored
Binary file not shown.
Before Width: | Height: | Size: 22 KiB |
BIN
external/llvm/docs/ARM-BE-ldr.png
vendored
BIN
external/llvm/docs/ARM-BE-ldr.png
vendored
Binary file not shown.
Before Width: | Height: | Size: 16 KiB |
174
external/llvm/docs/AdvancedBuilds.rst
vendored
174
external/llvm/docs/AdvancedBuilds.rst
vendored
@ -1,174 +0,0 @@
|
||||
=============================
|
||||
Advanced Build Configurations
|
||||
=============================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake
|
||||
does not build the project, it generates the files needed by your build tool
|
||||
(GNU make, Visual Studio, etc.) for building LLVM.
|
||||
|
||||
If **you are a new contributor**, please start with the :doc:`GettingStarted` or
|
||||
:doc:`CMake` pages. This page is intended for users doing more complex builds.
|
||||
|
||||
Many of the examples below are written assuming specific CMake Generators.
|
||||
Unless otherwise explicitly called out these commands should work with any CMake
|
||||
generator.
|
||||
|
||||
Bootstrap Builds
|
||||
================
|
||||
|
||||
The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a
|
||||
high level a multi-stage build is a chain of builds that pass data from one
|
||||
stage into the next. The most common and simple version of this is a traditional
|
||||
bootstrap build.
|
||||
|
||||
In a simple two-stage bootstrap build, we build clang using the system compiler,
|
||||
then use that just-built clang to build clang again. In CMake this simplest form
|
||||
of a bootstrap build can be configured with a single option,
|
||||
CLANG_ENABLE_BOOTSTRAP.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ cmake -G Ninja -DCLANG_ENABLE_BOOTSTRAP=On <path to source>
|
||||
$ ninja stage2
|
||||
|
||||
This command itself isn't terribly useful because it assumes default
|
||||
configurations for each stage. The next series of examples utilize CMake cache
|
||||
scripts to provide more complex options.
|
||||
|
||||
The clang build system refers to builds as stages. A stage1 build is a standard
|
||||
build using the compiler installed on the host, and a stage2 build is built
|
||||
using the stage1 compiler. This nomenclature holds up to more stages too. In
|
||||
general a stage*n* build is built using the output from stage*n-1*.
|
||||
|
||||
Apple Clang Builds (A More Complex Bootstrap)
|
||||
=============================================
|
||||
|
||||
Apple's Clang builds are a slightly more complicated example of the simple
|
||||
bootstrapping scenario. Apple Clang is built using a 2-stage build.
|
||||
|
||||
The stage1 compiler is a host-only compiler with some options set. The stage1
|
||||
compiler is a balance of optimization vs build time because it is a throwaway.
|
||||
The stage2 compiler is the fully optimized compiler intended to ship to users.
|
||||
|
||||
Setting up these compilers requires a lot of options. To simplify the
|
||||
configuration the Apple Clang build settings are contained in CMake Cache files.
|
||||
You can build an Apple Clang compiler using the following commands:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ cmake -G Ninja -C <path to clang>/cmake/caches/Apple-stage1.cmake <path to source>
|
||||
$ ninja stage2-distribution
|
||||
|
||||
This CMake invocation configures the stage1 host compiler, and sets
|
||||
CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the
|
||||
stage2 configuration step.
|
||||
|
||||
When you build the stage2-distribution target it builds the minimal stage1
|
||||
compiler and required tools, then configures and builds the stage2 compiler
|
||||
based on the settings in Apple-stage2.cmake.
|
||||
|
||||
This pattern of using cache scripts to set complex settings, and specifically to
|
||||
make later stage builds include cache scripts is common in our more advanced
|
||||
build configurations.
|
||||
|
||||
Multi-stage PGO
|
||||
===============
|
||||
|
||||
Profile-Guided Optimizations (PGO) is a really great way to optimize the code
|
||||
clang generates. Our multi-stage PGO builds are a workflow for generating PGO
|
||||
profiles that can be used to optimize clang.
|
||||
|
||||
At a high level, the way PGO works is that you build an instrumented compiler,
|
||||
then you run the instrumented compiler against sample source files. While the
|
||||
instrumented compiler runs it will output a bunch of files containing
|
||||
performance counters (.profraw files). After generating all the profraw files
|
||||
you use llvm-profdata to merge the files into a single profdata file that you
|
||||
can feed into the LLVM_PROFDATA_FILE option.
|
||||
|
||||
Our PGO.cmake cache script automates that whole process. You can use it by
|
||||
running:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ cmake -G Ninja -C <path_to_clang>/cmake/caches/PGO.cmake <source dir>
|
||||
$ ninja stage2-instrumented-generate-profdata
|
||||
|
||||
If you let that run for a few hours or so, it will place a profdata file in your
|
||||
build directory. This takes a really long time because it builds clang twice,
|
||||
and you *must* have compiler-rt in your build tree.
|
||||
|
||||
This process uses any source files under the perf-training directory as training
|
||||
data as long as the source files are marked up with LIT-style RUN lines.
|
||||
|
||||
After it finishes you can use “find . -name clang.profdata” to find it, but it
|
||||
should be at a path something like:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
<build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata
|
||||
|
||||
You can feed that file into the LLVM_PROFDATA_FILE option when you build your
|
||||
optimized compiler.
|
||||
|
||||
The PGO came cache has a slightly different stage naming scheme than other
|
||||
multi-stage builds. It generates three stages; stage1, stage2-instrumented, and
|
||||
stage2. Both of the stage2 builds are built using the stage1 compiler.
|
||||
|
||||
The PGO came cache generates the following additional targets:
|
||||
|
||||
**stage2-instrumented**
|
||||
Builds a stage1 x86 compiler, runtime, and required tools (llvm-config,
|
||||
llvm-profdata) then uses that compiler to build an instrumented stage2 compiler.
|
||||
|
||||
**stage2-instrumented-generate-profdata**
|
||||
Depends on "stage2-instrumented" and will use the instrumented compiler to
|
||||
generate profdata based on the training files in <clang>/utils/perf-training
|
||||
|
||||
**stage2**
|
||||
Depends of "stage2-instrumented-generate-profdata" and will use the stage1
|
||||
compiler with the stage2 profdata to build a PGO-optimized compiler.
|
||||
|
||||
**stage2-check-llvm**
|
||||
Depends on stage2 and runs check-llvm using the stage2 compiler.
|
||||
|
||||
**stage2-check-clang**
|
||||
Depends on stage2 and runs check-clang using the stage2 compiler.
|
||||
|
||||
**stage2-check-all**
|
||||
Depends on stage2 and runs check-all using the stage2 compiler.
|
||||
|
||||
**stage2-test-suite**
|
||||
Depends on stage2 and runs the test-suite using the stage3 compiler (requires
|
||||
in-tree test-suite).
|
||||
|
||||
3-Stage Non-Determinism
|
||||
=======================
|
||||
|
||||
In the ancient lore of compilers non-determinism is like the multi-headed hydra.
|
||||
Whenever it's head pops up, terror and chaos ensue.
|
||||
|
||||
Historically one of the tests to verify that a compiler was deterministic would
|
||||
be a three stage build. The idea of a three stage build is you take your sources
|
||||
and build a compiler (stage1), then use that compiler to rebuild the sources
|
||||
(stage2), then you use that compiler to rebuild the sources a third time
|
||||
(stage3) with an identical configuration to the stage2 build. At the end of
|
||||
this, you have a stage2 and stage3 compiler that should be bit-for-bit
|
||||
identical.
|
||||
|
||||
You can perform one of these 3-stage builds with LLVM & clang using the
|
||||
following commands:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ cmake -G Ninja -C <path_to_clang>/cmake/caches/3-stage.cmake <source dir>
|
||||
$ ninja stage3
|
||||
|
||||
After the build you can compare the stage2 & stage3 compilers. We have a bot
|
||||
setup `here <http://lab.llvm.org:8011/builders/clang-3stage-ubuntu>`_ that runs
|
||||
this build and compare configuration.
|
718
external/llvm/docs/AliasAnalysis.rst
vendored
718
external/llvm/docs/AliasAnalysis.rst
vendored
File diff suppressed because it is too large
Load Diff
605
external/llvm/docs/Atomics.rst
vendored
605
external/llvm/docs/Atomics.rst
vendored
File diff suppressed because it is too large
Load Diff
87
external/llvm/docs/Benchmarking.rst
vendored
87
external/llvm/docs/Benchmarking.rst
vendored
@ -1,87 +0,0 @@
|
||||
==================================
|
||||
Benchmarking tips
|
||||
==================================
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
For benchmarking a patch we want to reduce all possible sources of
|
||||
noise as much as possible. How to do that is very OS dependent.
|
||||
|
||||
Note that low noise is required, but not sufficient. It does not
|
||||
exclude measurement bias. See
|
||||
https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf for
|
||||
example.
|
||||
|
||||
General
|
||||
================================
|
||||
|
||||
* Use a high resolution timer, e.g. perf under linux.
|
||||
|
||||
* Run the benchmark multiple times to be able to recognize noise.
|
||||
|
||||
* Disable as many processes or services as possible on the target system.
|
||||
|
||||
* Disable frequency scaling, turbo boost and address space
|
||||
randomization (see OS specific section).
|
||||
|
||||
* Static link if the OS supports it. That avoids any variation that
|
||||
might be introduced by loading dynamic libraries. This can be done
|
||||
by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
|
||||
|
||||
* Try to avoid storage. On some systems you can use tmpfs. Putting the
|
||||
program, inputs and outputs on tmpfs avoids touching a real storage
|
||||
system, which can have a pretty big variability.
|
||||
|
||||
To mount it (on linux and freebsd at least)::
|
||||
|
||||
mount -t tmpfs -o size=<XX>g none dir_to_mount
|
||||
|
||||
Linux
|
||||
=====
|
||||
|
||||
* Disable address space randomization::
|
||||
|
||||
echo 0 > /proc/sys/kernel/randomize_va_space
|
||||
|
||||
* Set scaling_governor to performance::
|
||||
|
||||
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
|
||||
do
|
||||
echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
|
||||
done
|
||||
|
||||
* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
|
||||
program you are benchmarking. If using perf, leave at least 2 cores
|
||||
so that perf runs in one and your program in another::
|
||||
|
||||
cset shield -c N1,N2 -k on
|
||||
|
||||
This will move all threads out of N1 and N2. The ``-k on`` means
|
||||
that even kernel threads are moved out.
|
||||
|
||||
* Disable the SMT pair of the cpus you will use for the benchmark. The
|
||||
pair of cpu N can be found in
|
||||
``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
|
||||
disabled with::
|
||||
|
||||
echo 0 > /sys/devices/system/cpu/cpuX/online
|
||||
|
||||
|
||||
* Run the program with::
|
||||
|
||||
cset shield --exec -- perf stat -r 10 <cmd>
|
||||
|
||||
This will run the command after ``--`` in the isolated cpus. The
|
||||
particular perf command runs the ``<cmd>`` 10 times and reports
|
||||
statistics.
|
||||
|
||||
With these in place you can expect perf variations of less than 0.1%.
|
||||
|
||||
Linux Intel
|
||||
-----------
|
||||
|
||||
* Disable turbo mode::
|
||||
|
||||
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
|
205
external/llvm/docs/BigEndianNEON.rst
vendored
205
external/llvm/docs/BigEndianNEON.rst
vendored
@ -1,205 +0,0 @@
|
||||
==============================================
|
||||
Using ARM NEON instructions in big endian mode
|
||||
==============================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Generating code for big endian ARM processors is for the most part straightforward. NEON loads and stores however have some interesting properties that make code generation decisions less obvious in big endian mode.
|
||||
|
||||
The aim of this document is to explain the problem with NEON loads and stores, and the solution that has been implemented in LLVM.
|
||||
|
||||
In this document the term "vector" refers to what the ARM ABI calls a "short vector", which is a sequence of items that can fit in a NEON register. This sequence can be 64 or 128 bits in length, and can constitute 8, 16, 32 or 64 bit items. This document refers to A64 instructions throughout, but is almost applicable to the A32/ARMv7 instruction sets also. The ABI format for passing vectors in A32 is sligtly different to A64. Apart from that, the same concepts apply.
|
||||
|
||||
Example: C-level intrinsics -> assembly
|
||||
---------------------------------------
|
||||
|
||||
It may be helpful first to illustrate how C-level ARM NEON intrinsics are lowered to instructions.
|
||||
|
||||
This trivial C function takes a vector of four ints and sets the zero'th lane to the value "42"::
|
||||
|
||||
#include <arm_neon.h>
|
||||
int32x4_t f(int32x4_t p) {
|
||||
return vsetq_lane_s32(42, p, 0);
|
||||
}
|
||||
|
||||
arm_neon.h intrinsics generate "generic" IR where possible (that is, normal IR instructions not ``llvm.arm.neon.*`` intrinsic calls). The above generates::
|
||||
|
||||
define <4 x i32> @f(<4 x i32> %p) {
|
||||
%vset_lane = insertelement <4 x i32> %p, i32 42, i32 0
|
||||
ret <4 x i32> %vset_lane
|
||||
}
|
||||
|
||||
Which then becomes the following trivial assembly::
|
||||
|
||||
f: // @f
|
||||
movz w8, #0x2a
|
||||
ins v0.s[0], w8
|
||||
ret
|
||||
|
||||
Problem
|
||||
=======
|
||||
|
||||
The main problem is how vectors are represented in memory and in registers.
|
||||
|
||||
First, a recap. The "endianness" of an item affects its representation in memory only. In a register, a number is just a sequence of bits - 64 bits in the case of AArch64 general purpose registers. Memory, however, is a sequence of addressable units of 8 bits in size. Any number greater than 8 bits must therefore be split up into 8-bit chunks, and endianness describes the order in which these chunks are laid out in memory.
|
||||
|
||||
A "little endian" layout has the least significant byte first (lowest in memory address). A "big endian" layout has the *most* significant byte first. This means that when loading an item from big endian memory, the lowest 8-bits in memory must go in the most significant 8-bits, and so forth.
|
||||
|
||||
``LDR`` and ``LD1``
|
||||
===================
|
||||
|
||||
.. figure:: ARM-BE-ldr.png
|
||||
:align: right
|
||||
|
||||
Big endian vector load using ``LDR``.
|
||||
|
||||
|
||||
A vector is a consecutive sequence of items that are operated on simultaneously. To load a 64-bit vector, 64 bits need to be read from memory. In little endian mode, we can do this by just performing a 64-bit load - ``LDR q0, [foo]``. However if we try this in big endian mode, because of the byte swapping the lane indices end up being swapped! The zero'th item as laid out in memory becomes the n'th lane in the vector.
|
||||
|
||||
.. figure:: ARM-BE-ld1.png
|
||||
:align: right
|
||||
|
||||
Big endian vector load using ``LD1``. Note that the lanes retain the correct ordering.
|
||||
|
||||
|
||||
Because of this, the instruction ``LD1`` performs a vector load but performs byte swapping not on the entire 64 bits, but on the individual items within the vector. This means that the register content is the same as it would have been on a little endian system.
|
||||
|
||||
It may seem that ``LD1`` should suffice to peform vector loads on a big endian machine. However there are pros and cons to the two approaches that make it less than simple which register format to pick.
|
||||
|
||||
There are two options:
|
||||
|
||||
1. The content of a vector register is the same *as if* it had been loaded with an ``LDR`` instruction.
|
||||
2. The content of a vector register is the same *as if* it had been loaded with an ``LD1`` instruction.
|
||||
|
||||
Because ``LD1 == LDR + REV`` and similarly ``LDR == LD1 + REV`` (on a big endian system), we can simulate either type of load with the other type of load plus a ``REV`` instruction. So we're not deciding which instructions to use, but which format to use (which will then influence which instruction is best to use).
|
||||
|
||||
.. The 'clearer' container is required to make the following section header come after the floated
|
||||
images above.
|
||||
.. container:: clearer
|
||||
|
||||
Note that throughout this section we only mention loads. Stores have exactly the same problems as their associated loads, so have been skipped for brevity.
|
||||
|
||||
|
||||
Considerations
|
||||
==============
|
||||
|
||||
LLVM IR Lane ordering
|
||||
---------------------
|
||||
|
||||
LLVM IR has first class vector types. In LLVM IR, the zero'th element of a vector resides at the lowest memory address. The optimizer relies on this property in certain areas, for example when concatenating vectors together. The intention is for arrays and vectors to have identical memory layouts - ``[4 x i8]`` and ``<4 x i8>`` should be represented the same in memory. Without this property there would be many special cases that the optimizer would have to cleverly handle.
|
||||
|
||||
Use of ``LDR`` would break this lane ordering property. This doesn't preclude the use of ``LDR``, but we would have to do one of two things:
|
||||
|
||||
1. Insert a ``REV`` instruction to reverse the lane order after every ``LDR``.
|
||||
2. Disable all optimizations that rely on lane layout, and for every access to an individual lane (``insertelement``/``extractelement``/``shufflevector``) reverse the lane index.
|
||||
|
||||
AAPCS
|
||||
-----
|
||||
|
||||
The ARM procedure call standard (AAPCS) defines the ABI for passing vectors between functions in registers. It states:
|
||||
|
||||
When a short vector is transferred between registers and memory it is treated as an opaque object. That is a short vector is stored in memory as if it were stored with a single ``STR`` of the entire register; a short vector is loaded from memory using the corresponding ``LDR`` instruction. On a little-endian system this means that element 0 will always contain the lowest addressed element of a short vector; on a big-endian system element 0 will contain the highest-addressed element of a short vector.
|
||||
|
||||
-- Procedure Call Standard for the ARM 64-bit Architecture (AArch64), 4.1.2 Short Vectors
|
||||
|
||||
The use of ``LDR`` and ``STR`` as the ABI defines has at least one advantage over ``LD1`` and ``ST1``. ``LDR`` and ``STR`` are oblivious to the size of the individual lanes of a vector. ``LD1`` and ``ST1`` are not - the lane size is encoded within them. This is important across an ABI boundary, because it would become necessary to know the lane width the callee expects. Consider the following code:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
<callee.c>
|
||||
void callee(uint32x2_t v) {
|
||||
...
|
||||
}
|
||||
|
||||
<caller.c>
|
||||
extern void callee(uint32x2_t);
|
||||
void caller() {
|
||||
callee(...);
|
||||
}
|
||||
|
||||
If ``callee`` changed its signature to ``uint16x4_t``, which is equivalent in register content, if we passed as ``LD1`` we'd break this code until ``caller`` was updated and recompiled.
|
||||
|
||||
There is an argument that if the signatures of the two functions are different then the behaviour should be undefined. But there may be functions that are agnostic to the lane layout of the vector, and treating the vector as an opaque value (just loading it and storing it) would be impossible without a common format across ABI boundaries.
|
||||
|
||||
So to preserve ABI compatibility, we need to use the ``LDR`` lane layout across function calls.
|
||||
|
||||
Alignment
|
||||
---------
|
||||
|
||||
In strict alignment mode, ``LDR qX`` requires its address to be 128-bit aligned, whereas ``LD1`` only requires it to be as aligned as the lane size. If we canonicalised on using ``LDR``, we'd still need to use ``LD1`` in some places to avoid alignment faults (the result of the ``LD1`` would then need to be reversed with ``REV``).
|
||||
|
||||
Most operating systems however do not run with alignment faults enabled, so this is often not an issue.
|
||||
|
||||
Summary
|
||||
-------
|
||||
|
||||
The following table summarises the instructions that are required to be emitted for each property mentioned above for each of the two solutions.
|
||||
|
||||
+-------------------------------+-------------------------------+---------------------+
|
||||
| | ``LDR`` layout | ``LD1`` layout |
|
||||
+===============================+===============================+=====================+
|
||||
| Lane ordering | ``LDR + REV`` | ``LD1`` |
|
||||
+-------------------------------+-------------------------------+---------------------+
|
||||
| AAPCS | ``LDR`` | ``LD1 + REV`` |
|
||||
+-------------------------------+-------------------------------+---------------------+
|
||||
| Alignment for strict mode | ``LDR`` / ``LD1 + REV`` | ``LD1`` |
|
||||
+-------------------------------+-------------------------------+---------------------+
|
||||
|
||||
Neither approach is perfect, and choosing one boils down to choosing the lesser of two evils. The issue with lane ordering, it was decided, would have to change target-agnostic compiler passes and would result in a strange IR in which lane indices were reversed. It was decided that this was worse than the changes that would have to be made to support ``LD1``, so ``LD1`` was chosen as the canonical vector load instruction (and by inference, ``ST1`` for vector stores).
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
There are 3 parts to the implementation:
|
||||
|
||||
1. Predicate ``LDR`` and ``STR`` instructions so that they are never allowed to be selected to generate vector loads and stores. The exception is one-lane vectors [1]_ - these by definition cannot have lane ordering problems so are fine to use ``LDR``/``STR``.
|
||||
|
||||
2. Create code generation patterns for bitconverts that create ``REV`` instructions.
|
||||
|
||||
3. Make sure appropriate bitconverts are created so that vector values get passed over call boundaries as 1-element vectors (which is the same as if they were loaded with ``LDR``).
|
||||
|
||||
Bitconverts
|
||||
-----------
|
||||
|
||||
.. image:: ARM-BE-bitcastfail.png
|
||||
:align: right
|
||||
|
||||
The main problem with the ``LD1`` solution is dealing with bitconverts (or bitcasts, or reinterpret casts). These are pseudo instructions that only change the compiler's interpretation of data, not the underlying data itself. A requirement is that if data is loaded and then saved again (called a "round trip"), the memory contents should be the same after the store as before the load. If a vector is loaded and is then bitconverted to a different vector type before storing, the round trip will currently be broken.
|
||||
|
||||
Take for example this code sequence::
|
||||
|
||||
%0 = load <4 x i32> %x
|
||||
%1 = bitcast <4 x i32> %0 to <2 x i64>
|
||||
store <2 x i64> %1, <2 x i64>* %y
|
||||
|
||||
This would produce a code sequence such as that in the figure on the right. The mismatched ``LD1`` and ``ST1`` cause the stored data to differ from the loaded data.
|
||||
|
||||
.. container:: clearer
|
||||
|
||||
When we see a bitcast from type ``X`` to type ``Y``, what we need to do is to change the in-register representation of the data to be *as if* it had just been loaded by a ``LD1`` of type ``Y``.
|
||||
|
||||
.. image:: ARM-BE-bitcastsuccess.png
|
||||
:align: right
|
||||
|
||||
Conceptually this is simple - we can insert a ``REV`` undoing the ``LD1`` of type ``X`` (converting the in-register representation to the same as if it had been loaded by ``LDR``) and then insert another ``REV`` to change the representation to be as if it had been loaded by an ``LD1`` of type ``Y``.
|
||||
|
||||
For the previous example, this would be::
|
||||
|
||||
LD1 v0.4s, [x]
|
||||
|
||||
REV64 v0.4s, v0.4s // There is no REV128 instruction, so it must be synthesizedcd
|
||||
EXT v0.16b, v0.16b, v0.16b, #8 // with a REV64 then an EXT to swap the two 64-bit elements.
|
||||
|
||||
REV64 v0.2d, v0.2d
|
||||
EXT v0.16b, v0.16b, v0.16b, #8
|
||||
|
||||
ST1 v0.2d, [y]
|
||||
|
||||
It turns out that these ``REV`` pairs can, in almost all cases, be squashed together into a single ``REV``. For the example above, a ``REV128 4s`` + ``REV128 2d`` is actually a ``REV64 4s``, as shown in the figure on the right.
|
||||
|
||||
.. [1] One lane vectors may seem useless as a concept but they serve to distinguish between values held in general purpose registers and values held in NEON/VFP registers. For example, an ``i64`` would live in an ``x`` register, but ``<1 x i64>`` would live in a ``d`` register.
|
||||
|
1352
external/llvm/docs/BitCodeFormat.rst
vendored
1352
external/llvm/docs/BitCodeFormat.rst
vendored
File diff suppressed because it is too large
Load Diff
130
external/llvm/docs/BlockFrequencyTerminology.rst
vendored
130
external/llvm/docs/BlockFrequencyTerminology.rst
vendored
@ -1,130 +0,0 @@
|
||||
================================
|
||||
LLVM Block Frequency Terminology
|
||||
================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Block Frequency is a metric for estimating the relative frequency of different
|
||||
basic blocks. This document describes the terminology that the
|
||||
``BlockFrequencyInfo`` and ``MachineBlockFrequencyInfo`` analysis passes use.
|
||||
|
||||
Branch Probability
|
||||
==================
|
||||
|
||||
Blocks with multiple successors have probabilities associated with each
|
||||
outgoing edge. These are called branch probabilities. For a given block, the
|
||||
sum of its outgoing branch probabilities should be 1.0.
|
||||
|
||||
Branch Weight
|
||||
=============
|
||||
|
||||
Rather than storing fractions on each edge, we store an integer weight.
|
||||
Weights are relative to the other edges of a given predecessor block. The
|
||||
branch probability associated with a given edge is its own weight divided by
|
||||
the sum of the weights on the predecessor's outgoing edges.
|
||||
|
||||
For example, consider this IR:
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
define void @foo() {
|
||||
; ...
|
||||
A:
|
||||
br i1 %cond, label %B, label %C, !prof !0
|
||||
; ...
|
||||
}
|
||||
!0 = metadata !{metadata !"branch_weights", i32 7, i32 8}
|
||||
|
||||
and this simple graph representation::
|
||||
|
||||
A -> B (edge-weight: 7)
|
||||
A -> C (edge-weight: 8)
|
||||
|
||||
The probability of branching from block A to block B is 7/15, and the
|
||||
probability of branching from block A to block C is 8/15.
|
||||
|
||||
See :doc:`BranchWeightMetadata` for details about the branch weight IR
|
||||
representation.
|
||||
|
||||
Block Frequency
|
||||
===============
|
||||
|
||||
Block frequency is a relative metric that represents the number of times a
|
||||
block executes. The ratio of a block frequency to the entry block frequency is
|
||||
the expected number of times the block will execute per entry to the function.
|
||||
|
||||
Block frequency is the main output of the ``BlockFrequencyInfo`` and
|
||||
``MachineBlockFrequencyInfo`` analysis passes.
|
||||
|
||||
Implementation: a series of DAGs
|
||||
================================
|
||||
|
||||
The implementation of the block frequency calculation analyses each loop,
|
||||
bottom-up, ignoring backedges; i.e., as a DAG. After each loop is processed,
|
||||
it's packaged up to act as a pseudo-node in its parent loop's (or the
|
||||
function's) DAG analysis.
|
||||
|
||||
Block Mass
|
||||
==========
|
||||
|
||||
For each DAG, the entry node is assigned a mass of ``UINT64_MAX`` and mass is
|
||||
distributed to successors according to branch weights. Block Mass uses a
|
||||
fixed-point representation where ``UINT64_MAX`` represents ``1.0`` and ``0``
|
||||
represents a number just above ``0.0``.
|
||||
|
||||
After mass is fully distributed, in any cut of the DAG that separates the exit
|
||||
nodes from the entry node, the sum of the block masses of the nodes succeeded
|
||||
by a cut edge should equal ``UINT64_MAX``. In other words, mass is conserved
|
||||
as it "falls" through the DAG.
|
||||
|
||||
If a function's basic block graph is a DAG, then block masses are valid block
|
||||
frequencies. This works poorly in practise though, since downstream users rely
|
||||
on adding block frequencies together without hitting the maximum.
|
||||
|
||||
Loop Scale
|
||||
==========
|
||||
|
||||
Loop scale is a metric that indicates how many times a loop iterates per entry.
|
||||
As mass is distributed through the loop's DAG, the (otherwise ignored) backedge
|
||||
mass is collected. This backedge mass is used to compute the exit frequency,
|
||||
and thus the loop scale.
|
||||
|
||||
Implementation: Getting from mass and scale to frequency
|
||||
========================================================
|
||||
|
||||
After analysing the complete series of DAGs, each block has a mass (local to
|
||||
its containing loop, if any), and each loop pseudo-node has a loop scale and
|
||||
its own mass (from its parent's DAG).
|
||||
|
||||
We can get an initial frequency assignment (with entry frequency of 1.0) by
|
||||
multiplying these masses and loop scales together. A given block's frequency
|
||||
is the product of its mass, the mass of containing loops' pseudo nodes, and the
|
||||
containing loops' loop scales.
|
||||
|
||||
Since downstream users need integers (not floating point), this initial
|
||||
frequency assignment is shifted as necessary into the range of ``uint64_t``.
|
||||
|
||||
Block Bias
|
||||
==========
|
||||
|
||||
Block bias is a proposed *absolute* metric to indicate a bias toward or away
|
||||
from a given block during a function's execution. The idea is that bias can be
|
||||
used in isolation to indicate whether a block is relatively hot or cold, or to
|
||||
compare two blocks to indicate whether one is hotter or colder than the other.
|
||||
|
||||
The proposed calculation involves calculating a *reference* block frequency,
|
||||
where:
|
||||
|
||||
* every branch weight is assumed to be 1 (i.e., every branch probability
|
||||
distribution is even) and
|
||||
|
||||
* loop scales are ignored.
|
||||
|
||||
This reference frequency represents what the block frequency would be in an
|
||||
unbiased graph.
|
||||
|
||||
The bias is the ratio of the block frequency to this reference block frequency.
|
164
external/llvm/docs/BranchWeightMetadata.rst
vendored
164
external/llvm/docs/BranchWeightMetadata.rst
vendored
@ -1,164 +0,0 @@
|
||||
===========================
|
||||
LLVM Branch Weight Metadata
|
||||
===========================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Branch Weight Metadata represents branch weights as its likeliness to be taken
|
||||
(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to the
|
||||
``TerminatorInst`` as a ``MDNode`` of the ``MD_prof`` kind. The first operator
|
||||
is always a ``MDString`` node with the string "branch_weights". Number of
|
||||
operators depends on the terminator type.
|
||||
|
||||
Branch weights might be fetch from the profiling file, or generated based on
|
||||
`__builtin_expect`_ instruction.
|
||||
|
||||
All weights are represented as an unsigned 32-bit values, where higher value
|
||||
indicates greater chance to be taken.
|
||||
|
||||
Supported Instructions
|
||||
======================
|
||||
|
||||
``BranchInst``
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Metadata is only assigned to the conditional branches. There are two extra
|
||||
operands for the true and the false branch.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
!0 = metadata !{
|
||||
metadata !"branch_weights",
|
||||
i32 <TRUE_BRANCH_WEIGHT>,
|
||||
i32 <FALSE_BRANCH_WEIGHT>
|
||||
}
|
||||
|
||||
``SwitchInst``
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Branch weights are assigned to every case (including the ``default`` case which
|
||||
is always case #0).
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
!0 = metadata !{
|
||||
metadata !"branch_weights",
|
||||
i32 <DEFAULT_BRANCH_WEIGHT>
|
||||
[ , i32 <CASE_BRANCH_WEIGHT> ... ]
|
||||
}
|
||||
|
||||
``IndirectBrInst``
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Branch weights are assigned to every destination.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
!0 = metadata !{
|
||||
metadata !"branch_weights",
|
||||
i32 <LABEL_BRANCH_WEIGHT>
|
||||
[ , i32 <LABEL_BRANCH_WEIGHT> ... ]
|
||||
}
|
||||
|
||||
``CallInst``
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Calls may have branch weight metadata, containing the execution count of
|
||||
the call. It is currently used in SamplePGO mode only, to augment the
|
||||
block and entry counts which may not be accurate with sampling.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
!0 = metadata !{
|
||||
metadata !"branch_weights",
|
||||
i32 <CALL_BRANCH_WEIGHT>
|
||||
}
|
||||
|
||||
Other
|
||||
^^^^^
|
||||
|
||||
Other terminator instructions are not allowed to contain Branch Weight Metadata.
|
||||
|
||||
.. _\__builtin_expect:
|
||||
|
||||
Built-in ``expect`` Instructions
|
||||
================================
|
||||
|
||||
``__builtin_expect(long exp, long c)`` instruction provides branch prediction
|
||||
information. The return value is the value of ``exp``.
|
||||
|
||||
It is especially useful in conditional statements. Currently Clang supports two
|
||||
conditional statements:
|
||||
|
||||
``if`` statement
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
The ``exp`` parameter is the condition. The ``c`` parameter is the expected
|
||||
comparison value. If it is equal to 1 (true), the condition is likely to be
|
||||
true, in other case condition is likely to be false. For example:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
if (__builtin_expect(x > 0, 1)) {
|
||||
// This block is likely to be taken.
|
||||
}
|
||||
|
||||
``switch`` statement
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The ``exp`` parameter is the value. The ``c`` parameter is the expected
|
||||
value. If the expected value doesn't show on the cases list, the ``default``
|
||||
case is assumed to be likely taken.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
switch (__builtin_expect(x, 5)) {
|
||||
default: break;
|
||||
case 0: // ...
|
||||
case 3: // ...
|
||||
case 5: // This case is likely to be taken.
|
||||
}
|
||||
|
||||
CFG Modifications
|
||||
=================
|
||||
|
||||
Branch Weight Metatada is not proof against CFG changes. If terminator operands'
|
||||
are changed some action should be taken. In other case some misoptimizations may
|
||||
occur due to incorrect branch prediction information.
|
||||
|
||||
Function Entry Counts
|
||||
=====================
|
||||
|
||||
To allow comparing different functions during inter-procedural analysis and
|
||||
optimization, ``MD_prof`` nodes can also be assigned to a function definition.
|
||||
The first operand is a string indicating the name of the associated counter.
|
||||
|
||||
Currently, one counter is supported: "function_entry_count". The second operand
|
||||
is a 64-bit counter that indicates the number of times that this function was
|
||||
invoked (in the case of instrumentation-based profiles). In the case of
|
||||
sampling-based profiles, this operand is an approximation of how many times
|
||||
the function was invoked.
|
||||
|
||||
For example, in the code below, the instrumentation for function foo()
|
||||
indicates that it was called 2,590 times at runtime.
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
define i32 @foo() !prof !1 {
|
||||
ret i32 0
|
||||
}
|
||||
!1 = !{!"function_entry_count", i64 2590}
|
||||
|
||||
If "function_entry_count" has more than 2 operands, the later operands are
|
||||
the GUID of the functions that needs to be imported by ThinLTO. This is only
|
||||
set by sampling based profile. It is needed because the sampling based profile
|
||||
was collected on a binary that had already imported and inlined these functions,
|
||||
and we need to ensure the IR matches in the ThinLTO backends for profile
|
||||
annotation. The reason why we cannot annotate this on the callsite is that it
|
||||
can only goes down 1 level in the call chain. For the cases where
|
||||
foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels
|
||||
in the call chain to import both bar_in_b_cc and baz_in_c_cc.
|
221
external/llvm/docs/Bugpoint.rst
vendored
221
external/llvm/docs/Bugpoint.rst
vendored
@ -1,221 +0,0 @@
|
||||
====================================
|
||||
LLVM bugpoint tool: design and usage
|
||||
====================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Description
|
||||
===========
|
||||
|
||||
``bugpoint`` narrows down the source of problems in LLVM tools and passes. It
|
||||
can be used to debug three types of failures: optimizer crashes, miscompilations
|
||||
by optimizers, or bad native code generation (including problems in the static
|
||||
and JIT compilers). It aims to reduce large test cases to small, useful ones.
|
||||
For example, if ``opt`` crashes while optimizing a file, it will identify the
|
||||
optimization (or combination of optimizations) that causes the crash, and reduce
|
||||
the file down to a small example which triggers the crash.
|
||||
|
||||
For detailed case scenarios, such as debugging ``opt``, or one of the LLVM code
|
||||
generators, see :doc:`HowToSubmitABug`.
|
||||
|
||||
Design Philosophy
|
||||
=================
|
||||
|
||||
``bugpoint`` is designed to be a useful tool without requiring any hooks into
|
||||
the LLVM infrastructure at all. It works with any and all LLVM passes and code
|
||||
generators, and does not need to "know" how they work. Because of this, it may
|
||||
appear to do stupid things or miss obvious simplifications. ``bugpoint`` is
|
||||
also designed to trade off programmer time for computer time in the
|
||||
compiler-debugging process; consequently, it may take a long period of
|
||||
(unattended) time to reduce a test case, but we feel it is still worth it. Note
|
||||
that ``bugpoint`` is generally very quick unless debugging a miscompilation
|
||||
where each test of the program (which requires executing it) takes a long time.
|
||||
|
||||
Automatic Debugger Selection
|
||||
----------------------------
|
||||
|
||||
``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line
|
||||
and links them together into a single module, called the test program. If any
|
||||
LLVM passes are specified on the command line, it runs these passes on the test
|
||||
program. If any of the passes crash, or if they produce malformed output (which
|
||||
causes the verifier to abort), ``bugpoint`` starts the `crash debugger`_.
|
||||
|
||||
Otherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the
|
||||
test program with the "safe" backend (which is assumed to generate good code) to
|
||||
generate a reference output. Once ``bugpoint`` has a reference output for the
|
||||
test program, it tries executing it with the selected code generator. If the
|
||||
selected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on
|
||||
the code generator. Otherwise, if the resulting output differs from the
|
||||
reference output, it assumes the difference resulted from a code generator
|
||||
failure, and starts the `code generator debugger`_.
|
||||
|
||||
Finally, if the output of the selected code generator matches the reference
|
||||
output, ``bugpoint`` runs the test program after all of the LLVM passes have
|
||||
been applied to it. If its output differs from the reference output, it assumes
|
||||
the difference resulted from a failure in one of the LLVM passes, and enters the
|
||||
`miscompilation debugger`_. Otherwise, there is no problem ``bugpoint`` can
|
||||
debug.
|
||||
|
||||
.. _crash debugger:
|
||||
|
||||
Crash debugger
|
||||
--------------
|
||||
|
||||
If an optimizer or code generator crashes, ``bugpoint`` will try as hard as it
|
||||
can to reduce the list of passes (for optimizer crashes) and the size of the
|
||||
test program. First, ``bugpoint`` figures out which combination of optimizer
|
||||
passes triggers the bug. This is useful when debugging a problem exposed by
|
||||
``opt``, for example, because it runs over 38 passes.
|
||||
|
||||
Next, ``bugpoint`` tries removing functions from the test program, to reduce its
|
||||
size. Usually it is able to reduce a test program to a single function, when
|
||||
debugging intraprocedural optimizations. Once the number of functions has been
|
||||
reduced, it attempts to delete various edges in the control flow graph, to
|
||||
reduce the size of the function as much as possible. Finally, ``bugpoint``
|
||||
deletes any individual LLVM instructions whose absence does not eliminate the
|
||||
failure. At the end, ``bugpoint`` should tell you what passes crash, give you a
|
||||
bitcode file, and give you instructions on how to reproduce the failure with
|
||||
``opt`` or ``llc``.
|
||||
|
||||
.. _code generator debugger:
|
||||
|
||||
Code generator debugger
|
||||
-----------------------
|
||||
|
||||
The code generator debugger attempts to narrow down the amount of code that is
|
||||
being miscompiled by the selected code generator. To do this, it takes the test
|
||||
program and partitions it into two pieces: one piece which it compiles with the
|
||||
"safe" backend (into a shared object), and one piece which it runs with either
|
||||
the JIT or the static LLC compiler. It uses several techniques to reduce the
|
||||
amount of code pushed through the LLVM code generator, to reduce the potential
|
||||
scope of the problem. After it is finished, it emits two bitcode files (called
|
||||
"test" [to be compiled with the code generator] and "safe" [to be compiled with
|
||||
the "safe" backend], respectively), and instructions for reproducing the
|
||||
problem. The code generator debugger assumes that the "safe" backend produces
|
||||
good code.
|
||||
|
||||
.. _miscompilation debugger:
|
||||
|
||||
Miscompilation debugger
|
||||
-----------------------
|
||||
|
||||
The miscompilation debugger works similarly to the code generator debugger. It
|
||||
works by splitting the test program into two pieces, running the optimizations
|
||||
specified on one piece, linking the two pieces back together, and then executing
|
||||
the result. It attempts to narrow down the list of passes to the one (or few)
|
||||
which are causing the miscompilation, then reduce the portion of the test
|
||||
program which is being miscompiled. The miscompilation debugger assumes that
|
||||
the selected code generator is working properly.
|
||||
|
||||
Advice for using bugpoint
|
||||
=========================
|
||||
|
||||
``bugpoint`` can be a remarkably useful tool, but it sometimes works in
|
||||
non-obvious ways. Here are some hints and tips:
|
||||
|
||||
* In the code generator and miscompilation debuggers, ``bugpoint`` only works
|
||||
with programs that have deterministic output. Thus, if the program outputs
|
||||
``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may
|
||||
misinterpret differences in these data, when output, as the result of a
|
||||
miscompilation. Programs should be temporarily modified to disable outputs
|
||||
that are likely to vary from run to run.
|
||||
|
||||
* In the code generator and miscompilation debuggers, debugging will go faster
|
||||
if you manually modify the program or its inputs to reduce the runtime, but
|
||||
still exhibit the problem.
|
||||
|
||||
* ``bugpoint`` is extremely useful when working on a new optimization: it helps
|
||||
track down regressions quickly. To avoid having to relink ``bugpoint`` every
|
||||
time you change your optimization however, have ``bugpoint`` dynamically load
|
||||
your optimization with the ``-load`` option.
|
||||
|
||||
* ``bugpoint`` can generate a lot of output and run for a long period of time.
|
||||
It is often useful to capture the output of the program to file. For example,
|
||||
in the C shell, you can run:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ bugpoint ... |& tee bugpoint.log
|
||||
|
||||
to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well
|
||||
as on your terminal.
|
||||
|
||||
* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint``
|
||||
crashes before you see its "All input ok" message, you might try ``llvm-link
|
||||
-v`` on the same set of input files. If that also crashes, you may be
|
||||
experiencing a linker bug.
|
||||
|
||||
* ``bugpoint`` is useful for proactively finding bugs in LLVM. Invoking
|
||||
``bugpoint`` with the ``-find-bugs`` option will cause the list of specified
|
||||
optimizations to be randomized and applied to the program. This process will
|
||||
repeat until a bug is found or the user kills ``bugpoint``.
|
||||
|
||||
* ``bugpoint`` can produce IR which contains long names. Run ``opt
|
||||
-metarenamer`` over the IR to rename everything using easy-to-read,
|
||||
metasyntactic names. Alternatively, run ``opt -strip -instnamer`` to rename
|
||||
everything with very short (often purely numeric) names.
|
||||
|
||||
What to do when bugpoint isn't enough
|
||||
=====================================
|
||||
|
||||
Sometimes, ``bugpoint`` is not enough. In particular, InstCombine and
|
||||
TargetLowering both have visitor structured code with lots of potential
|
||||
transformations. If the process of using bugpoint has left you with still too
|
||||
much code to figure out and the problem seems to be in instcombine, the
|
||||
following steps may help. These same techniques are useful with TargetLowering
|
||||
as well.
|
||||
|
||||
Turn on ``-debug-only=instcombine`` and see which transformations within
|
||||
instcombine are firing by selecting out lines with "``IC``" in them.
|
||||
|
||||
At this point, you have a decision to make. Is the number of transformations
|
||||
small enough to step through them using a debugger? If so, then try that.
|
||||
|
||||
If there are too many transformations, then a source modification approach may
|
||||
be helpful. In this approach, you can modify the source code of instcombine to
|
||||
disable just those transformations that are being performed on your test input
|
||||
and perform a binary search over the set of transformations. One set of places
|
||||
to modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.*
|
||||
``visitICmpInst``) by adding a "``return false``" as the first line of the
|
||||
method.
|
||||
|
||||
If that still doesn't remove enough, then change the caller of
|
||||
``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the
|
||||
number of iterations.
|
||||
|
||||
You may also find it useful to use "``-stats``" now to see what parts of
|
||||
instcombine are firing. This can guide where to put additional reporting code.
|
||||
|
||||
At this point, if the amount of transformations is still too large, then
|
||||
inserting code to limit whether or not to execute the body of the code in the
|
||||
visit function can be helpful. Add a static counter which is incremented on
|
||||
every invocation of the function. Then add code which simply returns false on
|
||||
desired ranges. For example:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
|
||||
static int calledCount = 0;
|
||||
calledCount++;
|
||||
DEBUG(if (calledCount < 212) return false);
|
||||
DEBUG(if (calledCount > 217) return false);
|
||||
DEBUG(if (calledCount == 213) return false);
|
||||
DEBUG(if (calledCount == 214) return false);
|
||||
DEBUG(if (calledCount == 215) return false);
|
||||
DEBUG(if (calledCount == 216) return false);
|
||||
DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n");
|
||||
DEBUG(dbgs() << "I: "; I->dump());
|
||||
|
||||
could be added to ``visitXOR`` to limit ``visitXor`` to being applied only to
|
||||
calls 212 and 217. This is from an actual test case and raises an important
|
||||
point---a simple binary search may not be sufficient, as transformations that
|
||||
interact may require isolating more than one call. In TargetLowering, use
|
||||
``return SDNode();`` instead of ``return false;``.
|
||||
|
||||
Now that the number of transformations is down to a manageable number, try
|
||||
examining the output to see if you can figure out which transformations are
|
||||
being done. If that can be figured out, then do the usual debugging. If which
|
||||
code corresponds to the transformation being performed isn't obvious, set a
|
||||
breakpoint after the call count based disabling and step through the code.
|
||||
Alternatively, you can use "``printf``" style debugging to report waypoints.
|
91
external/llvm/docs/CFIVerify.rst
vendored
91
external/llvm/docs/CFIVerify.rst
vendored
@ -1,91 +0,0 @@
|
||||
==============================================
|
||||
Control Flow Verification Tool Design Document
|
||||
==============================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Objective
|
||||
=========
|
||||
|
||||
This document provides an overview of an external tool to verify the protection
|
||||
mechanisms implemented by Clang's *Control Flow Integrity* (CFI) schemes
|
||||
(``-fsanitize=cfi``). This tool, provided a binary or DSO, should infer whether
|
||||
indirect control flow operations are protected by CFI, and should output these
|
||||
results in a human-readable form.
|
||||
|
||||
This tool should also be added as part of Clang's continuous integration testing
|
||||
framework, where modifications to the compiler ensure that CFI protection
|
||||
schemes are still present in the final binary.
|
||||
|
||||
Location
|
||||
========
|
||||
|
||||
This tool will be present as a part of the LLVM toolchain, and will reside in
|
||||
the "/llvm/tools/llvm-cfi-verify" directory, relative to the LLVM trunk. It will
|
||||
be tested in two methods:
|
||||
|
||||
- Unit tests to validate code sections, present in "/llvm/unittests/llvm-cfi-
|
||||
verify".
|
||||
- Integration tests, present in "/llvm/tools/clang/test/LLVMCFIVerify". These
|
||||
integration tests are part of clang as part of a continuous integration
|
||||
framework, ensuring updates to the compiler that reduce CFI coverage on
|
||||
indirect control flow instructions are identified.
|
||||
|
||||
Background
|
||||
==========
|
||||
|
||||
This tool will continuously validate that CFI directives are properly
|
||||
implemented around all indirect control flows by analysing the output machine
|
||||
code. The analysis of machine code is important as it ensures that any bugs
|
||||
present in linker or compiler do not subvert CFI protections in the final
|
||||
shipped binary.
|
||||
|
||||
Unprotected indirect control flow instructions will be flagged for manual
|
||||
review. These unexpected control flows may simply have not been accounted for in
|
||||
the compiler implementation of CFI (e.g. indirect jumps to facilitate switch
|
||||
statements may not be fully protected).
|
||||
|
||||
It may be possible in the future to extend this tool to flag unnecessary CFI
|
||||
directives (e.g. CFI directives around a static call to a non-polymorphic base
|
||||
type). This type of directive has no security implications, but may present
|
||||
performance impacts.
|
||||
|
||||
Design Ideas
|
||||
============
|
||||
|
||||
This tool will disassemble binaries and DSO's from their machine code format and
|
||||
analyse the disassembled machine code. The tool will inspect virtual calls and
|
||||
indirect function calls. This tool will also inspect indirect jumps, as inlined
|
||||
functions and jump tables should also be subject to CFI protections. Non-virtual
|
||||
calls (``-fsanitize=cfi-nvcall``) and cast checks (``-fsanitize=cfi-*cast*``)
|
||||
are not implemented due to a lack of information provided by the bytecode.
|
||||
|
||||
The tool would operate by searching for indirect control flow instructions in
|
||||
the disassembly. A control flow graph would be generated from a small buffer of
|
||||
the instructions surrounding the 'target' control flow instruction. If the
|
||||
target instruction is branched-to, the fallthrough of the branch should be the
|
||||
CFI trap (on x86, this is a ``ud2`` instruction). If the target instruction is
|
||||
the fallthrough (i.e. immediately succeeds) of a conditional jump, the
|
||||
conditional jump target should be the CFI trap. If an indirect control flow
|
||||
instruction does not conform to one of these formats, the target will be noted
|
||||
as being CFI-unprotected.
|
||||
|
||||
Note that in the second case outlined above (where the target instruction is the
|
||||
fallthrough of a conditional jump), if the target represents a vcall that takes
|
||||
arguments, these arguments may be pushed to the stack after the branch but
|
||||
before the target instruction. In these cases, a secondary 'spill graph' in
|
||||
constructed, to ensure the register argument used by the indirect jump/call is
|
||||
not spilled from the stack at any point in the interim period. If there are no
|
||||
spills that affect the target register, the target is marked as CFI-protected.
|
||||
|
||||
Other Design Notes
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Only machine code sections that are marked as executable will be subject to this
|
||||
analysis. Non-executable sections do not require analysis as any execution
|
||||
present in these sections has already violated the control flow integrity.
|
||||
|
||||
Suitable extensions may be made at a later date to include anaylsis for indirect
|
||||
control flow operations across DSO boundaries. Currently, these CFI features are
|
||||
only experimental with an unstable ABI, making them unsuitable for analysis.
|
793
external/llvm/docs/CMake.rst
vendored
793
external/llvm/docs/CMake.rst
vendored
File diff suppressed because it is too large
Load Diff
168
external/llvm/docs/CMakeLists.txt
vendored
168
external/llvm/docs/CMakeLists.txt
vendored
@ -1,168 +0,0 @@
|
||||
|
||||
if (DOXYGEN_FOUND)
|
||||
if (LLVM_ENABLE_DOXYGEN)
|
||||
set(abs_top_srcdir ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
set(abs_top_builddir ${CMAKE_CURRENT_BINARY_DIR})
|
||||
|
||||
if (HAVE_DOT)
|
||||
set(DOT ${LLVM_PATH_DOT})
|
||||
endif()
|
||||
|
||||
if (LLVM_DOXYGEN_EXTERNAL_SEARCH)
|
||||
set(enable_searchengine "YES")
|
||||
set(searchengine_url "${LLVM_DOXYGEN_SEARCHENGINE_URL}")
|
||||
set(enable_server_based_search "YES")
|
||||
set(enable_external_search "YES")
|
||||
set(extra_search_mappings "${LLVM_DOXYGEN_SEARCH_MAPPINGS}")
|
||||
else()
|
||||
set(enable_searchengine "NO")
|
||||
set(searchengine_url "")
|
||||
set(enable_server_based_search "NO")
|
||||
set(enable_external_search "NO")
|
||||
set(extra_search_mappings "")
|
||||
endif()
|
||||
|
||||
# If asked, configure doxygen for the creation of a Qt Compressed Help file.
|
||||
option(LLVM_ENABLE_DOXYGEN_QT_HELP
|
||||
"Generate a Qt Compressed Help file." OFF)
|
||||
if (LLVM_ENABLE_DOXYGEN_QT_HELP)
|
||||
set(LLVM_DOXYGEN_QCH_FILENAME "org.llvm.qch" CACHE STRING
|
||||
"Filename of the Qt Compressed help file")
|
||||
set(LLVM_DOXYGEN_QHP_NAMESPACE "org.llvm" CACHE STRING
|
||||
"Namespace under which the intermediate Qt Help Project file lives")
|
||||
set(LLVM_DOXYGEN_QHP_CUST_FILTER_NAME "${PACKAGE_STRING}" CACHE STRING
|
||||
"See http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom-filters")
|
||||
set(LLVM_DOXYGEN_QHP_CUST_FILTER_ATTRS "${PACKAGE_NAME},${PACKAGE_VERSION}" CACHE STRING
|
||||
"See http://qt-project.org/doc/qt-4.8/qthelpproject.html#filter-attributes")
|
||||
find_program(LLVM_DOXYGEN_QHELPGENERATOR_PATH qhelpgenerator
|
||||
DOC "Path to the qhelpgenerator binary")
|
||||
if (NOT LLVM_DOXYGEN_QHELPGENERATOR_PATH)
|
||||
message(FATAL_ERROR "Failed to find qhelpgenerator binary")
|
||||
endif()
|
||||
|
||||
set(llvm_doxygen_generate_qhp "YES")
|
||||
set(llvm_doxygen_qch_filename "${LLVM_DOXYGEN_QCH_FILENAME}")
|
||||
set(llvm_doxygen_qhp_namespace "${LLVM_DOXYGEN_QHP_NAMESPACE}")
|
||||
set(llvm_doxygen_qhelpgenerator_path "${LLVM_DOXYGEN_QHELPGENERATOR_PATH}")
|
||||
set(llvm_doxygen_qhp_cust_filter_name "${LLVM_DOXYGEN_QHP_CUST_FILTER_NAME}")
|
||||
set(llvm_doxygen_qhp_cust_filter_attrs "${LLVM_DOXYGEN_QHP_CUST_FILTER_ATTRS}")
|
||||
|
||||
else()
|
||||
set(llvm_doxygen_generate_qhp "NO")
|
||||
set(llvm_doxygen_qch_filename "")
|
||||
set(llvm_doxygen_qhp_namespace "")
|
||||
set(llvm_doxygen_qhelpgenerator_path "")
|
||||
set(llvm_doxygen_qhp_cust_filter_name "")
|
||||
set(llvm_doxygen_qhp_cust_filter_attrs "")
|
||||
endif()
|
||||
|
||||
option(LLVM_DOXYGEN_SVG
|
||||
"Use svg instead of png files for doxygen graphs." OFF)
|
||||
if (LLVM_DOXYGEN_SVG)
|
||||
set(DOT_IMAGE_FORMAT "svg")
|
||||
else()
|
||||
set(DOT_IMAGE_FORMAT "png")
|
||||
endif()
|
||||
|
||||
configure_file(${CMAKE_CURRENT_SOURCE_DIR}/doxygen.cfg.in
|
||||
${CMAKE_CURRENT_BINARY_DIR}/doxygen.cfg @ONLY)
|
||||
|
||||
set(abs_top_srcdir)
|
||||
set(abs_top_builddir)
|
||||
set(DOT)
|
||||
set(enable_searchengine)
|
||||
set(searchengine_url)
|
||||
set(enable_server_based_search)
|
||||
set(enable_external_search)
|
||||
set(extra_search_mappings)
|
||||
set(llvm_doxygen_generate_qhp)
|
||||
set(llvm_doxygen_qch_filename)
|
||||
set(llvm_doxygen_qhp_namespace)
|
||||
set(llvm_doxygen_qhelpgenerator_path)
|
||||
set(llvm_doxygen_qhp_cust_filter_name)
|
||||
set(llvm_doxygen_qhp_cust_filter_attrs)
|
||||
set(DOT_IMAGE_FORMAT)
|
||||
|
||||
add_custom_target(doxygen-llvm
|
||||
COMMAND ${DOXYGEN_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/doxygen.cfg
|
||||
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
|
||||
COMMENT "Generating llvm doxygen documentation." VERBATIM)
|
||||
|
||||
if (LLVM_BUILD_DOCS)
|
||||
add_dependencies(doxygen doxygen-llvm)
|
||||
endif()
|
||||
|
||||
if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
|
||||
# ./ suffix is needed to copy the contents of html directory without
|
||||
# appending html/ into LLVM_INSTALL_DOXYGEN_HTML_DIR.
|
||||
install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/doxygen/html/.
|
||||
COMPONENT doxygen-html
|
||||
DESTINATION "${LLVM_INSTALL_DOXYGEN_HTML_DIR}")
|
||||
endif()
|
||||
endif()
|
||||
endif()
|
||||
|
||||
if (LLVM_ENABLE_SPHINX)
|
||||
include(AddSphinxTarget)
|
||||
if (SPHINX_FOUND)
|
||||
if (${SPHINX_OUTPUT_HTML})
|
||||
add_sphinx_target(html llvm)
|
||||
endif()
|
||||
|
||||
|
||||
if (${SPHINX_OUTPUT_MAN})
|
||||
add_sphinx_target(man llvm)
|
||||
add_sphinx_target(man llvm-dwarfdump)
|
||||
add_sphinx_target(man dsymutil)
|
||||
endif()
|
||||
|
||||
endif()
|
||||
endif()
|
||||
|
||||
list(FIND LLVM_BINDINGS_LIST ocaml uses_ocaml)
|
||||
if( NOT uses_ocaml LESS 0 AND LLVM_ENABLE_OCAMLDOC )
|
||||
set(doc_targets
|
||||
ocaml_llvm
|
||||
ocaml_llvm_all_backends
|
||||
ocaml_llvm_analysis
|
||||
ocaml_llvm_bitreader
|
||||
ocaml_llvm_bitwriter
|
||||
ocaml_llvm_executionengine
|
||||
ocaml_llvm_irreader
|
||||
ocaml_llvm_linker
|
||||
ocaml_llvm_target
|
||||
ocaml_llvm_ipo
|
||||
ocaml_llvm_passmgr_builder
|
||||
ocaml_llvm_scalar_opts
|
||||
ocaml_llvm_transform_utils
|
||||
ocaml_llvm_vectorize
|
||||
)
|
||||
|
||||
foreach(llvm_target ${LLVM_TARGETS_TO_BUILD})
|
||||
list(APPEND doc_targets ocaml_llvm_${llvm_target})
|
||||
endforeach()
|
||||
|
||||
set(odoc_files)
|
||||
foreach( doc_target ${doc_targets} )
|
||||
get_target_property(odoc_file ${doc_target} OCAML_ODOC)
|
||||
list(APPEND odoc_files -load ${odoc_file})
|
||||
endforeach()
|
||||
|
||||
add_custom_target(ocaml_doc
|
||||
COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html
|
||||
COMMAND ${OCAMLFIND} ocamldoc -d ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html
|
||||
-sort -colorize-code -html ${odoc_files}
|
||||
COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_SOURCE_DIR}/_ocamldoc/style.css
|
||||
${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html)
|
||||
|
||||
add_dependencies(ocaml_doc ${doc_targets})
|
||||
|
||||
if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
|
||||
# ./ suffix is needed to copy the contents of html directory without
|
||||
# appending html/ into LLVM_INSTALL_OCAMLDOC_HTML_DIR.
|
||||
install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html/.
|
||||
COMPONENT ocamldoc-html
|
||||
DESTINATION "${LLVM_INSTALL_OCAMLDOC_HTML_DIR}")
|
||||
endif()
|
||||
endif()
|
439
external/llvm/docs/CMakePrimer.rst
vendored
439
external/llvm/docs/CMakePrimer.rst
vendored
@ -1,439 +0,0 @@
|
||||
============
|
||||
CMake Primer
|
||||
============
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. warning::
|
||||
Disclaimer: This documentation is written by LLVM project contributors `not`
|
||||
anyone affiliated with the CMake project. This document may contain
|
||||
inaccurate terminology, phrasing, or technical details. It is provided with
|
||||
the best intentions.
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The LLVM project and many of the core projects built on LLVM build using CMake.
|
||||
This document aims to provide a brief overview of CMake for developers modifying
|
||||
LLVM projects or building their own projects on top of LLVM.
|
||||
|
||||
The official CMake language references is available in the cmake-language
|
||||
manpage and `cmake-language online documentation
|
||||
<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
|
||||
|
||||
10,000 ft View
|
||||
==============
|
||||
|
||||
CMake is a tool that reads script files in its own language that describe how a
|
||||
software project builds. As CMake evaluates the scripts it constructs an
|
||||
internal representation of the software project. Once the scripts have been
|
||||
fully processed, if there are no errors, CMake will generate build files to
|
||||
actually build the project. CMake supports generating build files for a variety
|
||||
of command line build tools as well as for popular IDEs.
|
||||
|
||||
When a user runs CMake it performs a variety of checks similar to how autoconf
|
||||
worked historically. During the checks and the evaluation of the build
|
||||
description scripts CMake caches values into the CMakeCache. This is useful
|
||||
because it allows the build system to skip long-running checks during
|
||||
incremental development. CMake caching also has some drawbacks, but that will be
|
||||
discussed later.
|
||||
|
||||
Scripting Overview
|
||||
==================
|
||||
|
||||
CMake's scripting language has a very simple grammar. Every language construct
|
||||
is a command that matches the pattern _name_(_args_). Commands come in three
|
||||
primary types: language-defined (commands implemented in C++ in CMake), defined
|
||||
functions, and defined macros. The CMake distribution also contains a suite of
|
||||
CMake modules that contain definitions for useful functionality.
|
||||
|
||||
The example below is the full CMake build for building a C++ "Hello World"
|
||||
program. The example uses only CMake language-defined functions.
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
cmake_minimum_required(VERSION 3.2)
|
||||
project(HelloWorld)
|
||||
add_executable(HelloWorld HelloWorld.cpp)
|
||||
|
||||
The CMake language provides control flow constructs in the form of foreach loops
|
||||
and if blocks. To make the example above more complicated you could add an if
|
||||
block to define "APPLE" when targeting Apple platforms:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
cmake_minimum_required(VERSION 3.2)
|
||||
project(HelloWorld)
|
||||
add_executable(HelloWorld HelloWorld.cpp)
|
||||
if(APPLE)
|
||||
target_compile_definitions(HelloWorld PUBLIC APPLE)
|
||||
endif()
|
||||
|
||||
Variables, Types, and Scope
|
||||
===========================
|
||||
|
||||
Dereferencing
|
||||
-------------
|
||||
|
||||
In CMake variables are "stringly" typed. All variables are represented as
|
||||
strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
|
||||
and results in a literal substitution of the name for the value. CMake refers to
|
||||
this as "variable evaluation" in their documentation. Dereferences are performed
|
||||
*before* the command being called receives the arguments. This means
|
||||
dereferencing a list results in multiple separate arguments being passed to the
|
||||
command.
|
||||
|
||||
Variable dereferences can be nested and be used to model complex data. For
|
||||
example:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
set(var_name var1)
|
||||
set(${var_name} foo) # same as "set(var1 foo)"
|
||||
set(${${var_name}}_var bar) # same as "set(foo_var bar)"
|
||||
|
||||
Dereferencing an unset variable results in an empty expansion. It is a common
|
||||
pattern in CMake to conditionally set variables knowing that it will be used in
|
||||
code paths that the variable isn't set. There are examples of this throughout
|
||||
the LLVM CMake build system.
|
||||
|
||||
An example of variable empty expansion is:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
if(APPLE)
|
||||
set(extra_sources Apple.cpp)
|
||||
endif()
|
||||
add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
|
||||
|
||||
In this example the ``extra_sources`` variable is only defined if you're
|
||||
targeting an Apple platform. For all other targets the ``extra_sources`` will be
|
||||
evaluated as empty before add_executable is given its arguments.
|
||||
|
||||
Lists
|
||||
-----
|
||||
|
||||
In CMake lists are semi-colon delimited strings, and it is strongly advised that
|
||||
you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
|
||||
defining lists:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
# Creates a list with members a, b, c, and d
|
||||
set(my_list a b c d)
|
||||
set(my_list "a;b;c;d")
|
||||
|
||||
# Creates a string "a b c d"
|
||||
set(my_string "a b c d")
|
||||
|
||||
Lists of Lists
|
||||
--------------
|
||||
|
||||
One of the more complicated patterns in CMake is lists of lists. Because a list
|
||||
cannot contain an element with a semi-colon to construct a list of lists you
|
||||
make a list of variable names that refer to other lists. For example:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
set(list_of_lists a b c)
|
||||
set(a 1 2 3)
|
||||
set(b 4 5 6)
|
||||
set(c 7 8 9)
|
||||
|
||||
With this layout you can iterate through the list of lists printing each value
|
||||
with the following code:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
foreach(list_name IN LISTS list_of_lists)
|
||||
foreach(value IN LISTS ${list_name})
|
||||
message(${value})
|
||||
endforeach()
|
||||
endforeach()
|
||||
|
||||
You'll notice that the inner foreach loop's list is doubly dereferenced. This is
|
||||
because the first dereference turns ``list_name`` into the name of the sub-list
|
||||
(a, b, or c in the example), then the second dereference is to get the value of
|
||||
the list.
|
||||
|
||||
This pattern is used throughout CMake, the most common example is the compiler
|
||||
flags options, which CMake refers to using the following variable expansions:
|
||||
CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
|
||||
|
||||
Other Types
|
||||
-----------
|
||||
|
||||
Variables that are cached or specified on the command line can have types
|
||||
associated with them. The variable's type is used by CMake's UI tool to display
|
||||
the right input field. A variable's type generally doesn't impact evaluation,
|
||||
however CMake does have special handling for some variables such as PATH.
|
||||
You can read more about the special handling in `CMake's set documentation
|
||||
<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
|
||||
|
||||
Scope
|
||||
-----
|
||||
|
||||
CMake inherently has a directory-based scoping. Setting a variable in a
|
||||
CMakeLists file, will set the variable for that file, and all subdirectories.
|
||||
Variables set in a CMake module that is included in a CMakeLists file will be
|
||||
set in the scope they are included from, and all subdirectories.
|
||||
|
||||
When a variable that is already set is set again in a subdirectory it overrides
|
||||
the value in that scope and any deeper subdirectories.
|
||||
|
||||
The CMake set command provides two scope-related options. PARENT_SCOPE sets a
|
||||
variable into the parent scope, and not the current scope. The CACHE option sets
|
||||
the variable in the CMakeCache, which results in it being set in all scopes. The
|
||||
CACHE option will not set a variable that already exists in the CACHE unless the
|
||||
FORCE option is specified.
|
||||
|
||||
In addition to directory-based scope, CMake functions also have their own scope.
|
||||
This means variables set inside functions do not bleed into the parent scope.
|
||||
This is not true of macros, and it is for this reason LLVM prefers functions
|
||||
over macros whenever reasonable.
|
||||
|
||||
.. note::
|
||||
Unlike C-based languages, CMake's loop and control flow blocks do not have
|
||||
their own scopes.
|
||||
|
||||
Control Flow
|
||||
============
|
||||
|
||||
CMake features the same basic control flow constructs you would expect in any
|
||||
scripting language, but there are a few quirks because, as with everything in
|
||||
CMake, control flow constructs are commands.
|
||||
|
||||
If, ElseIf, Else
|
||||
----------------
|
||||
|
||||
.. note::
|
||||
For the full documentation on the CMake if command go
|
||||
`here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
|
||||
far more complete.
|
||||
|
||||
In general CMake if blocks work the way you'd expect:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
if(<condition>)
|
||||
message("do stuff")
|
||||
elseif(<condition>)
|
||||
message("do other stuff")
|
||||
else()
|
||||
message("do other other stuff")
|
||||
endif()
|
||||
|
||||
The single most important thing to know about CMake's if blocks coming from a C
|
||||
background is that they do not have their own scope. Variables set inside
|
||||
conditional blocks persist after the ``endif()``.
|
||||
|
||||
Loops
|
||||
-----
|
||||
|
||||
The most common form of the CMake ``foreach`` block is:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
foreach(var ...)
|
||||
message("do stuff")
|
||||
endforeach()
|
||||
|
||||
The variable argument portion of the ``foreach`` block can contain dereferenced
|
||||
lists, values to iterate, or a mix of both:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
foreach(var foo bar baz)
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# foo
|
||||
# bar
|
||||
# baz
|
||||
|
||||
set(my_list 1 2 3)
|
||||
foreach(var ${my_list})
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# 1
|
||||
# 2
|
||||
# 3
|
||||
|
||||
foreach(var ${my_list} out_of_bounds)
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# 1
|
||||
# 2
|
||||
# 3
|
||||
# out_of_bounds
|
||||
|
||||
There is also a more modern CMake foreach syntax. The code below is equivalent
|
||||
to the code above:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
foreach(var IN ITEMS foo bar baz)
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# foo
|
||||
# bar
|
||||
# baz
|
||||
|
||||
set(my_list 1 2 3)
|
||||
foreach(var IN LISTS my_list)
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# 1
|
||||
# 2
|
||||
# 3
|
||||
|
||||
foreach(var IN LISTS my_list ITEMS out_of_bounds)
|
||||
message(${var})
|
||||
endforeach()
|
||||
# prints:
|
||||
# 1
|
||||
# 2
|
||||
# 3
|
||||
# out_of_bounds
|
||||
|
||||
Similar to the conditional statements, these generally behave how you would
|
||||
expect, and they do not have their own scope.
|
||||
|
||||
CMake also supports ``while`` loops, although they are not widely used in LLVM.
|
||||
|
||||
Modules, Functions and Macros
|
||||
=============================
|
||||
|
||||
Modules
|
||||
-------
|
||||
|
||||
Modules are CMake's vehicle for enabling code reuse. CMake modules are just
|
||||
CMake script files. They can contain code to execute on include as well as
|
||||
definitions for commands.
|
||||
|
||||
In CMake macros and functions are universally referred to as commands, and they
|
||||
are the primary method of defining code that can be called multiple times.
|
||||
|
||||
In LLVM we have several CMake modules that are included as part of our
|
||||
distribution for developers who don't build our project from source. Those
|
||||
modules are the fundamental pieces needed to build LLVM-based projects with
|
||||
CMake. We also rely on modules as a way of organizing the build system's
|
||||
functionality for maintainability and re-use within LLVM projects.
|
||||
|
||||
Argument Handling
|
||||
-----------------
|
||||
|
||||
When defining a CMake command handling arguments is very useful. The examples
|
||||
in this section will all use the CMake ``function`` block, but this all applies
|
||||
to the ``macro`` block as well.
|
||||
|
||||
CMake commands can have named arguments that are requried at every call site. In
|
||||
addition, all commands will implicitly accept a variable number of extra
|
||||
arguments (In C parlance, all commands are varargs functions). When a command is
|
||||
invoked with extra arguments (beyond the named ones) CMake will store the full
|
||||
list of arguments (both named and unnamed) in a list named ``ARGV``, and the
|
||||
sublist of unnamed arguments in ``ARGN``. Below is a trivial example of
|
||||
providing a wrapper function for CMake's built in function ``add_dependencies``.
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
function(add_deps target)
|
||||
add_dependencies(${target} ${ARGN})
|
||||
endfunction()
|
||||
|
||||
This example defines a new macro named ``add_deps`` which takes a required first
|
||||
argument, and just calls another function passing through the first argument and
|
||||
all trailing arguments.
|
||||
|
||||
CMake provides a module ``CMakeParseArguments`` which provides an implementation
|
||||
of advanced argument parsing. We use this all over LLVM, and it is recommended
|
||||
for any function that has complex argument-based behaviors or optional
|
||||
arguments. CMake's official documentation for the module is in the
|
||||
``cmake-modules`` manpage, and is also available at the
|
||||
`cmake-modules online documentation
|
||||
<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
|
||||
|
||||
.. note::
|
||||
As of CMake 3.5 the cmake_parse_arguments command has become a native command
|
||||
and the CMakeParseArguments module is empty and only left around for
|
||||
compatibility.
|
||||
|
||||
Functions Vs Macros
|
||||
-------------------
|
||||
|
||||
Functions and Macros look very similar in how they are used, but there is one
|
||||
fundamental difference between the two. Functions have their own scope, and
|
||||
macros don't. This means variables set in macros will bleed out into the calling
|
||||
scope. That makes macros suitable for defining very small bits of functionality
|
||||
only.
|
||||
|
||||
The other difference between CMake functions and macros is how arguments are
|
||||
passed. Arguments to macros are not set as variables, instead dereferences to
|
||||
the parameters are resolved across the macro before executing it. This can
|
||||
result in some unexpected behavior if using unreferenced variables. For example:
|
||||
|
||||
.. code-block:: cmake
|
||||
|
||||
macro(print_list my_list)
|
||||
foreach(var IN LISTS my_list)
|
||||
message("${var}")
|
||||
endforeach()
|
||||
endmacro()
|
||||
|
||||
set(my_list a b c d)
|
||||
set(my_list_of_numbers 1 2 3 4)
|
||||
print_list(my_list_of_numbers)
|
||||
# prints:
|
||||
# a
|
||||
# b
|
||||
# c
|
||||
# d
|
||||
|
||||
Generally speaking this issue is uncommon because it requires using
|
||||
non-dereferenced variables with names that overlap in the parent scope, but it
|
||||
is important to be aware of because it can lead to subtle bugs.
|
||||
|
||||
LLVM Project Wrappers
|
||||
=====================
|
||||
|
||||
LLVM projects provide lots of wrappers around critical CMake built-in commands.
|
||||
We use these wrappers to provide consistent behaviors across LLVM components
|
||||
and to reduce code duplication.
|
||||
|
||||
We generally (but not always) follow the convention that commands prefaced with
|
||||
``llvm_`` are intended to be used only as building blocks for other commands.
|
||||
Wrapper commands that are intended for direct use are generally named following
|
||||
with the project in the middle of the command name (i.e. ``add_llvm_executable``
|
||||
is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
|
||||
all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
|
||||
distribution. It can be included and used by any LLVM sub-project that requires
|
||||
LLVM.
|
||||
|
||||
.. note::
|
||||
|
||||
Not all LLVM projects require LLVM for all use cases. For example compiler-rt
|
||||
can be built without LLVM, and the compiler-rt sanitizer libraries are used
|
||||
with GCC.
|
||||
|
||||
Useful Built-in Commands
|
||||
========================
|
||||
|
||||
CMake has a bunch of useful built-in commands. This document isn't going to
|
||||
go into details about them because The CMake project has excellent
|
||||
documentation. To highlight a few useful functions see:
|
||||
|
||||
* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
|
||||
* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
|
||||
* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
|
||||
* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
|
||||
* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
|
||||
* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
|
||||
|
||||
The full documentation for CMake commands is in the ``cmake-commands`` manpage
|
||||
and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_
|
@ -1 +0,0 @@
|
||||
5c0fb064959ef9d7bf2ab1d0ba5eafc7547244e7
|
112
external/llvm/docs/CodeOfConduct.rst
vendored
112
external/llvm/docs/CodeOfConduct.rst
vendored
@ -1,112 +0,0 @@
|
||||
==============================
|
||||
LLVM Community Code of Conduct
|
||||
==============================
|
||||
|
||||
.. note::
|
||||
|
||||
This document is currently a **DRAFT** document while it is being discussed
|
||||
by the community.
|
||||
|
||||
The LLVM community has always worked to be a welcoming and respectful
|
||||
community, and we want to ensure that doesn't change as we grow and evolve. To
|
||||
that end, we have a few ground rules that we ask people to adhere to:
|
||||
|
||||
* `be friendly and patient`_,
|
||||
* `be welcoming`_,
|
||||
* `be considerate`_,
|
||||
* `be respectful`_,
|
||||
* `be careful in the words that you choose and be kind to others`_, and
|
||||
* `when we disagree, try to understand why`_.
|
||||
|
||||
This isn't an exhaustive list of things that you can't do. Rather, take it in
|
||||
the spirit in which it's intended - a guide to make it easier to communicate
|
||||
and participate in the community.
|
||||
|
||||
This code of conduct applies to all spaces managed by the LLVM project or The
|
||||
LLVM Foundation. This includes IRC channels, mailing lists, bug trackers, LLVM
|
||||
events such as the developer meetings and socials, and any other forums created
|
||||
by the project that the community uses for communication. It applies to all of
|
||||
your communication and conduct in these spaces, including emails, chats, things
|
||||
you say, slides, videos, posters, signs, or even t-shirts you display in these
|
||||
spaces. In addition, violations of this code outside these spaces may, in rare
|
||||
cases, affect a person's ability to participate within them, when the conduct
|
||||
amounts to an egregious violation of this code.
|
||||
|
||||
If you believe someone is violating the code of conduct, we ask that you report
|
||||
it by emailing conduct@llvm.org. For more details please see our
|
||||
:doc:`Reporting Guide <ReportingGuide>`.
|
||||
|
||||
.. _be friendly and patient:
|
||||
|
||||
* **Be friendly and patient.**
|
||||
|
||||
.. _be welcoming:
|
||||
|
||||
* **Be welcoming.** We strive to be a community that welcomes and supports
|
||||
people of all backgrounds and identities. This includes, but is not limited
|
||||
to members of any race, ethnicity, culture, national origin, colour,
|
||||
immigration status, social and economic class, educational level, sex, sexual
|
||||
orientation, gender identity and expression, age, size, family status,
|
||||
political belief, religion or lack thereof, and mental and physical ability.
|
||||
|
||||
.. _be considerate:
|
||||
|
||||
* **Be considerate.** Your work will be used by other people, and you in turn
|
||||
will depend on the work of others. Any decision you take will affect users
|
||||
and colleagues, and you should take those consequences into account. Remember
|
||||
that we're a world-wide community, so you might not be communicating in
|
||||
someone else's primary language.
|
||||
|
||||
.. _be respectful:
|
||||
|
||||
* **Be respectful.** Not all of us will agree all the time, but disagreement is
|
||||
no excuse for poor behavior and poor manners. We might all experience some
|
||||
frustration now and then, but we cannot allow that frustration to turn into
|
||||
a personal attack. It's important to remember that a community where people
|
||||
feel uncomfortable or threatened is not a productive one. Members of the LLVM
|
||||
community should be respectful when dealing with other members as well as
|
||||
with people outside the LLVM community.
|
||||
|
||||
.. _be careful in the words that you choose and be kind to others:
|
||||
|
||||
* **Be careful in the words that you choose and be kind to others.** Do not
|
||||
insult or put down other participants. Harassment and other exclusionary
|
||||
behavior aren't acceptable. This includes, but is not limited to:
|
||||
|
||||
* Violent threats or language directed against another person.
|
||||
* Discriminatory jokes and language.
|
||||
* Posting sexually explicit or violent material.
|
||||
* Posting (or threatening to post) other people's personally identifying
|
||||
information ("doxing").
|
||||
* Personal insults, especially those using racist or sexist terms.
|
||||
* Unwelcome sexual attention.
|
||||
* Advocating for, or encouraging, any of the above behavior.
|
||||
|
||||
In general, if someone asks you to stop, then stop. Persisting in such
|
||||
behavior after being asked to stop is considered harassment.
|
||||
|
||||
.. _when we disagree, try to understand why:
|
||||
|
||||
* **When we disagree, try to understand why.** Disagreements, both social and
|
||||
technical, happen all the time and LLVM is no exception. It is important that
|
||||
we resolve disagreements and differing views constructively. Remember that
|
||||
we're different. The strength of LLVM comes from its varied community, people
|
||||
from a wide range of backgrounds. Different people have different
|
||||
perspectives on issues. Being unable to understand why someone holds
|
||||
a viewpoint doesn't mean that they're wrong. Don't forget that it is human to
|
||||
err and blaming each other doesn't get us anywhere. Instead, focus on helping
|
||||
to resolve issues and learning from mistakes.
|
||||
|
||||
Questions?
|
||||
==========
|
||||
|
||||
If you have questions, please feel free to contact the LLVM Foundation Code of
|
||||
Conduct Advisory Committee by emailing conduct@llvm.org.
|
||||
|
||||
|
||||
(This text is based on the `Django Project`_ Code of Conduct, which is in turn
|
||||
based on wording from the `Speak Up! project`_.)
|
||||
|
||||
.. _Django Project: https://www.djangoproject.com/conduct/
|
||||
.. _Speak Up! project: http://speakup.io/coc.html
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user