You've already forked linux-packaging-mono
Imported Upstream version 5.18.0.167
Former-commit-id: 289509151e0fee68a1b591a20c9f109c3c789d3a
This commit is contained in:
parent
e19d552987
commit
b084638f15
4
external/llvm/docs/PDB/CodeViewSymbols.rst
vendored
4
external/llvm/docs/PDB/CodeViewSymbols.rst
vendored
@ -1,4 +0,0 @@
|
||||
=====================================
|
||||
CodeView Symbol Records
|
||||
=====================================
|
||||
|
4
external/llvm/docs/PDB/CodeViewTypes.rst
vendored
4
external/llvm/docs/PDB/CodeViewTypes.rst
vendored
@ -1,4 +0,0 @@
|
||||
=====================================
|
||||
CodeView Type Records
|
||||
=====================================
|
||||
|
445
external/llvm/docs/PDB/DbiStream.rst
vendored
445
external/llvm/docs/PDB/DbiStream.rst
vendored
@ -1,445 +0,0 @@
|
||||
=====================================
|
||||
The PDB DBI (Debug Info) Stream
|
||||
=====================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. _dbi_intro:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The PDB DBI Stream (Index 3) is one of the largest and most important streams
|
||||
in a PDB file. It contains information about how the program was compiled,
|
||||
(e.g. compilation flags, etc), the compilands (e.g. object files) that
|
||||
were used to link together the program, the source files which were used
|
||||
to build the program, as well as references to other streams that contain more
|
||||
detailed information about each compiland, such as the CodeView symbol records
|
||||
contained within each compiland and the source and line information for
|
||||
functions and other symbols within each compiland.
|
||||
|
||||
|
||||
.. _dbi_header:
|
||||
|
||||
Stream Header
|
||||
=============
|
||||
At offset 0 of the DBI Stream is a header with the following layout:
|
||||
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct DbiStreamHeader {
|
||||
int32_t VersionSignature;
|
||||
uint32_t VersionHeader;
|
||||
uint32_t Age;
|
||||
uint16_t GlobalStreamIndex;
|
||||
uint16_t BuildNumber;
|
||||
uint16_t PublicStreamIndex;
|
||||
uint16_t PdbDllVersion;
|
||||
uint16_t SymRecordStream;
|
||||
uint16_t PdbDllRbld;
|
||||
int32_t ModInfoSize;
|
||||
int32_t SectionContributionSize;
|
||||
int32_t SectionMapSize;
|
||||
int32_t SourceInfoSize;
|
||||
int32_t TypeServerSize;
|
||||
uint32_t MFCTypeServerIndex;
|
||||
int32_t OptionalDbgHeaderSize;
|
||||
int32_t ECSubstreamSize;
|
||||
uint16_t Flags;
|
||||
uint16_t Machine;
|
||||
uint32_t Padding;
|
||||
};
|
||||
|
||||
- **VersionSignature** - Unknown meaning. Appears to always be ``-1``.
|
||||
|
||||
- **VersionHeader** - A value from the following enum.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
enum class DbiStreamVersion : uint32_t {
|
||||
VC41 = 930803,
|
||||
V50 = 19960307,
|
||||
V60 = 19970606,
|
||||
V70 = 19990903,
|
||||
V110 = 20091201
|
||||
};
|
||||
|
||||
Similar to the :doc:`PDB Stream <PdbStream>`, this value always appears to be
|
||||
``V70``, and it is not clear what the other values are for.
|
||||
|
||||
- **Age** - The number of times the PDB has been written. Equal to the same
|
||||
field from the :ref:`PDB Stream header <pdb_stream_header>`.
|
||||
|
||||
- **GlobalStreamIndex** - The index of the :doc:`Global Symbol Stream <GlobalStream>`,
|
||||
which contains CodeView symbol records for all global symbols. Actual records
|
||||
are stored in the symbol record stream, and are referenced from this stream.
|
||||
|
||||
- **BuildNumber** - A bitfield containing values representing the major and minor
|
||||
version number of the toolchain (e.g. 12.0 for MSVC 2013) used to build the
|
||||
program, with the following layout:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
uint16_t MinorVersion : 8;
|
||||
uint16_t MajorVersion : 7;
|
||||
uint16_t NewVersionFormat : 1;
|
||||
|
||||
For the purposes of LLVM, we assume ``NewVersionFormat`` to be always ``true``.
|
||||
If it is ``false``, the layout above does not apply and the reader should consult
|
||||
the `Microsoft Source Code <https://github.com/Microsoft/microsoft-pdb>`__ for
|
||||
further guidance.
|
||||
|
||||
- **PublicStreamIndex** - The index of the :doc:`Public Symbol Stream <PublicStream>`,
|
||||
which contains CodeView symbol records for all public symbols. Actual records
|
||||
are stored in the symbol record stream, and are referenced from this stream.
|
||||
|
||||
- **PdbDllVersion** - The version number of ``mspdbXXXX.dll`` used to produce this
|
||||
PDB. Note this obviously does not apply for LLVM as LLVM does not use ``mspdb.dll``.
|
||||
|
||||
- **SymRecordStream** - The stream containing all CodeView symbol records used
|
||||
by the program. This is used for deduplication, so that many different
|
||||
compilands can refer to the same symbols without having to include the full record
|
||||
content inside of each module stream.
|
||||
|
||||
- **PdbDllRbld** - Unknown
|
||||
|
||||
- **MFCTypeServerIndex** - The length of the :ref:dbi_mfc_type_server_substream
|
||||
|
||||
- **Flags** - A bitfield with the following layout, containing various
|
||||
information about how the program was built:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
uint16_t WasIncrementallyLinked : 1;
|
||||
uint16_t ArePrivateSymbolsStripped : 1;
|
||||
uint16_t HasConflictingTypes : 1;
|
||||
uint16_t Reserved : 13;
|
||||
|
||||
The only one of these that is not self-explanatory is ``HasConflictingTypes``.
|
||||
Although undocumented, ``link.exe`` contains a hidden flag ``/DEBUG:CTYPES``.
|
||||
If it is passed to ``link.exe``, this field will be set. Otherwise it will
|
||||
not be set. It is unclear what this flag does, although it seems to have
|
||||
subtle implications on the algorithm used to look up type records.
|
||||
|
||||
- **Machine** - A value from the `CV_CPU_TYPE_e <https://msdn.microsoft.com/en-us/library/b2fc64ek.aspx>`__
|
||||
enumeration. Common values are ``0x8664`` (x86-64) and ``0x14C`` (x86).
|
||||
|
||||
Immediately after the fixed-size DBI Stream header are ``7`` variable-length
|
||||
`substreams`. The following ``7`` fields of the DBI Stream header specify the
|
||||
number of bytes of the corresponding substream. Each substream's contents will
|
||||
be described in detail :ref:`below <dbi_substreams>`. The length of the entire
|
||||
DBI Stream should equal ``64`` (the length of the header above) plus the value
|
||||
of each of the following ``7`` fields.
|
||||
|
||||
- **ModInfoSize** - The length of the :ref:`dbi_mod_info_substream`.
|
||||
|
||||
- **SectionContributionSize** - The length of the :ref:`dbi_sec_contr_substream`.
|
||||
|
||||
- **SectionMapSize** - The length of the :ref:`dbi_section_map_substream`.
|
||||
|
||||
- **SourceInfoSize** - The length of the :ref:`dbi_file_info_substream`.
|
||||
|
||||
- **TypeServerSize** - The length of the :ref:`dbi_type_server_substream`.
|
||||
|
||||
- **OptionalDbgHeaderSize** - The length of the :ref:`dbi_optional_dbg_stream`.
|
||||
|
||||
- **ECSubstreamSize** - The length of the :ref:`dbi_ec_substream`.
|
||||
|
||||
.. _dbi_substreams:
|
||||
|
||||
Substreams
|
||||
==========
|
||||
|
||||
.. _dbi_mod_info_substream:
|
||||
|
||||
Module Info Substream
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Begins at offset ``0`` immediately after the :ref:`header <dbi_header>`. The
|
||||
module info substream is an array of variable-length records, each one
|
||||
describing a single module (e.g. object file) linked into the program. Each
|
||||
record in the array has the format:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct SectionContribEntry {
|
||||
uint16_t Section;
|
||||
char Padding1[2];
|
||||
int32_t Offset;
|
||||
int32_t Size;
|
||||
uint32_t Characteristics;
|
||||
uint16_t ModuleIndex;
|
||||
char Padding2[2];
|
||||
uint32_t DataCrc;
|
||||
uint32_t RelocCrc;
|
||||
};
|
||||
|
||||
While most of these are self-explanatory, the ``Characteristics`` field
|
||||
warrants some elaboration. It corresponds to the ``Characteristics``
|
||||
field of the `IMAGE_SECTION_HEADER <https://msdn.microsoft.com/en-us/library/windows/desktop/ms680341(v=vs.85).aspx>`__
|
||||
structure.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct ModInfo {
|
||||
uint32_t Unused1;
|
||||
SectionContribEntry SectionContr;
|
||||
uint16_t Flags;
|
||||
uint16_t ModuleSymStream;
|
||||
uint32_t SymByteSize;
|
||||
uint32_t C11ByteSize;
|
||||
uint32_t C13ByteSize;
|
||||
uint16_t SourceFileCount;
|
||||
char Padding[2];
|
||||
uint32_t Unused2;
|
||||
uint32_t SourceFileNameIndex;
|
||||
uint32_t PdbFilePathNameIndex;
|
||||
char ModuleName[];
|
||||
char ObjFileName[];
|
||||
};
|
||||
|
||||
- **SectionContr** - Describes the properties of the section in the final binary
|
||||
which contain the code and data from this module.
|
||||
|
||||
- **Flags** - A bitfield with the following format:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
uint16_t Dirty : 1; // ``true`` if this ModInfo has been written since reading the PDB.
|
||||
uint16_t EC : 1; // ``true`` if EC information is present for this module. It is unknown what EC actually is.
|
||||
uint16_t Unused : 6;
|
||||
uint16_t TSM : 8; // Type Server Index for this module. It is unknown what this is used for, but it is not used by LLVM.
|
||||
|
||||
|
||||
- **ModuleSymStream** - The index of the stream that contains symbol information
|
||||
for this module. This includes CodeView symbol information as well as source
|
||||
and line information.
|
||||
|
||||
- **SymByteSize** - The number of bytes of data from the stream identified by
|
||||
``ModuleSymStream`` that represent CodeView symbol records.
|
||||
|
||||
- **C11ByteSize** - The number of bytes of data from the stream identified by
|
||||
``ModuleSymStream`` that represent C11-style CodeView line information.
|
||||
|
||||
- **C13ByteSize** - The number of bytes of data from the stream identified by
|
||||
``ModuleSymStream`` that represent C13-style CodeView line information. At
|
||||
most one of ``C11ByteSize`` and ``C13ByteSize`` will be non-zero.
|
||||
|
||||
- **SourceFileCount** - The number of source files that contributed to this
|
||||
module during compilation.
|
||||
|
||||
- **SourceFileNameIndex** - The offset in the names buffer of the primary
|
||||
translation unit used to build this module. All PDB files observed to date
|
||||
always have this value equal to 0.
|
||||
|
||||
- **PdbFilePathNameIndex** - The offset in the names buffer of the PDB file
|
||||
containing this module's symbol information. This has only been observed
|
||||
to be non-zero for the special ``* Linker *`` module.
|
||||
|
||||
- **ModuleName** - The module name. This is usually either a full path to an
|
||||
object file (either directly passed to ``link.exe`` or from an archive) or
|
||||
a string of the form ``Import:<dll name>``.
|
||||
|
||||
- **ObjFileName** - The object file name. In the case of an module that is
|
||||
linked directly passed to ``link.exe``, this is the same as **ModuleName**.
|
||||
In the case of a module that comes from an archive, this is usually the full
|
||||
path to the archive.
|
||||
|
||||
.. _dbi_sec_contr_substream:
|
||||
|
||||
Section Contribution Substream
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_mod_info_substream` ends,
|
||||
and consumes ``Header->SectionContributionSize`` bytes. This substream begins
|
||||
with a single ``uint32_t`` which will be one of the following values:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
enum class SectionContrSubstreamVersion : uint32_t {
|
||||
Ver60 = 0xeffe0000 + 19970605,
|
||||
V2 = 0xeffe0000 + 20140516
|
||||
};
|
||||
|
||||
``Ver60`` is the only value which has been observed in a PDB so far. Following
|
||||
this ``4`` byte field is an array of fixed-length structures. If the version
|
||||
is ``Ver60``, it is an array of ``SectionContribEntry`` structures. If the
|
||||
version is ``V2``, it is an array of ``SectionContribEntry2`` structures,
|
||||
defined as follows:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct SectionContribEntry2 {
|
||||
SectionContribEntry SC;
|
||||
uint32_t ISectCoff;
|
||||
};
|
||||
|
||||
The purpose of the second field is not well understood.
|
||||
|
||||
|
||||
.. _dbi_section_map_substream:
|
||||
|
||||
Section Map Substream
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_sec_contr_substream` ends,
|
||||
and consumes ``Header->SectionMapSize`` bytes. This substream begins with an ``8``
|
||||
byte header followed by an array of fixed-length records. The header and records
|
||||
have the following layout:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct SectionMapHeader {
|
||||
uint16_t Count; // Number of segment descriptors
|
||||
uint16_t LogCount; // Number of logical segment descriptors
|
||||
};
|
||||
|
||||
struct SectionMapEntry {
|
||||
uint16_t Flags; // See the SectionMapEntryFlags enum below.
|
||||
uint16_t Ovl; // Logical overlay number
|
||||
uint16_t Group; // Group index into descriptor array.
|
||||
uint16_t Frame;
|
||||
uint16_t SectionName; // Byte index of segment / group name in string table, or 0xFFFF.
|
||||
uint16_t ClassName; // Byte index of class in string table, or 0xFFFF.
|
||||
uint32_t Offset; // Byte offset of the logical segment within physical segment. If group is set in flags, this is the offset of the group.
|
||||
uint32_t SectionLength; // Byte count of the segment or group.
|
||||
};
|
||||
|
||||
enum class SectionMapEntryFlags : uint16_t {
|
||||
Read = 1 << 0, // Segment is readable.
|
||||
Write = 1 << 1, // Segment is writable.
|
||||
Execute = 1 << 2, // Segment is executable.
|
||||
AddressIs32Bit = 1 << 3, // Descriptor describes a 32-bit linear address.
|
||||
IsSelector = 1 << 8, // Frame represents a selector.
|
||||
IsAbsoluteAddress = 1 << 9, // Frame represents an absolute address.
|
||||
IsGroup = 1 << 10 // If set, descriptor represents a group.
|
||||
};
|
||||
|
||||
Many of these fields are not well understood, so will not be discussed further.
|
||||
|
||||
.. _dbi_file_info_substream:
|
||||
|
||||
File Info Substream
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_section_map_substream` ends,
|
||||
and consumes ``Header->SourceInfoSize`` bytes. This substream defines the mapping
|
||||
from module to the source files that contribute to that module. Since multiple
|
||||
modules can use the same source file (for example, a header file), this substream
|
||||
uses a string table to store each unique file name only once, and then have each
|
||||
module use offsets into the string table rather than embedding the string's value
|
||||
directly. The format of this substream is as follows:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct FileInfoSubstream {
|
||||
uint16_t NumModules;
|
||||
uint16_t NumSourceFiles;
|
||||
|
||||
uint16_t ModIndices[NumModules];
|
||||
uint16_t ModFileCounts[NumModules];
|
||||
uint32_t FileNameOffsets[NumSourceFiles];
|
||||
char NamesBuffer[][NumSourceFiles];
|
||||
};
|
||||
|
||||
**NumModules** - The number of modules for which source file information is
|
||||
contained within this substream. Should match the corresponding value from the
|
||||
ref:`dbi_header`.
|
||||
|
||||
**NumSourceFiles**: In theory this is supposed to contain the number of source
|
||||
files for which this substream contains information. But that would present a
|
||||
problem in that the width of this field being ``16``-bits would prevent one from
|
||||
having more than 64K source files in a program. In early versions of the file
|
||||
format, this seems to have been the case. In order to support more than this, this
|
||||
field of the is simply ignored, and computed dynamically by summing up the values of
|
||||
the ``ModFileCounts`` array (discussed below). In short, this value should be
|
||||
ignored.
|
||||
|
||||
**ModIndices** - This array is present, but does not appear to be useful.
|
||||
|
||||
**ModFileCountArray** - An array of ``NumModules`` integers, each one containing
|
||||
the number of source files which contribute to the module at the specified index.
|
||||
While each individual module is limited to 64K contributing source files, the
|
||||
union of all modules' source files may be greater than 64K. The real number of
|
||||
source files is thus computed by summing this array. Note that summing this array
|
||||
does not give the number of `unique` source files, only the total number of source
|
||||
file contributions to modules.
|
||||
|
||||
**FileNameOffsets** - An array of **NumSourceFiles** integers (where **NumSourceFiles**
|
||||
here refers to the 32-bit value obtained from summing **ModFileCountArray**), where
|
||||
each integer is an offset into **NamesBuffer** pointing to a null terminated string.
|
||||
|
||||
**NamesBuffer** - An array of null terminated strings containing the actual source
|
||||
file names.
|
||||
|
||||
.. _dbi_type_server_substream:
|
||||
|
||||
Type Server Substream
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_file_info_substream` ends,
|
||||
and consumes ``Header->TypeServerSize`` bytes. Neither the purpose nor the layout
|
||||
of this substream is understood, although it is assumed to related somehow to the
|
||||
usage of ``/Zi`` and ``mspdbsrv.exe``. This substream will not be discussed further.
|
||||
|
||||
.. _dbi_ec_substream:
|
||||
|
||||
EC Substream
|
||||
^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_type_server_substream` ends,
|
||||
and consumes ``Header->ECSubstreamSize`` bytes. Neither the purpose nor the layout
|
||||
of this substream is understood, and it will not be discussed further.
|
||||
|
||||
.. _dbi_optional_dbg_stream:
|
||||
|
||||
Optional Debug Header Stream
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
Begins at offset ``0`` immediately after the :ref:`dbi_ec_substream` ends, and
|
||||
consumes ``Header->OptionalDbgHeaderSize`` bytes. This field is an array of
|
||||
stream indices (e.g. ``uint16_t``'s), each of which identifies a stream
|
||||
index in the larger MSF file which contains some additional debug information.
|
||||
Each position of this array has a special meaning, allowing one to determine
|
||||
what kind of debug information is at the referenced stream. ``11`` indices
|
||||
are currently understood, although it's possible there may be more. The
|
||||
layout of each stream generally corresponds exactly to a particular type
|
||||
of debug data directory from the PE/COFF file. The format of these fields
|
||||
can be found in the `Microsoft PE/COFF Specification <https://www.microsoft.com/en-us/download/details.aspx?id=19509>`__.
|
||||
|
||||
**FPO Data** - ``DbgStreamArray[0]``. The data in the referenced stream is a
|
||||
debug data directory of type ``IMAGE_DEBUG_TYPE_FPO``
|
||||
|
||||
**Exception Data** - ``DbgStreamArray[1]``. The data in the referenced stream
|
||||
is a debug data directory of type ``IMAGE_DEBUG_TYPE_EXCEPTION``.
|
||||
|
||||
**Fixup Data** - ``DbgStreamArray[2]``. The data in the referenced stream is a
|
||||
debug data directory of type ``IMAGE_DEBUG_TYPE_FIXUP``.
|
||||
|
||||
**Omap To Src Data** - ``DbgStreamArray[3]``. The data in the referenced stream
|
||||
is a debug data directory of type ``IMAGE_DEBUG_TYPE_OMAP_TO_SRC``. This
|
||||
is used for mapping addresses between instrumented and uninstrumented code.
|
||||
|
||||
**Omap From Src Data** - ``DbgStreamArray[4]``. The data in the referenced stream
|
||||
is a debug data directory of type ``IMAGE_DEBUG_TYPE_OMAP_FROM_SRC``. This
|
||||
is used for mapping addresses between instrumented and uninstrumented code.
|
||||
|
||||
**Section Header Data** - ``DbgStreamArray[5]``. A dump of all section headers from
|
||||
the original executable.
|
||||
|
||||
**Token / RID Map** - ``DbgStreamArray[6]``. The layout of this stream is not
|
||||
understood, but it is assumed to be a mapping from ``CLR Token`` to
|
||||
``CLR Record ID``. Refer to `ECMA 335 <http://www.ecma-international.org/publications/standards/Ecma-335.htm>`__
|
||||
for more information.
|
||||
|
||||
**Xdata** - ``DbgStreamArray[7]``. A copy of the ``.xdata`` section from the
|
||||
executable.
|
||||
|
||||
**Pdata** - ``DbgStreamArray[8]``. This is assumed to be a copy of the ``.pdata``
|
||||
section from the executable, but that would make it identical to
|
||||
``DbgStreamArray[1]``. The difference between these two indices is not well
|
||||
understood.
|
||||
|
||||
**New FPO Data** - ``DbgStreamArray[9]``. The data in the referenced stream is a
|
||||
debug data directory of type ``IMAGE_DEBUG_TYPE_FPO``. It is not clear how this
|
||||
differs from ``DbgStreamArray[0]``, but in practice all observed PDB files have
|
||||
used the "new" format rather than the "old" format.
|
||||
|
||||
**Original Section Header Data** - ``DbgStreamArray[10]``. Assumed to be similar
|
||||
to ``DbgStreamArray[5]``, but has not been observed in practice.
|
3
external/llvm/docs/PDB/GlobalStream.rst
vendored
3
external/llvm/docs/PDB/GlobalStream.rst
vendored
@ -1,3 +0,0 @@
|
||||
=====================================
|
||||
The PDB Global Symbol Stream
|
||||
=====================================
|
3
external/llvm/docs/PDB/HashStream.rst
vendored
3
external/llvm/docs/PDB/HashStream.rst
vendored
@ -1,3 +0,0 @@
|
||||
=====================================
|
||||
The TPI & IPI Hash Streams
|
||||
=====================================
|
80
external/llvm/docs/PDB/ModiStream.rst
vendored
80
external/llvm/docs/PDB/ModiStream.rst
vendored
@ -1,80 +0,0 @@
|
||||
=====================================
|
||||
The Module Information Stream
|
||||
=====================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. _modi_stream_intro:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The Module Info Stream (henceforth referred to as the Modi stream) contains
|
||||
information about a single module (object file, import library, etc that
|
||||
contributes to the binary this PDB contains debug information about. There
|
||||
is one modi stream for each module, and the mapping between modi stream index
|
||||
and module is contained in the :doc:`DBI Stream <DbiStream>`. The modi stream
|
||||
for a single module contains line information for the compiland, as well as
|
||||
all CodeView information for the symbols defined in the compiland. Finally,
|
||||
there is a "global refs" substream which is not well understood.
|
||||
|
||||
.. _modi_stream_layout:
|
||||
|
||||
Stream Layout
|
||||
=============
|
||||
|
||||
A modi stream is laid out as follows:
|
||||
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct ModiStream {
|
||||
uint32_t Signature;
|
||||
uint8_t Symbols[SymbolSize-4];
|
||||
uint8_t C11LineInfo[C11Size];
|
||||
uint8_t C13LineInfo[C13Size];
|
||||
|
||||
uint32_t GlobalRefsSize;
|
||||
uint8_t GlobalRefs[GlobalRefsSize];
|
||||
};
|
||||
|
||||
- **Signature** - Unknown. In practice only the value of ``4`` has been
|
||||
observed. It is hypothesized that this value corresponds to the set of
|
||||
``CV_SIGNATURE_xx`` defines in ``cvinfo.h``, with the value of ``4``
|
||||
meaning that this module has C13 line information (as opposed to C11 line
|
||||
information). A corollary of this is that we expect to only ever see
|
||||
C13 line info, and that we do not understand the format of C11 line info.
|
||||
|
||||
- **Symbols** - The :ref:`CodeView Symbol Substream <modi_symbol_substream>`.
|
||||
``SymbolSize`` is equal to the value of ``SymByteSize`` for the
|
||||
corresponding module's entry in the :ref:`Module Info Substream <dbi_mod_info_substream>`
|
||||
of the :doc:`DBI Stream <DbiStream>`.
|
||||
|
||||
- **C11LineInfo** - A block containing CodeView line information in C11
|
||||
format. ``C11Size`` is equal to the value of ``C11ByteSize`` from the
|
||||
:ref:`Module Info Substream <dbi_mod_info_substream>` of the
|
||||
:doc:`DBI Stream <DbiStream>`. If this value is ``0``, then C11 line
|
||||
information is not present. As mentioned previously, the format of
|
||||
C11 line info is not understood and we assume all line in modern PDBs
|
||||
to be in C13 format.
|
||||
|
||||
- **C13LineInfo** - A block containing CodeView line information in C13
|
||||
format. ``C13Size`` is equal to the value of ``C13ByteSize`` from the
|
||||
:ref:`Module Info Substream <dbi_mod_info_substream>` of the
|
||||
:doc:`DBI Stream <DbiStream>`. If this value is ``0``, then C13 line
|
||||
information is not present.
|
||||
|
||||
- **GlobalRefs** - The meaning of this substream is not understood.
|
||||
|
||||
.. _modi_symbol_substream:
|
||||
|
||||
The CodeView Symbol Substream
|
||||
=============================
|
||||
|
||||
The CodeView Symbol Substream. This is an array of variable length
|
||||
records describing the functions, variables, inlining information,
|
||||
and other symbols defined in the compiland. The entire array consumes
|
||||
``SymbolSize-4`` bytes. The format of a CodeView Symbol Record (and
|
||||
thusly, an array of CodeView Symbol Records) is described in
|
||||
:doc:`CodeViewSymbols`.
|
121
external/llvm/docs/PDB/MsfFile.rst
vendored
121
external/llvm/docs/PDB/MsfFile.rst
vendored
@ -1,121 +0,0 @@
|
||||
=====================================
|
||||
The MSF File Format
|
||||
=====================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. _msf_superblock:
|
||||
|
||||
The Superblock
|
||||
==============
|
||||
At file offset 0 in an MSF file is the MSF *SuperBlock*, which is laid out as
|
||||
follows:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct SuperBlock {
|
||||
char FileMagic[sizeof(Magic)];
|
||||
ulittle32_t BlockSize;
|
||||
ulittle32_t FreeBlockMapBlock;
|
||||
ulittle32_t NumBlocks;
|
||||
ulittle32_t NumDirectoryBytes;
|
||||
ulittle32_t Unknown;
|
||||
ulittle32_t BlockMapAddr;
|
||||
};
|
||||
|
||||
- **FileMagic** - Must be equal to ``"Microsoft C / C++ MSF 7.00\\r\\n"``
|
||||
followed by the bytes ``1A 44 53 00 00 00``.
|
||||
- **BlockSize** - The block size of the internal file system. Valid values are
|
||||
512, 1024, 2048, and 4096 bytes. Certain aspects of the MSF file layout vary
|
||||
depending on the block sizes. For the purposes of LLVM, we handle only block
|
||||
sizes of 4KiB, and all further discussion assumes a block size of 4KiB.
|
||||
- **FreeBlockMapBlock** - The index of a block within the file, at which begins
|
||||
a bitfield representing the set of all blocks within the file which are "free"
|
||||
(i.e. the data within that block is not used). This bitfield is spread across
|
||||
the MSF file at ``BlockSize`` intervals.
|
||||
**Important**: ``FreeBlockMapBlock`` can only be ``1`` or ``2``! This field
|
||||
is designed to support incremental and atomic updates of the underlying MSF
|
||||
file. While writing to an MSF file, if the value of this field is `1`, you
|
||||
can write your new modified bitfield to page 2, and vice versa. Only when
|
||||
you commit the file to disk do you need to swap the value in the SuperBlock
|
||||
to point to the new ``FreeBlockMapBlock``.
|
||||
- **NumBlocks** - The total number of blocks in the file. ``NumBlocks * BlockSize``
|
||||
should equal the size of the file on disk.
|
||||
- **NumDirectoryBytes** - The size of the stream directory, in bytes. The stream
|
||||
directory contains information about each stream's size and the set of blocks
|
||||
that it occupies. It will be described in more detail later.
|
||||
- **BlockMapAddr** - The index of a block within the MSF file. At this block is
|
||||
an array of ``ulittle32_t``'s listing the blocks that the stream directory
|
||||
resides on. For large MSF files, the stream directory (which describes the
|
||||
block layout of each stream) may not fit entirely on a single block. As a
|
||||
result, this extra layer of indirection is introduced, whereby this block
|
||||
contains the list of blocks that the stream directory occupies, and the stream
|
||||
directory itself can be stitched together accordingly. The number of
|
||||
``ulittle32_t``'s in this array is given by ``ceil(NumDirectoryBytes / BlockSize)``.
|
||||
|
||||
The Stream Directory
|
||||
====================
|
||||
The Stream Directory is the root of all access to the other streams in an MSF
|
||||
file. Beginning at byte 0 of the stream directory is the following structure:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct StreamDirectory {
|
||||
ulittle32_t NumStreams;
|
||||
ulittle32_t StreamSizes[NumStreams];
|
||||
ulittle32_t StreamBlocks[NumStreams][];
|
||||
};
|
||||
|
||||
And this structure occupies exactly ``SuperBlock->NumDirectoryBytes`` bytes.
|
||||
Note that each of the last two arrays is of variable length, and in particular
|
||||
that the second array is jagged.
|
||||
|
||||
**Example:** Suppose a hypothetical PDB file with a 4KiB block size, and 4
|
||||
streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}.
|
||||
|
||||
Stream 0: ceil(1000 / 4096) = 1 block
|
||||
|
||||
Stream 1: ceil(8000 / 4096) = 2 blocks
|
||||
|
||||
Stream 2: ceil(16000 / 4096) = 4 blocks
|
||||
|
||||
Stream 3: ceil(9000 / 4096) = 3 blocks
|
||||
|
||||
In total, 10 blocks are used. Let's see what the stream directory might look
|
||||
like:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct StreamDirectory {
|
||||
ulittle32_t NumStreams = 4;
|
||||
ulittle32_t StreamSizes[] = {1000, 8000, 16000, 9000};
|
||||
ulittle32_t StreamBlocks[][] = {
|
||||
{4},
|
||||
{5, 6},
|
||||
{11, 9, 7, 8},
|
||||
{10, 15, 12}
|
||||
};
|
||||
};
|
||||
|
||||
In total, this occupies ``15 * 4 = 60`` bytes, so ``SuperBlock->NumDirectoryBytes``
|
||||
would equal ``60``, and ``SuperBlock->BlockMapAddr`` would be an array of one
|
||||
``ulittle32_t``, since ``60 <= SuperBlock->BlockSize``.
|
||||
|
||||
Note also that the streams are discontiguous, and that part of stream 3 is in the
|
||||
middle of part of stream 2. You cannot assume anything about the layout of the
|
||||
blocks!
|
||||
|
||||
Alignment and Block Boundaries
|
||||
==============================
|
||||
As may be clear by now, it is possible for a single field (whether it be a high
|
||||
level record, a long string field, or even a single ``uint16``) to begin and
|
||||
end in separate blocks. For example, if the block size is 4096 bytes, and a
|
||||
``uint16`` field begins at the last byte of the current block, then it would
|
||||
need to end on the first byte of the next block. Since blocks are not
|
||||
necessarily contiguously laid out in the file, this means that both the consumer
|
||||
and the producer of an MSF file must be prepared to split data apart
|
||||
accordingly. In the aforementioned example, the high byte of the ``uint16``
|
||||
would be written to the last byte of block N, and the low byte would be written
|
||||
to the first byte of block N+1, which could be tens of thousands of bytes later
|
||||
(or even earlier!) in the file, depending on what the stream directory says.
|
80
external/llvm/docs/PDB/PdbStream.rst
vendored
80
external/llvm/docs/PDB/PdbStream.rst
vendored
@ -1,80 +0,0 @@
|
||||
========================================
|
||||
The PDB Info Stream (aka the PDB Stream)
|
||||
========================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. _pdb_stream_header:
|
||||
|
||||
Stream Header
|
||||
=============
|
||||
At offset 0 of the PDB Stream is a header with the following layout:
|
||||
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
struct PdbStreamHeader {
|
||||
ulittle32_t Version;
|
||||
ulittle32_t Signature;
|
||||
ulittle32_t Age;
|
||||
Guid UniqueId;
|
||||
};
|
||||
|
||||
- **Version** - A Value from the following enum:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
enum class PdbStreamVersion : uint32_t {
|
||||
VC2 = 19941610,
|
||||
VC4 = 19950623,
|
||||
VC41 = 19950814,
|
||||
VC50 = 19960307,
|
||||
VC98 = 19970604,
|
||||
VC70Dep = 19990604,
|
||||
VC70 = 20000404,
|
||||
VC80 = 20030901,
|
||||
VC110 = 20091201,
|
||||
VC140 = 20140508,
|
||||
};
|
||||
|
||||
While the meaning of this field appears to be obvious, in practice we have
|
||||
never observed a value other than ``VC70``, even with modern versions of
|
||||
the toolchain, and it is unclear why the other values exist. It is assumed
|
||||
that certain aspects of the PDB stream's layout, and perhaps even that of
|
||||
the other streams, will change if the value is something other than ``VC70``.
|
||||
|
||||
- **Signature** - A 32-bit time-stamp generated with a call to ``time()`` at
|
||||
the time the PDB file is written. Note that due to the inherent uniqueness
|
||||
problems of using a timestamp with 1-second granularity, this field does not
|
||||
really serve its intended purpose, and as such is typically ignored in favor
|
||||
of the ``Guid`` field, described below.
|
||||
|
||||
- **Age** - The number of times the PDB file has been written. This can be used
|
||||
along with ``Guid`` to match the PDB to its corresponding executable.
|
||||
|
||||
- **Guid** - A 128-bit identifier guaranteed to be unique across space and time.
|
||||
In general, this can be thought of as the result of calling the Win32 API
|
||||
`UuidCreate <https://msdn.microsoft.com/en-us/library/windows/desktop/aa379205(v=vs.85).aspx>`__,
|
||||
although LLVM cannot rely on that, as it must work on non-Windows platforms.
|
||||
|
||||
Matching a PDB to its executable
|
||||
================================
|
||||
The linker is responsible for writing both the PDB and the final executable, and
|
||||
as a result is the only entity capable of writing the information necessary to
|
||||
match the PDB to the executable.
|
||||
|
||||
In order to accomplish this, the linker generates a guid for the PDB (or
|
||||
re-uses the existing guid if it is linking incrementally) and increments the Age
|
||||
field.
|
||||
|
||||
The executable is a PE/COFF file, and part of a PE/COFF file is the presence of
|
||||
number of "directories". For our purposes here, we are interested in the "debug
|
||||
directory". The exact format of a debug directory is described by the
|
||||
`IMAGE_DEBUG_DIRECTORY structure <https://msdn.microsoft.com/en-us/library/windows/desktop/ms680307(v=vs.85).aspx>`__.
|
||||
For this particular case, the linker emits a debug directory of type
|
||||
``IMAGE_DEBUG_TYPE_CODEVIEW``. The format of this record is defined in
|
||||
``llvm/DebugInfo/CodeView/CVDebugRecord.h``, but it suffices to say here only
|
||||
that it includes the same ``Guid`` and ``Age`` fields. At runtime, a
|
||||
debugger or tool can scan the COFF executable image for the presence of
|
||||
a debug directory of the correct type and verify that the Guid and Age match.
|
3
external/llvm/docs/PDB/PublicStream.rst
vendored
3
external/llvm/docs/PDB/PublicStream.rst
vendored
@ -1,3 +0,0 @@
|
||||
=====================================
|
||||
The PDB Public Symbol Stream
|
||||
=====================================
|
3
external/llvm/docs/PDB/TpiStream.rst
vendored
3
external/llvm/docs/PDB/TpiStream.rst
vendored
@ -1,3 +0,0 @@
|
||||
=====================================
|
||||
The PDB TPI Stream
|
||||
=====================================
|
167
external/llvm/docs/PDB/index.rst
vendored
167
external/llvm/docs/PDB/index.rst
vendored
@ -1,167 +0,0 @@
|
||||
=====================================
|
||||
The PDB File Format
|
||||
=====================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. _pdb_intro:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
PDB (Program Database) is a file format invented by Microsoft and which contains
|
||||
debug information that can be consumed by debuggers and other tools. Since
|
||||
officially supported APIs exist on Windows for querying debug information from
|
||||
PDBs even without the user understanding the internals of the file format, a
|
||||
large ecosystem of tools has been built for Windows to consume this format. In
|
||||
order for Clang to be able to generate programs that can interoperate with these
|
||||
tools, it is necessary for us to generate PDB files ourselves.
|
||||
|
||||
At the same time, LLVM has a long history of being able to cross-compile from
|
||||
any platform to any platform, and we wish for the same to be true here. So it
|
||||
is necessary for us to understand the PDB file format at the byte-level so that
|
||||
we can generate PDB files entirely on our own.
|
||||
|
||||
This manual describes what we know about the PDB file format today. The layout
|
||||
of the file, the various streams contained within, the format of individual
|
||||
records within, and more.
|
||||
|
||||
We would like to extend our heartfelt gratitude to Microsoft, without whom we
|
||||
would not be where we are today. Much of the knowledge contained within this
|
||||
manual was learned through reading code published by Microsoft on their `GitHub
|
||||
repo <https://github.com/Microsoft/microsoft-pdb>`__.
|
||||
|
||||
.. _pdb_layout:
|
||||
|
||||
File Layout
|
||||
===========
|
||||
|
||||
.. important::
|
||||
Unless otherwise specified, all numeric values are encoded in little endian.
|
||||
If you see a type such as ``uint16_t`` or ``uint64_t`` going forward, always
|
||||
assume it is little endian!
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
MsfFile
|
||||
PdbStream
|
||||
TpiStream
|
||||
DbiStream
|
||||
ModiStream
|
||||
PublicStream
|
||||
GlobalStream
|
||||
HashStream
|
||||
CodeViewSymbols
|
||||
CodeViewTypes
|
||||
|
||||
.. _msf:
|
||||
|
||||
The MSF Container
|
||||
-----------------
|
||||
A PDB file is really just a special case of an MSF (Multi-Stream Format) file.
|
||||
An MSF file is actually a miniature "file system within a file". It contains
|
||||
multiple streams (aka files) which can represent arbitrary data, and these
|
||||
streams are divided into blocks which may not necessarily be contiguously
|
||||
laid out within the file (aka fragmented). Additionally, the MSF contains a
|
||||
stream directory (aka MFT) which describes how the streams (files) are laid
|
||||
out within the MSF.
|
||||
|
||||
For more information about the MSF container format, stream directory, and
|
||||
block layout, see :doc:`MsfFile`.
|
||||
|
||||
.. _streams:
|
||||
|
||||
Streams
|
||||
-------
|
||||
The PDB format contains a number of streams which describe various information
|
||||
such as the types, symbols, source files, and compilands (e.g. object files)
|
||||
of a program, as well as some additional streams containing hash tables that are
|
||||
used by debuggers and other tools to provide fast lookup of records and types
|
||||
by name, and various other information about how the program was compiled such
|
||||
as the specific toolchain used, and more. A summary of streams contained in a
|
||||
PDB file is as follows:
|
||||
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| Name | Stream Index | Contents |
|
||||
+====================+==============================+===========================================+
|
||||
| Old Directory | - Fixed Stream Index 0 | - Previous MSF Stream Directory |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| PDB Stream | - Fixed Stream Index 1 | - Basic File Information |
|
||||
| | | - Fields to match EXE to this PDB |
|
||||
| | | - Map of named streams to stream indices |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| TPI Stream | - Fixed Stream Index 2 | - CodeView Type Records |
|
||||
| | | - Index of TPI Hash Stream |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| DBI Stream | - Fixed Stream Index 3 | - Module/Compiland Information |
|
||||
| | | - Indices of individual module streams |
|
||||
| | | - Indices of public / global streams |
|
||||
| | | - Section Contribution Information |
|
||||
| | | - Source File Information |
|
||||
| | | - FPO / PGO Data |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| IPI Stream | - Fixed Stream Index 4 | - CodeView Type Records |
|
||||
| | | - Index of IPI Hash Stream |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| /LinkInfo | - Contained in PDB Stream | - Unknown |
|
||||
| | Named Stream map | |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| /src/headerblock | - Contained in PDB Stream | - Unknown |
|
||||
| | Named Stream map | |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| /names | - Contained in PDB Stream | - PDB-wide global string table used for |
|
||||
| | Named Stream map | string de-duplication |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| Module Info Stream | - Contained in DBI Stream | - CodeView Symbol Records for this module |
|
||||
| | - One for each compiland | - Line Number Information |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| Public Stream | - Contained in DBI Stream | - Public (Exported) Symbol Records |
|
||||
| | | - Index of Public Hash Stream |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| Global Stream | - Contained in DBI Stream | - Global Symbol Records |
|
||||
| | | - Index of Global Hash Stream |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| TPI Hash Stream | - Contained in TPI Stream | - Hash table for looking up TPI records |
|
||||
| | | by name |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
| IPI Hash Stream | - Contained in IPI Stream | - Hash table for looking up IPI records |
|
||||
| | | by name |
|
||||
+--------------------+------------------------------+-------------------------------------------+
|
||||
|
||||
More information about the structure of each of these can be found on the
|
||||
following pages:
|
||||
|
||||
:doc:`PdbStream`
|
||||
Information about the PDB Info Stream and how it is used to match PDBs to EXEs.
|
||||
|
||||
:doc:`TpiStream`
|
||||
Information about the TPI stream and the CodeView records contained within.
|
||||
|
||||
:doc:`DbiStream`
|
||||
Information about the DBI stream and relevant substreams including the Module Substreams,
|
||||
source file information, and CodeView symbol records contained within.
|
||||
|
||||
:doc:`ModiStream`
|
||||
Information about the Module Information Stream, of which there is one for each compilation
|
||||
unit and the format of symbols contained within.
|
||||
|
||||
:doc:`PublicStream`
|
||||
Information about the Public Symbol Stream.
|
||||
|
||||
:doc:`GlobalStream`
|
||||
Information about the Global Symbol Stream.
|
||||
|
||||
:doc:`HashStream`
|
||||
Information about the Hash Table stream, and how it can be used to quickly look up records
|
||||
by name.
|
||||
|
||||
CodeView
|
||||
========
|
||||
CodeView is another format which comes into the picture. While MSF defines
|
||||
the structure of the overall file, and PDB defines the set of streams that
|
||||
appear within the MSF file and the format of those streams, CodeView defines
|
||||
the format of **symbol and type records** that appear within specific streams.
|
||||
Refer to the pages on :doc:`CodeViewSymbols` and :doc:`CodeViewTypes` for
|
||||
more information about the CodeView format.
|
Reference in New Issue
Block a user