Bug 1132771 - Add Files to moz.build with ability to define Bugzilla component; r=glandium

The Files sub-context allows us to attach metadata to files based on
pattern matching rules.

Patterns are matched against files in a last-write-wins fashion.

The sub-context defines the BUG_COMPONENT variable, which is a 2-tuple
(actually a named tuple) defining the Bugzilla product and component for
files. There are no consumers yet. But an eventual use case will be to
suggest a bug component for a patch/commit. Another will be to
automatically suggest a bug component for a failing test.
This commit is contained in:
Gregory Szorc 2015-03-01 22:15:07 -08:00
parent adf19d10b9
commit 49fd3da675
15 changed files with 493 additions and 15 deletions

View File

@ -0,0 +1,178 @@
.. _mozbuild_files_metadata:
==============
Files Metadata
==============
:ref:`mozbuild-files` provide a mechanism for attaching metadata to
files. Essentially, you define some flags to set on a file or file
pattern. Later, some tool or process queries for metadata attached to a
file of interest and it does something intelligent with that data.
Defining Metadata
=================
Files metadata is defined by using the
:ref:`Files Sub-Context <mozbuild_subcontext_Files>` in ``moz.build``
files. e.g.::
with Files('**/Makefile.in'):
BUG_COMPONENT = ('Core', 'Build Config')
This working example says, *for all Makefile.in files in all directories
in this one and underneath it, set the Bugzilla component to
Core :: Build Config*.
For more info, read the
:ref:`docs on Files <mozbuild_subcontext_Files>`.
How Metadata is Read
====================
``Files`` metadata is extracted in :ref:`mozbuild_fs_reading_mode`.
Reading starts by specifying a set of files whose metadata you are
interested in. For each file, the filesystem is walked to the root
of the source directory. Any ``moz.build`` encountered during this
walking are marked as relevant to the file.
Let's say you have the following filesystem content::
/moz.build
/root_file
/dir1/moz.build
/dir1/foo
/dir1/subdir1/foo
/dir2/foo
For ``/root_file``, the relevant ``moz.build`` files are just
``/moz.build``.
For ``/dir1/foo`` and ``/dir1/subdir1/foo``, the relevant files are
``/moz.build`` and ``/dir1/moz.build``.
For ``/dir2``, the relevant file is just ``/moz.build``.
Once the list of relevant ``moz.build`` files is obtained, each
``moz.build`` file is evaluated. Root ``moz.build`` file first,
leaf-most files last. This follows the rules of
:ref:`mozbuild_fs_reading_mode`, with the set of evaluated ``moz.build``
files being controlled by filesystem content, not ``DIRS`` variables.
The file whose metadata is being resolved maps to a set of ``moz.build``
files which in turn evaluates to a list of contexts. For file metadata,
we only care about one of these contexts:
:ref:`Files <mozbuild_subcontext_Files>`.
We start with an empty ``Files`` instance to represent the file. As
we encounter a *files sub-context*, we see if it is appropriate to
this file. If it is, we apply its values. This process is repeated
until all *files sub-contexts* have been applied or skipped. The final
state of the ``Files`` instance is used to represent the metadata for
this particular file.
It may help to visualize this. Say we have 2 ``moz.build`` files::
# /moz.build
with Files('*.cpp'):
BUG_COMPONENT = ('Core', 'XPCOM')
with Files('**/*.js'):
BUG_COMPONENT = ('Firefox', 'General')
# /foo/moz.build
with Files('*.js'):
BUG_COMPONENT = ('Another', 'Component')
Querying for metadata for the file ``/foo/test.js`` will reveal 3
relevant ``Files`` sub-contexts. They are evaluated as follows:
1. ``/moz.build - Files('*.cpp')``. Does ``/*.cpp`` match
``/foo/test.js``? **No**. Ignore this context.
2. ``/moz.build - Files('**/*.js')``. Does ``/**/*.js`` match
``/foo/test.js``? **Yes**. Apply ``BUG_COMPONENT = ('Firefox', 'General')``
to us.
3. ``/foo/moz.build - Files('*.js')``. Does ``/foo/*.js`` match
``/foo/test.js``? **Yes**. Apply
``BUG_COMPONENT = ('Another', 'Component')``.
At the end of execution, we have
``BUG_COMPONENT = ('Another', 'Component')`` as the metadata for
``/foo/test.js``.
One way to look at file metadata is as a stack of data structures.
Each ``Files`` sub-context relevant to a given file is applied on top
of the previous state, starting from an empty state. The final state
wins.
.. _mozbuild_files_metadata_finalizing:
Finalizing Values
=================
The default behavior of ``Files`` sub-context evaluation is to apply new
values on top of old. In most circumstances, this results in desired
behavior. However, there are circumstances where this may not be
desired. There is thus a mechanism to *finalize* or *freeze* values.
Finalizing values is useful for scenarios where you want to prevent
wildcard matches from overwriting previously-set values. This is useful
for one-off files.
Let's take ``Makefile.in`` files as an example. The build system module
policy dictates that ``Makefile.in`` files are part of the ``Build
Config`` module and should be reviewed by peers of that module. However,
there exist ``Makefile.in`` files in many directories in the source
tree. Without finalization, a ``*`` or ``**`` wildcard matching rule
would match ``Makefile.in`` files and overwrite their metadata.
Finalizing of values is performed by setting the ``FINAL`` variable
on ``Files`` sub-contexts. See the
:ref:`Files documentation <mozbuild_subcontext_Files>` for more.
Here is an example with ``Makefile.in`` files, showing how it is
possible to finalize the ``BUG_COMPONENT`` value.::
# /moz.build
with Files('**/Makefile.in'):
BUG_COMPONENT = ('Core', 'Build Config')
FINAL = True
# /foo/moz.build
with Files('**'):
BUG_COMPONENT = ('Another', 'Component')
If we query for metadata of ``/foo/Makefile.in``, both ``Files``
sub-contexts match the file pattern. However, since ``BUG_COMPONENT`` is
marked as finalized by ``/moz.build``, the assignment from
``/foo/moz.build`` is ignored. The final value for ``BUG_COMPONENT``
is ``('Core', 'Build Config')``.
Here is another example::
with Files('*.cpp'):
BUG_COMPONENT = ('One-Off', 'For C++')
FINAL = True
with Files('**'):
BUG_COMPONENT = ('Regular', 'Component')
For every files except ``foo.cpp``, the bug component will be resolved
as ``Regular :: Component``. However, ``foo.cpp`` has its value of
``One-Off :: For C++`` preserved because it is finalized.
.. important::
``FINAL`` only applied to variables defined in a context.
If you want to mark one variable as finalized but want to leave
another mutable, you'll need to use 2 ``Files`` contexts.
Guidelines for Defining Metadata
================================
In general, values defined towards the root of the source tree are
generic and become more specific towards the leaves. For example,
the ``BUG_COMPONENT`` for ``/browser`` might be ``Firefox :: General``
whereas ``/browser/components/preferences`` would list
``Firefox :: Preferences``.

View File

@ -13,6 +13,7 @@ Important Concepts
Mozconfig Files <mozconfigs>
mozbuild-files
mozbuild-symbols
files-metadata
Profile Guided Optimization <pgo>
slow
environment-variables

View File

@ -30,6 +30,8 @@ The following properties make execution of ``moz.build`` files special:
1. The execution environment exposes a limited subset of Python.
2. There is a special set of global symbols and an enforced naming
convention of symbols.
3. Some symbols are inherited from previously-executed ``moz.build``
files.
The limited subset of Python is actually an extremely limited subset.
Only a few symbols from ``__builtins__`` are exposed. These include
@ -68,16 +70,77 @@ sneak into the sandbox without being explicitly defined and documented.
Reading and Traversing moz.build Files
======================================
The process responsible for reading ``moz.build`` files simply starts at
a root ``moz.build`` file, processes it, emits the globals namespace to
a consumer, and then proceeds to process additional referenced
``moz.build`` files from the original file. The consumer then examines
the globals/``UPPERCASE`` variables set as part of execution and then
converts the data therein to Python class instances.
The process for reading ``moz.build`` files roughly consists of:
The executed Python sandbox is essentially represented as a dictionary
of all the special ``UPPERCASE`` variables populated during its
execution.
1. Start at the root ``moz.build`` (``<topsrcdir>/moz.build``).
2. Evaluate the ``moz.build`` file in a new sandbox.
3. Emit the main *context* and any *sub-contexts* from the executed
sandbox.
4. Extract a set of ``moz.build`` files to execute next.
5. For each additional ``moz.build`` file, goto #2 and repeat until all
referenced files have executed.
From the perspective of the consumer, the output of reading is a stream
of :py:class:`mozbuild.frontend.reader.context.Context` instances. Each
``Context`` defines a particular aspect of data. Consumers iterate over
these objects and do something with the data inside. Each object is
essentially a dictionary of all the ``UPPERCASE`` variables populated
during its execution.
.. note::
Historically, there was only one ``context`` per ``moz.build`` file.
As the number of things tracked by ``moz.build`` files grew and more
and more complex processing was desired, it was necessary to split these
contexts into multiple logical parts. It is now common to emit
multiple contexts per ``moz.build`` file.
Build System Reading Mode
-------------------------
The traditional mode of evaluation of ``moz.build`` files is what's
called *build system traversal mode.* In this mode, the ``CONFIG``
variable in each ``moz.build`` sandbox is populated from data coming
from ``config.status``, which is produced by ``configure``.
During evaluation, ``moz.build`` files often make decisions conditional
on the state of the build configuration. e.g. *only compile foo.cpp if
feature X is enabled*.
In this mode, traversal of ``moz.build`` files is governed by variables
like ``DIRS`` and ``TEST_DIRS``. For example, to execute a child
directory, ``foo``, you would add ``DIRS += ['foo']`` to a ``moz.build``
file and ``foo/moz.build`` would be evaluated.
.. _mozbuild_fs_reading_mode:
Filesystem Reading Mode
-----------------------
There is an alternative reading mode that doesn't involve the build
system and doesn't use ``DIRS`` variables to control traversal into
child directories. This mode is called *filesystem reading mode*.
In this reading mode, the ``CONFIG`` variable is a dummy, mostly empty
object. Accessing all but a few special variables will return an empty
value. This means that nearly all ``if CONFIG['FOO']:`` branches will
not be taken.
Instead of using content from within the evaluated ``moz.build``
file to drive traversal into subsequent ``moz.build`` files, the set
of files to evaluate is controlled by the thing doing the reading.
A single ``moz.build`` file is not guaranteed to be executable in
isolation. Instead, we must evaluate all *parent* ``moz.build`` files
first. For example, in order to evaluate ``/foo/moz.build``, one must
execute ``/moz.build`` and have its state influence the execution of
``/foo/moz.build``.
Filesystem reading mode is utilized to power the
:ref:`mozbuild_files_metadata` feature.
Technical Details
-----------------
The code for reading ``moz.build`` files lives in
:py:mod:`mozbuild.frontend.reader`. The Python sandboxes evaluation results
@ -100,9 +163,6 @@ verification step. There are multiple downstream consumers of the
``moz.build``-derived data and many will perform the same actions. This
logic can be complicated, so we have a component dedicated to it.
Other Notes
===========
:py:class:`mozbuild.frontend.reader.BuildReader`` and
:py:class:`mozbuild.frontend.reader.TreeMetadataEmitter`` have a
stream-based API courtesy of generators. When you hook them up properly,

View File

@ -19,7 +19,6 @@ from __future__ import unicode_literals
import os
from collections import OrderedDict
from contextlib import contextmanager
from mozbuild.util import (
HierarchicalStringList,
HierarchicalStringListWithFlagsFactory,
@ -31,6 +30,7 @@ from mozbuild.util import (
StrictOrderingOnAppendList,
StrictOrderingOnAppendListWithFlagsFactory,
TypedList,
TypedNamedTuple,
)
import mozpack.path as mozpath
from types import FunctionType
@ -167,7 +167,6 @@ class Context(KeyedDefaultDict):
def _factory(self, key):
"""Function called when requesting a missing key."""
defaults = self._allowed_variables.get(key)
if not defaults:
raise KeyError('global_ns', 'get_unknown', key)
@ -396,6 +395,108 @@ def ContextDerivedTypedList(type, base_class=List):
return _TypedList
BugzillaComponent = TypedNamedTuple('BugzillaComponent',
[('product', unicode), ('component', unicode)])
class Files(SubContext):
"""Metadata attached to files.
It is common to want to annotate files with metadata, such as which
Bugzilla component tracks issues with certain files. This sub-context is
where we stick that metadata.
The argument to this sub-context is a file matching pattern that is applied
against the host file's directory. If the pattern matches a file whose info
is currently being sought, the metadata attached to this instance will be
applied to that file.
Patterns are collections of filename characters with ``/`` used as the
directory separate (UNIX-style paths) and ``*`` and ``**`` used to denote
wildcard matching.
Patterns without the ``*`` character are literal matches and will match at
most one entity.
Patterns with ``*`` or ``**`` are wildcard matches. ``*`` matches files
within a single directory. ``**`` matches files across several directories.
Here are some examples:
``foo.html``
Will match only the ``foo.html`` file in the current directory.
``*.jsm``
Will match all ``.jsm`` files in the current directory.
``**/*.cpp``
Will match all ``.cpp`` files in this and all child directories.
``foo/*.css``
Will match all ``.css`` files in the ``foo/`` directory.
``bar/*``
Will match all files in the ``bar/`` directory but not any files in
child directories of ``bar/``, such as ``bar/dir1/baz``.
``baz/**``
Will match all files in the ``baz/`` directory and all directories
underneath.
"""
VARIABLES = {
'BUG_COMPONENT': (BugzillaComponent, tuple,
"""The bug component that tracks changes to these files.
Values are a 2-tuple of unicode describing the Bugzilla product and
component. e.g. ``('Core', 'Build Config')``.
""", None),
'FINAL': (bool, bool,
"""Mark variable assignments as finalized.
During normal processing, values from newer Files contexts
overwrite previously set values. Last write wins. This behavior is
not always desired. ``FINAL`` provides a mechanism to prevent
further updates to a variable.
When ``FINAL`` is set, the value of all variables defined in this
context are marked as frozen and all subsequent writes to them
are ignored during metadata reading.
See :ref:`mozbuild_files_metadata_finalizing` for more info.
""", None),
}
def __init__(self, parent, pattern=None):
super(Files, self).__init__(parent)
self.pattern = pattern
self.finalized = set()
def __iadd__(self, other):
assert isinstance(other, Files)
for k, v in other.items():
# Ignore updates to finalized flags.
if k in self.finalized:
continue
# Only finalize variables defined in this instance.
if k == 'FINAL':
self.finalized |= set(other) - {'FINAL'}
continue
self[k] = v
return self
def asdict(self):
"""Return this instance as a dict with built-in data structures.
Call this to obtain an object suitable for serializing.
"""
d = {}
if 'BUG_COMPONENT' in self:
bc = self['BUG_COMPONENT']
d['bug_component'] = (bc.product, bc.component)
return d
# This defines functions that create sub-contexts.
#
# Values are classes that are SubContexts. The class name will be turned into
@ -405,6 +506,7 @@ def ContextDerivedTypedList(type, base_class=List):
# argument is always the parent context. It is up to each class to perform
# argument validation.
SUBCONTEXTS = [
Files,
]
for cls in SUBCONTEXTS:

View File

@ -74,7 +74,10 @@ from .data import (
from .reader import SandboxValidationError
from .context import Context
from .context import (
Context,
SubContext,
)
class TreeMetadataEmitter(LoggingMixin):
@ -135,6 +138,11 @@ class TreeMetadataEmitter(LoggingMixin):
raise Exception('Unhandled object of type %s' % type(o))
for out in output:
# Nothing in sub-contexts is currently of interest to us. Filter
# them all out.
if isinstance(out, SubContext):
continue
if isinstance(out, Context):
# Keep all contexts around, we will need them later.
contexts[out.objdir] = out

View File

@ -62,6 +62,7 @@ from .sandbox import (
from .context import (
Context,
ContextDerivedValue,
Files,
FUNCTIONS,
VARIABLES,
DEPRECATION_HINTS,
@ -1237,3 +1238,46 @@ class BuildReader(object):
result[path] = reduce(lambda x, y: x + y, (contexts[p] for p in paths), [])
return result, all_contexts
def files_info(self, paths):
"""Obtain aggregate data from Files for a set of files.
Given a set of input paths, determine which moz.build files may
define metadata for them, evaluate those moz.build files, and
apply file metadata rules defined within to determine metadata
values for each file requested.
Essentially, for each input path:
1. Determine the set of moz.build files relevant to that file by
looking for moz.build files in ancestor directories.
2. Evaluate moz.build files starting with the most distant.
3. Iterate over Files sub-contexts.
4. If the file pattern matches the file we're seeking info on,
apply attribute updates.
5. Return the most recent value of attributes.
"""
paths, _ = self.read_relevant_mozbuilds(paths)
r = {}
for path, ctxs in paths.items():
flags = Files(Context())
for ctx in ctxs:
if not isinstance(ctx, Files):
continue
relpath = mozpath.relpath(path, ctx.relsrcdir)
pattern = ctx.pattern
# Only do wildcard matching if the '*' character is present.
# Otherwise, mozpath.match will match directories, which we've
# arbitrarily chosen to not allow.
if pattern == relpath or \
('*' in pattern and mozpath.match(relpath, pattern)):
flags += ctx
r[path] = flags
return r

View File

@ -0,0 +1,2 @@
with Files('*'):
BUG_COMPONENT = 'bad value'

View File

@ -0,0 +1,4 @@
with Files('*.jsm'):
BUG_COMPONENT = ('Firefox', 'JS')
with Files('*.cpp'):
BUG_COMPONENT = ('Firefox', 'C++')

View File

@ -0,0 +1,3 @@
with Files('**/Makefile.in'):
BUG_COMPONENT = ('Core', 'Build Config')
FINAL = True

View File

@ -0,0 +1,2 @@
with Files('**'):
BUG_COMPONENT = ('Another', 'Component')

View File

@ -0,0 +1,2 @@
with Files('**'):
BUG_COMPONENT = ('default_product', 'default_component')

View File

@ -0,0 +1,2 @@
with Files('*'):
BUG_COMPONENT = ('Core', 'Build Config')

View File

@ -0,0 +1,5 @@
with Files('foo'):
BUG_COMPONENT = ('FooProduct', 'FooComponent')
with Files('bar'):
BUG_COMPONENT = ('BarProduct', 'BarComponent')

View File

@ -10,6 +10,7 @@ import unittest
from mozunit import main
from mozbuild.frontend.context import BugzillaComponent
from mozbuild.frontend.reader import BuildReaderError
from mozbuild.frontend.reader import BuildReader
@ -313,6 +314,70 @@ class TestBuildReader(unittest.TestCase):
self.assertEqual([ctx.relsrcdir for ctx in paths['d2/file']],
['', 'd2'])
def test_files_bad_bug_component(self):
reader = self.reader('files-info')
with self.assertRaises(BuildReaderError):
reader.files_info(['bug_component/bad-assignment/moz.build'])
def test_files_bug_component_static(self):
reader = self.reader('files-info')
v = reader.files_info(['bug_component/static/foo',
'bug_component/static/bar',
'bug_component/static/foo/baz'])
self.assertEqual(len(v), 3)
self.assertEqual(v['bug_component/static/foo']['BUG_COMPONENT'],
BugzillaComponent('FooProduct', 'FooComponent'))
self.assertEqual(v['bug_component/static/bar']['BUG_COMPONENT'],
BugzillaComponent('BarProduct', 'BarComponent'))
self.assertEqual(v['bug_component/static/foo/baz']['BUG_COMPONENT'],
BugzillaComponent('default_product', 'default_component'))
def test_files_bug_component_simple(self):
reader = self.reader('files-info')
v = reader.files_info(['bug_component/simple/moz.build'])
self.assertEqual(len(v), 1)
flags = v['bug_component/simple/moz.build']
self.assertEqual(flags['BUG_COMPONENT'].product, 'Core')
self.assertEqual(flags['BUG_COMPONENT'].component, 'Build Config')
def test_files_bug_component_different_matchers(self):
reader = self.reader('files-info')
v = reader.files_info([
'bug_component/different-matchers/foo.jsm',
'bug_component/different-matchers/bar.cpp',
'bug_component/different-matchers/baz.misc'])
self.assertEqual(len(v), 3)
js_flags = v['bug_component/different-matchers/foo.jsm']
cpp_flags = v['bug_component/different-matchers/bar.cpp']
misc_flags = v['bug_component/different-matchers/baz.misc']
self.assertEqual(js_flags['BUG_COMPONENT'], BugzillaComponent('Firefox', 'JS'))
self.assertEqual(cpp_flags['BUG_COMPONENT'], BugzillaComponent('Firefox', 'C++'))
self.assertEqual(misc_flags['BUG_COMPONENT'], BugzillaComponent('default_product', 'default_component'))
def test_files_bug_component_final(self):
reader = self.reader('files-info')
v = reader.files_info([
'bug_component/final/foo',
'bug_component/final/Makefile.in',
'bug_component/final/subcomponent/Makefile.in',
'bug_component/final/subcomponent/bar'])
self.assertEqual(v['bug_component/final/foo']['BUG_COMPONENT'],
BugzillaComponent('default_product', 'default_component'))
self.assertEqual(v['bug_component/final/Makefile.in']['BUG_COMPONENT'],
BugzillaComponent('Core', 'Build Config'))
self.assertEqual(v['bug_component/final/subcomponent/Makefile.in']['BUG_COMPONENT'],
BugzillaComponent('Core', 'Build Config'))
self.assertEqual(v['bug_component/final/subcomponent/bar']['BUG_COMPONENT'],
BugzillaComponent('Another', 'Component'))
if __name__ == '__main__':
main()