Grammar rules have 2 variants: ones that are attached to a specific
compile function which is called to compile that grammar node, and ones
that don't have a compile function and are instead just inspected to see
what form they take.
In the compiler there is a table of all grammar rules, with each entry
having a pointer to the associated compile function. Those rules with no
compile function have a null pointer. There are 120 such rules, so that's
120 words of essentially wasted code space.
By grouping together the compile vs no-compile rules we can put all the
no-compile rules at the end of the list of rules, and then we don't need
to store the null pointers. We just have a truncated table and it's
guaranteed that when indexing this table we only index the first half,
the half with populated pointers.
This patch implements such a grouping by having a specific macro for the
compile vs no-compile grammar rules (DEF_RULE vs DEF_RULE_NC). It saves
around 460 bytes of code on 32-bit archs.
It is split into 2 functions, one to make small ints and the other to make
a non-small-int leaf node. This reduces code size by 32 bytes on
bare-arm, 64 bytes on unix (x64-64) and 144 bytes on stmhal.
In both parse.c and qstr.c, an internal chunking allocator tidies up
by calling m_renew to shrink an allocated chunk to the size used, and
assumes that the chunk will not move. However, when MICROPY_ENABLE_GC
is false, m_renew calls the system realloc, which does not guarantee
this behaviour. Environments where realloc may return a different
pointer include:
(1) mbed-os with MBED_HEAP_STATS_ENABLED (which adds a wrapper around
malloc & friends; this is where I was hit by the bug);
(2) valgrind on linux (how I diagnosed it).
The fix is to call m_renew_maybe with allow_move=false.
This fixes constant substitution so that only standalone identifiers are
replaced with their constant value (if they have one). I.e. don't
replace NAME in expressions like obj.NAME or NAME = expr.
Assignments of the form "_id = const(value)" are treated as private
(following a similar CPython convention) and code is no longer emitted
for the assignment to a global variable.
See issue #2111.
Most grammar rules can optimise to the identity if they only have a single
argument, saving a lot of RAM building the parse tree. Previous to this
patch, whether a given grammar rule could be optimised was defined (mostly
implicitly) by a complicated set of logic rules. With this patch the
definition is always specified explicitly by using "and_ident" in the rule
definition in the grammar. This simplifies the logic of the parser,
making it a bit smaller and faster. RAM usage in unaffected.
The chunks of memory that the parser allocates contain parse nodes and
are pointed to from many places, so these chunks cannot be relocated
by the memory manager. This patch makes it so that when a chunk is
shrunk to fit, it is not relocated.
Constant folding in the parser can now operate on big ints, whatever
their representation. This is now possible because the parser can create
parse nodes holding arbitrary objects. For the case of small ints the
folding is still efficient in RAM because the folded small int is stored
inplace in the parse node.
Adds 48 bytes to code size on Thumb2 architecture. Helps reduce heap
usage because more constants can be computed at compile time, leading to
a smaller parse tree, and most importantly means that the constants don't
have to be computed at runtime (perhaps more than once). Parser will now
be a little slower when folding due to calls to runtime to do the
arithmetic.
Before this patch, (x+y)*z would be parsed to a tree that contained a
redundant identity parse node corresponding to the parenthesis. With
this patch such nodes are optimised away, which reduces memory
requirements for expressions with parenthesis, and simplifies the
compiler because it doesn't need to handle this identity case.
A parenthesis parse node is still needed for tuples.
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.
MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions. By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.
Disabling both options saves about 40k of code size on 32-bit x86.
To use, put the following in mpconfigport.h:
#define MICROPY_OBJ_REPR (MICROPY_OBJ_REPR_D)
#define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_DOUBLE)
typedef int64_t mp_int_t;
typedef uint64_t mp_uint_t;
#define UINT_FMT "%llu"
#define INT_FMT "%lld"
Currently does not work with native emitter enabled.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.
This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.