Commit Graph

72 Commits

Author SHA1 Message Date
Damien George fdfcee7b1e py/parse: Make parser error handling cleaner, less spaghetti-like. 2015-10-12 12:59:18 +01:00
Damien George 64f2b213bb py: Move constant folding from compiler to parser.
It makes much more sense to do constant folding in the parser while the
parse tree is being built.  This eliminates the need to create parse
nodes that will just be folded away.  The code is slightly simpler and a
bit smaller as well.

Constant folding now has a configuration option,
MICROPY_COMP_CONST_FOLDING, which is enabled by default.
2015-10-12 12:58:45 +01:00
Damien George 366239b8b9 py/parse: Factor logic when creating parse node from and-rule. 2015-10-08 23:13:18 +01:00
Damien George 58e0f4ac50 py: Allocate parse nodes in chunks to reduce fragmentation and RAM use.
With this patch parse nodes are allocated sequentially in chunks.  This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.

Saves roughly 20% of RAM during parse stage.
2015-10-02 00:11:11 +01:00
Damien George 65dc960e3b unix-cpy: Remove unix-cpy. It's no longer needed.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests.  When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct.  The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly.  And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.

But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:

1. It outputs CPython 3.3 compatible bytecode.  CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5.  There's no point in changing the bytecode output to match
CPython anymore.

2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.

3. The bytecode tests are not run.  They were never part of Travis and
are not run locally anymore.

4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.

5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode.  Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
2015-08-17 12:51:26 +01:00
Damien George 96f0dd3cbc py/parse: Fix handling of empty input so it raises an exception. 2015-07-24 15:05:56 +00:00
Damien George fa7c61dfab py/parse: De-duplicate and simplify code for parser "or" rule. 2015-07-24 14:35:57 +00:00
Damien George ade9a05236 py: Improve allocation policy of qstr data.
Previous to this patch all interned strings lived in their own malloc'd
chunk.  On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).

With this patch interned strings are concatenated into the same malloc'd
chunk when possible.  Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.

RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned).  New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.

Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
2015-07-14 22:56:32 +01:00
Damien George 4735c45c51 py: Clean up some bits and pieces in parser, grammar. 2015-04-21 16:43:18 +00:00
nhtshot 5d323defe4 py: Update parse.c&mpconfig.h to reflect rename of mp_lexer_show_token.
This function is only used when DEBUG_PRINTERS and USE_RULE_NAME are
enabled.
2015-02-23 21:36:05 +00:00
Damien George dfe944c3e5 py: Expose compile.c:list_get as mp_parse_node_extract_list. 2015-02-13 02:29:46 +00:00
Damien George f804833a97 py: Initialise variables in mp_parse correctly, to satisfy gcc warning. 2015-02-08 13:40:20 +00:00
Damien George 7d414a1b52 py: Parse big-int/float/imag constants directly in parser.
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed.  This is wasteful in RAM and not efficient.  Now,
these constants are parsed straight away in the parser and turned into
objects.  This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
2015-02-08 01:57:40 +00:00
Damien George 0bfc7638ba py: Protect mp_parse and mp_compile with nlr push/pop block.
To enable parsing constants more efficiently, mp_parse should be allowed
to raise an exception, and mp_compile can already raise a MemoryError.
So these functions need to be protected by an nlr push/pop block.

This patch adds that feature in all places.  This allows to simplify how
mp_parse and mp_compile are called: they now raise an exception if they
have an error and so explicit checking is not needed anymore.
2015-02-07 18:33:58 +00:00
Damien George 5c670acb1f py: Be more machine-portable with size of bit fields. 2015-01-24 23:12:58 +00:00
Damien George 50912e7f5d py, unix, stmhal: Allow to compile with -Wshadow.
See issue #699.
2015-01-20 11:55:10 +00:00
Damien George 963a5a3e82 py, unix: Allow to compile with -Wsign-compare.
See issue #699.
2015-01-16 17:47:07 +00:00
Damien George d2d64f00fb py: Add "default" to switches to allow better code flow analysis.
This helps compiler produce smaller code.  Saves 124 bytes on stmhal and
bare-arm.
2015-01-14 21:32:42 +00:00
Damien George 4c81ba8015 py: Never intern data of large string/bytes object; add relevant tests.
Previously to this patch all constant string/bytes objects were
interned by the compiler, and this lead to crashes when the qstr was too
long (noticeable now that qstr length storage defaults to 1 byte).

With this patch, long string/bytes objects are never interned, and are
referenced directly as constant objects within generated code using
load_const_obj.
2015-01-13 16:21:23 +00:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 6efa66f125 py: Remove unnecessary RULE_none and PN_none from parser. 2014-12-20 18:41:59 +00:00
Damien George b47ea4eadd py: Add blank and ident flags to grammar rules to simplify parser.
This saves around 100 bytes code space on stmhal, more on unix.
2014-12-20 18:37:50 +00:00
Damien George 2870d85a11 py: Save a few code bytes in parser; make vars local where possible. 2014-12-20 18:06:08 +00:00
Damien George a4c52c5a3d py: Optimise lexer by exposing lexer type.
mp_lexer_t type is exposed, mp_token_t type is removed, and simple lexer
functions (like checking current token kind) are now inlined.

This saves 784 bytes ROM on 32-bit unix, 348 bytes on stmhal, and 460
bytes on bare-arm.  It also saves a tiny bit of RAM since mp_lexer_t
is a bit smaller.  Also will run a bit more efficiently.
2014-12-05 19:35:18 +00:00
Damien George e7bb0443cd py: Properly free string parse-node; add assertion to gc_free. 2014-10-23 14:13:05 +01:00