Reduces by about a factor of 10 on average the amount of RAM needed to
store the line-number to bytecode map in the bytecode prelude.
Using CPython3.4's stdlib for statistics: previously, an average of
13 bytes were used per (bytecode offset, line-number offset) pair, and
now with this improvement, that's down to 1.3 bytes on average.
Large RAM usage before was due to some very large steps in line numbers,
both from the start of the first line in a function way down in the
file, and also functions that have big comments and/or big strings in
them (both cases were significant).
Although the savings are large on average for the CPython stdlib, it
won't have such a big effect for small scripts used in embedded
programming.
Addresses issue #648.
This may seem a bit of a risky change, in that it may introduce crazy
bugs with respect to volatile variables in the VM loop. But, I think it
should be fine: code_state points to some external memory, so the
compiler should always read/write to that memory when accessing the
ip/sp variables (ie not put them in registers).
Anyway, it passes all tests and improves on all efficiency fronts: about
2-4% faster (64-bit unix), 16 bytes less stack space per call (64-bit
unix) and slightly less executable size (unix and stmhal).
The reason it's more efficient is save_ip and save_sp were volatile
variables, so were anyway stored on the stack (in memory, not regs).
Thus converting them to code_state->{ip, sp} doesn't cost an extra
memory dereference (except maybe to get code_state, but that can be put
in a register and then made more efficient for other uses of it).
Conflicts:
py/vm.c
Fixed stack underflow check. Use UINT_FMT/INT_FMT where necessary.
Specify maximum VM-stack byte size by multiple of machine word size, so
that on 64 bit machines it has same functionality as 32 bit.
This improves stack usage in callers to mp_execute_bytecode2, and is step
forward towards unifying execution interface for function and generators
(which is important because generators don't even support full forms
of arguments passing (keywords, etc.)).
Needed to pop the iterator object when breaking out of a for loop. Need
also to be careful to unwind exception handler before popping iterator.
Addresses issue #635.
This helps the compiler do its optimisation, makes it clear which
variables are local per opcode and which global, and makes it consistent
when extra variables are needed in an opcode (in addition to old obj1,
obj2 pair, for example).
Could also make unum local, but that's for another time.