This may seem a bit of a risky change, in that it may introduce crazy
bugs with respect to volatile variables in the VM loop. But, I think it
should be fine: code_state points to some external memory, so the
compiler should always read/write to that memory when accessing the
ip/sp variables (ie not put them in registers).
Anyway, it passes all tests and improves on all efficiency fronts: about
2-4% faster (64-bit unix), 16 bytes less stack space per call (64-bit
unix) and slightly less executable size (unix and stmhal).
The reason it's more efficient is save_ip and save_sp were volatile
variables, so were anyway stored on the stack (in memory, not regs).
Thus converting them to code_state->{ip, sp} doesn't cost an extra
memory dereference (except maybe to get code_state, but that can be put
in a register and then made more efficient for other uses of it).
Conflicts:
py/vm.c
Fixed stack underflow check. Use UINT_FMT/INT_FMT where necessary.
Specify maximum VM-stack byte size by multiple of machine word size, so
that on 64 bit machines it has same functionality as 32 bit.
This improves stack usage in callers to mp_execute_bytecode2, and is step
forward towards unifying execution interface for function and generators
(which is important because generators don't even support full forms
of arguments passing (keywords, etc.)).