There is no need to have three copies of the exception object on the top of
the native value stack. Instead, the values on the stack should be the
first two items in an nlr_buf_t: the prev pointer and the ret_val pointer.
This is all that is needed and is what the rest of the native emitter
expects is on the stack.
This patch is essentially an optimisation. Behaviour is unchanged,
although the stack layout for native exception handling now makes more
sense.
A native function allocates space on its C stack for mp_code_state_t,
followed by its Python stack, then its locals. This patch makes sure that
the native function actually starts at the start of its Python stack,
rather than at the start of mp_code_state_t (which didn't lead to any
issues so far because the mp_code_state_t is unused after the native
function sets itself up).
On x86 archs (both 32 and 64 bit) a bool return value only sets the 8-bit
al register, and the higher bits of the ax register have an undefined
value. When testing the return value of such cases it is required to just
test al for zero/non-zero. On the other hand, checking for truth or
zero/non-zero on an integer return value requires checking all bits of the
register. These two cases must be distinguished and handled correctly in
generated native code. This patch makes sure of this.
For other supported native archs (ARM, Thumb2, Xtensa) there is no such
distinction and this patch does not change anything for them.
DEBUG_printf and MICROPY_DEBUG_PRINTER is now used instead of normal
printf, and a fault is fixed in mp_obj_class_lookup with debugging enabled;
see issue #3999. Debugging can now be enabled on all ports including when
nan-boxing is used.
This patch in effect renames MICROPY_DEBUG_PRINTER_DEST to
MICROPY_DEBUG_PRINTER, moving its default definition from
lib/utils/printf.c to py/mpconfig.h to make it official and documented, and
makes this macro a pointer rather than the actual mp_print_t struct. This
is done to get consistency with MICROPY_ERROR_PRINTER, and provide this
macro for use outside just lib/utils/printf.c.
Ports are updated to use the new macro name.
This patch makes the Thumb-2 native emitter use wide ldr instructions to
call into the runtime, when the index into the native glue function table
is 32 or greater. This reduces the generated assembler code from 10 bytes
to 6 bytes, saving RAM and making native code run about 0.8% faster.
This error message did not consume all of its variable args, a bug
introduced long ago in baf6f14deb. By fixing
it to use %s (instead of keeping the string as-is and deleting the last
arg) the same error message string is now reused three times in this format
function and gives a code size reduction of around 130 bytes. It also now
gives a better error message when a non-string is passed in as an argument
to format, eg '{:d}'.format([]).
There's no need to call mp_obj_new_int() which will just fail the check for
small int and call mp_obj_new_int_from_ll() anyway.
Thanks to @Jongy for prompting this change.
In non-debug mode MP_OBJ_STOP_ITERATION is zero and comparing something to
zero can be done more efficiently in assembler than comparing to a non-zero
value.
With the recent change b488a4a848, a
generating function now has the same layout in memory as a normal bytecode
function, and so can reuse the latter's attribute accessor code to
implement __name__.
Because this function is simple it saves code size to have it inlined.
Being an auxiliary helper function (and only used in the py/ core) the
argument should always be an mp_obj_module_t*, so there's no need for the
assert (and having it would require including assert.h in obj.h).
It's a very simple function and saves code, and improves efficiency, by
being inline. Note that this is an auxiliary helper function and so
doesn't need mp_check_self -- that's used for functions that can be
accessed directly from Python code (eg from a method table).
mp_obj_module_get_globals() returns a mp_obj_dict_t*, and type->locals_dict
is a mp_obj_dict_t*, so access the map entry of the dict directly instead
of needing to cast this mp_obj_dict_t* up to an object and then calling the
mp_obj_dict_get_map() helper function.
For generating functions there is no need to wrap the bytecode function in
a generator wrapper instance. Instead the type of the bytecode function
can be changed to mp_type_gen_wrap. This reduces code size and saves a
block of GC heap RAM for each generator.
This feature is controlled at compile time by MICROPY_PY_URE_SUB, disabled
by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
This feature is controlled at compile time by
MICROPY_PY_URE_MATCH_SPAN_START_END, disabled by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
This feature is controlled at compile time by MICROPY_PY_URE_MATCH_GROUPS,
disabled by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
Before this patch the context manager's __aexit__() method would not be
executed if a return/break/continue statement was used to exit an async
with block. async with now has the same semantics as normal with.
The fix here applies purely to the compiler, and does not modify the
runtime at all. It might (eventually) be better to define new bytecode(s)
to handle async with (and maybe other async constructs) in a cleaner, more
efficient way.
One minor drawback with addressing this issue purely in the compiler is
that it wasn't possible to get 100% CPython semantics. The thing that is
different here to CPython is that the __aexit__ method is not looked up in
the context manager until it is needed, which is after the body of the
async with statement has executed. So if a context manager doesn't have
__aexit__ then CPython raises an exception before the async with is
executed, whereas uPy will raise it after it is executed. Note that
__aenter__ is looked up at the beginning in uPy because it needs to be
called straightaway, so if the context manager isn't a context manager then
it'll still raise an exception at the same location as CPython. The only
difference is if the context manager has the __aenter__ method but not the
__aexit__ method, then in that case uPy has different behaviour. But this
is a very minor, and acceptable, difference.