-DCALL_PROFILE: Count the number of function calls executed.
When this symbol is defined, the ceval mainloop and helper functions
count the number of function calls made. It keeps detailed statistics
about what kind of object was called and whether the call hit any of
the special fast paths in the code.
Optimization:
When we take the fast_function() path, which seems to be taken for
most function calls, and there is minimal frame setup to do, avoid
call PyEval_EvalCodeEx(). The eval code ex function does a lot of
work to handle keywords args and star args, free variables,
generators, etc. The inlined version simply allocates the frame and
copies the arguments values into the frame.
The optimization gets a little help from compile.c which adds a
CO_NOFREE flag to code objects that don't have free variables or cell
variables. This change allows fast_function() to get into the fast
path with fewer tests.
I measure a couple of percent speedup in pystone with this change, but
there's surely more that can be done.
globals, _Py_Ticker and _Py_CheckInterval. This also implements Jeremy's
shortcut in Py_AddPendingCall that zeroes out _Py_Ticker. This allows the
test in the main loop to only test a single value.
The gory details are at
http://python.org/sf/602191
that info to code dynamically compiled *by* code compiled with generators
enabled. Doesn't yet work because there's still no way to tell the parser
that "yield" is OK (unlike nested_scopes, the parser has its fingers in
this too).
Replaced PyEval_GetNestedScopes by a more-general
PyEval_MergeCompilerFlags. Perhaps I should not have? I doubted it was
*intended* to be part of the public API, so just did.
Python interpreter.
This change adds two new C-level APIs: PyEval_SetProfile() and
PyEval_SetTrace(). These can be used to install profile and trace
functions implemented in C, which can operate at much higher speeds
than Python-based functions. The overhead for calling a C-based
profile function is a very small fraction of a percent of the overhead
involved in calling a Python-based function.
The machinery required to call a Python-based profile or trace
function been moved to sysmodule.c, where sys.setprofile() and
sys.setprofile() simply become users of the new interface.
Introduce truly separate (sub)interpreter objects. For now, these
must be used by separate threads, created from C. See Demo/pysvr for
an example of how to use this. This also rationalizes Python's
initialization and finalization behavior:
Py_Initialize() -- initialize the whole interpreter
Py_Finalize() -- finalize the whole interpreter
tstate = Py_NewInterpreter() -- create a new (sub)interpreter
Py_EndInterpreter(tstate) -- delete a new (sub)interpreter
There are also new interfaces relating to threads and the interpreter
lock, which can be used to create new threads, and sometimes have to
be used to manipulate the interpreter lock when creating or deleting
sub-interpreters. These are only defined when WITH_THREAD is defined:
PyEval_AcquireLock() -- acquire the interpreter lock
PyEval_ReleaseLock() -- release the interpreter lock
PyEval_AcquireThread(tstate) -- acquire the lock and make the thread current
PyEval_ReleaseThread(tstate) -- release the lock and make NULL current
Other administrative changes:
- The header file bltinmodule.h is deleted.
- The init functions for Import, Sys and Builtin are now internal and
declared in pythonrun.h.
- Py_Setup() and Py_Cleanup() are no longer declared.
- The interpreter state and thread state structures are now linked
together in a chain (the chain of interpreters is a static variable
in pythonrun.c).
- Some members of the interpreter and thread structures have new,
shorter, more consistent, names.
- Added declarations for _PyImport_{Find,Fixup}Extension() to import.h.