If Py_BUILD_CORE is defined, the PyThreadState_GET() macro access
_PyRuntime which comes from the internal pycore_state.h header.
Public headers must not require internal headers.
Move PyThreadState_GET() and _PyInterpreterState_GET_UNSAFE() from
Include/pystate.h to Include/internal/pycore_state.h, and rename
PyThreadState_GET() to _PyThreadState_GET() there.
The PyThreadState_GET() macro of pystate.h is now redefined when
pycore_state.h is included, to use the fast _PyThreadState_GET().
Changes:
* Add _PyThreadState_GET() macro
* Replace "PyThreadState_GET()->interp" with
_PyInterpreterState_GET_UNSAFE()
* Replace PyThreadState_GET() with _PyThreadState_GET() in internal C
files (compiled with Py_BUILD_CORE defined), but keep
PyThreadState_GET() in the public header files.
* _testcapimodule.c: replace PyThreadState_GET() with
PyThreadState_Get(); the module is not compiled with Py_BUILD_CORE
defined.
* pycore_state.h now requires Py_BUILD_CORE to be defined.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
According to the comment, there was previously a call to PyObject_IsSubclass() involved which could fail, but since it was replaced with a call to PyType_IsSubtype(), it can no longer fail.
Replace
_PyObject_CallArg1(func, arg)
with
PyObject_CallFunctionObjArgs(func, arg, NULL)
Using the _PyObject_CallArg1() macro increases the usage of the C stack, which
was unexpected and unwanted. PyObject_CallFunctionObjArgs() doesn't have this
issue.
When Python is not compiled with PGO, the performance of Python on call_simple
and call_method microbenchmarks depend highly on the code placement. In the
worst case, the performance slowdown can be up to 70%.
The GCC __attribute__((hot)) attribute helps to keep hot code close to reduce
the risk of such major slowdown. This attribute is ignored when Python is
compiled with PGO.
The following functions are considered as hot according to statistics collected
by perf record/perf report:
* _PyEval_EvalFrameDefault()
* call_function()
* _PyFunction_FastCall()
* PyFrame_New()
* frame_dealloc()
* PyErr_Occurred()
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]