Fix the pthread+semaphore implementation of
PyThread_acquire_lock_timed() when called with timeout > 0 and
intr_flag=0: recompute the timeout if sem_timedwait() is interrupted
by a signal (EINTR).
See also the PEP 475.
The pthread implementation of PyThread_acquire_lock() now fails with
a fatal error if the timeout is larger than PY_TIMEOUT_MAX, as done
in the Windows implementation.
The check prevents any risk of overflow in PyThread_acquire_lock().
Add also PY_DWORD_MAX constant.
Issue #29234: Inlining _PyStack_AsTuple() into callers increases their stack
consumption, Disable inlining to optimize the stack consumption.
Add _Py_NO_INLINE: use __attribute__((noinline)) of GCC and Clang.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1040 => 976 (-64 B)
test_python_getitem: 976 => 912 (-64 B)
test_python_iterator: 1120 => 1056 (-64 B)
=> total: 3136 => 2944 (- 192 B)
When Python is not compiled with PGO, the performance of Python on call_simple
and call_method microbenchmarks depend highly on the code placement. In the
worst case, the performance slowdown can be up to 70%.
The GCC __attribute__((hot)) attribute helps to keep hot code close to reduce
the risk of such major slowdown. This attribute is ignored when Python is
compiled with PGO.
The following functions are considered as hot according to statistics collected
by perf record/perf report:
* _PyEval_EvalFrameDefault()
* call_function()
* _PyFunction_FastCall()
* PyFrame_New()
* frame_dealloc()
* PyErr_Occurred()