If a PyTokenizer_FromString() is called with an empty string, the
tokenizer's line_start member never gets initialized. Later, it is
compared with the token pointer 'a' in parsetok.c:193 and that behavior
can result in undefined behavior.
Added checks for integer overflows, contributed by Google. Some are
only available if asserts are left in the code, in cases where they
can't be triggered from Python code.
Rather than sprinkle casts throughout the code, change Py_CHARMASK to
always cast it's result to an unsigned char. This should ensure we
do the right thing when accessing an array with the result.
The new PyParser_*Ex() functions are based on Neal's suggestion and initial patch. The new __future__ feature makes all '' and r'' unicode strings. b'' and br'' stay (byte) strings.
This work is substantially Anthony Baxter's, from issue
1633807. I just freshened it, made a few minor tweaks,
and added the test cases. I also created issue 2412,
which is to check for 2to3's behavior with the print
function. I also added myself to ACKS.
Added 0b and 0o literals to tokenizer.
Modified PyOS_strtoul to support 0b and 0o inputs.
Modified PyLong_FromString to support guessing 0b and 0o inputs.
Renamed test_hexoct.py to test_int_literal.py and added binary tests.
Added upper and lower case 0b, 0O, and 0X tests to test_int_literal.py