Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
If anyone wants to clean up the documentation, feel free. It's my first documentation foray, and it's not that great.
Will port to py3k with a different strategy.
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
(backport of r70452 from py3k to trunk)
the freelist contained half-initialized objects with freed pointers.
The comment
/* XXX UNREF/NEWREF interface should be more symmetrical */
was copied from tupleobject.c, and appears in some other places.
I sign the petition.