any errors that might occur during coercion of the left operand and
turning them into a TypeError with a message text that was confusing in
the given context. This patch lets any errors through, as was already
done during coercion of the right hand side.
Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
If anyone wants to clean up the documentation, feel free. It's my first documentation foray, and it's not that great.
Will port to py3k with a different strategy.
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
(backport of r70452 from py3k to trunk)