instead. This seems more robust than returning an Unicode string with
some unconverted charcters in it.
This still doesn't support getting truly binary data out of Tcl, since
we look for the trailing null byte; but the old (pre-Unicode) code did
this too, so apparently there's no need. (Plus, I really don't feel
like finding out how Tcl deals with this in each version.)
1. In Tcl 8.2 and later, use Tcl_NewUnicodeObj() when passing a Python
Unicode object rather than going through UTF-8. (This function
doesn't exist in Tcl 8.1, so there the original UTF-8 code is still
used; in Tcl 8.0 there is no support for Unicode.) This assumes that
Tcl_UniChar is the same thing as Py_UNICODE; a run-time error is
issued if this is not the case.
2. In Tcl 8.1 and later (i.e., whenever Tcl supports Unicode), when a
string returned from Tcl contains bytes with the top bit set, we
assume it is encoded in UTF-8, and decode it into a Unicode string
object.
Notes:
- Passing Unicode strings to Tcl 8.0 does not do the right thing; this
isn't worth fixing.
- When passing an 8-bit string to Tcl 8.1 or later that has bytes with
the top bit set, Tcl tries to interpret it as UTF-8; it seems to fall
back on Latin-1 for non-UTF-8 bytes. I'm not sure what to do about
this besides telling the user to disambiguate such strings by
converting them to Unicode (forcing the user to be explicit about the
encoding).
- Obviously it won't be possible to get binary data out of Tk this
way. Do we need that ability? How to do it?
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
None in an argument list *terminates* the argument list: further
arguments are *ignored*. This isn't kosher, but too much code relies
on it, implicitly. For example, IDLE was pretty broken.
This was originally submitted by Martin von Loewis as part of his
Unicode patch; all I did was add special cases for Python int and
float objects and rearrange the object type tests somewhat to speed up
the common cases (string, int, float, tuple, unicode, object).
thread state of the thread calling mainloop() (or another event
handling function) rather than the thread state of the function that
created the client data structure.
low-level Python exit handler. This can attempt to call Python code
at a point that the interpreter and thread state have already been
destroyed, causing a Bus Error. Given the intended use of
Py_AtExit(), I'm not convinced that it's a good idea to call it
earlier during Python's finalization sequence... (Although this is
the only use for it in the entire distribution.)
PythonCmd_Error() but failed to return. The error wasn't very likely
(only when we run out of memory) but since the check is there we might
as well return the error. (I think that Barry introduced this buglet
when he added error checks everywhere.)