However, the patch in that tracker item is elaborated such that the newly
included unit test pass on Python 2.1 through 2.5. Note that Python 2.1's
uu.decode() does not have a 'quiet' argument, so we have to be sneaky.
Will port to email 3.0 (although without the backward compatible sneakiness).
(.set_payload() gives bad .get_payload() results). Specific changes include:
Simplfy the default CODEC_MAP in Charset.py to not include the Japanese and
Korean codecs. The names of the codecs are different depending on whether
you're using Python 2.4 and 2.5, which include the codecs by default, or
earlier Python's which provide the codecs under different names as a third
party library. Now, we attempt to discover which (if either) is available and
populate the CODEC_MAP as appropriate.
Message.set_charset(): When the message does not already have a
Content-Transfer-Encoding header, instead of just adding the header, we also
encode the body as defined by the assigned Charset. As before, if the
body_encoding is callable, we just call that. If not, then we add a call to
body_encode() before setting the header. This way, we guarantee that a
message's text payload is always encoded properly.
Remove the payload encoding code from Generator._handle_text(). With the
above patch, this would cause the body to be doubly encoded. Doing this in
the Message class is better than only doing it in the Generator.
Added some new tests to ensure everything works correctly. Also changed the
way the test_email_codecs.py tests get added (using the same lookup code that
the CODEC_MAP adjustments use).
This resolves both issues for email 2.5/Python 2.3. I will patch forward to
email 3.0 for both Python 2.4 and 2.5.
name in the Content-Distribution header, so Message.get_filename() should fall
back to using that. Will port both to email 3.0 and Python 2.5 trunk.
Also, bump the email package version to 2.5.7 for eventual release. Of
course, add a test case too.
XXX Need to update the documentation.
an RFC 2231-style header could be bogus or unknown to Python. In that case,
we return the the text part of the parameter undecoded. However, in
get_content_charset(), if that is not ascii, then it is an illegal charset and
so we return failobj.
Test cases and a version bump are included.
Committing this to the Python 2.3 branch because I need to generate an email
2.5.6 release that contains these patches. I will port these fixes to Python
2.4 and 2.5 for email 3.x.
quotes. Fixes SF bug #794466, with the essential patch provided by
Stuart D. Gathman. Specifically,
_parseparam(), _get_params_preserve(): Use the parsing function that
takes quotes into account, as given (essentially) in the bug report's
test program.
project, and with assistance from Oleg Broytmann. Specifically,
get_param(), get_params(): Document that these methods may return
parameter values that are either strings, or 3-tuples in the case of
RFC 2231 encoded parameters. The application should be prepared to
deal with such return values.
get_boundary(): Be prepared to deal with RFC 2231 encoded boundary
parameters. It makes little sense to have boundaries that are
anything but ascii, so if we get back a 3-tuple from get_param() we
will decode it into ascii and let any failures percolate up.
get_content_charset(): New method which treats the charset parameter
just like the boundary parameter in get_boundary(). Note that
"get_charset()" was already taken to return the default Charset
object.
get_charsets(): Rewrite to use get_content_charset().
Broytmann in SF patch #600096. Specifically, the former function now
encodes the triplets, while the latter adds optional charset and
language arguments.
2045, section 5.2 states that if the Content-Type: header is
syntactically invalid, the default type should be text/plain.
Implement minimal sanity checking of the header -- it must have
exactly one slash in it. This closes SF patch #597593 by Skip, but in
a different way.
Note that these methods used to raise ValueError for invalid ctypes,
but now they won't.