42 Commits

Author SHA1 Message Date
Barry Warsaw
dc74e34e24 Resolve SF bug 1409403: email.Message should supress warning from uu.decode.
However, the patch in that tracker item is elaborated such that the newly
included unit test pass on Python 2.1 through 2.5.  Note that Python 2.1's
uu.decode() does not have a 'quiet' argument, so we have to be sneaky.

Will port to email 3.0 (although without the backward compatible sneakiness).
2006-02-09 04:03:22 +00:00
Barry Warsaw
f5853f7592 Patches to address SF bugs 1409538 (Japanese codecs in CODEC_MAP) and 1409455
(.set_payload() gives bad .get_payload() results).  Specific changes include:

Simplfy the default CODEC_MAP in Charset.py to not include the Japanese and
Korean codecs.  The names of the codecs are different depending on whether
you're using Python 2.4 and 2.5, which include the codecs by default, or
earlier Python's which provide the codecs under different names as a third
party library.  Now, we attempt to discover which (if either) is available and
populate the CODEC_MAP as appropriate.

Message.set_charset(): When the message does not already have a
Content-Transfer-Encoding header, instead of just adding the header, we also
encode the body as defined by the assigned Charset.  As before, if the
body_encoding is callable, we just call that.  If not, then we add a call to
body_encode() before setting the header.  This way, we guarantee that a
message's text payload is always encoded properly.

Remove the payload encoding code from Generator._handle_text().  With the
above patch, this would cause the body to be doubly encoded.  Doing this in
the Message class is better than only doing it in the Generator.

Added some new tests to ensure everything works correctly.  Also changed the
way the test_email_codecs.py tests get added (using the same lookup code that
the CODEC_MAP adjustments use).

This resolves both issues for email 2.5/Python 2.3.  I will patch forward to
email 3.0 for both Python 2.4 and 2.5.
2006-02-08 13:33:20 +00:00
Barry Warsaw
4e71930f41 SF bug #1403349 solution for email 2.5; some MUAs use the 'file' parameter
name in the Content-Distribution header, so Message.get_filename() should fall
back to using that.  Will port both to email 3.0 and Python 2.5 trunk.

Also, bump the email package version to 2.5.7 for eventual release.  Of
course, add a test case too.

XXX Need to update the documentation.
2006-01-17 04:34:54 +00:00
Barry Warsaw
712d474d3c get_filename(), get_content_charset(): It's possible that the charset named in
an RFC 2231-style header could be bogus or unknown to Python.  In that case,
we return the the text part of the parameter undecoded.  However, in
get_content_charset(), if that is not ascii, then it is an illegal charset and
so we return failobj.

Test cases and a version bump are included.

Committing this to the Python 2.3 branch because I need to generate an email
2.5.6 release that contains these patches.  I will port these fixes to Python
2.4 and 2.5 for email 3.x.
2005-04-29 12:12:02 +00:00
Barry Warsaw
3497155a03 get_boundary(): Fix for SF bug #1060941. RFC 2046 says boundaries may begin
-- but not end -- with whitespace.
2004-11-06 00:14:05 +00:00
Barry Warsaw
c5a6b23719 __getitem__(): Fix docstring, SF 979924. 2004-09-28 04:55:34 +00:00
Barry Warsaw
8dcec1381e Test cases and fixes for bugs described in patch #873418: email/Message.py:
del_param fails when specifying a header.

I'll port this to Python 2.4 shortly.
2004-08-16 15:31:43 +00:00
Walter Dörwald
4958f2741a Backport checkin:
Fix a bunch of typos in documentation, docstrings and comments.
(From SF patch #810751)
2003-10-20 14:34:48 +00:00
Barry Warsaw
b1919b7ff5 A fix for parsing parameters when there are semicolons inside the
quotes.  Fixes SF bug #794466, with the essential patch provided by
Stuart D. Gathman.  Specifically,

_parseparam(), _get_params_preserve(): Use the parsing function that
takes quotes into account, as given (essentially) in the bug report's
test program.
2003-09-03 04:21:29 +00:00
Barry Warsaw
89af2f65dc Backporting email 2.5.4 fixes from the trunk. 2003-08-19 04:56:46 +00:00
Barry Warsaw
6754d52521 get_payload(): Improve the TypeError message when the payload isn't of
the expected type.  In response to SF #751451.
2003-06-10 16:31:55 +00:00
Barry Warsaw
482c5f7eb7 as_string(): Added some text to the docstring to make it clear that
it's a convenience only and give hints on what to do for more
flexibility.
2003-04-18 23:04:35 +00:00
Barry Warsaw
08898499b2 get_payload(): Teach this about various uunencoded
Content-Transfer-Encodings
2003-03-11 04:33:30 +00:00
Barry Warsaw
21191d3e31 get_payload(): If we get a low-level binascii.Error when base64
decoding the payload, just return it as-is.
2003-03-10 16:13:14 +00:00
Barry Warsaw
ee07cb1d70 get_content_charset(): RFC 2046 $4.1.2 says charsets are not case
sensitive.  Coerce the argument to lower case.
2002-10-10 15:13:26 +00:00
Barry Warsaw
42d1d3edc0 __contains__(): Change the second argument to `name' for consistency.
I seriously doubt this will break any deployed code.

Docstring consistency with the updated .tex files.
2002-09-30 18:17:35 +00:00
Barry Warsaw
4ece778bbc is_multipart(): Use isinstance() instead of type equality. 2002-09-28 20:41:39 +00:00
Barry Warsaw
c494549566 Docstring and code cleanups, e.g. use True/False everywhere. 2002-09-28 20:40:25 +00:00
Barry Warsaw
15aefa94d0 Fixing some RFC 2231 related issues as reported in the Spambayes
project, and with assistance from Oleg Broytmann.  Specifically,

get_param(), get_params(): Document that these methods may return
parameter values that are either strings, or 3-tuples in the case of
RFC 2231 encoded parameters.  The application should be prepared to
deal with such return values.

get_boundary(): Be prepared to deal with RFC 2231 encoded boundary
parameters.  It makes little sense to have boundaries that are
anything but ascii, so if we get back a 3-tuple from get_param() we
will decode it into ascii and let any failures percolate up.

get_content_charset(): New method which treats the charset parameter
just like the boundary parameter in get_boundary().  Note that
"get_charset()" was already taken to return the default Charset
object.

get_charsets(): Rewrite to use get_content_charset().
2002-09-26 17:19:34 +00:00
Barry Warsaw
fbcde75c70 get_payload(): Document that calling it with no arguments returns a
reference to the payload.
2002-09-11 14:11:35 +00:00
Barry Warsaw
3c25535dc8 _formatparam(), set_param(): RFC 2231 encoding support by Oleg
Broytmann in SF patch #600096.  Specifically, the former function now
encodes the triplets, while the latter adds optional charset and
language arguments.
2002-09-06 03:55:04 +00:00
Barry Warsaw
229727fa07 replace_header(): New method given by Skip Montanaro in SF patch
#601959.  Modified slightly by Barry (who added the KeyError in case
the header is missing.
2002-09-06 03:38:12 +00:00
Barry Warsaw
48b0d36b4d Typo 2002-08-27 22:34:44 +00:00
Tim Peters
280488b9a3 Whitespace normalization. 2002-08-23 18:19:30 +00:00
Barry Warsaw
f36d804b3b get_content_type(), get_content_maintype(), get_content_subtype(): RFC
2045, section 5.2 states that if the Content-Type: header is
syntactically invalid, the default type should be text/plain.
Implement minimal sanity checking of the header -- it must have
exactly one slash in it.  This closes SF patch #597593 by Skip, but in
a different way.

Note that these methods used to raise ValueError for invalid ctypes,
but now they won't.
2002-08-20 14:50:09 +00:00