Commit Graph

41 Commits

Author SHA1 Message Date
Georges Toth
303aac8c56 bpo-30681: Support invalid date format or value in email Date header (GH-22090)
I am re-submitting an older PR which was abandoned but is still relevant, #10783 by @timb07.

The issue being solved () is still relevant. The original PR #10783 was closed as
the final request changes were not applied and since abandoned.

In this new PR I have re-used the original patch plus applied both comments from the review, by @maxking and @pganssle.


For reference, here is the original PR description:
In email.utils.parsedate_to_datetime(), a failure to parse the date, or invalid date components (such as hour outside 0..23) raises an exception. Document this behaviour, and add tests to test_email/test_utils.py to confirm this behaviour.

In email.headerregistry.DateHeader.parse(), check when parsedate_to_datetime() raises an exception and add a new defect InvalidDateDefect; preserve the invalid value as the string value of the header, but set the datetime attribute to None.

Add tests to test_email/test_headerregistry.py to confirm this behaviour; also added test to test_email/test_inversion.py to confirm emails with such defective date headers round trip successfully.

This pull request incorporates feedback gratefully received from @bitdancer, @brettcannon, @Mariatta and @warsaw, and replaces the earlier PR #2254.

Automerge-Triggered-By: GH:warsaw
2020-10-26 17:31:06 -07:00
Jürgen Gmach
66a65ba43c Improve readability of formataddr docstring (GH-20963)
For me as a non native English speaker, the sentence with its embedded clause was very hard to understand.

modified:   Lib/email/utils.py

Automerge-Triggered-By: @csabella
2020-06-19 04:57:30 -07:00
Michael Selik
2702638eab bpo-34002: Minor efficiency and clarity improvements in email package. (GH-7999)
* Check intersection of two sets explicitly

Comparing ``len(a) > ``len(a - b)`` is essentially looking for an
intersection between the two sets. If set ``b`` does not intersect ``a``
then ``len(a - b)`` will be equal to ``len(a)``. This logic is more
clearly expressed as ``a & b``.

* Change while/pop to a for-loop

Copying the list, then repeatedly popping the first element was
unnecessarily slow. I also cleaned up a couple other inefficiencies.
There's no need to unpack a tuple, then re-pack and append it. The list
can be created with the first element instead of empty. Secondly, the
``endswith`` method returns a bool, so there's no need for an if-
statement to set ``encoding`` to True or False.

* Use set.intersection to check for intersections

``a.intersection(b)`` method is more clear of purpose than ``not
a.isdisjoint(b)`` and avoids an unnecessary set construction that ``a &
set(b)`` performs.

* Use not isdisjoint instead of intersection

While it reads slightly worse, the isdisjoint method will stop when it
finds a counterexample and returns a bool, rather than looping over the
entire iterable and constructing a new set.
2019-09-19 20:25:55 -07:00
INADA Naoki
bf477a99e0 bpo-31677: email: Remove re.IGNORECASE flag (GH-3868)
While there is not real bug in this case, using re.IGNORECASE without re.ASCII
leads unexpected behavior.
Instead of adding re.ASCII, this commit removes re.IGNORECASE flag because
it's easier and simpler.

This commit removes dead copy of the pattern in email.util module too.
While the pattern is same, it is compiled separately because it had different flags.
2017-10-04 12:47:38 +09:00
Rohit Balasubramanian
9e7b9b21fe bpo-31507 Add docstring to parseaddr function in email.utils.parseaddr (gh-3647) 2017-09-19 15:10:49 -04:00
Martin Panter
6245cb3c01 Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
This affects documentation, code comments, and a debugging messages.
2016-04-15 02:14:19 +00:00
Robert Collins
2080dc97a7 Issue #22932: Fix timezones in email.utils.formatdate.
Patch from Dmitry Shachnev.
2015-08-01 08:18:22 +12:00
Serhiy Storchaka
ae760c0a2c Issue #6598: Increased time precision and random number range in
email.utils.make_msgid() to strengthen the uniqueness of the message ID.
2015-05-19 10:09:42 +03:00
R David Murray
95a8dfb924 #20976: remove unneeded quopri import in email.utils. 2014-03-23 14:18:44 -04:00
Victor Stinner
7fa767e517 Issue #20976: pyflakes: Remove unused imports 2014-03-20 09:16:38 +01:00
R David Murray
c489e83432 Merge: #17369: Improve handling of broken RFC2231 values in get_filename. 2014-02-07 15:04:26 -05:00
R David Murray
1e949890f6 #17369: Improve handling of broken RFC2231 values in get_filename.
This fixes a regression relative to python2.
2014-02-07 15:02:19 -05:00
R David Murray
3da240fd01 #18891: Complete new provisional email API.
This adds EmailMessage and, MIMEPart subclasses of Message
with new API methods, and a ContentManager class used by
the new methods.  Also a new policy setting, content_manager.

Patch was reviewed by Stephen J. Turnbull and Serhiy Storchaka,
and reflects their feedback.

I will ideally add some examples of using the new API to the
documentation before the final release.
2013-10-16 22:48:40 -04:00
R David Murray
b83ee30fc1 #11454: Reduce email module load time, improve surrogate check efficiency.
The new _has_surrogates code was suggested by Serhiy Storchaka.  See
the issue for timings, but it is far faster than any other alternative,
and also removes the load time that we previously incurred from compiling
the complex regex this replaces.
2013-06-26 12:06:21 -04:00
Andrew Svetlov
5b89840d9c Issue #16714: use 'raise' exceptions, don't 'throw'.
Patch by Serhiy Storchaka.
2012-12-18 21:26:36 +02:00
Georg Brandl
1aca31e8f3 Closes #15925: fix regression in parsedate() and parsedate_tz() that should return None if unable to parse the argument. 2012-09-22 09:03:56 +02:00
Alexander Belopolsky
f9bd9141c5 Issue #665194: Added a small optimization 2012-08-22 23:02:36 -04:00
R David Murray
097a1208bc #665194: fix variable name in exception code path.
It was correct in the original patch and I foobared it
when I restructured part of the code.
2012-08-22 21:52:31 -04:00
R David Murray
b8687df653 #665194: Update email.utils.localtime to use astimezone, and fix bug.
The new code correctly handles historic changes in UTC offsets.
A test for this should follow.

Original patch by Alexander Belopolsky.
2012-08-22 21:34:00 -04:00
R David Murray
d2d521eafd #665194: Add a localtime function to email.utils.
Without this function people would be tempted to use the other date functions
in email.utils to compute an aware localtime, and those functions are not as
good for that purpose as this code.  The code is Alexander Belopolsy's from
his proposed patch for issue 9527, with a fix (and additional tests) by Brian
K. Jones.
2012-05-25 23:22:59 -04:00
R David Murray
0b6f6c82b5 #12586: add provisional email policy with new header parsing and folding.
When the new policies are used (and only when the new policies are explicitly
used) headers turn into objects that have attributes based on their parsed
values, and can be set using objects that encapsulate the values, as well as
set directly from unicode strings.  The folding algorithm then takes care of
encoding unicode where needed, and folding according to the highest level
syntactic objects.

With this patch only date and time headers are parsed as anything other than
unstructured, but that is all the helper methods in the existing API handle.
I do plan to add more parsers, and complete the set specified in the RFC
before the package becomes stable.
2012-05-25 18:42:14 -04:00
R David Murray
c27e52265b #14731: refactor email policy framework.
This patch primarily does two things: (1) it adds some internal-interface
methods to Policy that allow for Policy to control the parsing and folding of
headers in such a way that we can construct a backward compatibility policy
that is 100% compatible with the 3.2 API, while allowing a new policy to
implement the email6 API.  (2) it adds that backward compatibility policy and
refactors the test suite so that the only differences between the 3.2
test_email.py file and the 3.3 test_email.py file is some small changes in
test framework and the addition of tests for bugs fixed that apply to the 3.2
API.

There are some additional teaks, such as moving just the code needed for the
compatibility policy into _policybase, so that the library code can import
only _policybase.  That way the new code that will be added for email6
will only get imported when a non-compatibility policy is imported.
2012-05-25 15:01:48 -04:00
R David Murray
b53319f509 #12818: remove escaping of () in quoted strings in formataddr
The quoting of ()s inside quoted strings is allowed by the RFC, but is not
needed.  There seems to be no reason to add needless escapes.
2012-03-14 15:31:47 -04:00
R David Murray
875048bd4c #665194: support roundtripping RFC2822 date stamps in the email.utils module 2011-07-20 11:41:21 -04:00
R David Murray
8debacb51c #1690608: make formataddr RFC2047 aware.
Patch by Torsten Becker.
2011-04-06 09:35:57 -04:00