The new _has_surrogates code was suggested by Serhiy Storchaka. See
the issue for timings, but it is far faster than any other alternative,
and also removes the load time that we previously incurred from compiling
the complex regex this replaces.
Without this function people would be tempted to use the other date functions
in email.utils to compute an aware localtime, and those functions are not as
good for that purpose as this code. The code is Alexander Belopolsy's from
his proposed patch for issue 9527, with a fix (and additional tests) by Brian
K. Jones.
When the new policies are used (and only when the new policies are explicitly
used) headers turn into objects that have attributes based on their parsed
values, and can be set using objects that encapsulate the values, as well as
set directly from unicode strings. The folding algorithm then takes care of
encoding unicode where needed, and folding according to the highest level
syntactic objects.
With this patch only date and time headers are parsed as anything other than
unstructured, but that is all the helper methods in the existing API handle.
I do plan to add more parsers, and complete the set specified in the RFC
before the package becomes stable.
This patch primarily does two things: (1) it adds some internal-interface
methods to Policy that allow for Policy to control the parsing and folding of
headers in such a way that we can construct a backward compatibility policy
that is 100% compatible with the 3.2 API, while allowing a new policy to
implement the email6 API. (2) it adds that backward compatibility policy and
refactors the test suite so that the only differences between the 3.2
test_email.py file and the 3.3 test_email.py file is some small changes in
test framework and the addition of tests for bugs fixed that apply to the 3.2
API.
There are some additional teaks, such as moving just the code needed for the
compatibility policy into _policybase, so that the library code can import
only _policybase. That way the new code that will be added for email6
will only get imported when a non-compatibility policy is imported.
Code contributed by Matt Giuca. quote() now encodes the input
before quoting, unquote() decodes after unquoting. There are
new arguments to change the encoding and errors settings.
There are also new APIs to skip the encode/decode steps.
[un]quote_plus() are also affected.
It consists of code from urllib, urllib2, urlparse, and robotparser.
The old modules have all been removed. The new package has five
submodules: urllib.parse, urllib.request, urllib.response,
urllib.error, and urllib.robotparser. The urllib.request.urlopen()
function uses the url opener from urllib2.
Note that the unittests have not been renamed for the
beta, but they will be renamed in the future.
Joint work with Senthil Kumaran.
MIMEApplication() requires a bytes object for its _data, so fix the tests.
We no longer need utils._identity() or utils._bdecode(). The former isn't
used anywhere AFAICT (where's "make test's" lint? <wink>) and the latter is a
kludge that is eliminated by base64.b64encode().
Current status: 5F/5E
This should restore the email package in the py3k branch to exactly what's in
the sandbox.
This wipes out 1-2 fixes made post-copy, which I'll re-apply shortly.
Completely get rid of StringIO.py and cStringIO.c.
I had to fix a few tests and modules beyond what Christian did, and
invent a few conventions. E.g. in elementtree, I chose to
write/return Unicode strings whe no encoding is given, but bytes when
an explicit encoding is given. Also mimetools was made to always
assume binary files.