Commit Graph

217 Commits

Author SHA1 Message Date
Berker Peksag
7b4bcd2004 Issue #25270: Merge from 3.5 2016-09-16 17:32:06 +03:00
Berker Peksag
4a72a7b6c4 Issue #25270: Prevent codecs.escape_encode() from raising SystemError when an empty bytestring is passed 2016-09-16 17:31:06 +03:00
R David Murray
110b6fecbb #27364: Deprecate invalid escape strings in str/byutes.
Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.
2016-09-08 15:34:08 -04:00
R David Murray
44b548dda8 #27364: fix "incorrect" uses of escape character in the stdlib.
And most of the tools.

Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Steve Dower
f5aba58480 Issue #27959: Adds oem encoding, alias ansi to mbcs, move aliasmbcs to codec lookup 2016-09-06 19:42:27 -07:00
Serhiy Storchaka
e437a10d15 Issue #23277: Remove unused imports in tests. 2016-04-24 21:41:02 +03:00
Martin Panter
8b04a945ef Merge typo fixes from 3.5 2016-04-16 09:29:17 +00:00
Martin Panter
119e502277 Fix typos in code comments and documentation 2016-04-16 09:28:57 +00:00
Martin Panter
cda80940ed Issue #15984: Merge PyUnicode doc from 3.5 2016-04-15 02:27:11 +00:00
Martin Panter
6245cb3c01 Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
This affects documentation, code comments, and a debugging messages.
2016-04-15 02:14:19 +00:00
Martin Panter
e56a919100 Issue #25523: Merge a-to-an corrections from 3.5 2015-11-02 04:27:17 +00:00
Martin Panter
2eb819f7a8 Issue #25523: Merge "a" to "an" fixes from 3.4 into 3.5 2015-11-02 04:04:57 +00:00
Martin Panter
7462b64911 Issue #25523: Correct "a" article to "an" article
This changes the main documentation, doc strings, source code comments, and a
couple error messages in the test suite. In some cases the word was removed
or edited some other way to fix the grammar.
2015-11-02 03:37:02 +00:00
Victor Stinner
797485e101 Issue #25318: Avoid sprintf() in backslashreplace()
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().

Add also unit tests for non-BMP characters.
2015-10-09 03:17:30 +02:00
Victor Stinner
1d65d9192d Issue #25301: The UTF-8 decoder is now up to 15 times as fast for error
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
2015-10-05 13:43:50 +02:00
Serhiy Storchaka
29e68edbf4 Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:14:03 +03:00
Serhiy Storchaka
58c8f2bb6d Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:13:14 +03:00
Serhiy Storchaka
28b21e50c8 Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
2015-10-02 13:07:28 +03:00
Victor Stinner
01ada3996b Issue #25267: The UTF-8 encoder is now up to 75 times as fast for error
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
2015-10-01 21:54:51 +02:00
Victor Stinner
c3713e9706 Optimize ascii/latin1+surrogateescape encoders
Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.

Initial patch written by Serhiy Storchaka.
2015-09-29 12:32:13 +02:00
Victor Stinner
f96418de05 Issue #24870: Optimize the ASCII decoder for error handlers: surrogateescape,
ignore and replace. Initial patch written by Naoki Inada.

The decoder is now up to 60 times as fast for these error handlers.

Add also unit tests for the ASCII decoder.
2015-09-21 23:06:27 +02:00
Martin Panter
9ab96946ee Issue #16473: Merge codecs doc and test from 3.4 into 3.5 2015-09-12 01:22:17 +00:00
Martin Panter
06171bd52a Issue #16473: Fix byte transform codec documentation; test quotetabs=True
This changes the equivalent functions listed for the Base-64, hex and Quoted-
Printable codecs to reflect the functions actually used. Also mention and
test the "quotetabs" setting for Quoted-Printable encoding.
2015-09-12 00:34:28 +00:00
Serhiy Storchaka
f0eeedf0d8 Issue #22681: Added support for the koi8_t encoding. 2015-05-12 23:24:19 +03:00
Serhiy Storchaka
ad8a1c3fb2 Issue #22682: Added support for the kz1048 encoding. 2015-05-12 23:16:55 +03:00