Commit Graph

1414 Commits

Author SHA1 Message Date
Miss Islington (bot)
bdeb56cd21 bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
(cherry picked from commit 4013c17911)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-12-03 01:09:11 -08:00
Victor Stinner
85ab974f78 bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) (GH-10761)
Fix memory leak in PyUnicode_EncodeLocale() and
PyUnicode_EncodeFSDefault() on error handling.

Fix unicode_encode_locale() error handling.

(cherry picked from commit bde9d6bbb4)
2018-11-28 12:42:40 +01:00
Victor Stinner
7f9fb0f345 bpo-33954: Rewrite FILL() macro of unicodeobject.c (GH-10738)
Copy code from master: add assertions on start and value, replace 'i'
iterator with 'end' pointer for the loop stop condition.

_PyUnicode_FastFill(): fix type of 'data', it must not be constant,
since data is modified by FILL().
2018-11-27 12:42:04 +01:00
Victor Stinner
6f5fa1b4be bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623) (GH-10718)
Fix str.format(), float.__format__() and complex.__format__() methods
for non-ASCII decimal point when using the "n" formatter.

Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires
a _PyUnicodeWriter object for the buffer and a Python str object
for digits.

(cherry picked from commit 59423e3ddd)
2018-11-26 14:17:01 +01:00
Miss Islington (bot)
9fbcb1402e [3.7] bpo-35214: Fix OOB memory access in unicode escape parser (GH-10506) (GH-10522)
Discovered using clang's MemorySanitizer when it ran python3's
test_fstring test_misformed_unicode_character_name.

An msan build will fail by simply executing: ./python -c 'u"\N"'
(cherry picked from commit 746b2d35ea)


Co-authored-by: Gregory P. Smith <greg@krypto.org>


https://bugs.python.org/issue35214
2018-11-13 16:39:36 -08:00
Miss Islington (bot)
1e596d3a20 bpo-34435: Add missing NULL check to unicode_encode_ucs1(). (GH-8823)
Reported by Svace static analyzer.
(cherry picked from commit 74a307d48e)

Co-authored-by: Alexey Izbyshev <izbyshev@users.noreply.github.com>
2018-08-19 16:17:53 -04:00
Miss Islington (bot)
c721472fb8 bpo-34087: Fix buffer overflow in int(s) and similar functions (GH-8274)
`_PyUnicode_TransformDecimalAndSpaceToASCII()` missed trailing NUL char.
It caused buffer overflow in `_Py_string_to_number_with_underscores()`.

This bug is introduced in 9b6c60cb.
(cherry picked from commit 16dfca4d82)

Co-authored-by: INADA Naoki <methane@users.noreply.github.com>
2018-07-13 20:58:12 -07:00
Miss Islington (bot)
09819ef05a bpo-32827: Fix usage of _PyUnicodeWriter_Prepare() in decoding errors handler. (GH-5636) (GH-5650)
(cherry picked from commit b7e2d67f7c)


Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-02-13 09:15:57 +02:00
Xiang Zhang
86fdad093b bpo-32583: Fix possible crashing in builtin Unicode decoders (#5325)
When using customized decode error handlers, it is possible for builtin decoders
to write out-of-bounds and then crash.
2018-01-31 17:02:12 -05:00
INADA Naoki
7cc95f5069 Fix wrong assert in unicodeobject (GH-5340) 2018-01-28 02:07:09 +09:00
INADA Naoki
a49ac99029 bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342) 2018-01-27 14:06:21 +09:00
Victor Stinner
7ed7aead95 bpo-29240: Fix locale encodings in UTF-8 Mode (#5170)
Modify locale.localeconv(), time.tzname, os.strerror() and other
functions to ignore the UTF-8 Mode: always use the current locale
encoding.

Changes:

* Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or
  encoding error, they return the position of the error and an error
  message which are used to raise Unicode errors in
  PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale().
* Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx().
* PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all
  cases, especially for the strict error handler.
* Add _Py_DecodeUTF8Ex(): return more information on decoding error
  and supports the strict error handler.
* Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex().
* Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx().
* Ignore the UTF-8 mode to encode/decode localeconv(), strerror()
  and time zone name.
* Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize()
  and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use
  the "current" locale.
* Remove _PyUnicode_DecodeCurrentLocale(),
  _PyUnicode_DecodeCurrentLocaleAndSize() and
  _PyUnicode_EncodeCurrentLocale().
2018-01-15 10:45:49 +01:00
Victor Stinner
cb3ae5588b bpo-29240: Ignore UTF-8 Mode in time module (#5148)
time.strftime() must use the current LC_CTYPE encoding, not UTF-8
if the UTF-8 mode is enabled.

Add _PyUnicode_DecodeCurrentLocale() function.
2018-01-11 10:37:59 +01:00
Victor Stinner
2cba6b8579 bpo-29240: readline now ignores the UTF-8 Mode (#5145)
Add new fuctions ignoring the UTF-8 mode:

* _Py_DecodeCurrentLocale()
* _Py_EncodeCurrentLocale()
* _PyUnicode_DecodeCurrentLocaleAndSize()
* _PyUnicode_EncodeCurrentLocale()

Modify the readline module to use these functions.

Re-enable test_readline.test_nonascii().
2018-01-10 22:46:15 +01:00
Victor Stinner
9dd762013f bpo-32030: Add _Py_EncodeLocaleRaw() (#4961)
Replace Py_EncodeLocale() with _Py_EncodeLocaleRaw() in:

* _Py_wfopen()
* _Py_wreadlink()
* _Py_wrealpath()
* _Py_wstat()
* pymain_open_filename()

These functions are called early during Python intialization, only
the RAW memory allocator must be used.
2017-12-21 16:20:32 +01:00
Victor Stinner
e47e698da6 bpo-32030: Add _Py_EncodeUTF8_surrogateescape() (#4960)
Py_EncodeLocale() now uses _Py_EncodeUTF8_surrogateescape(), instead
of using temporary unicode and bytes objects. So Py_EncodeLocale()
doesn't use the Python C API anymore.
2017-12-21 15:45:16 +01:00
Serhiy Storchaka
a5552f023e bpo-32240: Add the const qualifier to declarations of PyObject* array arguments. (#4746) 2017-12-15 13:11:11 +02:00
Victor Stinner
91106cd9ff bpo-29240: PEP 540: Add a new UTF-8 Mode (#855)
* Add -X utf8 command line option, PYTHONUTF8 environment variable
  and a new sys.flags.utf8_mode flag.
* If the LC_CTYPE locale is "C" at startup: enable automatically the
  UTF-8 mode.
* Add _winapi.GetACP(). encodings._alias_mbcs() now calls
  _winapi.GetACP() to get the ANSI code page
* locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
  mode. As a side effect, open() now uses the UTF-8 encoding by
  default in this mode.
* Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
  in the UTF-8 Mode.
* Update subprocess._args_from_interpreter_flags() to handle -X utf8
* Skip some tests relying on the current locale if the UTF-8 mode is
  enabled.
* Add test_utf8mode.py.
* _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
  return also the length (number of wide characters).
* pymain_get_global_config() and pymain_set_global_config() now
  always copy flag values, rather than only copying if the new value
  is greater than the old value.
2017-12-13 12:29:09 +01:00
Victor Stinner
6a54c676e6 bpo-31979: Remove unused align_maxchar() function (#4527) 2017-11-23 19:02:23 +01:00
Serhiy Storchaka
9b6c60cbce bpo-31979: Simplify transforming decimals to ASCII (#4336)
in int(), float() and complex() parsers.

This also speeds up parsing non-ASCII numbers by around 20%.
2017-11-13 21:23:48 +02:00
Serhiy Storchaka
e2f92de6a9 Add the const qualifier to "char *" variables that refer to literal strings. (#4370) 2017-11-11 13:06:26 +02:00
stratakis
e8b1965639 bpo-23699: Use a macro to reduce boilerplate code in rich comparison functions (GH-793) 2017-11-02 20:32:54 +10:00
Serhiy Storchaka
a2314283ff bpo-20047: Make bytearray methods partition() and rpartition() rejecting (#4158)
separators that are not bytes-like objects.
2017-10-29 02:11:54 +03:00
Serhiy Storchaka
56cb465cc9 bpo-31825: Fixed OverflowError in the 'unicode-escape' codec (#4058)
and in codecs.escape_decode() when decode an escaped non-ascii byte.
2017-10-20 17:08:15 +03:00
Barry Warsaw
b2e5794870 bpo-31338 (#3374)
* Add Py_UNREACHABLE() as an alias to abort().
* Use Py_UNREACHABLE() instead of assert(0)
* Convert more unreachable code to use Py_UNREACHABLE()
* Document Py_UNREACHABLE() and a few other macros.
2017-09-14 18:13:16 -07:00