1416 Commits

Author SHA1 Message Date
Miss Islington (bot)
cbc7c2c791 bpo-35552: Fix reading past the end in PyUnicode_FromFormat() and PyBytes_FromFormat(). (GH-11276)
Format characters "%s" and "%V" in PyUnicode_FromFormat() and "%s" in PyBytes_FromFormat()
no longer read memory past the limit if precision is specified.
(cherry picked from commit d586ccb04f)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2019-01-12 00:52:55 -08:00
Miss Islington (bot)
9a413faa87 bpo-35560: Remove assertion from format(float, "n") (GH-11288)
Fix an assertion error in format() in debug build for floating point
formatting with "n" format, zero padding and small width. Release build is
not impacted. Patch by Karthikeyan Singaravelan.
(cherry picked from commit 3f7983a25a)

Co-authored-by: Xtreak <tir.karthi@gmail.com>
2019-01-07 07:26:20 -08:00
Miss Islington (bot)
bdeb56cd21 bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
(cherry picked from commit 4013c17911)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-12-03 01:09:11 -08:00
Victor Stinner
85ab974f78 bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) (GH-10761)
Fix memory leak in PyUnicode_EncodeLocale() and
PyUnicode_EncodeFSDefault() on error handling.

Fix unicode_encode_locale() error handling.

(cherry picked from commit bde9d6bbb4)
2018-11-28 12:42:40 +01:00
Victor Stinner
7f9fb0f345 bpo-33954: Rewrite FILL() macro of unicodeobject.c (GH-10738)
Copy code from master: add assertions on start and value, replace 'i'
iterator with 'end' pointer for the loop stop condition.

_PyUnicode_FastFill(): fix type of 'data', it must not be constant,
since data is modified by FILL().
2018-11-27 12:42:04 +01:00
Victor Stinner
6f5fa1b4be bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623) (GH-10718)
Fix str.format(), float.__format__() and complex.__format__() methods
for non-ASCII decimal point when using the "n" formatter.

Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires
a _PyUnicodeWriter object for the buffer and a Python str object
for digits.

(cherry picked from commit 59423e3ddd)
2018-11-26 14:17:01 +01:00
Miss Islington (bot)
9fbcb1402e [3.7] bpo-35214: Fix OOB memory access in unicode escape parser (GH-10506) (GH-10522)
Discovered using clang's MemorySanitizer when it ran python3's
test_fstring test_misformed_unicode_character_name.

An msan build will fail by simply executing: ./python -c 'u"\N"'
(cherry picked from commit 746b2d35ea)


Co-authored-by: Gregory P. Smith <greg@krypto.org>


https://bugs.python.org/issue35214
2018-11-13 16:39:36 -08:00
Miss Islington (bot)
1e596d3a20 bpo-34435: Add missing NULL check to unicode_encode_ucs1(). (GH-8823)
Reported by Svace static analyzer.
(cherry picked from commit 74a307d48e)

Co-authored-by: Alexey Izbyshev <izbyshev@users.noreply.github.com>
2018-08-19 16:17:53 -04:00
Miss Islington (bot)
c721472fb8 bpo-34087: Fix buffer overflow in int(s) and similar functions (GH-8274)
`_PyUnicode_TransformDecimalAndSpaceToASCII()` missed trailing NUL char.
It caused buffer overflow in `_Py_string_to_number_with_underscores()`.

This bug is introduced in 9b6c60cb.
(cherry picked from commit 16dfca4d82)

Co-authored-by: INADA Naoki <methane@users.noreply.github.com>
2018-07-13 20:58:12 -07:00
Miss Islington (bot)
09819ef05a bpo-32827: Fix usage of _PyUnicodeWriter_Prepare() in decoding errors handler. (GH-5636) (GH-5650)
(cherry picked from commit b7e2d67f7c)


Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-02-13 09:15:57 +02:00
Xiang Zhang
86fdad093b bpo-32583: Fix possible crashing in builtin Unicode decoders (#5325)
When using customized decode error handlers, it is possible for builtin decoders
to write out-of-bounds and then crash.
2018-01-31 17:02:12 -05:00
INADA Naoki
7cc95f5069 Fix wrong assert in unicodeobject (GH-5340) 2018-01-28 02:07:09 +09:00
INADA Naoki
a49ac99029 bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342) 2018-01-27 14:06:21 +09:00
Victor Stinner
7ed7aead95 bpo-29240: Fix locale encodings in UTF-8 Mode (#5170)
Modify locale.localeconv(), time.tzname, os.strerror() and other
functions to ignore the UTF-8 Mode: always use the current locale
encoding.

Changes:

* Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or
  encoding error, they return the position of the error and an error
  message which are used to raise Unicode errors in
  PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale().
* Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx().
* PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all
  cases, especially for the strict error handler.
* Add _Py_DecodeUTF8Ex(): return more information on decoding error
  and supports the strict error handler.
* Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex().
* Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx().
* Ignore the UTF-8 mode to encode/decode localeconv(), strerror()
  and time zone name.
* Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize()
  and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use
  the "current" locale.
* Remove _PyUnicode_DecodeCurrentLocale(),
  _PyUnicode_DecodeCurrentLocaleAndSize() and
  _PyUnicode_EncodeCurrentLocale().
2018-01-15 10:45:49 +01:00
Victor Stinner
cb3ae5588b bpo-29240: Ignore UTF-8 Mode in time module (#5148)
time.strftime() must use the current LC_CTYPE encoding, not UTF-8
if the UTF-8 mode is enabled.

Add _PyUnicode_DecodeCurrentLocale() function.
2018-01-11 10:37:59 +01:00
Victor Stinner
2cba6b8579 bpo-29240: readline now ignores the UTF-8 Mode (#5145)
Add new fuctions ignoring the UTF-8 mode:

* _Py_DecodeCurrentLocale()
* _Py_EncodeCurrentLocale()
* _PyUnicode_DecodeCurrentLocaleAndSize()
* _PyUnicode_EncodeCurrentLocale()

Modify the readline module to use these functions.

Re-enable test_readline.test_nonascii().
2018-01-10 22:46:15 +01:00
Victor Stinner
9dd762013f bpo-32030: Add _Py_EncodeLocaleRaw() (#4961)
Replace Py_EncodeLocale() with _Py_EncodeLocaleRaw() in:

* _Py_wfopen()
* _Py_wreadlink()
* _Py_wrealpath()
* _Py_wstat()
* pymain_open_filename()

These functions are called early during Python intialization, only
the RAW memory allocator must be used.
2017-12-21 16:20:32 +01:00
Victor Stinner
e47e698da6 bpo-32030: Add _Py_EncodeUTF8_surrogateescape() (#4960)
Py_EncodeLocale() now uses _Py_EncodeUTF8_surrogateescape(), instead
of using temporary unicode and bytes objects. So Py_EncodeLocale()
doesn't use the Python C API anymore.
2017-12-21 15:45:16 +01:00
Serhiy Storchaka
a5552f023e bpo-32240: Add the const qualifier to declarations of PyObject* array arguments. (#4746) 2017-12-15 13:11:11 +02:00
Victor Stinner
91106cd9ff bpo-29240: PEP 540: Add a new UTF-8 Mode (#855)
* Add -X utf8 command line option, PYTHONUTF8 environment variable
  and a new sys.flags.utf8_mode flag.
* If the LC_CTYPE locale is "C" at startup: enable automatically the
  UTF-8 mode.
* Add _winapi.GetACP(). encodings._alias_mbcs() now calls
  _winapi.GetACP() to get the ANSI code page
* locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
  mode. As a side effect, open() now uses the UTF-8 encoding by
  default in this mode.
* Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
  in the UTF-8 Mode.
* Update subprocess._args_from_interpreter_flags() to handle -X utf8
* Skip some tests relying on the current locale if the UTF-8 mode is
  enabled.
* Add test_utf8mode.py.
* _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
  return also the length (number of wide characters).
* pymain_get_global_config() and pymain_set_global_config() now
  always copy flag values, rather than only copying if the new value
  is greater than the old value.
2017-12-13 12:29:09 +01:00
Victor Stinner
6a54c676e6 bpo-31979: Remove unused align_maxchar() function (#4527) 2017-11-23 19:02:23 +01:00
Serhiy Storchaka
9b6c60cbce bpo-31979: Simplify transforming decimals to ASCII (#4336)
in int(), float() and complex() parsers.

This also speeds up parsing non-ASCII numbers by around 20%.
2017-11-13 21:23:48 +02:00
Serhiy Storchaka
e2f92de6a9 Add the const qualifier to "char *" variables that refer to literal strings. (#4370) 2017-11-11 13:06:26 +02:00
stratakis
e8b1965639 bpo-23699: Use a macro to reduce boilerplate code in rich comparison functions (GH-793) 2017-11-02 20:32:54 +10:00
Serhiy Storchaka
a2314283ff bpo-20047: Make bytearray methods partition() and rpartition() rejecting (#4158)
separators that are not bytes-like objects.
2017-10-29 02:11:54 +03:00