cpython

mirror of https://github.com/AdaCore/cpython.git synced 2026-02-12 12:57:15 -08:00

Author	SHA1	Message	Date
Victor Stinner	c6b292cdee	bpo-29882: Add _Py_popcount32() function (GH-20518) * Rename pycore_byteswap.h to pycore_bitutils.h. * Move popcount_digit() to pycore_bitutils.h as _Py_popcount32(). * _Py_popcount32() uses GCC and clang builtin function if available. * Add unit tests to _Py_popcount32().	2020-06-08 16:30:33 +02:00
Victor Stinner	d7c657d4b1	bpo-40302: UTF-32 encoder SWAB4() macro use a\|b rather than a+b (GH-19572)	2020-04-17 19:13:34 +02:00
Victor Stinner	1ae035b7e8	bpo-40302: Add pycore_byteswap.h header file (GH-19552) Add a new internal pycore_byteswap.h header file with the following functions: * _Py_bswap16() * _Py_bswap32() * _Py_bswap64() Use these functions in _ctypes, sha256 and sha512 modules, and also use in the UTF-32 encoder. sha256, sha512 and _ctypes modules are now built with the internal C API.	2020-04-17 17:47:20 +02:00
Serhiy Storchaka	cd8295ff75	bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345)	2020-04-11 10:48:40 +03:00
Benjamin Peterson	51796e5d26	Update some www.unicode.org URLs to use HTTPS. (GH-18912)	2020-03-10 21:10:59 -07:00
Inada Naoki	02a4d57263	bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327) Avoid using temporary bytes object.	2020-02-27 13:48:59 +09:00
Andy Lester	e6be9b59a9	closes bpo-39605: Fix some casts to not cast away const. (GH-18453) gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either: Adding the const to the type cast, as in: - return _PyUnicode_FromUCS1((unsigned char)s, size); + return _PyUnicode_FromUCS1((const unsigned char)s, size); or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in: - PyDTrace_FUNCTION_ENTRY((char )filename, (char )funcname, lineno); + PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno); These changes will not change code, but they will make it much easier to check for errors in consts	2020-02-11 18:28:35 -08:00
Serhiy Storchaka	894263ba80	bpo-24214: Fixed the UTF-8 and UTF-16 incremental decoders. (GH-14304) * The UTF-8 incremental decoders fails now fast if encounter a sequence that can't be handled by the error handler. * The UTF-16 incremental decoders with the surrogatepass error handler decodes now a lone low surrogate with final=False.	2019-06-25 11:54:18 +03:00
Victor Stinner	709d23dee6	bpo-36775: _PyCoreConfig only uses wchar_t* (GH-13062) _PyCoreConfig: Change filesystem_encoding, filesystem_errors, stdio_encoding and stdio_errors fields type from char* to wchar_t. Changes: PyInterpreterState: replace fscodec_initialized (int) with fs_codec structure. * Add get_error_handler_wide() and unicode_encode_utf8() helper functions. * Add error_handler parameter to unicode_encode_locale() and unicode_decode_locale(). * Remove _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideString() to _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideStringFromString() to _PyCoreConfig_DecodeLocale().	2019-05-02 14:56:30 -04:00
Victor Stinner	3d4226a832	bpo-34523: Support surrogatepass in locale codecs (GH-8995) Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.	2018-08-29 22:21:32 +02:00
Stefan Krah	f432a3234f	bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. (#3157 )	2017-08-21 13:09:59 +02:00
Serhiy Storchaka	998c9cdd42	Issue #28561 : Clean up UTF-8 encoder: remove dead code, update comments, etc. Patch by Xiang Zhang.	2016-10-30 18:25:27 +02:00
Victor Stinner	1a05d6c04d	PEP 7 style for if/else in C Add also a newline for readability in normalize_encoding().	2016-09-02 12:12:23 +02:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Serhiy Storchaka	bcde10aa7e	Issue #26765 : Ensure that bytes- and unicode-specific stringlib files are used with correct type.	2016-05-16 09:42:29 +03:00
Victor Stinner	6bd525b656	Optimize error handlers of ASCII and Latin1 encoders when the replacement string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path	2015-10-09 13:10:05 +02:00
Victor Stinner	ce179bf6ba	Add _PyBytesWriter_WriteBytes() to factorize the code	2015-10-09 12:57:22 +02:00
Victor Stinner	ad7715891e	_PyBytesWriter: simplify code to avoid "prealloc" parameters Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().	2015-10-09 12:38:53 +02:00
Victor Stinner	e7bf86cd7d	Optimize backslashreplace error handler Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.	2015-10-09 01:39:28 +02:00
Victor Stinner	fdfbf78114	Issue #25318 : Add _PyBytesWriter API Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.	2015-10-09 00:33:49 +02:00
Victor Stinner	01ada3996b	Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.	2015-10-01 21:54:51 +02:00
Serhiy Storchaka	9ce71a6475	Fixed typos in comments.	2015-05-18 22:20:18 +03:00
Serhiy Storchaka	7e29eea926	Fixed typos in comments.	2015-05-18 22:19:42 +03:00
Serhiy Storchaka	0d4df752ac	Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.	2015-05-12 23:12:45 +03:00
Serhiy Storchaka	3079328d29	Reverted changeset b72c5573c5e7 (issue #15027 ).	2014-01-04 22:44:01 +02:00

1 2

42 Commits