Commit Graph

137 Commits

Author SHA1 Message Date
Antoine Pitrou
d0acb411ef Issue #14387: Do not include accu.h from Python.h. 2012-03-22 14:42:18 +01:00
Victor Stinner
41a863cb81 Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
   (from the locale encoding), instead of decoding them implicitly from latin1
 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
 * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
   character if unicode is NULL
 * Replace MIN/MAX macros by Py_MIN/Py_MAX
 * stringlib/undef.h undefines STRINGLIB_IS_UNICODE
 * stringlib/localeutil.h only supports Unicode
2012-02-24 00:37:51 +01:00
Benjamin Peterson
21e0da228d remove some usage of Py_UNICODE_TOUPPER/LOWER 2012-01-11 21:00:42 -05:00
Victor Stinner
6099a03202 Issue #13624: Write a specialized UTF-8 encoder to allow more optimization
The main bottleneck was the PyUnicode_READ() macro.
2011-12-18 14:22:26 +01:00
Victor Stinner
f8eac00779 Issue #13623: Fix a performance regression introduced by issue #12170 in
bytes.find() and handle correctly OverflowError (raise the same ValueError than
the error for -1).
2011-12-18 01:17:41 +01:00
Victor Stinner
b37b17423b Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
Create an empty string with the new Unicode API.
2011-12-01 03:18:59 +01:00
Antoine Pitrou
0a3229de6b Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass.
2011-11-21 20:39:13 +01:00
Victor Stinner
0fc35196bb stringlib: remove unused STRINGLIB_FILL 2011-11-20 19:30:15 +01:00
Victor Stinner
7931d9a951 Replace PyUnicodeObject type by PyObject
* _PyUnicode_CheckConsistency() now takes a PyObject* instead of void*
 * Remove now useless casts to PyObject*
2011-11-04 00:22:48 +01:00
Victor Stinner
9db1a8b69f Replace PyUnicodeObject* by PyObject* where it was irrevelant
A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
2011-10-23 20:04:37 +02:00
Antoine Pitrou
ac65d96777 Issue #12170: The count(), find(), rfind(), index() and rindex() methods
of bytes and bytearray objects now accept an integer between 0 and 255
as their first argument.  Patch by Petri Lehtinen.
2011-10-20 23:54:17 +02:00
Antoine Pitrou
5b9f4c1539 Fix typo 2011-10-17 19:21:04 +02:00
Antoine Pitrou
c198d0599b Add a comment explaining this heuristic. 2011-10-13 18:07:37 +02:00
Antoine Pitrou
dda339e6d2 Simplify heuristic for when to use memchr 2011-10-13 17:58:11 +02:00
Antoine Pitrou
dd4e2f0153 Issue #13155: Optimize finding the optimal character width of an unicode string 2011-10-13 00:02:27 +02:00
Victor Stinner
d218bf14cc stringlib: Fix STRINGLIB_STR for UCS2/UCS4 2011-10-12 00:14:32 +02:00
Victor Stinner
8cc70dcf70 Fix fastsearch for UCS2 and UCS4
* If needle is 0, try (p[0] >> 16) & 0xff for UCS4
 * Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4
2011-10-11 23:22:22 +02:00
Antoine Pitrou
2c3b2302ad Issue #13134: optimize finding single-character strings using memchr 2011-10-11 20:29:21 +02:00
Martin v. Löwis
c47adb04b3 Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE. 2011-10-07 20:55:35 +02:00
Antoine Pitrou
4574e62c6e Fix massive slowdown in string formatting with str.format.
Example:
./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)"

-> before: 547 usec per loop
-> after: 13 usec per loop
-> 3.2: 22.5 usec per loop
-> 2.7: 12.6 usec per loop
2011-10-07 02:26:47 +02:00
Antoine Pitrou
dbf697ae5c Fix compilation warnings under 64-bit Windows 2011-10-06 15:34:41 +02:00
Victor Stinner
c3cec7868b Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
e57b1c0da1 Mark PyUnicode_FromUCS[124] as private 2011-09-28 22:20:48 +02:00
Martin v. Löwis
d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Mark Dickinson
c7d93b7614 Issue #1621: Fix undefined behaviour from signed overflow in datetime module hashes, array and list iterations, and get_integer (stringlib/string_format.h) 2011-09-25 15:34:32 +01:00