Fix SF bug #697220, string.strip implementation/doc mismatch
Attempt to make all the various string/unicode *strip methods the same.
* Doc - add doc for when functions were added
* UserString
* string/unicode object methods
* string module functions
'chars' is used for the last parameter everywhere.
Fix a nasty endcase reported by Armin Rigo in SF bug 618623:
'%2147483647d' % -123 segfaults. This was because an integer overflow
in a comparison caused the string resize to be skipped. After fixing
the overflow, this could call _PyString_Resize() with a negative size,
so I (1) test for that and raise MemoryError instead; (2) also added a
test for negative newsize to _PyString_Resize(), raising SystemError
as for all bad arguments.
An identical bug existed in unicodeobject.c, of course.
2002/08/11 12:23:04 lemburg Python/bltinmodule.c 2.262
2002/08/11 12:23:04 lemburg Objects/unicodeobject.c 2.162
2002/08/11 12:23:03 lemburg Misc/NEWS 1.461
2002/08/11 12:23:03 lemburg Lib/test/test_unicode.py 1.65
2002/08/11 12:23:03 lemburg Include/unicodeobject.h 2.39
Add C API PyUnicode_FromOrdinal() which exposes unichr() at C level.
u'%c' will now raise a ValueError in case the argument is an
integer outside the valid range of Unicode code point ordinals.
Closes SF bug #593581.
UTF-8 decoder accept broken UTF-8 sequences which encode lone
high surrogates (the pre-2.2.2 versions forgot to generate the
UTF-8 prefix \xed for these).
Fixes SF bug #610783: Lone surrogates cause bad .pyc files.
Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do the
wrong thing for a unicode subclass when there were zero string
replacements. The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another. The code would actually write outside allocated memory if
replacement string was longer than the search string.
Repair widespread misuse of _PyString_Resize. Since it's clear people
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
Original patch(es):
python/dist/src/Objects/fileobject.c:2.161
python/dist/src/Objects/stringobject.c:2.161
python/dist/src/Objects/unicodeobject.c:2.147
Apply patch diff.txt from SF feature request
http://www.python.org/sf/444708
This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
Misc/NEWS 1.387->1.388
Lib/test/string_tests.py 1.10->1.11, 1.12->1.14,
Lib/test/test_unicode.py 1.50->1.51, 1.53->1.54, 1.55->1.56
Lib/test/test_string.py 1.15->1.16
Lib/string.py 1.61->1.63
Lib/test/test_userstring.py 1.5->1.6, 1.11, 1.12
Objects/stringobject.c 2.156->2.159
Objects/unicodeobject.c 2.137->2.139
Doc/lib/libstdtypes.tec 1.87->1.88
Add a method zfill to str, unicode and UserString
and change Lib/string.py accordingly
(see SF patch http://www.python.org/sf/536241)
This also adds Guido's fix to test_userstring.py
and the subinstance checks in test_string.py
and test_unicode.py.
This is best reproduced by
while 1:
class U(unicode):
pass
U(u"xxxxxx")
The unicode_dealloc() code wasn't properly freeing the str and defenc
fields of the Unicode object when freeing a subtype instance. Fixed
this by a subtle refactoring that actually reduces the amount of code
slightly.