Commit Graph

134 Commits

Author SHA1 Message Date
Guido van Rossum
96303ce055 Fix some endcase bugs in unicode rfind()/rindex() and endswith().
These were reported and fixed by Inyeol Lee in SF bug 595350.  The
endswith() bug is already fixed in 2.3; I'll fix the others in
2.3 next.
2002-08-20 16:57:58 +00:00
Anthony Baxter
3864a45cd3 backport tim_one's patch:
Repair widespread misuse of _PyString_Resize.  Since it's clear people
don't understand how this function works, also beefed up the docs.  The
most common usage error is of this form (often spread out across gotos):

	if (_PyString_Resize(&s, n) < 0) {
		Py_DECREF(s);
		s = NULL;
		goto outtahere;
	}

The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL.  So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s):  s is already
NULL!  A correct way to write the above is the simpler (and intended)

	if (_PyString_Resize(&s, n) < 0)
		goto outtahere;

Bugfix candidate.

Original patch(es):
python/dist/src/Objects/fileobject.c:2.161
python/dist/src/Objects/stringobject.c:2.161
python/dist/src/Objects/unicodeobject.c:2.147
2002-04-30 03:41:53 +00:00
Walter Dörwald
aa82bc0817 Backport checkin:
Apply patch diff.txt from SF feature request
http://www.python.org/sf/444708

This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
2002-04-22 18:42:45 +00:00
Walter Dörwald
1c097b7102 Backport the following changes:
Misc/NEWS 1.387->1.388
Lib/test/string_tests.py 1.10->1.11, 1.12->1.14,
Lib/test/test_unicode.py 1.50->1.51, 1.53->1.54, 1.55->1.56
Lib/test/test_string.py 1.15->1.16
Lib/string.py 1.61->1.63
Lib/test/test_userstring.py 1.5->1.6, 1.11, 1.12
Objects/stringobject.c 2.156->2.159
Objects/unicodeobject.c 2.137->2.139
Doc/lib/libstdtypes.tec 1.87->1.88

Add a method zfill to str, unicode and UserString
and change Lib/string.py accordingly
(see SF patch http://www.python.org/sf/536241)

This also adds Guido's fix to test_userstring.py
and the subinstance checks in test_string.py
and test_unicode.py.
2002-04-22 11:57:06 +00:00
Michael W. Hudson
c2a5e3c0f9 Martin's fix for
[ 529104 ] broken error handling in unicode-escape

I presume this will need to be fixed on the trunk, too.

Later.
2002-03-18 12:47:52 +00:00
Michael W. Hudson
36de099f04 Fix
[ 531306 ] ucs4 build horked.

Classic C mistake, I think.

Also squashed a couple of warnings in the ucs4 build.
2002-03-18 12:43:33 +00:00
Tim Peters
f0cd73e215 Move to zlib 1.1.4 on Windows (the new version that squashes the "double
free" glitch).

unicodeobject.c:  squash compiler warnings.

Noting that test_pyclbr currently fails in 2.2.1:

    test_others (__main__.PyclbrTest) ... ??? HTTP11
    FAIL
2002-03-13 23:56:48 +00:00
Marc-André Lemburg
679e4314fd Whitespace normalization and minor cosmetics. 2002-02-25 14:51:00 +00:00
Marc-André Lemburg
d53e226f81 Fix UTF-8 encoder pointer arithmetic and restore 2.2 behaviour. 2002-02-25 14:30:49 +00:00
Michael W. Hudson
6c7078b582 Fix the problem reported in
[ #495401 ] Build troubles: --with-pymalloc

in a slightly different manner to the trunk, as discussed on python-dev.
2002-02-22 13:44:43 +00:00
Guido van Rossum
604ddf80d8 Fix for #489669 (Neil Norwitz): memory leak in test_descr (unicode).
This is best reproduced by

  while 1:
      class U(unicode):
          pass
      U(u"xxxxxx")

The unicode_dealloc() code wasn't properly freeing the str and defenc
fields of the Unicode object when freeing a subtype instance.  Fixed
this by a subtle refactoring that actually reduces the amount of code
slightly.
2001-12-06 20:03:56 +00:00
Barry Warsaw
e5c492d72a formatfloat(), formatint(): Conversion of sprintf() to PyOS_snprintf()
for buffer overrun avoidance.
2001-11-28 21:00:41 +00:00
Marc-André Lemburg
11326de657 Fix for bug #485951: repr diff between string and unicode. 2001-11-28 12:56:20 +00:00
Marc-André Lemburg
72f8213ba4 Fix for bug #438164: %-formatting using Unicode objects.
This patch also does away with an incompatibility between Jython
and CPython.
2001-11-20 15:18:49 +00:00
Marc-André Lemburg
b5507ecd3c Additional test and documentation for the unicode() changes.
This patch should also be applied to the 2.2b1 trunk.
2001-10-19 12:02:29 +00:00
Guido van Rossum
b8c65bc27f SF patch #470578: Fixes to synchronize unicode() and str()
This patch implements what we have discussed on python-dev late in
    September: str(obj) and unicode(obj) should behave similar, while
    the old behaviour is retained for unicode(obj, encoding, errors).

    The patch also adds a new feature with which objects can provide
    unicode(obj) with input data: the __unicode__ method. Currently no
    new tp_unicode slot is implemented; this is left as option for the
    future.

    Note that PyUnicode_FromEncodedObject() no longer accepts Unicode
    objects as input. The API name already suggests that Unicode
    objects do not belong in the list of acceptable objects and the
    functionality was only needed because
    PyUnicode_FromEncodedObject() was being used directly by
    unicode(). The latter was changed in the discussed way:

    * unicode(obj) calls PyObject_Unicode()
    * unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject()

    One thing left open to discussion is whether to leave the
    PyUnicode_FromObject() API as a thin API extension on top of
    PyUnicode_FromEncodedObject() or to turn it into a (macro) alias
    for PyObject_Unicode() and deprecate it. Doing so would have some
    surprising consequences though, e.g.  u"abc" + 123 would turn out
    as u"abc123"...

[Marc-Andre didn't have time to check this in before the deadline.  I
hope this is OK, Marc-Andre!  You can still make changes and commit
them on the trunk after the branch has been made, but then please mail
Barry a context diff if you want the change to be merged into the
2.2b1 release branch.  GvR]
2001-10-19 02:01:31 +00:00
Guido van Rossum
9475a2310d Enable GC for new-style instances. This touches lots of files, since
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self).  It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types.  Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again.  I'm not
sure if doing something like

void xxx_dealloc(PyObject *self)
{
	if (PyXxxCheckExact(self))
		PyObject_DEL(self);
	else
		self->ob_type->tp_free(self);
}

is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
2001-10-05 20:51:39 +00:00
Guido van Rossum
ad9744a67a Fix a bug in rendering of \\ by repr() -- it rendered as \\\ instead
of \\.
2001-09-21 15:38:17 +00:00
Marc-André Lemburg
3508e30861 Fix Unicode .join() method to raise a TypeError for sequence
elements which are not Unicode objects or strings. (This matches
the string.join() behaviour.)

Fix a memory leak in the .join() method which occurs in case
the Unicode resize fails.

Restore the test_unicode output.
2001-09-20 17:22:58 +00:00
Marc-André Lemburg
6871f6ac57 Implement the changes proposed in patch #413333. unicode(obj) now
works just like str(obj) in that it tries __str__/tp_str on the object
in case it finds that the object is not a string or buffer.
2001-09-20 12:53:16 +00:00
Marc-André Lemburg
c60e6f7771 Patch #435971: UTF-7 codec by Brian Quinlan. 2001-09-20 10:35:46 +00:00
Tim Peters
af90b3e610 str_subtype_new, unicode_subtype_new:
+ These were leaving the hash fields at 0, which all string and unicode
  routines believe is a legitimate hash code.  As a result, hash() applied
  to str and unicode subclass instances always returned 0, which in turn
  confused dict operations, etc.
+ Changed local names "new"; no point to antagonizing C++ compilers.
2001-09-12 05:18:58 +00:00
Tim Peters
7a29bd5861 More on bug 460020: disable many optimizations of unicode subclasses. 2001-09-12 03:03:31 +00:00
Tim Peters
78e0fc74bc Possibly the end of SF [#460020] bug or feature: unicode() and subclasses.
Changed unicode(i) to return a true Unicode object when i is an instance of
a unicode subclass.  Added PyUnicode_CheckExact macro.
2001-09-11 03:07:38 +00:00
Tim Peters
0ebeb584a4 PyUnicode_FromEncodedObject(): Repair memory leak in an error case. 2001-09-11 02:00:50 +00:00