130 Commits

Author SHA1 Message Date
Michael W. Hudson
c2a5e3c0f9 Martin's fix for
[ 529104 ] broken error handling in unicode-escape

I presume this will need to be fixed on the trunk, too.

Later.
2002-03-18 12:47:52 +00:00
Michael W. Hudson
36de099f04 Fix
[ 531306 ] ucs4 build horked.

Classic C mistake, I think.

Also squashed a couple of warnings in the ucs4 build.
2002-03-18 12:43:33 +00:00
Tim Peters
f0cd73e215 Move to zlib 1.1.4 on Windows (the new version that squashes the "double
free" glitch).

unicodeobject.c:  squash compiler warnings.

Noting that test_pyclbr currently fails in 2.2.1:

    test_others (__main__.PyclbrTest) ... ??? HTTP11
    FAIL
2002-03-13 23:56:48 +00:00
Marc-André Lemburg
679e4314fd Whitespace normalization and minor cosmetics. 2002-02-25 14:51:00 +00:00
Marc-André Lemburg
d53e226f81 Fix UTF-8 encoder pointer arithmetic and restore 2.2 behaviour. 2002-02-25 14:30:49 +00:00
Michael W. Hudson
6c7078b582 Fix the problem reported in
[ #495401 ] Build troubles: --with-pymalloc

in a slightly different manner to the trunk, as discussed on python-dev.
2002-02-22 13:44:43 +00:00
Guido van Rossum
604ddf80d8 Fix for #489669 (Neil Norwitz): memory leak in test_descr (unicode).
This is best reproduced by

  while 1:
      class U(unicode):
          pass
      U(u"xxxxxx")

The unicode_dealloc() code wasn't properly freeing the str and defenc
fields of the Unicode object when freeing a subtype instance.  Fixed
this by a subtle refactoring that actually reduces the amount of code
slightly.
2001-12-06 20:03:56 +00:00
Barry Warsaw
e5c492d72a formatfloat(), formatint(): Conversion of sprintf() to PyOS_snprintf()
for buffer overrun avoidance.
2001-11-28 21:00:41 +00:00
Marc-André Lemburg
11326de657 Fix for bug #485951: repr diff between string and unicode. 2001-11-28 12:56:20 +00:00
Marc-André Lemburg
72f8213ba4 Fix for bug #438164: %-formatting using Unicode objects.
This patch also does away with an incompatibility between Jython
and CPython.
2001-11-20 15:18:49 +00:00
Marc-André Lemburg
b5507ecd3c Additional test and documentation for the unicode() changes.
This patch should also be applied to the 2.2b1 trunk.
2001-10-19 12:02:29 +00:00
Guido van Rossum
b8c65bc27f SF patch #470578: Fixes to synchronize unicode() and str()
This patch implements what we have discussed on python-dev late in
    September: str(obj) and unicode(obj) should behave similar, while
    the old behaviour is retained for unicode(obj, encoding, errors).

    The patch also adds a new feature with which objects can provide
    unicode(obj) with input data: the __unicode__ method. Currently no
    new tp_unicode slot is implemented; this is left as option for the
    future.

    Note that PyUnicode_FromEncodedObject() no longer accepts Unicode
    objects as input. The API name already suggests that Unicode
    objects do not belong in the list of acceptable objects and the
    functionality was only needed because
    PyUnicode_FromEncodedObject() was being used directly by
    unicode(). The latter was changed in the discussed way:

    * unicode(obj) calls PyObject_Unicode()
    * unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject()

    One thing left open to discussion is whether to leave the
    PyUnicode_FromObject() API as a thin API extension on top of
    PyUnicode_FromEncodedObject() or to turn it into a (macro) alias
    for PyObject_Unicode() and deprecate it. Doing so would have some
    surprising consequences though, e.g.  u"abc" + 123 would turn out
    as u"abc123"...

[Marc-Andre didn't have time to check this in before the deadline.  I
hope this is OK, Marc-Andre!  You can still make changes and commit
them on the trunk after the branch has been made, but then please mail
Barry a context diff if you want the change to be merged into the
2.2b1 release branch.  GvR]
2001-10-19 02:01:31 +00:00
Guido van Rossum
9475a2310d Enable GC for new-style instances. This touches lots of files, since
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self).  It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types.  Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again.  I'm not
sure if doing something like

void xxx_dealloc(PyObject *self)
{
	if (PyXxxCheckExact(self))
		PyObject_DEL(self);
	else
		self->ob_type->tp_free(self);
}

is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
2001-10-05 20:51:39 +00:00
Guido van Rossum
ad9744a67a Fix a bug in rendering of \\ by repr() -- it rendered as \\\ instead
of \\.
2001-09-21 15:38:17 +00:00
Marc-André Lemburg
3508e30861 Fix Unicode .join() method to raise a TypeError for sequence
elements which are not Unicode objects or strings. (This matches
the string.join() behaviour.)

Fix a memory leak in the .join() method which occurs in case
the Unicode resize fails.

Restore the test_unicode output.
2001-09-20 17:22:58 +00:00
Marc-André Lemburg
6871f6ac57 Implement the changes proposed in patch #413333. unicode(obj) now
works just like str(obj) in that it tries __str__/tp_str on the object
in case it finds that the object is not a string or buffer.
2001-09-20 12:53:16 +00:00
Marc-André Lemburg
c60e6f7771 Patch #435971: UTF-7 codec by Brian Quinlan. 2001-09-20 10:35:46 +00:00
Tim Peters
af90b3e610 str_subtype_new, unicode_subtype_new:
+ These were leaving the hash fields at 0, which all string and unicode
  routines believe is a legitimate hash code.  As a result, hash() applied
  to str and unicode subclass instances always returned 0, which in turn
  confused dict operations, etc.
+ Changed local names "new"; no point to antagonizing C++ compilers.
2001-09-12 05:18:58 +00:00
Tim Peters
7a29bd5861 More on bug 460020: disable many optimizations of unicode subclasses. 2001-09-12 03:03:31 +00:00
Tim Peters
78e0fc74bc Possibly the end of SF [#460020] bug or feature: unicode() and subclasses.
Changed unicode(i) to return a true Unicode object when i is an instance of
a unicode subclass.  Added PyUnicode_CheckExact macro.
2001-09-11 03:07:38 +00:00
Tim Peters
0ebeb584a4 PyUnicode_FromEncodedObject(): Repair memory leak in an error case. 2001-09-11 02:00:50 +00:00
Guido van Rossum
e023fe0eef Make unicode subclassable. 2001-08-30 03:12:59 +00:00
Martin v. Löwis
e3eb1f2b23 Patch #427190: Implement and use METH_NOARGS and METH_O. 2001-08-16 13:15:00 +00:00
Tim Peters
772747b3f1 SF patch #438013 Remove 2-byte Py_UCS2 assumptions
Removed all instances of Py_UCS2 from the codebase, and so also (I hope)
the last remaining reliance on the platform having an integral type
with exactly 16 bits.
PyUnicode_DecodeUTF16() and PyUnicode_EncodeUTF16() now read and write
one byte at a time.
2001-08-09 22:21:55 +00:00
Tim Peters
6d6c1a35e0 Merge of descr-branch back into trunk. 2001-08-02 04:15:00 +00:00