243 Commits

Author SHA1 Message Date
Martin v. Löwis
9b8de84a89 Backported r55839 and r61350
Issue #4469: Prevent expandtabs() on string and unicode
objects from causing a segfault when a large width is passed
on 32-bit platforms.
2008-12-13 13:20:46 +00:00
Neal Norwitz
b93d7d52b5 Security patches from Apple: prevent int overflow when allocating memory 2008-07-31 17:04:32 +00:00
Andrew M. Kuchling
8a28c16430 [Backport r50743 | neal.norwitz]
Handle allocation failures gracefully.  Found with failmalloc.
Many (all?) of these could be backported.
2006-10-05 18:08:58 +00:00
Armin Rigo
b2d0f34240 A review of overflow-detecting code in the 2.4 branch.
* unified the way intobject, longobject and mystrtoul handle
  values around -sys.maxint-1.

* in general, trying to entierely avoid overflows in any computation
  involving signed ints or longs is extremely involved.  Fixed a few
  simple cases where a compiler might be too clever (but that's all
  guesswork).

* more overflow checks against bad data in marshal.c.
2006-10-04 10:13:32 +00:00
Andrew M. Kuchling
ab68637b52 [Backport rev. 39743 by lemburg]
Bug fix for [ 1331062 ] utf 7 codec broken.

Backport candidate.
2006-09-29 17:57:58 +00:00
Georg Brandl
79ba8e53aa Backport rev 51448:
- Patch #1541585: fix buffer overrun when performing repr() on
  a unicode string in a build with wide unicode (UCS-4) support.
2006-08-22 08:25:33 +00:00
Martin v. Löwis
cbbe647bb7 Don't crash on Py_UNICODE values < 0. Fixes #1454485. 2006-06-05 10:43:57 +00:00
Neal Norwitz
9109341c43 Backport: Patch #1488312, Fix memory alignment problem on SPARC in unicode. 2006-05-15 07:22:23 +00:00
Anthony Baxter
cb9051a608 after discussions with perky, reverted fix for Bug #1379994: Builtin
unicode_escape and raw_unicode_escape codec now encodes backslash correctly.

This caused another issue for unicode repr strings being double-escaped
(SF Bug #1459029). Correct fix will be in 2.5, but is too risky for 2.4.3.

Added a testcase for #1459029.
2006-03-28 07:32:36 +00:00
Neal Norwitz
55dd2b41e0 Fix the refleak from test_unicode.
Backport 42973 (lots of whitespace changes intermixed):

 - Reindent a confusingly indented piece of code (no intended code changes
    there)
 - Add missing DECREFs of inner-scope 'temp' variable
 - Add various missing DECREFs by changing 'return NULL' into 'goto onError'
 - Avoid double DECREF when last _PyUnicode_Resize() fails

Coverity found one of the missing DECREFs, but oddly enough not the others.
2006-03-28 06:05:21 +00:00
Hye-Shik Chang
361cd4bd6c Backport r42894: SF #1444030 Fix several potential defects found
by Coverity.
2006-03-07 15:59:09 +00:00
Neal Norwitz
7fb3aa7b3c Backport:
- Patch #1400181, fix unicode string formatting to not use the locale.
  This is how string objects work.  u'%f' could use , instead of .
  for the decimal point.  Now both strings and unicode always use periods.

This is the code that would break:

    import locale
    locale.setlocale(locale.LC_NUMERIC, 'de_DE')
    u'%.1f' % 1.0
    assert '1.0' == u'%.1f' % 1.0

I couldn't create a test case which fails, but this fixes the problem.
(tested in interpreter and reported fixed by others)
2006-01-10 06:05:57 +00:00
Hye-Shik Chang
cb92b45e41 Bug #1379994: Fix *unicode_escape codecs to encode r'\' as r'\\'
just like string codecs.
2005-12-17 04:38:31 +00:00
Walter Dörwald
cff983722a Backport checkin:
Fix leaked reference to None.
2005-11-28 22:16:22 +00:00
Walter Dörwald
fd8e0170e2 Backport checkin:
SF bug #1251300: On UCS-4 builds the "unicode-internal" codec will now complain
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the patch by Nik Haldimann that detects truncated data)
2005-08-30 10:46:06 +00:00
Marc-André Lemburg
a9cadcd41b Correct the handling of 0-termination of PyUnicode_AsWideChar()
and its usage in PyLocale_strcoll().

Clarify the documentation on this.

Thanks to Andreas Degert for pointing this out.
2004-11-22 13:02:31 +00:00
Marc-André Lemburg
204bd6d9d2 Applied patch for [ 1047269 ] Buffer overwrite in PyUnicode_AsWideChar.
Python 2.3.x candidate.
2004-10-15 07:45:05 +00:00
Skip Montanaro
6543b45b0c Initialize sep and seplen to suppress warning from gcc. 2004-09-16 03:28:13 +00:00
Thomas Heller
ca0d2cb66e Add a missing line continuation character. 2004-09-15 11:41:32 +00:00
Walter Dörwald
065a32f550 Make the hint about the None default less ambiguous. 2004-09-14 09:45:10 +00:00
Walter Dörwald
782afc5927 Enhance the docstrings for unicode.split() and string.split()
to make it clear that it is possible to pass None as the
separator argument to get the default "any whitespace" separator.
2004-09-14 09:40:45 +00:00
Walter Dörwald
69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Tim Peters
91879ab8ea PyUnicode_Join(): Bozo Alert. While this is chugging along, it may
need to convert str objects from the iterable to unicode.  So, if
someone set the system default encoding to something nasty enough,
the conversion process could mutate the input iterable as a side
effect, and PySequence_Fast doesn't hide that from us if the input was
a list.  IOW, can't assume the size of PySequence_Fast's result is
invariant across PyUnicode_FromObject() calls.
2004-08-27 22:35:44 +00:00
Tim Peters
05eba1fdc8 PyUnicode_Join(): Rewrote to use PySequence_Fast(). This doesn't do
much to reduce the size of the code, but greatly improves its clarity.
It's also quicker in what's probably the most common case (the argument
iterable is a list).  Against it, if the iterable isn't a list or a tuple,
a temp tuple is materialized containing the entire input sequence, and
that's a bigger temp memory burden.  Yawn.
2004-08-27 21:32:02 +00:00
Tim Peters
894c512c2f PyUnicode_Join(): Missed a spot where I intended a cast from size_t to
int.  I sure wish MS would gripe about that!  Whatever, note that the
statement above it guarantees that the cast loses no info.
2004-08-27 05:08:36 +00:00