67 Commits

Author SHA1 Message Date
Moshe Zadka
95e2265963 - #122162 -- unicodeobject.c --- Fix unicode .split() off-by-one
- Loosely based on patch #103249 -- Fix core dumps in PyUnicode_Count
2001-03-31 06:48:52 +00:00
Barry Warsaw
5b4c22806f _PyUnicode_Fini(): Initialize the local freelist walking variable `u'
after unicode_empty has been freed, otherwise it might not point to
the real start of the unicode_freelist.  Final closure for SF bug
#110681, Jitterbug PR#398.
2000-10-03 20:45:26 +00:00
Guido van Rossum
4ae8ef84da In _PyUnicode_Fini(), decref unicode_empty before tearng down the free
list.  Discovered by Barry, fix approved by MAL.
2000-10-03 18:09:04 +00:00
Fred Drake
d5fadf75e4 Rationalize use of limits.h, moving the inclusion to Python.h.
Add definitions of INT_MAX and LONG_MAX to pyport.h.
Remove includes of limits.h and conditional definitions of INT_MAX
and LONG_MAX elsewhere.

This closes SourceForge patch #101659 and bug #115323.
2000-09-26 05:46:01 +00:00
Tim Peters
38fd5b6413 Derived from Martin's SF patch 110609: support unbounded ints in %d,i,u,x,X,o formats.
Note a curious extension to the std C rules:  x, X and o formatting can never produce
a sign character in C, so the '+' and ' ' flags are meaningless for them.  But
unbounded ints *can* produce a sign character under these conversions (no fixed-
width bitstring is wide enough to hold all negative values in 2's-comp form).  So
these flags become meaningful in Python when formatting a Python long which is too
big to fit in a C long.  This required shuffling around existing code, which hacked
x and X conversions to death when both the '#' and '0' flags were specified:  the
hacks weren't strong enough to deal with the simultaneous possibility of the ' ' or
'+' flags too, since signs were always meaningless before for x and X conversions.
Isomorphic shuffling was required in unicodeobject.c.
Also added dozens of non-trivial new unbounded-int test cases to test_format.py.
2000-09-21 05:43:11 +00:00
Tim Peters
8f422461b4 Fix for bug 113934. string*n and unicode*n did no overflow checking at
all, either to see whether the # of chars fit in an int, or that the
amount of memory needed fit in a size_t.  Checking these is expensive, but
the alternative is silently wrong answers (as in the bug report) or
core dumps (which were easy to provoke using Unicode strings).
2000-09-09 06:13:41 +00:00
Fredrik Lundh
df84675f93 changed \x to consume exactly two hex digits, also for unicode
strings.  closes PEP-223.

also added \U escape (eight hex digits).
2000-09-03 11:29:49 +00:00
Barry Warsaw
ce4dc41b1a PyUnicode_AsUTF8String(): /F picks up what I missed: the local var
`str' is no longer necessary.  Gotta turn on -Wall!
2000-08-18 19:30:40 +00:00
Barry Warsaw
2dd4abf277 PyUnicode_AsUTF8String(): Don't need to explicitly incref str since
PyUnicode_EncodeUTF8() already returns the created object with the
proper reference count.  This fixes an Insure reported memory leak.
2000-08-18 06:58:15 +00:00
Marc-André Lemburg
b7520774e2 Fixed a couple of instances where a 0-length string was being
resized after creation. 0-length strings are usually shared
and _PyString_Resize() fails on these shared strings.

Fixes [ Bug #111667 ] unicode core dump.
2000-08-14 11:29:19 +00:00
Trent Mick
20abf573ef Clean up warning from Monterey compiler.
Properly end a comment block. It was terminated fine later but by a subsequent
block and. It was also in #if 0. This patch is so trivial I can't believe I am
talking about it. :)
2000-08-12 22:14:34 +00:00
Marc-André Lemburg
e5034378cc Removing UTF-16 aware Unicode comparison code. This kind of compare
function (together with other locale aware ones) should into a new collation
support module. See python-dev for a discussion of this removal.

Note: This patch should also be applied to the 1.6 branch.
2000-08-08 08:04:29 +00:00
Marc-André Lemburg
bff879cabb This patch finalizes the move from UTF-8 to a default encoding in
the Python Unicode implementation.

The internal buffer used for implementing the buffer protocol
is renamed to defenc to make this change visible. It now holds the
default encoded version of the Unicode object and is calculated
on demand (NULL otherwise).

Since the default encoding defaults to ASCII, this will mean that
Unicode objects which hold non-ASCII characters will no longer
work on C APIs using the "s" or "t" parser markers. C APIs must now
explicitly provide Unicode support via the "u", "U" or "es"/"es#"
parser markers in order to work with non-ASCII Unicode strings.

(Note: this patch will also have to be applied to the 1.6 branch
 of the CVS tree.)
2000-08-03 18:46:08 +00:00
Guido van Rossum
16b1ad9c7d Changing the CNRI copyright notice according to CNRI's instructions.
This is a notice without a date, which apparently is not a claim to
copyright but only advice to the reader.  IANAL. :-)
2000-08-03 16:24:25 +00:00
Peter Schneider-Kamp
7e01890986 merge Include/my*.h into Include/pyport.h
marked my*.h as obsolete
2000-07-31 15:28:04 +00:00
Thomas Wouters
7889010731 Miscelaneous ANSIfications. I'm assuming here 'main' should take (int,
char**) and return an int even on PC platforms. If not, please fix
PC/utils/makesrc.c ;-P
2000-07-22 19:25:51 +00:00
Marc-André Lemburg
9542f48fd5 Fixed problems with UTF error reporting macros and some formatting bugs. 2000-07-17 18:23:13 +00:00
Greg Stein
af36a3aa20 gcc is being stupid with if/else constructs
clean out some other warnings
2000-07-17 09:04:43 +00:00
Greg Stein
ff975003cf stop messing around with goto and just write the macro correctly. 2000-07-16 21:39:49 +00:00
Fredrik Lundh
0e19e76aba - change \x to mean "byte" also in unicode literals
(patch #100912)
2000-07-16 18:47:43 +00:00
Tim Peters
855ffac224 Fix fatal compiler (MSVC6) error:
unicodeobject.c(735) :
    error C2143: syntax error : missing ';' before '}'
2000-07-16 17:10:50 +00:00
Marc-André Lemburg
fb625847bf Fix to a bug found by Florian Weimer:
The UTF-8 decoder is still buggy (i.e. it doesn't pass Markus Kuhn's
stress test), mainly due to the following construct:

    #define UTF8_ERROR(details)  do {                       \
        if (utf8_decoding_error(&s, &p, errors, details))   \
            goto onError;                                   \
        continue;                                           \
    } while (0)

(The "continue" statement is supposed to exit from the outer loop,
but of course, it doesn't.  Indeed, this is a marvelous example of
the dangers of the C programming language and especially of the C
preprocessor.)
2000-07-16 13:29:13 +00:00
Thomas Wouters
7e47402264 Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in either
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").

There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
2000-07-16 12:04:32 +00:00
Jeremy Hylton
03657cfdb0 replace PyXXX_Length calls with PyXXX_Size calls 2000-07-12 13:05:33 +00:00
Marc-André Lemburg
566d8a64eb Jeremy Hylton:
better error message for unicode coercion failure
2000-07-11 09:47:04 +00:00