120 Commits

Author SHA1 Message Date
Guido van Rossum
9ff1cf05fb Backport to 2.2.1:
This is Neil's fix for SF bug 535905 (Evil Trashcan and GC interaction).

The fix makes it possible to call PyObject_GC_UnTrack() more than once
on the same object, and then move the PyObject_GC_UnTrack() call to
*before* the trashcan code is invoked.

BUGFIX CANDIDATE!
2002-03-28 20:36:50 +00:00
Tim Peters
f582b82fe9 SF bug #491415 PyDict_UpdateFromSeq2() unused
PyDict_UpdateFromSeq2():  removed it.
PyDict_MergeFromSeq2():  made it public and documented it.
PyDict_Merge() docs:  updated to reveal <wink> that the second
argument can be any mapping object.
2001-12-11 18:51:08 +00:00
Guido van Rossum
dbb53d9918 Fix of SF bug #475877 (Mutable subtype instances are hashable).
Rather than tweaking the inheritance of type object slots (which turns
out to be too messy to try), this fix adds a __hash__ to the list and
dict types (the only mutable types I'm aware of) that explicitly
raises an error.  This has the advantage that list.__hash__([]) also
raises an error (previously, this would invoke object.__hash__([]),
returning the argument's address); ditto for dict.__hash__.

The disadvantage for this fix is that 3rd party mutable types aren't
automatically fixed.  This should be added to the rules for creating
subclassable extension types: if you don't want your object to be
hashable, add a tp_hash function that raises an exception.

Also, it's possible that I've forgotten about other mutable types for
which this should be done.
2001-12-03 16:32:18 +00:00
Tim Peters
a427a2b8d0 Rename "dictionary" (type and constructor) to "dict". 2001-10-29 22:25:45 +00:00
Tim Peters
4d85953fe6 dictionary() constructor:
+ Change keyword arg name from "x" to "items".  People passing a mapping
  object can stretch their imaginations <wink>.
+ Simplify the docstring text.
2001-10-27 18:27:48 +00:00
Tim Peters
1fc240e851 Generalize dictionary() to accept a sequence of 2-sequences. At the
outer level, the iterator protocol is used for memory-efficiency (the
outer sequence may be very large if fully materialized); at the inner
level, PySequence_Fast() is used for time-efficiency (these should
always be sequences of length 2).

dictobject.c, new functions PyDict_{Merge,Update}FromSeq2.  These are
wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2-
sequences argument instead of a mapping object.  For now, I left these
functions file static, so no corresponding doc changes.  It's tempting
to change dict.update() to allow a sequence-of-2-seqs argument too.

Also changed the name of dictionary's keyword argument from "mapping"
to "x".  Got a better name?  "mapping_or_sequence_of_pairs" isn't
attractive, although more so than "mosop" <wink>.

abstract.h, abstract.tex:  Added new PySequence_Fast_GET_SIZE function,
much faster than going thru the all-purpose PySequence_Size.

libfuncs.tex:
- Document dictionary().
- Fiddle tuple() and list() to admit that their argument is optional.
- The long-winded repetitions of "a sequence, a container that supports
  iteration, or an iterator object" is getting to be a PITA.  Many
  months ago I suggested factoring this out into "iterable object",
  where the definition of that could include being explicit about
  generators too (as is, I'm not sure a reader outside of PythonLabs
  could guess that "an iterator object" includes a generator call).
- Please check my curly braces -- I'm going blind <0.9 wink>.

abstract.c, PySequence_Tuple():  When PyObject_GetIter() fails, leave
its error msg alone now (the msg it produces has improved since
PySequence_Tuple was generalized to accept iterable objects, and
PySequence_Tuple was also stomping on the msg in cases it shouldn't
have even before PyObject_GetIter grew a better msg).
2001-10-26 05:06:50 +00:00
Guido van Rossum
9475a2310d Enable GC for new-style instances. This touches lots of files, since
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self).  It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types.  Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again.  I'm not
sure if doing something like

void xxx_dealloc(PyObject *self)
{
	if (PyXxxCheckExact(self))
		PyObject_DEL(self);
	else
		self->ob_type->tp_free(self);
}

is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
2001-10-05 20:51:39 +00:00
Tim Peters
0ab085c4cb Changed the dict implementation to take "string shortcuts" only when
keys are true strings -- no subclasses need apply.  This may be debatable.

The problem is that a str subclass may very well want to override __eq__
and/or __hash__ (see the new example of case-insensitive strings in
test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts
(and subclass overrides aren't even looked for then).  Another go-fast
reason for the change is that PyCheck_StringExact() is a quicker test
than PyCheck_String(), and we make such a test on virtually every access
to every dict.

OTOH, a str subclass may also be perfectly happy using the base str eq
and hash, and this change slows them a lot.  But those cases are still
hypothetical, while Python's own reliance on true-string dicts is not.
2001-09-14 00:25:33 +00:00
Tim Peters
b95ec09a44 Repair typo in comment. 2001-09-02 18:35:54 +00:00
Tim Peters
25786c0851 Make dictionary() a real constructor. Accepts at most one argument, "a
mapping object", in the same sense dict.update(x) requires of x (that x
has a keys() method and a getitem).
Questionable:  The other type constructors accept a keyword argument, so I
did that here too (e.g., dictionary(mapping={1:2}) works).  But type_call
doesn't pass the keyword args to the tp_new slot (it passes NULL), it only
passes them to the tp_init slot, so getting at them required adding a
tp_init slot to dicts.  Looks like that makes the normal case (i.e., no
args at all) a little slower (the time it takes to call dict.tp_init and
have it figure out there's nothing to do).
2001-09-02 08:22:48 +00:00
Neil Schemenauer
e83c00efd0 Use new GC API. 2001-08-29 23:54:21 +00:00
Martin v. Löwis
e3eb1f2b23 Patch #427190: Implement and use METH_NOARGS and METH_O. 2001-08-16 13:15:00 +00:00
Guido van Rossum
05ac6de2d5 Add PyDict_Merge(a, b, override):
PyDict_Merge(a, b, 1) is the same as PyDict_Update(a, b).
PyDict_Merge(a, b, 0) does something similar but leaves existing items
unchanged.
2001-08-10 20:28:28 +00:00
Tim Peters
6d6c1a35e0 Merge of descr-branch back into trunk. 2001-08-02 04:15:00 +00:00
Barry Warsaw
66a0d1d9b9 dict_update(): Generalize this method so {}.update() accepts any
"mapping" object, specifically one that supports PyMapping_Keys() and
PyObject_GetItem().  This allows you to say e.g. {}.update(UserDict())

We keep the special case for concrete dict objects, although that
seems moderately questionable.  OTOH, the code exists and works, so
why change that?

.update()'s docstring already claims that D.update(E) implies calling
E.keys() so it's appropriate not to transform AttributeErrors in
PyMapping_Keys() to TypeErrors.

Patch eyeballed by Tim.
2001-06-26 20:08:32 +00:00
Tim Peters
c605784174 dict_repr: Reuse one of the int vars (minor code simplification). 2001-06-16 07:52:53 +00:00
Tim Peters
a7259597f1 SF bug 433228: repr(list) woes when len(list) big.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.

I don't consider this a bugfix candidate, as it's a performance boost.

Added _PyString_Join() to the internal string API.  If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
2001-06-16 05:11:17 +00:00
Tim Peters
afb6ae8452 Store the mask instead of the size in dictobjects. The mask is more
frequently used, and in particular this allows to drop the last
remaining obvious time-waster in the crucial lookdict() and
lookdict_string() functions.  Other changes consist mostly of changing
"i < ma_size" to "i <= ma_mask" everywhere.
2001-06-04 21:00:21 +00:00
Tim Peters
453163d842 lookdict: stop more insane core-dump mutating comparison cases. Should
be possible to provoke unbounded recursion now, but leaving that to someone
else to provoke and repair.
Bugfix candidate -- although this is getting harder to backstitch, and the
cases it's protecting against are mondo contrived.
2001-06-03 04:54:32 +00:00
Tim Peters
7b5d0afb1e lookdict: Reduce obfuscating code duplication with a judicious goto.
This code is likely to get even hairier to squash core dumps due to
mutating comparisons, and it's hard enough to follow without that.
2001-06-03 04:14:43 +00:00
Tim Peters
19b77cfc4b Finish the dict->string coredump fix. Need sleep.
Bugfix candidate.
2001-06-02 08:27:39 +00:00
Tim Peters
23cf6be23c Coredumpers from Michael Hudson, mutating dicts while printing or
converting to string.
Critical bugfix candidate -- if you take this seriously <wink>.
2001-06-02 08:02:56 +00:00
Tim Peters
f4b33f61fb dict_popitem(): Repaired last-second 2.1 comment, which misidentified the
true reason for allocating the tuple before checking the dict size.
2001-06-02 05:42:29 +00:00
Tim Peters
eb28ef209e New collision resolution scheme: no polynomials, simpler, faster, less
code, less memory.  Tests have uncovered no drawbacks.  Christian and
Vladimir are the other two people who have burned many brain cells on the
dict code in recent years, and they like the approach too, so I'm checking
it in without further ado.
2001-06-02 05:27:19 +00:00
Tim Peters
15d4929ae4 Implement an old idea of Christian Tismer's: use polynomial division
instead of multiplication to generate the probe sequence.  The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.

The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
 [i << 16 for i in range(20000)] by a factor of 500.  Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.

Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
2001-05-27 07:39:22 +00:00