67 Commits

Author SHA1 Message Date
Ronald Oussoren
9545a23c7f In a number of places code still revers
to "sys.platform == 'mac'" and that is
dead code because it refers to a platform
that is no longer supported (and hasn't been
supported for several releases).

Fixes issue #7908 for the trunk.
2010-05-05 19:09:31 +00:00
Georg Brandl
a6168f9e0a Queue renaming reversal part 3: move module into place and
change imports and other references. Closes #2925.
2008-05-25 07:20:14 +00:00
Alexandre Vassalotti
30ece44f2e Added stub for the Queue module to be renamed in 3.0.
Use the 3.0 module name to avoid spurious warnings.
2008-05-11 19:39:48 +00:00
Christian Heimes
c5f05e45cf Patch #2167 from calvin: Remove unused imports 2008-02-23 17:40:11 +00:00
Andrew M. Kuchling
ab26004923 Use sys.exc_info() 2006-07-26 18:15:45 +00:00
Martin Blais
215f13dd11 Normalized a few cases of whitespace in function declarations.
Found them using::

  find . -name '*.py' | while read i ; do grep 'def[^(]*( ' $i /dev/null ; done
  find . -name '*.py' | while read i ; do grep ' ):' $i /dev/null ; done

(I was doing this all over my own code anyway, because I'd been using spaces in
all defs, so I thought I'd make a run on the Python code as well.  If you need
to do such fixes in your own code, you can use xx-rename or parenregu.el within
emacs.)
2006-06-06 12:46:55 +00:00
Tim Peters
182b5aca27 Whitespace normalization, via reindent.py. 2004-07-18 06:16:08 +00:00
Andrew M. Kuchling
a982c44543 [Patch #918212] Support XHTML's 'id' attribute, which can be on any element. 2004-03-21 19:07:23 +00:00
Neal Norwitz
592c4cc460 SF bug 753592, websucker bug
Pass the proper variable when the user supplies a directory.
Will backport.
2003-07-01 04:14:28 +00:00
Mark Hammond
ce56c377a0 When bad HTML is encountered, ignore the page rather than failing with
a traceback.
2003-02-27 06:59:10 +00:00
Fred Drake
0b9e3f750c Handle the Content-Type header a little more appropriately: if it
contains options, drop them to get the major/minor content type.
Modified from the supplied patch to support more whitespace variation.
Closes SF patch #613605.
2002-11-12 22:19:34 +00:00
Walter Dörwald
aaab30e00c Apply diff2.txt from SF patch http://www.python.org/sf/572113
(with one small bugfix in bgen/bgen/scantools.py)

This replaces string module functions with string methods
for the stuff in the Tools directory. Several uses of
string.letters etc. are still remaining.
2002-09-11 20:36:02 +00:00
Walter Dörwald
88a20baa77 Apply diff.txt from SF patch http://www.python.org/sf/561478
This uses cgi.parse_header() in Checker.checkforhtml(), so that
webchecker recognises the mime type text/html even if options
are specified.
2002-06-06 17:01:21 +00:00
Andrew M. Kuchling
566c0c737f [Bug #512799] urllib.splittype() returns a 2-tuple. (Reported by seb bacon) 2002-03-08 17:19:10 +00:00
Guido van Rossum
f0953b9dff Fix SF bug #482171: webchecker dies on file: URLs w/o robots.txt
The cause seems to be that when a file URL doesn't exist,
urllib.urlopen() raises OSError instead of IOError.  Simply add this
to the except clause.  Not elegant, but effective. :-)
2001-12-11 22:41:24 +00:00
Fred Drake
a2133339ff Only catch NameError and TypeError when attempting to subclass an
exception (for compatibility with old versions of Python).
2001-05-11 19:40:10 +00:00
Fred Drake
d34a9c98a9 Added more link attributes based on additonal information from Chris
McCafferty <christopher.mccafferty@csg.ch>, and a bit of experimentation
with Navigator 4.7.

HTML-as-deployed is evil!
2001-04-05 18:14:50 +00:00
Fred Drake
f3186e8242 A number of improvements based on a discussion with Chris McCafferty
<christopher.mccafferty@csg.ch>:

Add javascript: and telnet: to the types of URLs we ignore.

Add support for several additional URL-valued attributes on the BODY,
FRAME, IFRAME, LINK, OBJECT, and SCRIPT elements.
2001-04-04 17:47:25 +00:00
Guido van Rossum
f3335e193b Patch inspired by Just van Rossum: on the Mac, in savefilename(), make
the path to save a relative path by prefixing it with os.sep (':').
Also fix an indent inconsistency in the same function.
2000-04-25 21:13:24 +00:00
Guido van Rossum
918429b3b2 Moved robotparser.py to the Lib directory.
If you do a "cvs update" in the Lib directory, it will pop up there.
2000-03-29 16:02:45 +00:00
Guido van Rossum
84306246f1 Fix suggested by Magnus Kessler: in class Page, it is possible for
self.parser to be None; in that case don't dereference it in
getnames().
2000-03-28 20:10:39 +00:00
Guido van Rossum
dc8b7980e0 Skip Montanaro:
The robotparser.py module currently lives in Tools/webchecker.  In
preparation for its migration to Lib, I made the following changes:

    * renamed the test() function _test
    * corrected the URLs in _test() so they refer to actual documents
    * added an "if __name__ == '__main__'" catcher to invoke _test()
      when run as a main program
    * added doc strings for the two main methods, parse and can_fetch
    * replaced usage of regsub and regex with corresponding re code
2000-03-27 19:29:31 +00:00
Guido van Rossum
4755ee567d Complete the integration of Sam Bayer's fixes. 1999-11-17 15:41:47 +00:00
Guido van Rossum
497a19879d Changed fron importing wcnew back to webchecker. 1999-11-17 15:40:48 +00:00
Guido van Rossum
e284b21457 Integrated Sam Bayer's wcnew.py code. It seems silly to keep two
files.  Removed Sam's "SLB" change comments; otherwise this is the
same as wcnew.py.
1999-11-17 15:40:08 +00:00