Commit Graph

45 Commits

Author SHA1 Message Date
Guido van Rossum
4755ee567d Complete the integration of Sam Bayer's fixes. 1999-11-17 15:41:47 +00:00
Guido van Rossum
497a19879d Changed fron importing wcnew back to webchecker. 1999-11-17 15:40:48 +00:00
Guido van Rossum
e284b21457 Integrated Sam Bayer's wcnew.py code. It seems silly to keep two
files.  Removed Sam's "SLB" change comments; otherwise this is the
same as wcnew.py.
1999-11-17 15:40:08 +00:00
Guido van Rossum
61b95db389 # *NOT* by Sam Bayer: reindented to use 4 spaces like the rest here,
# and removed trailing whitespace.
1999-11-17 15:13:21 +00:00
Guido van Rossum
64acb5ce93 Samuel L. Bayer:
- same trick with "import wcnew; webchecker = wcnew" as above
- updated readhtml() method to handle pair representation; used
  new name suppression infrastructure from wcnew.py to suppress
  processing name anchors

[And untabified --GvR]
1999-11-17 15:04:26 +00:00
Guido van Rossum
a8946406df Samuel L. Bayer:
- added -t and -a arguments
- added "import wcnew; webchecker = wcnew" in place of "import
  webchecker" (I assume that if you're happy with the changes, you'll
  just replace webchecker.py with wcnew.py, but if I were to do that,
  the diffs would be incomprehensible)
- fixed buggy -v argument (I think you got out of sync with the
  way verbosity was handled in webchecker vs. wcgui between 1.5 and
  1.5.2)
- made -v actually do something by adding a call to c.setflags()
  (probably the same problem as above)
- updated references to URLs to accommodate wcnew.py's pair
  representation; added appropriate calls to format_url() to handle
  display; added argument to ListPanel() initialization to provide
  access to format_url()

[And untabified --GvR]
1999-11-17 15:03:52 +00:00
Guido van Rossum
f97eecccb7 Samuel L. Bayer:
- same fixes from webchecker.py
- incorporated small diff between current webchecker.py and 1.5.2
- fixed bug where "extra roots" added with the -t argument were being
  checked as real roots, not just as possible continuations
- added -a argument to suppress checking of name anchors

[And untabified --GvR]
1999-11-17 15:02:53 +00:00
Guido van Rossum
dbd5c3e63b Samuel L. Bayer:
- forced new done origins to set errors if they're in self.bad (fixes
  bug where only the first of a number of errorful references to a
  link is reported under some circumstances)
- suppressed adding duplicates to self.todo list (cleans up printout
  in wcgui details)
1999-11-17 15:00:14 +00:00
Guido van Rossum
0ec1493d0b Some changes (maybe not enough?) to make it work on Windows with local
file URLs.
1999-04-26 23:11:46 +00:00
Guido van Rossum
545006259d Added Samuel Bayer's new webchecker.
Unfortunately his code breaks wcgui.py in a way that's not easy
to fix.  I expect that this is a temporary situation --
eventually Sam's changes will be merged back in.
(The changes add a -t option to specify exceptions to the -x
option, and explicit checking for #foo style fragment ids.)
1999-03-24 19:09:00 +00:00
Guido van Rossum
909bc18188 Recover from failed saves; when a file turns out to be a directory,
create a directory and moer the original file to the index.html.
1999-01-03 13:06:00 +00:00
Guido van Rossum
a42c1ee21d Added note() message to Page class -- this was used but didn't exist.
(The alternative would be to call self.checker.note() but since
self.checker might be None that's not quite right.
1998-08-06 21:31:13 +00:00
Guido van Rossum
b77a68e6b1 Rewrite to support multiple suckers, each with their own thread. 1998-07-08 03:05:22 +00:00
Guido van Rossum
125700addb Instead of printint, use self.message() or self.note(). 1998-07-08 03:04:39 +00:00
Guido van Rossum
0a13f7f23a # This is a new module I wrote over the weekend. Again, you missed the
# checkin email because my PC doesn't have the "Mail" command.

Add threading (now that it works).  Also some small adaptations to
Unix again.
1998-06-15 14:49:16 +00:00
Guido van Rossum
e3bd82117f Primitive GUI for websucker. 1998-06-15 12:35:19 +00:00
Guido van Rossum
d328a9b5f4 Fix the way a trailing / is changed to /index.html so that it
doesn't depend on the value of os.sep.  (I.e. ported to Windows :-)
1998-06-15 12:34:41 +00:00
Guido van Rossum
6eb9d32c43 sort the urls in the todo list 1998-06-15 12:33:02 +00:00
Guido van Rossum
bee64533d6 Use a try-except so that the pickle file is written even when we die
because of an unexpected exception.
1998-04-27 19:35:15 +00:00
Guido van Rossum
986abac1ba Give in to tabnanny 1998-04-06 14:29:28 +00:00
Guido van Rossum
88b02cf346 Use a better way to bind the checkext instance variable to a check
button widget, not involving a __getattr__() method but a callback on
the widget.
1998-03-05 20:12:18 +00:00
Guido van Rossum
1a7eae919a Adapt to new webchecker structure. Due to better structure of
getpage(), much less duplicate code is needed -- we only need to
override readhtml().
1998-02-21 20:08:39 +00:00
Guido van Rossum
00756bd4a6 Major overhaul. Don't use global variable (e.g. verbose); use
instance variables.  Make all global functions methods, for easy
overriding.  Restructure getpage() for easy overriding.  Add
save_pickle() method and load_pickle() global function to make it
easier for other programs to emulate the toplevel interface.
1998-02-21 20:02:09 +00:00
Guido van Rossum
f326134e5c Map .shtml to text/html. 1997-10-07 14:56:42 +00:00
Guido van Rossum
d57548023f A variant on webchecker that creates a mirror copy of a remote site. 1997-10-06 18:54:25 +00:00