Ezio Melotti
8e596a765c
#17802 : Fix an UnboundLocalError in html.parser. Initial tests by Thomas Barlow.
2013-05-01 16:18:25 +03:00
Ezio Melotti
46495182d0
#15156 : HTMLParser now uses the new "html.entities.html5" dictionary.
2012-06-24 22:02:56 +02:00
Ezio Melotti
3861d8b271
#15114 : the strict mode of HTMLParser and the HTMLParseError exception are deprecated now that the parser is able to parse invalid markup.
2012-06-23 15:27:51 +02:00
Ezio Melotti
0780b6bc58
#14538 : HTMLParser can now parse correctly start tags that contain a bare /.
2012-04-18 19:18:22 -06:00
Ezio Melotti
29877e8e04
HTMLParser is now able to handle slashes in the start tag.
2012-02-21 09:25:00 +02:00
Ezio Melotti
e31ddedb0e
Fix an index and clean up comments.
2012-02-13 20:20:00 +02:00
Ezio Melotti
f4ab491901
Improve handling of declarations in HTMLParser.
2012-02-13 15:50:37 +02:00
Ezio Melotti
86f67123be
Fix htmlparser tests to always use the right collector.
2012-02-13 14:11:27 +02:00
Ezio Melotti
5211ffe4df
#13993 : HTMLParser is now able to handle broken end tags when strict=False.
2012-02-13 11:24:50 +02:00
Ezio Melotti
fa3702dc28
#13960 : HTMLParser is now able to handle broken comments when strict=False.
2012-02-10 10:45:44 +02:00
Ezio Melotti
62f3d0300e
#13576 : add tests about the handling of (possibly broken) condcoms.
2011-12-19 07:29:03 +02:00
Ezio Melotti
15cb489234
#13358 : HTMLParser now calls handle_data only once for each CDATA.
2011-11-18 18:01:49 +02:00
Ezio Melotti
c2fe57762b
#1745761 , #755670 , #13357 , #12629 , #1200313 : improve attribute handling in HTMLParser.
2011-11-14 18:53:33 +02:00
Ezio Melotti
b245ed1cdf
Group tests about attributes in a separate class.
2011-11-14 18:13:22 +02:00
Ezio Melotti
c1e73c30e9
Make sure that the tolerant parser still parses valid HTML correctly.
2011-11-01 18:57:15 +02:00
Ezio Melotti
b9a48f7144
Avoid reusing the same collector in the tests.
2011-11-01 15:00:59 +02:00
Ezio Melotti
18b0e5b79b
#12008 : add a test.
2011-11-01 14:42:54 +02:00
Ezio Melotti
7de56f6a04
#670664 : Fix HTMLParser to correctly handle the content of `<script>...</script> and <style>...</style>`.
2011-11-01 14:12:22 +02:00
Ezio Melotti
f50ffa94ab
#13273 : fix a bug that prevented HTMLParser to properly detect some tags when strict=False.
2011-10-28 13:21:09 +03:00
Ezio Melotti
d9e0b068af
#12888 : Fix a bug in HTMLParser.unescape that prevented it to escape more than 128 entities. Patch by Peter Otten.
2011-09-05 17:11:06 +03:00
Ezio Melotti
2e3607c1e7
#7311 : fix html.parser to accept non-ASCII attribute values.
2011-04-07 22:03:31 +03:00
Senthil Kumaran
164540fee1
Fix Issue10759 - html.parser.unescape() fails on HTML entities with incorrect syntax
2010-12-28 15:55:16 +00:00
R. David Murray
b579dba119
#1486713 : Add a tolerant mode to HTMLParser.
...
The motivation for adding this option is that the the functionality it
provides used to be provided by sgmllib in Python2, and was used by,
for example, BeautifulSoup. Without this option, the Python3 version
of BeautifulSoup and the many programs that use it are crippled.
The original patch was by 'kxroberto'. I modified it heavily but kept his
heuristics and test. I also added additional heuristics to fix #975556 ,
#1046092 , and part of #6191 . This patch should be completely backward
compatible: the behavior with the default strict=True is unchanged.
2010-12-03 04:06:39 +00:00
Victor Stinner
e021f4b206
Recorded merge of revisions 81500-81501 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines
Issue #6662 : Fix parsing of malformatted charref (&#bad;)
........
r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines
Add the author of the last fix (Issue #6662 )
........
2010-05-24 21:46:25 +00:00
Benjamin Peterson
5a53fdeee8
Merged revisions 78678,78680,78682 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78678 | benjamin.peterson | 2010-03-04 21:07:59 -0600 (Thu, 04 Mar 2010) | 1 line
set svn:eol-style
........
r78680 | benjamin.peterson | 2010-03-04 21:15:07 -0600 (Thu, 04 Mar 2010) | 1 line
set svn:eol-style on Lib files
........
r78682 | benjamin.peterson | 2010-03-04 21:20:06 -0600 (Thu, 04 Mar 2010) | 1 line
remove the svn:executable property from files that don't have shebang lines
........
2010-03-05 03:33:11 +00:00