UTF-8 auto-detection tests. Bug 811363

This commit is contained in:
Simon Montagu 2012-11-16 11:33:12 -08:00
parent 9b5a25b4e0
commit 67c7db920b
29 changed files with 555 additions and 0 deletions

View File

@ -41,6 +41,34 @@ MOCHITEST_CHROME_FILES = \
test_bug631751be.html \
bug638318_text.html \
test_bug638318.html \
bug811363-1.text \
bug811363-2.text \
bug811363-3.text \
bug811363-4.text \
bug811363-5.text \
bug811363-6.text \
bug811363-7.text \
bug811363-8.text \
bug811363-9.text \
bug811363-invalid-1.text \
bug811363-invalid-2.text \
bug811363-invalid-3.text \
bug811363-invalid-4.text \
bug811363-invalid-5.text \
test_bug811363-1-1.html \
test_bug811363-1-2.html \
test_bug811363-1-3.html \
test_bug811363-1-4.html \
test_bug811363-1-5.html \
test_bug811363-2-1.html \
test_bug811363-2-2.html \
test_bug811363-2-3.html \
test_bug811363-2-4.html \
test_bug811363-2-5.html \
test_bug811363-2-6.html \
test_bug811363-2-7.html \
test_bug811363-2-8.html \
test_bug811363-2-9.html \
$(NULL)
include $(topsrcdir)/config/rules.mk

View File

@ -0,0 +1 @@
Two-byte UTF-8 including the first and last characters in the range: €Шерлок߿

View File

@ -0,0 +1,3 @@
Three byte UTF-8, first byte 0xE0, including first and last characters
in the range: ࠀशर्लक࿿

View File

@ -0,0 +1,3 @@
Three byte UTF-8, first byte 0xE1-EC, including first and last characters
in the range: ကシャーロック쿿

View File

@ -0,0 +1,3 @@
Three byte UTF-8, first byte 0xED, including first and last characters
in the range: 퀀홈하홈탐퟿

View File

@ -0,0 +1,3 @@
Three byte UTF-8, first byte 0xEE-EF, including first and last characters
in the range: ﴍﻟﻮﻙ￿

View File

@ -0,0 +1,3 @@
Four byte UTF-8, first byte 0xF0, including first and last characters
in the range: 𐀀𐌲𐌿𐍄𐌹𐍃𐌺 𿿿

View File

@ -0,0 +1,3 @@
Four byte UTF-8, first byte 0xF1-F3, including first and last characters
in the range: 񀀀񠀀 񠀁 񠀂󿿿

View File

@ -0,0 +1,3 @@
Four byte UTF-8, first byte 0xF4, including first and last characters
in the range:􀀀􈀀 􈀁 􈀂􏿿

View File

@ -0,0 +1,2 @@
Four byte UTF-8, first byte 0xF0, including BMP only:𐤔𐤓𐤋𐤅𐤒

View File

@ -0,0 +1,4 @@
Orphaned continuation bytes: €<>ƒ„…†‡ˆ‰ŠŒ<E280B9>Ž<EFBFBD>
<EFBFBD>“”•˜™šœ<EFBFBD>žŸ
 ¡¢£¤¥¦§¨©ª«¬­®¯
°±²³´µ¶·¸¹º»¼½¾¿

View File

@ -0,0 +1,3 @@
First bytes of 2-byte sequences (0xc0-0xdf), each followed by a space character: À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
First bytes of 3-byte sequences (0xe0-0xef), each followed by a space character: à á â ã ä å æ ç è é ê ë ì í î ï
First bytes of 4-byte sequences (0xf0-0xf4), each followed by a space character: ð ñ ò ó ô

View File

@ -0,0 +1,2 @@
3-byte sequence with last byte missing (U+0000): à°
4-byte sequence with last b0te missing (U+0000): ð°€

View File

@ -0,0 +1 @@
Overlong encodings: <20><> <20><><EFBFBD> <20><><EFBFBD><EFBFBD>

View File

@ -0,0 +1,3 @@
Isolated surrogates: <20><><EFBFBD> <20><><EFBFBD>
Surrogate pairs: <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD> <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-invalid-1.text",
"ISO-8859-1",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-invalid-2.text",
"windows-1252",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-invalid-3.text",
"windows-1252",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-invalid-4.text",
"windows-1252",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-invalid-5.text",
"ISO-8859-1",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-1.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-2.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-3.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-4.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-5.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-6.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-7.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-8.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>

View File

@ -0,0 +1,35 @@
<!DOCTYPE HTML>
<html>
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=811363
-->
<head>
<title>Test for Bug 811363</title>
<script type="text/javascript"
src="chrome://mochikit/content/tests/SimpleTest/SimpleTest.js">
</script>
<script type="text/javascript" src="CharsetDetectionTests.js"></script>
<link rel="stylesheet" type="text/css"
href="chrome://mochikit/content/tests/SimpleTest/test.css" />
</head>
<body>
<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=811363">Mozilla Bug 811363</a>
<p id="display"></p>
<div id="content" style="display: none">
</div>
<iframe id="testframe"></iframe>
<pre id="test">
<script class="testbody" type="text/javascript">
/** Test for Bug 811363 **/
CharsetDetectionTests("bug811363-9.text",
"UTF-8",
new Array("ja_parallel_state_machine",
"zh_parallel_state_machine",
"zhtw_parallel_state_machine",
"zhcn_parallel_state_machine",
"cjk_parallel_state_machine",
"universal_charset_detector"));
</script>
</pre>
</body>
</html>