Dong-hee Na
113feb3ec2
bpo-40328: Add tool for generating cjk mapping headers (GH-19602)
2020-04-30 02:34:24 +09:00
Benjamin Peterson
51796e5d26
Update some www.unicode.org URLs to use HTTPS. (GH-18912)
2020-03-10 21:10:59 -07:00
Benjamin Peterson
051b9d08d1
closes bpo-39926: Update Unicode to 13.0.0. (GH-18910)
2020-03-10 20:41:34 -07:00
Greg Price
a65678c5c9
bpo-37760: Convert from length-18 lists to a dataclass, in makeunicodedata. (GH-15265)
...
Now the fields have names! Much easier to keep straight as a
reader than the elements of an 18-tuple.
Runs about 10-15% slower: from 10.8s to 12.3s, on my laptop.
Fortunately that's perfectly fine for this maintenance script.
2019-09-12 10:23:43 +01:00
Greg Price
3cbc23aa22
bpo-37758: Cut always-constant conditionals on sys.maxunicode. (GH-15302)
...
Since PEP 393 in Python 3.3, this value is always 0x10ffff, the
maximum codepoint in Unicode; there's no longer such a thing as a
UCS-2 build of Python, which couldn't properly represent some
characters.
There are a couple of spots left where we still condition on the value
of this constant. Take them out.
2019-09-09 08:20:40 -07:00
Greg Price
3e4498d35c
bpo-37760: Avoid cluttering work tree with downloaded Unicode files. (GH-15128)
2019-08-14 18:18:53 -07:00
Greg Price
c03e698c34
bpo-37760: Factor out standard range-expanding logic in makeunicodedata. (GH-15248)
...
Much like the lower-level logic in commit ef2af1ad4 , we had
4 copies of this logic, written in a couple of different ways.
They're all implementing the same standard, so write it just once.
2019-08-13 19:28:38 -07:00
Greg Price
99d208efed
bpo-37760: Constant-fold some old options in makeunicodedata. (GH-15129)
...
The `expand` option was introduced in 2000 in commit fad27aee1 .
It appears to have been always set since it was committed, and
what it does is tell the code to do something essential. So,
just always do that, and cut the option.
Also cut the `linebreakprops` option, which isn't consulted anymore.
2019-08-12 22:59:30 -07:00
Greg Price
ef2af1ad44
bpo-37760: Factor out the basic UCD parsing logic of makeunicodedata. (GH-15130)
...
There were 10 copies of this, and almost as many distinct versions of
exactly how it was written. They're all implementing the same
standard. Pull them out to the top, so the more interesting logic
that remains becomes easier to read.
2019-08-12 22:20:56 -07:00
Stefan Behnel
faa2948654
Clean up and reduce visual clutter in the makeunicode.py script. (GH-7558)
2019-06-01 21:49:03 +02:00
Benjamin Peterson
3aca40d3cb
closes bpo-36861: Update Unicode database to 12.1.0. (GH-13214)
...
Adds ㋿.
2019-05-08 20:59:35 -07:00
Inada Naoki
6fec905de5
bpo-36642: make unicodedata const (GH-12855)
2019-04-17 08:40:34 +09:00
Serhiy Storchaka
172bb39452
bpo-22831: Use "with" to avoid possible fd leaks in tools (part 2). (GH-10927)
2019-03-30 08:33:02 +02:00
Benjamin Peterson
738c19f4c5
closes bpo-33376: Update to Unicode 12.0.0. (GH-12256)
2019-03-09 16:25:55 -08:00
Benjamin Peterson
7c69c1c0fb
update to Unicode 11.0.0 (closes bpo-33778) (GH-7439)
...
Also, standardize indentation of generated tables.
2018-06-06 20:14:28 -07:00
Benjamin Peterson
279a96206f
bpo-30736: upgrade to Unicode 10.0 ( #2344 )
...
Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
2017-06-22 22:31:08 -07:00
Zachary Ware
6b6e687766
bpo-27425: Be more explicit in .gitattributes (GH-840)
...
Updates checked-in line endings on several files.
2017-06-10 14:58:42 -05:00
Jon Dufresne
3972628de3
bpo-30296 Remove unnecessary tuples, lists, sets, and dicts ( #1489 )
...
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
expression>) when supported (e.g. any(), all(), tuple(), min(), &
max())
2017-05-18 07:35:54 -07:00
Benjamin Peterson
6775231597
Unicode 9.0.0
...
Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata.
2016-09-14 23:53:47 -07:00
R David Murray
44b548dda8
#27364 : fix "incorrect" uses of escape character in the stdlib.
...
And most of the tools.
Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Benjamin Peterson
4801383c29
upgrade to Unicode 8.0.0
2015-06-27 15:45:56 -05:00
Serhiy Storchaka
ba9ac5b5c4
Issue #16261 : Converted some bare except statements to except statements
...
with specified exception type. Original patch by Ramchandra Apte.
2015-05-20 10:33:40 +03:00
Zachary Ware
774ac377da
Closes #17202 : Merge with 3.4
2015-04-13 12:11:40 -05:00
Zachary Ware
4c9c848159
Issue #17202 : Add .bat to .hgeol to force them to CRLF.
...
Using LF can a script to fail if it tries to use a label that is
split across 512 byte blocks. Who knows why.
2015-04-13 11:59:54 -05:00
Serhiy Storchaka
82e07b92b3
Issue #23181 : More "codepoint" -> "code point".
2015-01-18 11:33:31 +02:00