a575963da9
Former-commit-id: da6be194a6b1221998fc28233f2503bd61dd9d14
1765 lines
63 KiB
Plaintext
1765 lines
63 KiB
Plaintext
2010-06-04 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* create-category-table.cs: Utility to generate reasonably-packed
|
|
Unicode tables.
|
|
|
|
This program generates (partially) bi-level tables encoding the
|
|
contents of the Unicode character category database.
|
|
|
|
Mono embeds a linear table with category codes for the Unicode BMP
|
|
(first 65536 codepoints), and lacks information about characters
|
|
in the astral planes--leading to requests such as bug 480178.
|
|
Extending the linear table to cover the full codespace is not an
|
|
ideal solution, as that would expand the embedded "blob" by a
|
|
factor of 17.
|
|
|
|
The new tables generated by this program can be used to support
|
|
the full range of characters. An additional level of indirection
|
|
used for characters outside the U+0000..U+FFFF range enables
|
|
"page" sharing, so that the total amount of embedded data only
|
|
grows by 13.5kB.
|
|
|
|
Cf. in-file comments for usage instructions.
|
|
|
|
2010-05-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : fix extender search index for LastIndexOf().
|
|
Fixed bug #605094.
|
|
|
|
2010-04-20 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* Normalization.cs: Really apply canonical reordering "recursively."
|
|
|
|
Before this, a sequence of code points with the combining
|
|
classes (22, 33, 11) would be reordered to (22, 11, 33) instead of
|
|
the correct (11, 22, 33). This is because the 'i--' would be
|
|
directly cancelled by the 'i++' in the for loop.
|
|
|
|
2010-04-20 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* Normalization.cs: The correct "checkType" argument to
|
|
Decompose() is NKD or NKFD when normalizing to NKC resp. NKFC.
|
|
|
|
* StringTest.cs: More NFC test cases.
|
|
|
|
2010-04-20 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* Normalization.cs: Implement algorithmic Hangul composition.
|
|
Calling Normalize(NormalizationForm.FormC) on Korean characters
|
|
now works properly (bnc#480152).
|
|
|
|
* StringTest.cs: Add test cases for Hangul composition.
|
|
|
|
2010-04-20 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* Normalization.cs: Follow the spec when checking composition pairs.
|
|
|
|
Figure 7 in section 1.3 of http://unicode.org/reports/tr15/ shows
|
|
how when doing composition, one has to examine the successive
|
|
(starter, candidate) pairs, and combine if a matching canonical
|
|
decomposition exists.
|
|
|
|
The original algorithm was, instead, iterating on canonical
|
|
decompositions, and, for each one, trying to match a sequence
|
|
of (starter, non-starter, ...). This, however, does not produce
|
|
the same results as it is violating some implicit ordering
|
|
constraints in the Unicode tables.
|
|
|
|
E.g., when composing the following sequence of codepoints, the
|
|
original algorithm was picking:
|
|
|
|
03B7 0313 0300 0345
|
|
^^^^ ^^^^
|
|
1F74 0313 0345
|
|
^^^^ ^^^^
|
|
1FC2 0313
|
|
|
|
and would stop at 1FC2 0313 as there is no decomposition matching
|
|
it. The new algorithm, which follows the guidance of the pretty
|
|
figure 7, ends up doing:
|
|
|
|
03B7 0313 0300 0345
|
|
^^^^ ^^^^
|
|
1F20 0300 0345
|
|
^^^^ ^^^^
|
|
1F22 0345
|
|
^^^^ ^^^^
|
|
1F92
|
|
|
|
resulting in the correct 1F92.
|
|
|
|
2010-04-19 Damien Diederen <dd@crosstwine.com>
|
|
|
|
* Normalization.cs: Recursively apply the Unicode decomposition mapping.
|
|
|
|
According to http://www.unicode.org/reports/tr15/tr15-31.html,
|
|
section 1.3:
|
|
|
|
"To transform a Unicode string into a given Unicode Normalization
|
|
Form, the first step is to fully decompose the string. [...] Full
|
|
decomposition involves recursive application of the
|
|
Decomposition_Mapping values, because in some cases a complex
|
|
composite character may have a Decomposition_Mapping into a
|
|
sequence of characters, one of which may also have its own
|
|
non-trivial Decomposition_Mapping value."
|
|
|
|
2010-02-18 Gabriel Burt <gabriel.burt@gmail.com>
|
|
|
|
* Normalization.cs: Implement algorithmic Hangul decomposition; Calling
|
|
string.Normalize on Korean characters now works properly (bnc#480152).
|
|
This reduces the number of errors in 'make test' from 27k to 4.8k.
|
|
|
|
* StringNormalizationTestSource.cs:
|
|
* Makefile: Use the local, working copy of Normalization etc,so as to make
|
|
modifying Normalization.cs and then testing your changes with 'make test'
|
|
possible. Also, fix building/running of tests, patch by Alexander
|
|
Kojevnikov.
|
|
|
|
2009-09-18 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs : Handle blocked characters which are not
|
|
immediately next to the primary composite character. This fixes
|
|
some Arabic string sequence normalization.
|
|
* Makefile : fix test build.
|
|
|
|
2009-09-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs : some renaming for disambiguation.
|
|
* NormalizationTableUtil.cs : fix some wrong ranges in
|
|
mapIdxToComposite. This fixes some Arabic normalization (and more).
|
|
* normalization-notes.txt : added some notes on the implementation.
|
|
|
|
2008-06-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs :
|
|
- reverted the previous index calculation change. It was correctly
|
|
implemented and I rather broke it.
|
|
- fix index calculation on combining.
|
|
- NFKD was incorrectly directed to combining path. It should not.
|
|
- Simplify quick check.
|
|
|
|
2008-06-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs : For NFC and NFKC, IsNormalized() was not working
|
|
enough to check composed characters. It's not possible without
|
|
the actual composition, so just call Normalize() and compare them.
|
|
In Normalize() mapping helper didn't pick correct map index since
|
|
the table for index stores index for "uncompressed" numbers.
|
|
* NormalizationTableUtil.cs : updated to the latest UCD.
|
|
* Makefile : to build test, source file must be downloaded too.
|
|
|
|
2008-11-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* ucd.cs : Write type for *_count. Add notice to not edit
|
|
unicode-data.h directly.
|
|
|
|
2008-11-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* ucd.cs : new code to generate unicode table for eglib.
|
|
|
|
2008-07-04 Andreas Nahr <ClassDevelopment@A-SoftTech.com>
|
|
|
|
* SortKey: Fix parameter names, add attribute, small formatting
|
|
|
|
2008-06-27 Rodrigo Kumpera <rkumpera@novell.com>
|
|
|
|
* CodePointIndexer.cs : Make TableRange a struct instead
|
|
of a class so we save 2 memory ops per ToIndex loop.
|
|
|
|
2008-04-02 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKey.cs : check null arguments. Fixed bug #376171.
|
|
|
|
2007-07-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : I wonder how long its build
|
|
has been broken ...
|
|
|
|
2007-03-06 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : disable QuickCheckPossible(), which is
|
|
inaccurate and inefficient. Fixed bug #79714.
|
|
|
|
2007-02-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : character filtering is needed for
|
|
OrdinalIgnoreCase in 2.0 profile. Fixed bug #80865.
|
|
|
|
2007-01-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : GetTailContraction() was broken to pick correct
|
|
contraction/special sortkey out and thus LastIndexOf() failed when
|
|
it is involved. Fixed bug #80612.
|
|
|
|
2007-01-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : for non-StringSort comparison, level5 (- and ')
|
|
should be still skipped after initial level5 check is done (while
|
|
they were simply treated as a normal character). Fixed bug #78748.
|
|
* SortKeyBuffer.cs : Fixed NRE in french sort.
|
|
|
|
2006-12-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : added IndexOf() implementation for Ordinal
|
|
and OrdinalIgnoreCase, though Ordinal version is not used (since
|
|
it is slower than icall).
|
|
|
|
2006-05-30 Miguel de Icaza <miguel@novell.com>
|
|
|
|
* MSCompatUnicodeTable.cs: Remove the fixed loading and compute it
|
|
just when we actually consume it. This only fixes the
|
|
!USE_C_HEADER case.
|
|
|
|
2006-04-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* README: removed obsolete info.
|
|
* Normalization.cs : canonical reordering should participate in the
|
|
decomposition step. In reordering, string append was incomplete.
|
|
Combining class check is required in NFD check. Icall is written
|
|
using IntPtr now.
|
|
|
|
2005-12-07 Zoltan Varga <vargaz@gmail.com>
|
|
|
|
* SimpleCollator.cs: Fix a warning.
|
|
|
|
2005-11-30 Sebastien Pouliot <sebastien@ximian.com>
|
|
|
|
* SimpleCollator.cs: Fix CAS support. The static ctor/var try to get
|
|
the environment variable MUCH too soon (i.e. the security manager
|
|
needs the collator).
|
|
|
|
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : direct fast-path optimization for IndexOf().
|
|
|
|
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs :
|
|
- CompareQuick(): added immediateBreakup to avoid extraneous sortkey
|
|
computation.
|
|
- QuickCheckPossible(): index used for s1 was incorrect.
|
|
|
|
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : added another quick check for CompareInternal()
|
|
that does almost ordinal comparison for quick-checkable strings.
|
|
(It affects on Compare(), IndexOf(), IsSuffix() etc. as well.)
|
|
|
|
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : (IsIgnorable) \0 is not ignorable.
|
|
Fixed bug 76702.
|
|
|
|
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs :
|
|
Created another struct to reduce method arguments. Created another
|
|
flags that keeps "once-matched" state (counterpart of
|
|
checkedFlags, now neverMatchFlags).
|
|
|
|
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs :
|
|
- Added CompareOrdinalIgnoreCase() for NET_2_0 RTM.
|
|
- Reduced extra parameter from LastIndexOfSortKey().
|
|
- LastIndexOf() should use GetTailContraction for the source string.
|
|
And then, target could match in the middle of the possible
|
|
"replacement contraction" of the source string, so use
|
|
LastIndexOfSortKey() to catch them.
|
|
- Fixed GetTailContraction() that caused index out of range.
|
|
|
|
2005-11-11 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Makefile : Now use MONO_DISABLE_MANAGED_COLLATION.
|
|
* SortKey.cs : some members are virtual.
|
|
|
|
2005-10-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : modified to use stackalloc for byte array.
|
|
|
|
2005-09-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : in CompareInternal(), there was a possibility of
|
|
infinite loop. Fixed bug #76243.
|
|
|
|
2005-09-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : In IsPrefix/IsSuffix, if target is an empty string,
|
|
immediately return true.
|
|
|
|
2005-09-09 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IsSuffix() optimization logic was buggy, so just
|
|
use pretty simple way with LastIndexOf() (no significant perf.
|
|
problem).
|
|
|
|
2005-09-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* README, Collation-notes.txt, CollationDataStructures.txt :
|
|
removing obsolete info and some added some notes.
|
|
|
|
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs : remove warned code.
|
|
* managed-collation.patch : now it's not required anymore.
|
|
|
|
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : added IsSortable(string).
|
|
|
|
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Now all collator methods are thread safe.
|
|
|
|
All instance non-readonly fields turned into arguments of every
|
|
methods that use those fields.
|
|
(Sadly it is the end of no-memory-cost collator era. mcs bootstrap
|
|
now needs +100KB memory consumption.)
|
|
|
|
2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : made "checkedFlags" as nullable and made it as
|
|
an argument of every index methods (to make it thread safe).
|
|
|
|
2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs,
|
|
MSCompatUnicodeTable.cs :
|
|
- Now IsIgnorable() is aggregated to be one invokation to check
|
|
completely ignorable, nonspacing and symbols.
|
|
- Introduced "already checked" flags for IndexOf() and LastIndexOf()
|
|
to skip sortkey binary check on the same characters. Significant
|
|
perf. improvement for such case as IndexOf("AABCBABC...Z",'Z').
|
|
|
|
2005-08-08 Gert Driesen <drieseng@users.sourceforge.net>
|
|
|
|
* SortKey.cs: Marked Serializable to match MS.NET.
|
|
|
|
2005-08-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
Makefile : changed resources output directory.
|
|
|
|
2005-08-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-tests.cs,
|
|
StringNormalizationTestSource.cs : new files for Unicode
|
|
Normalization test generator.
|
|
* Makefile : added support for above.
|
|
|
|
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* NormalizationTableUtil.cs : oops, it does not compile.
|
|
* managed-collation.patch : I guess having managed resource would be
|
|
better for collation. At least current code has such #define so
|
|
Makefile should be in sync with it.
|
|
|
|
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs : Fixed CharMapComparer which
|
|
incorrectly returned 0 when the second arg is shorter. Reduced
|
|
extraneous helperIndex map. Other minor fixes and code removal.
|
|
* Normalization.cs : several fixes to support blocked combine handling.
|
|
* NormalizationTableUtil.cs : tiny member renaming.
|
|
|
|
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs,
|
|
NormalizationTableUtil.cs,
|
|
Normalization.cs : several bugfixes on index miscomputation.
|
|
Renamed using aliases (csc will bork). Primary combine safety is now
|
|
computed during UnicodeData.txt parse.
|
|
Maximum NFKD length was 18, not 4 (U+FDFA).
|
|
|
|
2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* managed-collation.patch : added Normalization support.
|
|
* managed-collation-icall.patch : added, including normalization stuff.
|
|
|
|
BTW when will collation code checked in?
|
|
|
|
2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs : Unified three normalization source
|
|
generators, to compute IsUnsafe flag. Fixed helperIndex array type
|
|
in C header output.
|
|
* create-char-mapping-source.cs,
|
|
create-combining-class-source.cs : thus removed.
|
|
* Makefile : thus modified for the above integration.
|
|
* NormalizationTableUtil.cs : Extended to contain IsUnsafe flag.
|
|
* Normalization.cs : Several fixes to make Normalize() actually work.
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs,
|
|
Normalization.cs,
|
|
create-char-mapping-source.cs,
|
|
create-combining-class-source.cs,
|
|
Makefile : converted managed array to pointers (like collation stuff).
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* NormalizationTableUtil.cs : further table range optimization.
|
|
* create-normalization-source.cs,
|
|
create-char-mapping-source.cs,
|
|
create-combining-class-source.cs : added C header output support.
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs, Normalization.cs :
|
|
Now property size is < 256, so directly embed value in "props" array.
|
|
Add QuickCheck(c,checkType) and remove IsNFD/C/KD/KC and delegates.
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-combining-class-source.cs,
|
|
create-char-mapping-source.cs,
|
|
create-normalization-source.cs,
|
|
NormalizationTableUtil.cs,
|
|
Normalization.cs : String.Normalize() does not handle surrogate
|
|
characters. mapping information in DerivedNormalizationProps.txt
|
|
are not used in the code (those from UnicodeData.txt is used).
|
|
Hangul syllables are computed instead of embedded in the tables.
|
|
* managed-collation.patch : removed IntPtrStream and Makefile patches.
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : IsSortable() was broken.
|
|
|
|
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : added helper for CompareInfo.IsSortable().
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-tailoring.cfg : added for convenience of contraction check.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs,
|
|
SimpleCollator.cs,
|
|
SortKeyBuffer.cs,
|
|
create-mscompat-collation-table.cs,
|
|
MSCompatUnicodeTableUtil.cs,
|
|
SortKey.cs,
|
|
create-collation-element-table.cs,
|
|
MSCompatUnicodeTable.cs,
|
|
CodePointIndexer.cs,
|
|
create-combining-class-source.cs : added copyright lines.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
MSCompatUnicodeTable.cs : removed extraneous definition.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs
|
|
MSCompatUnicodeTable.cs : full C header support, finally.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Normalization.cs,
|
|
NormalizationTableUtil.cs,
|
|
create-char-mapping-source.cs : more aggressive data compression.
|
|
It now ignores characters that are >= U+10000.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Makefile,
|
|
Normalization.template,
|
|
Normalization.cs : renamed existing file.
|
|
|
|
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* NormalizationTableUtil.cs,
|
|
Normalization.template,
|
|
create-combining-class-source.cs : GetCombiningClass is now
|
|
implemented as indexer based array.
|
|
* Makefile : renamed output filename.
|
|
* create-mscompat-collation-table.cs : removed comments that does not
|
|
make sense now.
|
|
* create-tailoring.cs : use utf-8 output (and fixed filename).
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : hacked safer IPA extensions.
|
|
* Collation-notes.txt : status of sortkey table.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some Greek mapping fix.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : diacritical weight is not
|
|
treated correctly when they are picked from letter names, as flags.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed culture-dependent
|
|
nonspacing mark weight.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some Hebrew case letter fixes.
|
|
Some diacritical fixes on symbols.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed level 3 weight of
|
|
Arabic presentation forms.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed some diacritical weight
|
|
of Arabic presentation forms.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : more status updates. It's almost complete,
|
|
except for sortkey values.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : similar optimization also for LastIndexOf().
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : the previous patch was missing IgnoreNonSpace
|
|
case.
|
|
|
|
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : reduced extra sortkey value computation in
|
|
MatchesForward(). It makes IndexOf() roughly 30% faster.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKey.cs : GetHashCode() returns a value based on its byte data.
|
|
Removed unused code.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : consider extractions in invariant culture.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : (unsafeFlags) be compact ;-)
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : When the tail of the target does not match more
|
|
than 3 times, then IsSuffix() will never be true (3 is the max
|
|
length of an expansion; \uFB03 -> ffi). It brings significant
|
|
performance boost when "source" string is very long.
|
|
* MSCompatUnicodeTable.cs : added MaxExpansionLength constant.
|
|
Reordered code lines.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : updated implementation status.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Implemented quick codepoint comparison in
|
|
Compare(). Comparison became 125x faster.
|
|
* mono-tailoring-source.txt : added tiny comment.
|
|
|
|
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* mono-tailoring-source.txt : Added all single sortkey remapping to
|
|
all cultures (still need to fill contractions and annotate possible
|
|
buggy mapping referencing to CLDR).
|
|
* SimpleCollator.cs : removed unused code.
|
|
* MSCompatUnicodeTable.cs : tiny cast removal.
|
|
|
|
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs
|
|
create-mscompat-collation-table.cs
|
|
MSCompatUnicodeTableUtil.cs
|
|
MSCompatUnicodeTable.cs : Now CJK mapping data is stored as byte
|
|
arrays. Thus SimpleCollator does not need to use bitwise and shift
|
|
operations to get sortkey value and they could be managed resources.
|
|
|
|
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
MSCompatUnicodeTable.cs,
|
|
MSCompatUnicodeTableUtil.cs : From the result of sortkey comparison
|
|
between None and IgnoreWidth, width compat table could be computed
|
|
in somewhat simple way. So removed that table and all related code.
|
|
Increased the collation resource version.
|
|
|
|
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Added C header output support.
|
|
|
|
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : FillLetterNFKD() could also be
|
|
applied to Cyrillic letters. Saved some of them.
|
|
|
|
2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : oh, ok, so we already have
|
|
GetManifestResourceInternal() ;-)
|
|
* managed-collation.patch : in Assembly.cs made that method internal.
|
|
|
|
2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : the pointer based icall code could be
|
|
also applicable for USE_MANAGED_RESOURCE mode.
|
|
|
|
2005-07-23 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : added icall support code (not enabled
|
|
unless the first line is commented out).
|
|
|
|
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
MSCompatUnicodeTableUtil.cs,
|
|
MSCompatUnicodeTable.cs : Added resource version output (and ignore
|
|
in case of version mismatch). Removed obsolete, commented out code.
|
|
|
|
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs,
|
|
MSCompatUnicodeTable.cs,
|
|
create-mscompat-collation-table.cs : Now they use unmanaged pointers
|
|
instead of managed arrays.
|
|
* managed-collation.patch : Now it contains patch for IntPtrStream.cs
|
|
and Assembly.cs as well.
|
|
|
|
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs,
|
|
SimpleCollator.cs : Moved tailoring support classes to
|
|
MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
|
|
Now that cjk and tailoring support are filled inside
|
|
MSCompatUnicodeTable, no managed array is exposed.
|
|
|
|
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
SimpleCollator.cs,
|
|
MSCompatUnicodeTable.cs : Now it's not exposing collation table
|
|
internals as managed arrays (to switch to unmanaged pointers).
|
|
|
|
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : tiny nonspacing mark fix.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed most of Greek mappings.
|
|
* MSCompatUnicodeTable.cs : don't lock string.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : More Latin diacritical fixes.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : There were still missing
|
|
math symbol mappings. Added several hacky diacritical weight for
|
|
Latin characters.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed a few diacritical weight
|
|
on Cyrillic characters. Fixed ParseTailoringSource() to handle
|
|
non-heading escape sequence (\uXXXX) as expected.
|
|
|
|
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
MSCompatUnicodeTableUtil.cs,
|
|
MSCompatUnicodeTable.cs : added more aggressive index limits for
|
|
table optimization at data size, in cost of speed.
|
|
|
|
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
|
|
|
|
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Mapping for hyphens and
|
|
punctuation are kinda finished. Rewrote batch mapping method to
|
|
collect all NFKD. Required modification on mapping is done.
|
|
|
|
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : minor mapping fixes on accent
|
|
marks and punctuations.
|
|
|
|
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
|
|
and Box drawing mapping.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed almost all numbers.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Symbol mappings are almost done.
|
|
Removed hack that gave dummy mappings to blank symbols.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more fix on arrows. Fix on box
|
|
drawings. Some code refactoring to eliminate hack.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed some secondary weight
|
|
in Devanagari and arrows.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : a set of tiny mapping fixes.
|
|
|
|
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some diacritical fixes for
|
|
Latin. Added batch mapping method that considers computed
|
|
diacritical weight (for numbers).
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* managed-collation.patch : forgot to add System.String patch.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : added resource existence check (required
|
|
for mscorlib transient time from the one without resources to the
|
|
one with resources.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed punctuations and hyphen
|
|
(shift) primary weight.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more nonspacing mark fixes.
|
|
Some non-basic Cyrillic diacritical weight fixes.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
|
|
and level 3. Tiny Hangul weight fixes.
|
|
* MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some normal characters who have
|
|
"narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
|
|
values were different. Handle U+30FB as category A.
|
|
* MSCompatUnicodeTable.cs : U+30FB does not have special weight.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more diacritical weight fixes.
|
|
Removed some unused code.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed some Thai and Arabic
|
|
level 2 weight.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed nonspacing marks in
|
|
Malayalam, Thai and Lao. Removed extraneous hack.
|
|
|
|
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
|
|
Some refactoring on IndexOf() code. Removed unused Matches().
|
|
* Collation-notes.txt : some methods needed to be reimplemented, so
|
|
rewrote the description.
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
|
|
Thus supported extenders in IsSuffix().
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : more IsSuffix() simplification, but it will be
|
|
stopped here since it cannot handle extenders (implementing new
|
|
approach one).
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : simplified IsSuffix() code.
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
|
|
entire replacement string if char target was an expansion.
|
|
IsSuffix() was using a method for IsPrefix() which was incorrect.
|
|
Removed old IsPrefix() code.
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IndexOf() was incorrectly sharing the same
|
|
byte[] field in different areas of code. Now extenders in both
|
|
source and target really work in IndexOf().
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
|
|
* SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
|
|
|
|
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Now FilterExtender() handles all extender
|
|
support. IndexOf() and LastIndexOf() now supports extenders.
|
|
IndexOf() and LastIndexOf() did not proceed contraction source
|
|
length as expected. Tiny refactoring on private IsPrefix() to take
|
|
stringSort argument.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : when restoring from expansion, go back to the
|
|
top of the loop (to avoid index out of range).
|
|
Now IsPrefix() is implemented to reuse Compare() and thus it now
|
|
supports extender as well.
|
|
* Collation-notes.txt : status update. Deleted optimization part in
|
|
status section (it is duplicate).
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : some code reordering.
|
|
* create-mscompat-collation-table.cs : it was still missing U+3094.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : In GetSortKey(), don't update previousChar when
|
|
it is not primary (e.g. don't "extend" diacritical mark).
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* managed-collation.patch : CompareInfo.Compare() should consider
|
|
the possibilities that non-empty string might be actually empty
|
|
in culture-sensitive context.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
|
|
target is "empty" (in culture-sensitive context).
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
|
|
characters in target string.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : When IgnoreWidth is specified, all Kana
|
|
characters are regarded as half-width.
|
|
Even though IgnoreWidth is specified, it should not ignore case.
|
|
For special weight comparison, the default values (E4) are bigger
|
|
than non-default values.
|
|
* SortKeyBuffer.cs : It should save LCID and original string.
|
|
* create-mscompat-collation-table.cs : For Japanese half-width kana,
|
|
it should not be counted in widthCompat map since IgnoreWidth does
|
|
not really ignore those differences.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed missing Japanese bits.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs :
|
|
tiny diacritical weight fix for U+20D0-U+20E1.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : ja CJK ideograph got completed.
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed CJK custom Japanese
|
|
mapping. It (maybe as well as other CJK tables) mixes NFKD. For
|
|
Japanese, modified NFKD table (because of Windows lame design).
|
|
|
|
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
|
|
* MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
|
|
invoked at any time it is required.
|
|
* SimpleCollator.cs : call FillCJK() above in .ctor().
|
|
* MSCompatUnicodeTableUtil.cs : CJK range was wider.
|
|
* create-mscompat-collation-table.cs : CJK binary was missing the
|
|
length. CJK remapping is being moved to ModifyUnidata().
|
|
For cjk-ja mapping, we have to consider compat characters to be
|
|
added to the map, besides the raw UCA table.
|
|
|
|
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
|
|
|
|
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
|
|
contraction as expected. Fixed Compare() to save s2's contraction
|
|
as expected.
|
|
* TestDriver.cs :added LastIndexOf() tester w/ indexes.
|
|
|
|
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
|
|
incorrectly use Compare().
|
|
* TestDriver.cs : more moved to nunit tests.
|
|
|
|
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : several fixes on Compare().
|
|
- Ignorable characters are skippted at the top of the loop.
|
|
- IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
|
|
- In such case that s1 index is increased while s2 contraction is
|
|
replaced, s1 is inconsistently proceeded (bug).
|
|
- IsIgnorable() now also checks IgnoreNonSpace.
|
|
- Fixed FilterOptions() that does not work for IgnoreWidth at all.
|
|
* TestDriver.cs : now some are moved to nunit tests.
|
|
* Collation-notes.txt : minor todo update.
|
|
|
|
2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Compare() was ignoring such case that both
|
|
entire strings have '-' to be compared.
|
|
* Collation-notes.txt : more status updates.
|
|
* TestDriver.cs : added '-' use cases.
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : to be same as other buggy part, it now handles
|
|
U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
|
|
previous character.
|
|
Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
|
|
* create-mscompat-collation-table.cs : dummy values for extenders.
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
|
|
should be computed from ExtenderType, and voice mark weight should
|
|
be considered.
|
|
* MSCompatUnicodeTable.cs : added tiny comment.
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
|
|
* SimpleCollator.cs : support for extender (U+309D etc.).
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some punct/symbols fix.
|
|
* managed-collation.patch : new (and temporary) file to support
|
|
managed collation in mscorlib.
|
|
* README : described how to use managed collation.
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
|
|
U+482-4C8 (though needs diacritical fixes).
|
|
* MSCompatUnicodeTable.cs : tiny comment for alternative impl.
|
|
|
|
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
|
|
computation code, since it looks like the same way as Latin letters
|
|
have. Thus removed all other approach (UCA, by letter name).
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : diacritical fix for "double-
|
|
struck". Syriac nonspacing fixes.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more math symbol weight fixes.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
|
|
implemented (no stub). Some other fixes on category 8-A.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some minor fixes on Arabic,
|
|
Korean and Japanese sortkey weights.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : More diacritical fixes.
|
|
Georgian characters do not have level 2 weights but level 3.
|
|
|
|
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Roman numeral characters
|
|
have diacritical weight. quick hack for control signs (U+2400..)
|
|
and box drawings.
|
|
|
|
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : improving Latin mappings.
|
|
Setting non-ASCII Latin characters' primary weight between those
|
|
ASCII characters, and setting diacritical weight (hacky).
|
|
* MSCompatUnicodeTable.cs :
|
|
Kanatype check: fixed (voice marks) and improved (comparison order).
|
|
|
|
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more diacritical fixes.
|
|
primary weight fixes on punctuations in category 07.
|
|
|
|
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : several diacritical fixes.
|
|
* TestDriver.cs : sortkey dumper should use StringSort.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : fixed incorrect indexer setup. Optimized
|
|
GetContraction() call a bit.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : fixed incorrect level 2
|
|
output type.
|
|
* MSCompatUnicodeTable.cs : remove debug line.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTableUtil.cs,
|
|
MSCompatUnicodeTable.cs,
|
|
CodePointIndexer.cs,
|
|
create-mscompat-collation-table.cs : made some members internal and
|
|
accessible from other classes. Many indexes could be 0 by default.
|
|
* SimpleCollator.cs : optimizations. avoid method call.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more updates.
|
|
* SimpleCollator.cs : Added quick check for Ordinal comparison.
|
|
Fixed special weight comparison. It cannot be customizable in the
|
|
implementation (and it won't be harmful).
|
|
* mono-tailoring-source.txt : thus updated comment.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Compare() was missing French sort support.
|
|
* TestDriver.cs : added example case.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : updated status. Eliminated descriptions on
|
|
"iterator" (I avoided it for performance concern). Fixed misc.
|
|
incorrect descriptions.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collator.cs : Now that SimpleCollator became feature complete, it is
|
|
not useful anymore.
|
|
|
|
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : implemented decent Compare() that immediately
|
|
stops at first primary difference.
|
|
|
|
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : indexers might return -1.
|
|
|
|
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
|
|
buggy (length check for source was missing).
|
|
|
|
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed tailoring table output
|
|
to be in correct and countable order. Now if tailoring alias was not
|
|
found, just stop the build.
|
|
* MSCompatUnicodeTable.cs : several build fixes. Now it works to read
|
|
assembly resources.
|
|
* mono-tailoring-source.txt : commented out CJK aliases that miss
|
|
target.
|
|
* Makefile : needed further filename fixes.
|
|
|
|
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
|
|
(now it is working as a standalone file).
|
|
* Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
|
|
(the generator now creates both binary resources and C# source).
|
|
|
|
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Now it generates binary
|
|
resources (to parent directory).
|
|
* MSCompatUnicodeTable.template : added conditional code that fills
|
|
collation tables from manifest resources.
|
|
* Makefile : remove collation table binaries as well on "make clean".
|
|
Removed extraneous dependency.
|
|
|
|
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template,
|
|
SimpleCollator.cs : removed extraneous GetExpansion().
|
|
|
|
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IsSuffix() also supports contractions.
|
|
* TestDriver.cs : IsSuffix() example contraction cases.
|
|
|
|
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
|
|
what current IsPrefix() does). For expansion of target, IsPrefix()
|
|
should check the no-match case that expansion is longer than input.
|
|
Some refactory on IsPrefix().
|
|
Added GetContractionTal() for IsSuffix() (not used yet).
|
|
|
|
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* TestDriver.cs : added IsPrefix() expansion cases.
|
|
* SimpleCollator.cs : IsPrefix() now supports contractions (with much
|
|
of complexity), and it now returns bool again.
|
|
IndexOf() for replacement should make use of IndexOfPrimitiveChar()
|
|
since expansions won't be expanded recursively.
|
|
|
|
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : commonized character comparison in IsPrefix()
|
|
and IsSuffix(). csc compile fix.
|
|
* CompareInfoImpl.cs : deleted.
|
|
|
|
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* TestDriver.cs : added SimpleCollator.ctor() sanity check.
|
|
Added replacement contraction example.
|
|
* SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
|
|
contraction in source string. Extracted matching code to Matches().
|
|
Replacement contraction was including extraneous '\x0'.
|
|
|
|
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : updated status.
|
|
* CollationDataStructures.txt : tiny fixes.
|
|
* SimpleCollator.cs :
|
|
Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
|
|
namespace Util and csc borked).
|
|
GetContraction was incorrectly returning first item.
|
|
Private IsPrefix() now returns int (but it might not be in real use).
|
|
Extracted simple char comparison to CompareCharSimple().
|
|
IndexOf() and LastIndexOf() now fully handle contractions (both
|
|
binary key and string replacement) in "target" (for "s" not yet).
|
|
* TestDriver.cs : be more verbose.
|
|
* mono-tailoring-source.txt : added comment.
|
|
* MSCompatUnicodeTable.template :
|
|
Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
|
|
|
|
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : compute COMBINING blah marks as
|
|
well as those characters WITH blah.
|
|
* TestDriver.cs : added combining sortkey cases.
|
|
|
|
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* mono-tailoring-source.txt : fixed description on '*' in sortkeys.
|
|
* SimpleCollator.cs : Now it fully uses tailoring info. Fixed
|
|
contraction search that worked only when string is contraction.
|
|
Removed commented code. Minor refactoring.
|
|
* TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
|
|
|
|
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
* mono-tailoring-source.txt : removed extraneous level 4 sortkey
|
|
which cannot be supported.
|
|
* SimpleCollator.cs : added GetContraction() and used in some places.
|
|
Now CompareOptions is set only once. Reordered some code (e.g.
|
|
ignorable check -> get compat char -> compare).
|
|
|
|
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : sort tailoring tables before actual usage.
|
|
Support diacritical remappings (it is customized collation rule
|
|
which does not exist in UCA).
|
|
|
|
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : build culture specific tailoring table from
|
|
TailoringInfo and unified data array.
|
|
* create-mscompat-collation-table.cs : Added null termination to
|
|
sortkey map tailorings (mostly to save my eyes).
|
|
* MSCompatUnicodeTable.template : added public TailoringValues.
|
|
|
|
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKeyBuffer.cs : handle special weight (category 06) characters.
|
|
* Collation-notes.txt : Updated description on special weight (it was
|
|
incorrect).
|
|
* TestDriver.cs : added special weight cases.
|
|
|
|
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template : added GetTailoringInfo().
|
|
* SimpleCollator.cs : Now tailoring information is acquired and used.
|
|
(FrenchSort is supported but Compare() won't work expectedly since
|
|
the table is still incomplete for those diacritical marks).
|
|
* SortKeyBuffer.cs : On reversing diacritical weights, it should
|
|
ignore zeros. Reset() should reset frenchSorted flag.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Further fixes on Jamo,
|
|
diacritical weights by character name, and *Numbers primary weights.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : More fix on Devanagari,
|
|
Gujarati, Oliya, Tamil and Lao sortkeys.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
|
|
sortkey values.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed Thai character primary
|
|
and secondary values. Fixed Thaana letters. Added more LAMESPEC
|
|
CJK compat. Fixed some circled CJK secondary weight.
|
|
Hacked some nonspacing mark sortkey value adjustment.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : CP932.TXT was not parsed as
|
|
expected. JIS ordering was incorrect. OtherNumbers that represents
|
|
10 or more values were incorrectly computed the offset. Some Hangul
|
|
compat characters has different offset.
|
|
|
|
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed 0x8 category characters.
|
|
Added hack for need-to-be-fixed characters to fall into 0xA category.
|
|
* create-collation-element-table.cs : previous checkin seem failed :(
|
|
* README: updated a bit.
|
|
|
|
2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* CodePointIndexer.cs :
|
|
removed extraneous switch (I could use empty array for that need).
|
|
* CollationElementTableUtil.cs : primary weight type became ushort.
|
|
* create-collation-element-table.cs : several bugfixes.
|
|
collElem should be int. It was skipping most of entries because of
|
|
incorrect string tokenization.
|
|
|
|
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : handle some Jamo NKFD.
|
|
|
|
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : forgot to commit in the last checkin.
|
|
* create-mscompat-collation-table.cs : fixed arabic shift weight chars.
|
|
* TestDriver.cs : switch table dumper and collator testing.
|
|
* SortKey.cs : for now comment out internal indexes (not in use).
|
|
|
|
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template,
|
|
SimpleCollator.cs : support for culture dependent CJK table.
|
|
|
|
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs,
|
|
MSCompatUnicodeTableUtil.cs : make CJK table more compact.
|
|
|
|
2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : Fixed stupid index search when start != 0.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
|
|
now starts from "start" and proceeds backward by "length".
|
|
* TestDriver.cs : fix warning.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* TestDriver.cs : more tests.
|
|
* SimpleCollator.cs : LastIndexOf() is not setting search length
|
|
on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-normalization-source.cs : output propValue as uint.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKey.cs : Now it is System.Globalization.SortKey.
|
|
To replace existing implementation, it now requires lcid and
|
|
CompareOptions. Added required members.
|
|
* SortKeyBuffer.cs : thus .ctor() requires LCID.
|
|
* SimpleCollator.cs : made required changes above.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* CodePointIndexer.cs : added CompressArray(). Now it requires two more
|
|
parameters for default index and codepoint.
|
|
* CollationElementTableUtil.cs,
|
|
NormalizationTableUtil.cs : required changes wrt above change.
|
|
* MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
|
|
* MSCompatUnicodeTable.template : Now it uses codepoint indexer.
|
|
* create-mscompat-collation-table.cs : Now it outputs compressed array.
|
|
* Makefile : now collation requires MSCompatUnicodeTableUtil.cs
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs :
|
|
Implemented IsSuffix() and LastIndexOf().
|
|
Several fixes on index > 0 cases.
|
|
* TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : updated (status, impl. classes).
|
|
* MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
|
|
|
|
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
|
|
and IsPrefix(). Tiny code refactory.
|
|
* TestDriver.cs : sample IsPrefix() and IndexOf() usage.
|
|
* MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
|
|
|
|
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs :
|
|
IndexOf(string, char, CompareOptions) implementation.
|
|
* TestDriver.cs : sample IndexOf() usage.
|
|
|
|
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : was missing most important
|
|
kind of blocks - equivalent expansions (e.g. invariant mappings).
|
|
More readable mappings.
|
|
|
|
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* mono-tailoring-source.txt : new file. It describes tailoring
|
|
information. Basically examined under .NET 1.x.
|
|
* create-mscompat-collation-table.cs : consume the file above.
|
|
* MSCompatUnicodeTable.template : now tailorings is not a stub.
|
|
* CollationDataStructures.txt : minor fixes.
|
|
* SortKeyBuffer.cs,
|
|
SimpleCollator.cs : added FrenchSort support.
|
|
* Collation-notes.txt : added description on Latin primary weights.
|
|
* ldml-limited.rng : added note.
|
|
* create-tailorings.cs : added note. more serialization (but won't be
|
|
used anyways).
|
|
|
|
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKeyBuffer.cs : non-primary character is added to previous
|
|
diacritical weight.
|
|
* TestDriver.cs : added example case of above.
|
|
|
|
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : IgnoreSymbols support.
|
|
* TestDriver.cs : compilation fix. IgnoreSymbols example.
|
|
* create-mscompat-collation-table.cs : more Hangul fixes.
|
|
|
|
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more Hangul fixes.
|
|
* SortKey.cs : it will replace sys.globalization.SortKey. It has
|
|
some internal members.
|
|
* SortKeyBuffer.cs : now it uses SortKey instead of byte[].
|
|
* SimpleCollator.cs : CompareOptions support. However I don't think
|
|
it will be developed anymore since SortKey never enables IndexOf().
|
|
* TestDriver.cs : a few CompareOptions cases.
|
|
|
|
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SimpleCollator.cs : simple collator implementation that just will
|
|
use GetSortKey() for all its basis.
|
|
* TestDriver.cs : sample code that uses this collator set.
|
|
* MSCompatUnicodeTable.template : removed test driver from here.
|
|
|
|
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Hangul fixes.
|
|
Now less than 300 characters that does not have sortkey weights.
|
|
* MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
|
|
|
|
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Added control picture mappings.
|
|
Minor primary weight fixes.
|
|
|
|
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Added mappings for box
|
|
drawings and blocks.
|
|
|
|
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Added mappings for arrows.
|
|
|
|
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : added support for letterlike
|
|
characters and squared CJK compatibility characters, ordered by
|
|
character names (0x0E category).
|
|
* Collation-notes.txt : added description on that.
|
|
|
|
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template : Now expansions are simulated.
|
|
* create-mscompat-collation-table.cs : filled Korean number level2.
|
|
Reordered some code blocks to fill correct diacritical differences.
|
|
* Collation-notes.txt : some corrections and minor additions.
|
|
|
|
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template :
|
|
Now dumper test driver uses SortKeyBuffer for dogfooding.
|
|
* create-mscompat-collation-table.cs : some diacritical level fixes
|
|
(with non-working extra latin check).
|
|
* SortKeyBuffer.cs : several fixes to get working as a practical code.
|
|
* Collator.cs : make it compilable, leaving things as NotImplemented.
|
|
|
|
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : some fixes on primary category
|
|
07 (miscellaneous symbols and punctuations).
|
|
|
|
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : more mapping fix on numbers,
|
|
letters, variable weight characters, circled Japanese and CJK.
|
|
* MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
|
|
inclusive. Simplified dumper code.
|
|
|
|
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : finished Hangul (both Jamo
|
|
and Syllables). sortkey dumper diff lines became 8000 from 30000.
|
|
|
|
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : added some nonspacing marks in
|
|
either correct or hacky way.
|
|
|
|
2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : several improvements. Japanese
|
|
Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
|
|
numeric characters, diacritically decorated latin alphabets. Fixed
|
|
some diacritical weights detection.
|
|
* MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
|
|
marks' primary weight as empty.
|
|
* Collation-notes.txt : some updates.
|
|
|
|
2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : don't process nonexact NFKD
|
|
mapping as equivalent, however store CJK extensions into NFKD map
|
|
even if one does not strictly match.
|
|
Now am going to fill Hangul into tables (unlike UCA it does not look
|
|
possible to calculate sortkey value).
|
|
Fixed Cyrillic and Georgian UCA based orderings.
|
|
* MSCompatUnicodeTable.template : added CJK extension sortkey
|
|
calculation.
|
|
|
|
2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Fixed latin alphabet support.
|
|
Added latin with diacritical and CJK extension.
|
|
* MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
|
|
|
|
2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
|
|
now not used thouth). Filled CJK ideograph, still not perfect.
|
|
Fixed number primary keys. NFKD numbers and CJK ideographs are now
|
|
considered, including brackets elimination.
|
|
* Makefile : now it downloads DerivedAge.txt.
|
|
* MSCompatUnicodeTable.template : added dummy code dumper. It computes
|
|
PrivateUse, Surrogate and Hangul Syllables.
|
|
* Collation-notes.txt : Noted that Hangul Syllables need more love.
|
|
|
|
2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-tailorings.cs : added configuration support. sort them.
|
|
I wonder if it is really usable. Having own format might be better.
|
|
* create-mscompat-collation-table.cs : fixing some sortkey numbers,
|
|
making closer to windows. Now it handles NFKD in some places.
|
|
* MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
|
|
* CollationDataStructures.txt : added description on tailoring
|
|
fields, though they are subject to change.
|
|
|
|
2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-tailorings.cs, ldml-limited.rng : new file.
|
|
* LdmlReader.cs : removed old file.
|
|
|
|
2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* SortKeyBuffer.cs : split from Collator.cs. Now it considers
|
|
practical use, reflecting updated sortkey constant design.
|
|
Especially level 4 weight is split to 4 arrays that are merged in
|
|
the last stage of GetSortKey().
|
|
* Collator.cs : thus SortKeyBuffer is removed from here.
|
|
Additionally, removed some extraneous bits in other classes.
|
|
* Collation-notes.txt : Some editorial fixes. Added information on
|
|
Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
|
|
be stored in simple byte arrays).
|
|
* CodePointIndexer.cs,
|
|
create-collation-element-table.cs,
|
|
CollationElementTable.template,
|
|
NormalizationTableUtil.cs : short CodePointIndexer method names.
|
|
* create-mscompat-collation-table.cs : Additional info on why some
|
|
meaningful characters are ignored in Windows (Unicode version
|
|
difference). Removed U+070F from special check (was extraneous).
|
|
|
|
2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.template:
|
|
Moved body implementation to table creator and put those bool
|
|
results into an array.
|
|
* create-mscompat-collation-table.cs :
|
|
So imported those methods. Modified array output to emit "0x"
|
|
only for more than 9.
|
|
* create-normalization-source.cs : ditto on "0x" output matter.
|
|
* CollationDataStructures.txt : so now it holds ignorableFlags.
|
|
|
|
2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt, CollationDataStructures.txt :
|
|
separate document for data structure design.
|
|
|
|
2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : added culture-dependent CJK
|
|
table creation. It uses CLDR as its basis. (Culture independent CJK
|
|
is not ready BTW).
|
|
* Makefile : added CLDR archive downloading support.
|
|
* MSCompatUnicodeTable.template : tiny renamings.
|
|
* Collation-notes.txt : additional CJK info.
|
|
|
|
2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt, create-mscompat-collation-table.cs :
|
|
added secondary weight support for BlahNumber characters.
|
|
|
|
2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* downloaded : added directory. All downloaded files are stored here.
|
|
* Makefile : use "downloaded" directory.
|
|
Added more auto-download stuff.
|
|
* create-mscompat-collation-table.cs :
|
|
Added Japanese square kana support.
|
|
|
|
2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
|
|
* create-mscompat-collation-table.cs : added support for Arabic abjad,
|
|
Estrangela and Thaana.
|
|
* MSCompatUnicodeTable.template : removed BOM.
|
|
|
|
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : wrong comment cleanup and spelling fixes.
|
|
* create-mscompat-collation-table.cs : added diacritic support for
|
|
Latin letters (as long as covered in primary weight).
|
|
|
|
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Makefile : minor fixes. Added warning lines to generated sources.
|
|
|
|
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-char-mapping-source.cs :
|
|
Removed ToWidthInsensitive() generation.
|
|
|
|
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
|
|
ToWidthInsensitive() is implemented here, using an array (which is
|
|
to be optimized using CodePointIndexer).
|
|
* MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
|
|
* MSCompatUnicodeTable.template : now it is used to generate
|
|
MSCompatUnicodeTable.cs which got ready to be used.
|
|
* Makefile : added MSCompatUnicodeTable.cs build support. Now it
|
|
supports "make normalization" and "make collation".
|
|
|
|
2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : Description on ICU is very incorrect. Now it
|
|
became more rational and sane.
|
|
* create-mscompat-collation-table.cs : fixed some indexes.
|
|
* Makefile : added "mstablegen" target.
|
|
* MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
|
|
|
|
2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more analysis on "letters".
|
|
* create-mscompat-collation-table.cs : more proof of concepts.
|
|
|
|
2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more info. Started letter sortkey analysis
|
|
(some of other stuff are really non-understandable right now.)
|
|
* create-mscompat-collation-table.cs : table generator proof-of-
|
|
concept source (not compilable).
|
|
* MSCompatUnicodeTable.cs : moved some code to the new source.
|
|
Some more fixes.
|
|
|
|
2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : started level 2 weight analysis.
|
|
|
|
2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : Additional information on how to create
|
|
level 3 tables.
|
|
* MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
|
|
|
|
2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : More case weight (level 3) analysis. I'm
|
|
likely to just write table generator.
|
|
|
|
2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* MSCompatUnicodeTable.cs : part of level 4 weight implementation.
|
|
|
|
2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt :
|
|
Added task list.
|
|
Revised comparison methods; backward iteration is possible.
|
|
More on char-by-char comparison.
|
|
Level 4 comparison is actually a bit more complex.
|
|
Misc corrections.
|
|
* Collator.cs : some conceptual updates wrt above.
|
|
|
|
2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : Japanese voice mark is level 2, and Hangul
|
|
properties are level 3.
|
|
|
|
2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : Make it more readable. More analysis on
|
|
level 3 and 4 sortkey structures.
|
|
* Collator.cs : some compilation fixes (not compilable yet).
|
|
|
|
2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : Analysis on variable-weighting (level 5)
|
|
sortkey format.
|
|
* Collator.cs : updated corresponding part of level 5, and more.
|
|
|
|
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more updates.
|
|
* Collator.cs : rewrote from scratch. Some rough sketch for sortkey
|
|
buffer, character iterator and collator methods. Not compiling.
|
|
|
|
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collator.cs : Am going to replace it with new one. No need for
|
|
CompareOptions-dependent Comparer.
|
|
|
|
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : There seems a bit more complexity.
|
|
|
|
2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more updates, being close to write sortkey
|
|
generator code.
|
|
|
|
2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* CompareInfoImpl.cs, Collator.cs : conceptual update
|
|
* Collation-notes.txt : some corrections and additions.
|
|
* Makefile : added LDML input (but it won't be used at all).
|
|
|
|
2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more updates.
|
|
|
|
2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : more updates.
|
|
|
|
2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* Collation-notes.txt : some updates.
|
|
* create-mapping-char-source.cs : superscripts and subscripts are also
|
|
ignored in IgnoreWidth comparison.
|
|
* Makefile : tiny touch fix.
|
|
|
|
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
|
|
|
|
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* create-char-mapping-source.cs : Now it generates
|
|
ToWidthInsensitive() from combining category <wide> and <narrow>.
|
|
* MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
|
|
ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
|
|
|
|
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* README, LdmlReader.cs, DataStructures.txt : new files.
|
|
|
|
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
|
|
|
|
* CodePointIndexer.cs,
|
|
Collation-notes.txt,
|
|
CollationElementTable.template,
|
|
CollationElementTableUtil.cs,
|
|
create-char-mapping-source.cs,
|
|
create-collation-element-table.cs,
|
|
create-combining-class-source.cs,
|
|
create-normalization-source.cs,
|
|
Makefile,
|
|
MSCompatUnicodeTable.cs,
|
|
Normalization.template,
|
|
NormalizationTableUtil.cs : initial checkin (to private branch).
|
|
|