Jo Shields a575963da9 Imported Upstream version 3.6.0
Former-commit-id: da6be194a6b1221998fc28233f2503bd61dd9d14
2014-08-13 10:39:27 +01:00

1765 lines
63 KiB
Plaintext

2010-06-04 Damien Diederen <dd@crosstwine.com>
* create-category-table.cs: Utility to generate reasonably-packed
Unicode tables.
This program generates (partially) bi-level tables encoding the
contents of the Unicode character category database.
Mono embeds a linear table with category codes for the Unicode BMP
(first 65536 codepoints), and lacks information about characters
in the astral planes--leading to requests such as bug 480178.
Extending the linear table to cover the full codespace is not an
ideal solution, as that would expand the embedded "blob" by a
factor of 17.
The new tables generated by this program can be used to support
the full range of characters. An additional level of indirection
used for characters outside the U+0000..U+FFFF range enables
"page" sharing, so that the total amount of embedded data only
grows by 13.5kB.
Cf. in-file comments for usage instructions.
2010-05-17 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : fix extender search index for LastIndexOf().
Fixed bug #605094.
2010-04-20 Damien Diederen <dd@crosstwine.com>
* Normalization.cs: Really apply canonical reordering "recursively."
Before this, a sequence of code points with the combining
classes (22, 33, 11) would be reordered to (22, 11, 33) instead of
the correct (11, 22, 33). This is because the 'i--' would be
directly cancelled by the 'i++' in the for loop.
2010-04-20 Damien Diederen <dd@crosstwine.com>
* Normalization.cs: The correct "checkType" argument to
Decompose() is NKD or NKFD when normalizing to NKC resp. NKFC.
* StringTest.cs: More NFC test cases.
2010-04-20 Damien Diederen <dd@crosstwine.com>
* Normalization.cs: Implement algorithmic Hangul composition.
Calling Normalize(NormalizationForm.FormC) on Korean characters
now works properly (bnc#480152).
* StringTest.cs: Add test cases for Hangul composition.
2010-04-20 Damien Diederen <dd@crosstwine.com>
* Normalization.cs: Follow the spec when checking composition pairs.
Figure 7 in section 1.3 of http://unicode.org/reports/tr15/ shows
how when doing composition, one has to examine the successive
(starter, candidate) pairs, and combine if a matching canonical
decomposition exists.
The original algorithm was, instead, iterating on canonical
decompositions, and, for each one, trying to match a sequence
of (starter, non-starter, ...). This, however, does not produce
the same results as it is violating some implicit ordering
constraints in the Unicode tables.
E.g., when composing the following sequence of codepoints, the
original algorithm was picking:
03B7 0313 0300 0345
^^^^ ^^^^
1F74 0313 0345
^^^^ ^^^^
1FC2 0313
and would stop at 1FC2 0313 as there is no decomposition matching
it. The new algorithm, which follows the guidance of the pretty
figure 7, ends up doing:
03B7 0313 0300 0345
^^^^ ^^^^
1F20 0300 0345
^^^^ ^^^^
1F22 0345
^^^^ ^^^^
1F92
resulting in the correct 1F92.
2010-04-19 Damien Diederen <dd@crosstwine.com>
* Normalization.cs: Recursively apply the Unicode decomposition mapping.
According to http://www.unicode.org/reports/tr15/tr15-31.html,
section 1.3:
"To transform a Unicode string into a given Unicode Normalization
Form, the first step is to fully decompose the string. [...] Full
decomposition involves recursive application of the
Decomposition_Mapping values, because in some cases a complex
composite character may have a Decomposition_Mapping into a
sequence of characters, one of which may also have its own
non-trivial Decomposition_Mapping value."
2010-02-18 Gabriel Burt <gabriel.burt@gmail.com>
* Normalization.cs: Implement algorithmic Hangul decomposition; Calling
string.Normalize on Korean characters now works properly (bnc#480152).
This reduces the number of errors in 'make test' from 27k to 4.8k.
* StringNormalizationTestSource.cs:
* Makefile: Use the local, working copy of Normalization etc,so as to make
modifying Normalization.cs and then testing your changes with 'make test'
possible. Also, fix building/running of tests, patch by Alexander
Kojevnikov.
2009-09-18 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs : Handle blocked characters which are not
immediately next to the primary composite character. This fixes
some Arabic string sequence normalization.
* Makefile : fix test build.
2009-09-17 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs : some renaming for disambiguation.
* NormalizationTableUtil.cs : fix some wrong ranges in
mapIdxToComposite. This fixes some Arabic normalization (and more).
* normalization-notes.txt : added some notes on the implementation.
2008-06-19 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs :
- reverted the previous index calculation change. It was correctly
implemented and I rather broke it.
- fix index calculation on combining.
- NFKD was incorrectly directed to combining path. It should not.
- Simplify quick check.
2008-06-15 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs : For NFC and NFKC, IsNormalized() was not working
enough to check composed characters. It's not possible without
the actual composition, so just call Normalize() and compare them.
In Normalize() mapping helper didn't pick correct map index since
the table for index stores index for "uncompressed" numbers.
* NormalizationTableUtil.cs : updated to the latest UCD.
* Makefile : to build test, source file must be downloaded too.
2008-11-05 Atsushi Enomoto <atsushi@ximian.com>
* ucd.cs : Write type for *_count. Add notice to not edit
unicode-data.h directly.
2008-11-04 Atsushi Enomoto <atsushi@ximian.com>
* ucd.cs : new code to generate unicode table for eglib.
2008-07-04 Andreas Nahr <ClassDevelopment@A-SoftTech.com>
* SortKey: Fix parameter names, add attribute, small formatting
2008-06-27 Rodrigo Kumpera <rkumpera@novell.com>
* CodePointIndexer.cs : Make TableRange a struct instead
of a class so we save 2 memory ops per ToIndex loop.
2008-04-02 Atsushi Enomoto <atsushi@ximian.com>
* SortKey.cs : check null arguments. Fixed bug #376171.
2007-07-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : I wonder how long its build
has been broken ...
2007-03-06 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : disable QuickCheckPossible(), which is
inaccurate and inefficient. Fixed bug #79714.
2007-02-15 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : character filtering is needed for
OrdinalIgnoreCase in 2.0 profile. Fixed bug #80865.
2007-01-25 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : GetTailContraction() was broken to pick correct
contraction/special sortkey out and thus LastIndexOf() failed when
it is involved. Fixed bug #80612.
2007-01-22 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : for non-StringSort comparison, level5 (- and ')
should be still skipped after initial level5 check is done (while
they were simply treated as a normal character). Fixed bug #78748.
* SortKeyBuffer.cs : Fixed NRE in french sort.
2006-12-25 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : added IndexOf() implementation for Ordinal
and OrdinalIgnoreCase, though Ordinal version is not used (since
it is slower than icall).
2006-05-30 Miguel de Icaza <miguel@novell.com>
* MSCompatUnicodeTable.cs: Remove the fixed loading and compute it
just when we actually consume it. This only fixes the
!USE_C_HEADER case.
2006-04-14 Atsushi Enomoto <atsushi@ximian.com>
* README: removed obsolete info.
* Normalization.cs : canonical reordering should participate in the
decomposition step. In reordering, string append was incomplete.
Combining class check is required in NFD check. Icall is written
using IntPtr now.
2005-12-07 Zoltan Varga <vargaz@gmail.com>
* SimpleCollator.cs: Fix a warning.
2005-11-30 Sebastien Pouliot <sebastien@ximian.com>
* SimpleCollator.cs: Fix CAS support. The static ctor/var try to get
the environment variable MUCH too soon (i.e. the security manager
needs the collator).
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : direct fast-path optimization for IndexOf().
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs :
- CompareQuick(): added immediateBreakup to avoid extraneous sortkey
computation.
- QuickCheckPossible(): index used for s1 was incorrect.
2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : added another quick check for CompareInternal()
that does almost ordinal comparison for quick-checkable strings.
(It affects on Compare(), IndexOf(), IsSuffix() etc. as well.)
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : (IsIgnorable) \0 is not ignorable.
Fixed bug 76702.
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs :
Created another struct to reduce method arguments. Created another
flags that keeps "once-matched" state (counterpart of
checkedFlags, now neverMatchFlags).
2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs :
- Added CompareOrdinalIgnoreCase() for NET_2_0 RTM.
- Reduced extra parameter from LastIndexOfSortKey().
- LastIndexOf() should use GetTailContraction for the source string.
And then, target could match in the middle of the possible
"replacement contraction" of the source string, so use
LastIndexOfSortKey() to catch them.
- Fixed GetTailContraction() that caused index out of range.
2005-11-11 Atsushi Enomoto <atsushi@ximian.com>
* Makefile : Now use MONO_DISABLE_MANAGED_COLLATION.
* SortKey.cs : some members are virtual.
2005-10-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : modified to use stackalloc for byte array.
2005-09-27 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : in CompareInternal(), there was a possibility of
infinite loop. Fixed bug #76243.
2005-09-20 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : In IsPrefix/IsSuffix, if target is an empty string,
immediately return true.
2005-09-09 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IsSuffix() optimization logic was buggy, so just
use pretty simple way with LastIndexOf() (no significant perf.
problem).
2005-09-01 Atsushi Enomoto <atsushi@ximian.com>
* README, Collation-notes.txt, CollationDataStructures.txt :
removing obsolete info and some added some notes.
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs : remove warned code.
* managed-collation.patch : now it's not required anymore.
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : added IsSortable(string).
2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Now all collator methods are thread safe.
All instance non-readonly fields turned into arguments of every
methods that use those fields.
(Sadly it is the end of no-memory-cost collator era. mcs bootstrap
now needs +100KB memory consumption.)
2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : made "checkedFlags" as nullable and made it as
an argument of every index methods (to make it thread safe).
2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs,
MSCompatUnicodeTable.cs :
- Now IsIgnorable() is aggregated to be one invokation to check
completely ignorable, nonspacing and symbols.
- Introduced "already checked" flags for IndexOf() and LastIndexOf()
to skip sortkey binary check on the same characters. Significant
perf. improvement for such case as IndexOf("AABCBABC...Z",'Z').
2005-08-08 Gert Driesen <drieseng@users.sourceforge.net>
* SortKey.cs: Marked Serializable to match MS.NET.
2005-08-08 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
Makefile : changed resources output directory.
2005-08-04 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-tests.cs,
StringNormalizationTestSource.cs : new files for Unicode
Normalization test generator.
* Makefile : added support for above.
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
* NormalizationTableUtil.cs : oops, it does not compile.
* managed-collation.patch : I guess having managed resource would be
better for collation. At least current code has such #define so
Makefile should be in sync with it.
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs : Fixed CharMapComparer which
incorrectly returned 0 when the second arg is shorter. Reduced
extraneous helperIndex map. Other minor fixes and code removal.
* Normalization.cs : several fixes to support blocked combine handling.
* NormalizationTableUtil.cs : tiny member renaming.
2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs,
NormalizationTableUtil.cs,
Normalization.cs : several bugfixes on index miscomputation.
Renamed using aliases (csc will bork). Primary combine safety is now
computed during UnicodeData.txt parse.
Maximum NFKD length was 18, not 4 (U+FDFA).
2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
* managed-collation.patch : added Normalization support.
* managed-collation-icall.patch : added, including normalization stuff.
BTW when will collation code checked in?
2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs : Unified three normalization source
generators, to compute IsUnsafe flag. Fixed helperIndex array type
in C header output.
* create-char-mapping-source.cs,
create-combining-class-source.cs : thus removed.
* Makefile : thus modified for the above integration.
* NormalizationTableUtil.cs : Extended to contain IsUnsafe flag.
* Normalization.cs : Several fixes to make Normalize() actually work.
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs,
Normalization.cs,
create-char-mapping-source.cs,
create-combining-class-source.cs,
Makefile : converted managed array to pointers (like collation stuff).
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* NormalizationTableUtil.cs : further table range optimization.
* create-normalization-source.cs,
create-char-mapping-source.cs,
create-combining-class-source.cs : added C header output support.
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs, Normalization.cs :
Now property size is < 256, so directly embed value in "props" array.
Add QuickCheck(c,checkType) and remove IsNFD/C/KD/KC and delegates.
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* create-combining-class-source.cs,
create-char-mapping-source.cs,
create-normalization-source.cs,
NormalizationTableUtil.cs,
Normalization.cs : String.Normalize() does not handle surrogate
characters. mapping information in DerivedNormalizationProps.txt
are not used in the code (those from UnicodeData.txt is used).
Hangul syllables are computed instead of embedded in the tables.
* managed-collation.patch : removed IntPtrStream and Makefile patches.
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : IsSortable() was broken.
2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : added helper for CompareInfo.IsSortable().
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* create-tailoring.cfg : added for convenience of contraction check.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs,
SimpleCollator.cs,
SortKeyBuffer.cs,
create-mscompat-collation-table.cs,
MSCompatUnicodeTableUtil.cs,
SortKey.cs,
create-collation-element-table.cs,
MSCompatUnicodeTable.cs,
CodePointIndexer.cs,
create-combining-class-source.cs : added copyright lines.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
MSCompatUnicodeTable.cs : removed extraneous definition.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs
MSCompatUnicodeTable.cs : full C header support, finally.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* Normalization.cs,
NormalizationTableUtil.cs,
create-char-mapping-source.cs : more aggressive data compression.
It now ignores characters that are >= U+10000.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* Makefile,
Normalization.template,
Normalization.cs : renamed existing file.
2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
* NormalizationTableUtil.cs,
Normalization.template,
create-combining-class-source.cs : GetCombiningClass is now
implemented as indexer based array.
* Makefile : renamed output filename.
* create-mscompat-collation-table.cs : removed comments that does not
make sense now.
* create-tailoring.cs : use utf-8 output (and fixed filename).
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : hacked safer IPA extensions.
* Collation-notes.txt : status of sortkey table.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some Greek mapping fix.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : diacritical weight is not
treated correctly when they are picked from letter names, as flags.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed culture-dependent
nonspacing mark weight.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some Hebrew case letter fixes.
Some diacritical fixes on symbols.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed level 3 weight of
Arabic presentation forms.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed some diacritical weight
of Arabic presentation forms.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : more status updates. It's almost complete,
except for sortkey values.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : similar optimization also for LastIndexOf().
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : the previous patch was missing IgnoreNonSpace
case.
2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : reduced extra sortkey value computation in
MatchesForward(). It makes IndexOf() roughly 30% faster.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* SortKey.cs : GetHashCode() returns a value based on its byte data.
Removed unused code.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : consider extractions in invariant culture.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : (unsafeFlags) be compact ;-)
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : When the tail of the target does not match more
than 3 times, then IsSuffix() will never be true (3 is the max
length of an expansion; \uFB03 -> ffi). It brings significant
performance boost when "source" string is very long.
* MSCompatUnicodeTable.cs : added MaxExpansionLength constant.
Reordered code lines.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : updated implementation status.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Implemented quick codepoint comparison in
Compare(). Comparison became 125x faster.
* mono-tailoring-source.txt : added tiny comment.
2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
* mono-tailoring-source.txt : Added all single sortkey remapping to
all cultures (still need to fill contractions and annotate possible
buggy mapping referencing to CLDR).
* SimpleCollator.cs : removed unused code.
* MSCompatUnicodeTable.cs : tiny cast removal.
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs
create-mscompat-collation-table.cs
MSCompatUnicodeTableUtil.cs
MSCompatUnicodeTable.cs : Now CJK mapping data is stored as byte
arrays. Thus SimpleCollator does not need to use bitwise and shift
operations to get sortkey value and they could be managed resources.
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
MSCompatUnicodeTable.cs,
MSCompatUnicodeTableUtil.cs : From the result of sortkey comparison
between None and IgnoreWidth, width compat table could be computed
in somewhat simple way. So removed that table and all related code.
Increased the collation resource version.
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Added C header output support.
2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : FillLetterNFKD() could also be
applied to Cyrillic letters. Saved some of them.
2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : oh, ok, so we already have
GetManifestResourceInternal() ;-)
* managed-collation.patch : in Assembly.cs made that method internal.
2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : the pointer based icall code could be
also applicable for USE_MANAGED_RESOURCE mode.
2005-07-23 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : added icall support code (not enabled
unless the first line is commented out).
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
MSCompatUnicodeTableUtil.cs,
MSCompatUnicodeTable.cs : Added resource version output (and ignore
in case of version mismatch). Removed obsolete, commented out code.
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs,
MSCompatUnicodeTable.cs,
create-mscompat-collation-table.cs : Now they use unmanaged pointers
instead of managed arrays.
* managed-collation.patch : Now it contains patch for IntPtrStream.cs
and Assembly.cs as well.
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs,
SimpleCollator.cs : Moved tailoring support classes to
MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
Now that cjk and tailoring support are filled inside
MSCompatUnicodeTable, no managed array is exposed.
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
SimpleCollator.cs,
MSCompatUnicodeTable.cs : Now it's not exposing collation table
internals as managed arrays (to switch to unmanaged pointers).
2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : tiny nonspacing mark fix.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed most of Greek mappings.
* MSCompatUnicodeTable.cs : don't lock string.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : More Latin diacritical fixes.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : There were still missing
math symbol mappings. Added several hacky diacritical weight for
Latin characters.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed a few diacritical weight
on Cyrillic characters. Fixed ParseTailoringSource() to handle
non-heading escape sequence (\uXXXX) as expected.
2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
MSCompatUnicodeTableUtil.cs,
MSCompatUnicodeTable.cs : added more aggressive index limits for
table optimization at data size, in cost of speed.
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Mapping for hyphens and
punctuation are kinda finished. Rewrote batch mapping method to
collect all NFKD. Required modification on mapping is done.
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : minor mapping fixes on accent
marks and punctuations.
2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
and Box drawing mapping.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed almost all numbers.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Symbol mappings are almost done.
Removed hack that gave dummy mappings to blank symbols.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more fix on arrows. Fix on box
drawings. Some code refactoring to eliminate hack.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed some secondary weight
in Devanagari and arrows.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : a set of tiny mapping fixes.
2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some diacritical fixes for
Latin. Added batch mapping method that considers computed
diacritical weight (for numbers).
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* managed-collation.patch : forgot to add System.String patch.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : added resource existence check (required
for mscorlib transient time from the one without resources to the
one with resources.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed punctuations and hyphen
(shift) primary weight.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more nonspacing mark fixes.
Some non-basic Cyrillic diacritical weight fixes.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
and level 3. Tiny Hangul weight fixes.
* MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some normal characters who have
"narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
values were different. Handle U+30FB as category A.
* MSCompatUnicodeTable.cs : U+30FB does not have special weight.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more diacritical weight fixes.
Removed some unused code.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed some Thai and Arabic
level 2 weight.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed nonspacing marks in
Malayalam, Thai and Lao. Removed extraneous hack.
2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
Some refactoring on IndexOf() code. Removed unused Matches().
* Collation-notes.txt : some methods needed to be reimplemented, so
rewrote the description.
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
Thus supported extenders in IsSuffix().
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : more IsSuffix() simplification, but it will be
stopped here since it cannot handle extenders (implementing new
approach one).
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : simplified IsSuffix() code.
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
entire replacement string if char target was an expansion.
IsSuffix() was using a method for IsPrefix() which was incorrect.
Removed old IsPrefix() code.
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IndexOf() was incorrectly sharing the same
byte[] field in different areas of code. Now extenders in both
source and target really work in IndexOf().
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
* SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Now FilterExtender() handles all extender
support. IndexOf() and LastIndexOf() now supports extenders.
IndexOf() and LastIndexOf() did not proceed contraction source
length as expected. Tiny refactoring on private IsPrefix() to take
stringSort argument.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : when restoring from expansion, go back to the
top of the loop (to avoid index out of range).
Now IsPrefix() is implemented to reuse Compare() and thus it now
supports extender as well.
* Collation-notes.txt : status update. Deleted optimization part in
status section (it is duplicate).
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : some code reordering.
* create-mscompat-collation-table.cs : it was still missing U+3094.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : In GetSortKey(), don't update previousChar when
it is not primary (e.g. don't "extend" diacritical mark).
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* managed-collation.patch : CompareInfo.Compare() should consider
the possibilities that non-empty string might be actually empty
in culture-sensitive context.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
target is "empty" (in culture-sensitive context).
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
characters in target string.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : When IgnoreWidth is specified, all Kana
characters are regarded as half-width.
Even though IgnoreWidth is specified, it should not ignore case.
For special weight comparison, the default values (E4) are bigger
than non-default values.
* SortKeyBuffer.cs : It should save LCID and original string.
* create-mscompat-collation-table.cs : For Japanese half-width kana,
it should not be counted in widthCompat map since IgnoreWidth does
not really ignore those differences.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed missing Japanese bits.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs :
tiny diacritical weight fix for U+20D0-U+20E1.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : ja CJK ideograph got completed.
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed CJK custom Japanese
mapping. It (maybe as well as other CJK tables) mixes NFKD. For
Japanese, modified NFKD table (because of Windows lame design).
2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
* Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
* MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
invoked at any time it is required.
* SimpleCollator.cs : call FillCJK() above in .ctor().
* MSCompatUnicodeTableUtil.cs : CJK range was wider.
* create-mscompat-collation-table.cs : CJK binary was missing the
length. CJK remapping is being moved to ModifyUnidata().
For cjk-ja mapping, we have to consider compat characters to be
added to the map, besides the raw UCA table.
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
* SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
contraction as expected. Fixed Compare() to save s2's contraction
as expected.
* TestDriver.cs :added LastIndexOf() tester w/ indexes.
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
* managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
incorrectly use Compare().
* TestDriver.cs : more moved to nunit tests.
2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : several fixes on Compare().
- Ignorable characters are skippted at the top of the loop.
- IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
- In such case that s1 index is increased while s2 contraction is
replaced, s1 is inconsistently proceeded (bug).
- IsIgnorable() now also checks IgnoreNonSpace.
- Fixed FilterOptions() that does not work for IgnoreWidth at all.
* TestDriver.cs : now some are moved to nunit tests.
* Collation-notes.txt : minor todo update.
2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Compare() was ignoring such case that both
entire strings have '-' to be compared.
* Collation-notes.txt : more status updates.
* TestDriver.cs : added '-' use cases.
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : to be same as other buggy part, it now handles
U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
previous character.
Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
* create-mscompat-collation-table.cs : dummy values for extenders.
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
should be computed from ExtenderType, and voice mark weight should
be considered.
* MSCompatUnicodeTable.cs : added tiny comment.
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
* SimpleCollator.cs : support for extender (U+309D etc.).
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some punct/symbols fix.
* managed-collation.patch : new (and temporary) file to support
managed collation in mscorlib.
* README : described how to use managed collation.
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
U+482-4C8 (though needs diacritical fixes).
* MSCompatUnicodeTable.cs : tiny comment for alternative impl.
2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
computation code, since it looks like the same way as Latin letters
have. Thus removed all other approach (UCA, by letter name).
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : diacritical fix for "double-
struck". Syriac nonspacing fixes.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more math symbol weight fixes.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
implemented (no stub). Some other fixes on category 8-A.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some minor fixes on Arabic,
Korean and Japanese sortkey weights.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : More diacritical fixes.
Georgian characters do not have level 2 weights but level 3.
2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Roman numeral characters
have diacritical weight. quick hack for control signs (U+2400..)
and box drawings.
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : improving Latin mappings.
Setting non-ASCII Latin characters' primary weight between those
ASCII characters, and setting diacritical weight (hacky).
* MSCompatUnicodeTable.cs :
Kanatype check: fixed (voice marks) and improved (comparison order).
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more diacritical fixes.
primary weight fixes on punctuations in category 07.
2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : several diacritical fixes.
* TestDriver.cs : sortkey dumper should use StringSort.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : fixed incorrect indexer setup. Optimized
GetContraction() call a bit.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : fixed incorrect level 2
output type.
* MSCompatUnicodeTable.cs : remove debug line.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTableUtil.cs,
MSCompatUnicodeTable.cs,
CodePointIndexer.cs,
create-mscompat-collation-table.cs : made some members internal and
accessible from other classes. Many indexes could be 0 by default.
* SimpleCollator.cs : optimizations. avoid method call.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more updates.
* SimpleCollator.cs : Added quick check for Ordinal comparison.
Fixed special weight comparison. It cannot be customizable in the
implementation (and it won't be harmful).
* mono-tailoring-source.txt : thus updated comment.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Compare() was missing French sort support.
* TestDriver.cs : added example case.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : updated status. Eliminated descriptions on
"iterator" (I avoided it for performance concern). Fixed misc.
incorrect descriptions.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* Collator.cs : Now that SimpleCollator became feature complete, it is
not useful anymore.
2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : implemented decent Compare() that immediately
stops at first primary difference.
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : indexers might return -1.
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
buggy (length check for source was missing).
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed tailoring table output
to be in correct and countable order. Now if tailoring alias was not
found, just stop the build.
* MSCompatUnicodeTable.cs : several build fixes. Now it works to read
assembly resources.
* mono-tailoring-source.txt : commented out CJK aliases that miss
target.
* Makefile : needed further filename fixes.
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
(now it is working as a standalone file).
* Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
(the generator now creates both binary resources and C# source).
2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Now it generates binary
resources (to parent directory).
* MSCompatUnicodeTable.template : added conditional code that fills
collation tables from manifest resources.
* Makefile : remove collation table binaries as well on "make clean".
Removed extraneous dependency.
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template,
SimpleCollator.cs : removed extraneous GetExpansion().
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IsSuffix() also supports contractions.
* TestDriver.cs : IsSuffix() example contraction cases.
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
what current IsPrefix() does). For expansion of target, IsPrefix()
should check the no-match case that expansion is longer than input.
Some refactory on IsPrefix().
Added GetContractionTal() for IsSuffix() (not used yet).
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
* TestDriver.cs : added IsPrefix() expansion cases.
* SimpleCollator.cs : IsPrefix() now supports contractions (with much
of complexity), and it now returns bool again.
IndexOf() for replacement should make use of IndexOfPrimitiveChar()
since expansions won't be expanded recursively.
2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : commonized character comparison in IsPrefix()
and IsSuffix(). csc compile fix.
* CompareInfoImpl.cs : deleted.
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
* TestDriver.cs : added SimpleCollator.ctor() sanity check.
Added replacement contraction example.
* SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
contraction in source string. Extracted matching code to Matches().
Replacement contraction was including extraneous '\x0'.
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : updated status.
* CollationDataStructures.txt : tiny fixes.
* SimpleCollator.cs :
Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
namespace Util and csc borked).
GetContraction was incorrectly returning first item.
Private IsPrefix() now returns int (but it might not be in real use).
Extracted simple char comparison to CompareCharSimple().
IndexOf() and LastIndexOf() now fully handle contractions (both
binary key and string replacement) in "target" (for "s" not yet).
* TestDriver.cs : be more verbose.
* mono-tailoring-source.txt : added comment.
* MSCompatUnicodeTable.template :
Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : compute COMBINING blah marks as
well as those characters WITH blah.
* TestDriver.cs : added combining sortkey cases.
2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
* mono-tailoring-source.txt : fixed description on '*' in sortkeys.
* SimpleCollator.cs : Now it fully uses tailoring info. Fixed
contraction search that worked only when string is contraction.
Removed commented code. Minor refactoring.
* TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
* mono-tailoring-source.txt : removed extraneous level 4 sortkey
which cannot be supported.
* SimpleCollator.cs : added GetContraction() and used in some places.
Now CompareOptions is set only once. Reordered some code (e.g.
ignorable check -> get compat char -> compare).
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : sort tailoring tables before actual usage.
Support diacritical remappings (it is customized collation rule
which does not exist in UCA).
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : build culture specific tailoring table from
TailoringInfo and unified data array.
* create-mscompat-collation-table.cs : Added null termination to
sortkey map tailorings (mostly to save my eyes).
* MSCompatUnicodeTable.template : added public TailoringValues.
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
* SortKeyBuffer.cs : handle special weight (category 06) characters.
* Collation-notes.txt : Updated description on special weight (it was
incorrect).
* TestDriver.cs : added special weight cases.
2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template : added GetTailoringInfo().
* SimpleCollator.cs : Now tailoring information is acquired and used.
(FrenchSort is supported but Compare() won't work expectedly since
the table is still incomplete for those diacritical marks).
* SortKeyBuffer.cs : On reversing diacritical weights, it should
ignore zeros. Reset() should reset frenchSorted flag.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Further fixes on Jamo,
diacritical weights by character name, and *Numbers primary weights.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : More fix on Devanagari,
Gujarati, Oliya, Tamil and Lao sortkeys.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
sortkey values.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed Thai character primary
and secondary values. Fixed Thaana letters. Added more LAMESPEC
CJK compat. Fixed some circled CJK secondary weight.
Hacked some nonspacing mark sortkey value adjustment.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : CP932.TXT was not parsed as
expected. JIS ordering was incorrect. OtherNumbers that represents
10 or more values were incorrectly computed the offset. Some Hangul
compat characters has different offset.
2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed 0x8 category characters.
Added hack for need-to-be-fixed characters to fall into 0xA category.
* create-collation-element-table.cs : previous checkin seem failed :(
* README: updated a bit.
2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
* CodePointIndexer.cs :
removed extraneous switch (I could use empty array for that need).
* CollationElementTableUtil.cs : primary weight type became ushort.
* create-collation-element-table.cs : several bugfixes.
collElem should be int. It was skipping most of entries because of
incorrect string tokenization.
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : handle some Jamo NKFD.
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : forgot to commit in the last checkin.
* create-mscompat-collation-table.cs : fixed arabic shift weight chars.
* TestDriver.cs : switch table dumper and collator testing.
* SortKey.cs : for now comment out internal indexes (not in use).
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template,
SimpleCollator.cs : support for culture dependent CJK table.
2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs,
MSCompatUnicodeTableUtil.cs : make CJK table more compact.
2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : Fixed stupid index search when start != 0.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
now starts from "start" and proceeds backward by "length".
* TestDriver.cs : fix warning.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* TestDriver.cs : more tests.
* SimpleCollator.cs : LastIndexOf() is not setting search length
on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* create-normalization-source.cs : output propValue as uint.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* SortKey.cs : Now it is System.Globalization.SortKey.
To replace existing implementation, it now requires lcid and
CompareOptions. Added required members.
* SortKeyBuffer.cs : thus .ctor() requires LCID.
* SimpleCollator.cs : made required changes above.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* CodePointIndexer.cs : added CompressArray(). Now it requires two more
parameters for default index and codepoint.
* CollationElementTableUtil.cs,
NormalizationTableUtil.cs : required changes wrt above change.
* MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
* MSCompatUnicodeTable.template : Now it uses codepoint indexer.
* create-mscompat-collation-table.cs : Now it outputs compressed array.
* Makefile : now collation requires MSCompatUnicodeTableUtil.cs
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs :
Implemented IsSuffix() and LastIndexOf().
Several fixes on index > 0 cases.
* TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : updated (status, impl. classes).
* MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
and IsPrefix(). Tiny code refactory.
* TestDriver.cs : sample IsPrefix() and IndexOf() usage.
* MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs :
IndexOf(string, char, CompareOptions) implementation.
* TestDriver.cs : sample IndexOf() usage.
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : was missing most important
kind of blocks - equivalent expansions (e.g. invariant mappings).
More readable mappings.
2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
* mono-tailoring-source.txt : new file. It describes tailoring
information. Basically examined under .NET 1.x.
* create-mscompat-collation-table.cs : consume the file above.
* MSCompatUnicodeTable.template : now tailorings is not a stub.
* CollationDataStructures.txt : minor fixes.
* SortKeyBuffer.cs,
SimpleCollator.cs : added FrenchSort support.
* Collation-notes.txt : added description on Latin primary weights.
* ldml-limited.rng : added note.
* create-tailorings.cs : added note. more serialization (but won't be
used anyways).
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
* SortKeyBuffer.cs : non-primary character is added to previous
diacritical weight.
* TestDriver.cs : added example case of above.
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : IgnoreSymbols support.
* TestDriver.cs : compilation fix. IgnoreSymbols example.
* create-mscompat-collation-table.cs : more Hangul fixes.
2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more Hangul fixes.
* SortKey.cs : it will replace sys.globalization.SortKey. It has
some internal members.
* SortKeyBuffer.cs : now it uses SortKey instead of byte[].
* SimpleCollator.cs : CompareOptions support. However I don't think
it will be developed anymore since SortKey never enables IndexOf().
* TestDriver.cs : a few CompareOptions cases.
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
* SimpleCollator.cs : simple collator implementation that just will
use GetSortKey() for all its basis.
* TestDriver.cs : sample code that uses this collator set.
* MSCompatUnicodeTable.template : removed test driver from here.
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Hangul fixes.
Now less than 300 characters that does not have sortkey weights.
* MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Added control picture mappings.
Minor primary weight fixes.
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Added mappings for box
drawings and blocks.
2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Added mappings for arrows.
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : added support for letterlike
characters and squared CJK compatibility characters, ordered by
character names (0x0E category).
* Collation-notes.txt : added description on that.
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template : Now expansions are simulated.
* create-mscompat-collation-table.cs : filled Korean number level2.
Reordered some code blocks to fill correct diacritical differences.
* Collation-notes.txt : some corrections and minor additions.
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template :
Now dumper test driver uses SortKeyBuffer for dogfooding.
* create-mscompat-collation-table.cs : some diacritical level fixes
(with non-working extra latin check).
* SortKeyBuffer.cs : several fixes to get working as a practical code.
* Collator.cs : make it compilable, leaving things as NotImplemented.
2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : some fixes on primary category
07 (miscellaneous symbols and punctuations).
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : more mapping fix on numbers,
letters, variable weight characters, circled Japanese and CJK.
* MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
inclusive. Simplified dumper code.
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : finished Hangul (both Jamo
and Syllables). sortkey dumper diff lines became 8000 from 30000.
2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : added some nonspacing marks in
either correct or hacky way.
2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : several improvements. Japanese
Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
numeric characters, diacritically decorated latin alphabets. Fixed
some diacritical weights detection.
* MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
marks' primary weight as empty.
* Collation-notes.txt : some updates.
2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : don't process nonexact NFKD
mapping as equivalent, however store CJK extensions into NFKD map
even if one does not strictly match.
Now am going to fill Hangul into tables (unlike UCA it does not look
possible to calculate sortkey value).
Fixed Cyrillic and Georgian UCA based orderings.
* MSCompatUnicodeTable.template : added CJK extension sortkey
calculation.
2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Fixed latin alphabet support.
Added latin with diacritical and CJK extension.
* MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
now not used thouth). Filled CJK ideograph, still not perfect.
Fixed number primary keys. NFKD numbers and CJK ideographs are now
considered, including brackets elimination.
* Makefile : now it downloads DerivedAge.txt.
* MSCompatUnicodeTable.template : added dummy code dumper. It computes
PrivateUse, Surrogate and Hangul Syllables.
* Collation-notes.txt : Noted that Hangul Syllables need more love.
2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
* create-tailorings.cs : added configuration support. sort them.
I wonder if it is really usable. Having own format might be better.
* create-mscompat-collation-table.cs : fixing some sortkey numbers,
making closer to windows. Now it handles NFKD in some places.
* MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
* CollationDataStructures.txt : added description on tailoring
fields, though they are subject to change.
2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
* create-tailorings.cs, ldml-limited.rng : new file.
* LdmlReader.cs : removed old file.
2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
* SortKeyBuffer.cs : split from Collator.cs. Now it considers
practical use, reflecting updated sortkey constant design.
Especially level 4 weight is split to 4 arrays that are merged in
the last stage of GetSortKey().
* Collator.cs : thus SortKeyBuffer is removed from here.
Additionally, removed some extraneous bits in other classes.
* Collation-notes.txt : Some editorial fixes. Added information on
Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
be stored in simple byte arrays).
* CodePointIndexer.cs,
create-collation-element-table.cs,
CollationElementTable.template,
NormalizationTableUtil.cs : short CodePointIndexer method names.
* create-mscompat-collation-table.cs : Additional info on why some
meaningful characters are ignored in Windows (Unicode version
difference). Removed U+070F from special check (was extraneous).
2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.template:
Moved body implementation to table creator and put those bool
results into an array.
* create-mscompat-collation-table.cs :
So imported those methods. Modified array output to emit "0x"
only for more than 9.
* create-normalization-source.cs : ditto on "0x" output matter.
* CollationDataStructures.txt : so now it holds ignorableFlags.
2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt, CollationDataStructures.txt :
separate document for data structure design.
2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : added culture-dependent CJK
table creation. It uses CLDR as its basis. (Culture independent CJK
is not ready BTW).
* Makefile : added CLDR archive downloading support.
* MSCompatUnicodeTable.template : tiny renamings.
* Collation-notes.txt : additional CJK info.
2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt, create-mscompat-collation-table.cs :
added secondary weight support for BlahNumber characters.
2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
* downloaded : added directory. All downloaded files are stored here.
* Makefile : use "downloaded" directory.
Added more auto-download stuff.
* create-mscompat-collation-table.cs :
Added Japanese square kana support.
2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
* create-mscompat-collation-table.cs : added support for Arabic abjad,
Estrangela and Thaana.
* MSCompatUnicodeTable.template : removed BOM.
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : wrong comment cleanup and spelling fixes.
* create-mscompat-collation-table.cs : added diacritic support for
Latin letters (as long as covered in primary weight).
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
* Makefile : minor fixes. Added warning lines to generated sources.
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
* create-char-mapping-source.cs :
Removed ToWidthInsensitive() generation.
2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
* create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
ToWidthInsensitive() is implemented here, using an array (which is
to be optimized using CodePointIndexer).
* MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
* MSCompatUnicodeTable.template : now it is used to generate
MSCompatUnicodeTable.cs which got ready to be used.
* Makefile : added MSCompatUnicodeTable.cs build support. Now it
supports "make normalization" and "make collation".
2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : Description on ICU is very incorrect. Now it
became more rational and sane.
* create-mscompat-collation-table.cs : fixed some indexes.
* Makefile : added "mstablegen" target.
* MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more analysis on "letters".
* create-mscompat-collation-table.cs : more proof of concepts.
2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more info. Started letter sortkey analysis
(some of other stuff are really non-understandable right now.)
* create-mscompat-collation-table.cs : table generator proof-of-
concept source (not compilable).
* MSCompatUnicodeTable.cs : moved some code to the new source.
Some more fixes.
2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : started level 2 weight analysis.
2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : Additional information on how to create
level 3 tables.
* MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : More case weight (level 3) analysis. I'm
likely to just write table generator.
2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
* MSCompatUnicodeTable.cs : part of level 4 weight implementation.
2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt :
Added task list.
Revised comparison methods; backward iteration is possible.
More on char-by-char comparison.
Level 4 comparison is actually a bit more complex.
Misc corrections.
* Collator.cs : some conceptual updates wrt above.
2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : Japanese voice mark is level 2, and Hangul
properties are level 3.
2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : Make it more readable. More analysis on
level 3 and 4 sortkey structures.
* Collator.cs : some compilation fixes (not compilable yet).
2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : Analysis on variable-weighting (level 5)
sortkey format.
* Collator.cs : updated corresponding part of level 5, and more.
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more updates.
* Collator.cs : rewrote from scratch. Some rough sketch for sortkey
buffer, character iterator and collator methods. Not compiling.
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
* Collator.cs : Am going to replace it with new one. No need for
CompareOptions-dependent Comparer.
2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : There seems a bit more complexity.
2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more updates, being close to write sortkey
generator code.
2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
* CompareInfoImpl.cs, Collator.cs : conceptual update
* Collation-notes.txt : some corrections and additions.
* Makefile : added LDML input (but it won't be used at all).
2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more updates.
2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : more updates.
2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
* Collation-notes.txt : some updates.
* create-mapping-char-source.cs : superscripts and subscripts are also
ignored in IgnoreWidth comparison.
* Makefile : tiny touch fix.
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
* CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
* create-char-mapping-source.cs : Now it generates
ToWidthInsensitive() from combining category <wide> and <narrow>.
* MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
* README, LdmlReader.cs, DataStructures.txt : new files.
2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
* CodePointIndexer.cs,
Collation-notes.txt,
CollationElementTable.template,
CollationElementTableUtil.cs,
create-char-mapping-source.cs,
create-collation-element-table.cs,
create-combining-class-source.cs,
create-normalization-source.cs,
Makefile,
MSCompatUnicodeTable.cs,
Normalization.template,
NormalizationTableUtil.cs : initial checkin (to private branch).