Put the contents of the README into the documentation. First, we need to distribute a pre-compiled documentation xml module: ============ * DOM requires Utf16-encoding for strings. Can we make all the packages generic, so that the user can decide whether to use Utf16 or Utf8 ? * Unicode: Add conversion to Wide_Char and Wide_String * Support for xml:* attributes (space, ...) Note that xml:space means that we must use the Ignorable_Whitespace callback in SAX. * Should Encoding_Scheme include support for From_Utf32 and To_Utf32 * Unicode: http://www.unicode.org/unicode/standard/versions/Unicode3.0.html * DOM: Finish implementation for the few remaining cases * SAX: Should have a feature to specify whether entity refs in attribute values should be expanded or not * Parser: Checking of self-contained entities could be cleaner (ie done at a more general level). For instance, directly at the end of Next_Token, but we must exclude Text from the tests. * Can we implement the lexical parser as a state machine (Next_Char called at a single place). This might make it shorter (but probably not more efficient) Release 0.7 =========== * Parse_Element_Model_From_Entity should use Next_Token (for entitites) * File_Input shouldn't store the whole file in memory (have a buffer of 64k instead). * No need to return Id in Next_Token, should be stored directly in Parser.Id. This would be more efficient. This allows for a simplification in Parse_Element_Model. * For efficiency, Sax.Attributes should be an array * We must normalize PublicID (see 4.2.2) * Group all DTD-related field into a separate record (Entities, Default_Atts)