mirror of
https://github.com/AdaCore/VSS.git
synced 2026-02-12 13:06:25 -08:00
* Move Ada.Containers.Vectors instances out of local scope to avoid instantance initialization on each Match call * Replace State_Vectors with an array to avoid overheads on container operations * Create a copy of VSS.Implementation.Strings.Cursor without initialization to avoid initialization of a possible big state vector * Replace Instruction_Vectors with arrays in Match to avoid overheads on container operations * Move Character_Iterator and Pos to outer scope in Match to avoid overheads on class-wide casts * Unroll Append_State manualy because compiler can't make tail calls
RegExp engine
This regexp engine should implement ECMAScript Regular Expression (Unicode Mode), but currently only part of specification is implemented.
For now we have:
| RegExp | Description |
|---|---|
| x y | Match the x then y |
| x | y | Match either the x or y |
| x * | Match the x zero or more times |
| x + | Match the x one or more times |
| x ? | Match the x zero or one times |
| (:? x ) | Non-capturing group |
| ( x ) | Capturing group |
| \p{ N } | Char of the general category N |
| \P{ N } | Char not of the general category N |
| [ x ] | Character class x |
| [^ x ] | Character not in the class x |
| [ x - y ] | Character in range x..y |
| [\p{ N }] | Char of the general category N |
| [\P{ N }] | Char not of the general category N |
| ^ | Start of line assertion |
| $ | End of line assertion |
| \b | Word boundary assertion |
| \B | Not a word boundary assertion |
| \d | A digit (like [0-9]) |
| \D | Not a digit (like [^0-9]) |
| \s | A whitespace (like [\p{z}\r\n\t\f\v]) |
| \S | Not a whitespace |
| \w | A word character (like [A-Za-z0-9_]) |
| \W | Not a word char (like [^A-Za-z0-9_]) |
| x | Character literal x, not in ^$.*+?]{} |
| \ x | Character literal x in ^$.*+?]{} |
| \n \r \t | New line, tabulation and other controls |
| [\n\r] | The same in a character class |