Files
Maxim Reznik e85309344c RegExp performance optimization
* Move Ada.Containers.Vectors instances out of local scope to avoid
  instantance initialization on each Match call
* Replace State_Vectors with an array to avoid overheads on container
  operations
* Create a copy of VSS.Implementation.Strings.Cursor without
  initialization to avoid initialization of a possible big state vector
* Replace Instruction_Vectors with arrays in Match to avoid overheads
  on container operations
* Move Character_Iterator and Pos to outer scope in Match to avoid
  overheads on class-wide casts
* Unroll Append_State manualy because compiler can't make tail calls
2023-05-16 13:08:23 +03:00
..

RegExp engine

This regexp engine should implement ECMAScript Regular Expression (Unicode Mode), but currently only part of specification is implemented.

For now we have:

RegExp Description
x y Match the x then y
x | y Match either the x or y
x * Match the x zero or more times
x + Match the x one or more times
x ? Match the x zero or one times
(:? x ) Non-capturing group
( x ) Capturing group
\p{ N } Char of the general category N
\P{ N } Char not of the general category N
[ x ] Character class x
[^ x ] Character not in the class x
[ x - y ] Character in range x..y
[\p{ N }] Char of the general category N
[\P{ N }] Char not of the general category N
^ Start of line assertion
$ End of line assertion
\b Word boundary assertion
\B Not a word boundary assertion
\d A digit (like [0-9])
\D Not a digit (like [^0-9])
\s A whitespace (like [\p{z}\r\n\t\f\v])
\S Not a whitespace
\w A word character (like [A-Za-z0-9_])
\W Not a word char (like [^A-Za-z0-9_])
x Character literal x, not in ^$.*+?]{}
\ x Character literal x in ^$.*+?]{}
\n \r \t New line, tabulation and other controls
[\n\r] The same in a character class

Useful articles