uwrap/documentation/source/language_standard_functions.rst

Standard Functions
==================

Fold
----

Fold allows to take a generator function and to reduce it into one single
value. It is written as a suffix of such generator, for example ``child``.
Folding is computed on capture variables. The general syntax is:

.. code-block:: text

   <prefix generator>.fold (<initial value>, <folding expression>)

In a fold expression, two specific references can be used to create the result:
- The entity currently generated by the prefix stored in the standard iterator
  reference ``it``.
- The previous value of the result, stored in the usual left reference ``@``.

For example:

.. code-block:: text

   child ().fold ("", @ & ", " & it));

Fore more complex expressions, it's possible to use explicit accumulators and
references in addition to the default one. The previous example can be written:

.. code-block:: text

   child (v: true).fold (i: "", i: (i & ", " & v));

The above would use i as an accumulator, and take successive values of the
content of child to concatenate into a result. If fold is the last call of
a selector, and the result of the selector is captured, then early evaluation
of that value can be used to write the above expression differently:

.. code-block:: text

   i: child (v: true).fold ("", i & ", " & v);

A third optional parameter to fold allows to provide an expression to be called
in between two values. This can be used in particualr to implement separators.
For example:

.. code-block:: text

   child ().fold ("", @ & it, @ & ", "));

If child () generates three values, A B and C, the result will be "A, B, C".

Filters
-------

A filter is a function allowing to fitler values returned or generated by a
given prefix. The general syntax is:

.. code-block:: text

   <prefix generator>.filter (<filtering expression>)

A filter can be a complete regular expression. For example, consider a tree of
the following structure: "some entity named A" -> "some random node" -> "some
entity named "B". This would not be matched by:

.. code-block:: text

   child (\ f_name ("A") \ f_name ("B"))

Since there's a node between "A" and "B" (a common situation with langkit
generated trees where a list node is often between two related entities).

An alternative could be to take into account this extra node in the child
expression instead:

.. code-block:: text

   child (\ f_name ("A") \ many (not Entity ()) \ f_name ("B"))

However, it can be useful to isolate "A" \ "B" pattern and hide the fact that
there can be nodes in between. This can be done in a two staged expression:

.. code-block:: text

   child (\ many (not Entity ()) \ Entity ()).filter (\ f_name ("A") \ f_name ("B"))

Which can be picked one step further using a function:

.. code-block:: text

   function entity_child do
      pick child (\ many (not Entity ()) \ Entity ()).all()
   end;

   entity_child ().filter (\ f_name ("A") \ f_name ("B"))

In order for regular expressions to be used, the prefix of the filter must
support it. Tree predicates such as ``child`` do support it, either directly
as a parameter child (A \ B) or using filter as a suffix child ().filter (A \ B).

Custom function may also provide support to regular expressions. In order to
do so, they need to operate on the current iteration reference ``it``, and to
return a node. If that's the case, any section of the regular expression will be
interpreted as a call to the prefix on the current value of ``it``.

Let's take an example. Consider an expression of the form P1.P2.P3.filter (A\B\C):
- First, values generated by P1.P2.P3 will be iterated over up until a match to
  A is valid.
- A then becomes the iterated node ``it``. P3 will then be called again with
  this new value, values to this call will be generated up until a match to B is
  found.
- B then becomes the new iterated node, ``it``. P3 will then be called again and
  so on.

This is the behavior highlighted in the previous example.

Functions on Strings
--------------------

The following functions are available for various text formats:

TODO: implementation to be finalized

- text (<some value>): ensures that the resulting value is interpeted as text.
  The function will actually create a dynamic conversion which will only be
  resolved when converted to an actual string.
- string (<some value>): resvolves the value in parameter to a final string,
  resolving all the components it's computed from.
- to_lower (<some value>): resolves a string and converts it to lower case
- to_upper (<some value>): resolves a string and converts it to upper case
- normalize_ada_name (<some value>): resolves a string and convert it to following
  the casing of Ada identifiers
- replace_text (<value>, <pattern>, <by>): resolves a string and replace all
  occurences of pattern by a certain value. TODO: should be replace_string and
  replace_string_all instead.