If the runners require multiple calls to run_shader_tests() for
different shader model ranges, these are moved inside the sole runner
call.
For the same reason, the trace() messages are also moved inside the
runner calls.
We can now pass (sm<4) and (sm>=4) to "fail" and "todo" qualifiers, and
we can use multiple of these qualifier arguments using "&" for AND and
"|" for OR.
examples:
todo(sm>=4 & sm<6)
todo(sm<4 | sm>6)
parenthesis are not supported.
Adding additional model ranges for the tests, if we need them, should be
easier now, since they only have to be added to the "valid_args" array.
The relative-addressed case in shader_register_normalise_arrayed_addressing()
leaves the control point id in idx[0], while for constant register
indices it is placed in idx[1]. The latter case could be fixed instead,
but placing the control point count in the outer dimension is more
logical.
The FXC optimiser sometimes converts a local array of input values into
direct array addressing of the inputs, which can result in a
dcl_indexrange instruction spanning input elements with different masks.