The generated Vulkan calls look right and do not trigger any
validation error, but the returned timestamp is 0. A valid
timestamp is returned if the CopyResource() call is commented,
or the second EndQuery() call is moved before CopyResource(),
or the first EndQuery() call is commented. I am not seeing any
sensible pattern here, so I guess there is just a bug in
MoltenVK.
Specifically, MoltenVK seems to be able to load from stencil, but
the specific replicating swizzle (repeating the stencil value on
all the channels) is not honored. The stencil value is read only
on the red channel.
At the current moment this is a little odd because for SM1 [test]
directives are skipped, and the [shader] directives are not executed by
the shader_runner_vulkan.c:compile_shader() but by the general
shader_runner.c:compile_shader(). So in principle it is a little weird
that we go through the vulkan runner.
But fret not, because in the future we plan to make the parser agnostic
to the language of the tests, so we will get rid of the general
shader_runner.c:compile_shader() function and instead call a
runner->compile_shader() function, defined for each runner. Granted,
most of these may call a generic implementation that uses native
compiler in Windows, and vkd3d-shader on Linux, but it would be more
conceptually correct.
If the runners require multiple calls to run_shader_tests() for
different shader model ranges, these are moved inside the sole runner
call.
For the same reason, the trace() messages are also moved inside the
runner calls.
We can now pass (sm<4) and (sm>=4) to "fail" and "todo" qualifiers, and
we can use multiple of these qualifier arguments using "&" for AND and
"|" for OR.
examples:
todo(sm>=4 & sm<6)
todo(sm<4 | sm>6)
parenthesis are not supported.
Adding additional model ranges for the tests, if we need them, should be
easier now, since they only have to be added to the "valid_args" array.
The relative-addressed case in shader_register_normalise_arrayed_addressing()
leaves the control point id in idx[0], while for constant register
indices it is placed in idx[1]. The latter case could be fixed instead,
but placing the control point count in the outer dimension is more
logical.
The FXC optimiser sometimes converts a local array of input values into
direct array addressing of the inputs, which can result in a
dcl_indexrange instruction spanning input elements with different masks.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=56162
Storing to a vector component using a non-constant index is not allowed
on profiles lower than 6.0. Unless this happens inside a loop that can be
unrolled, which we are not doing yet.
For this reason, a validate_nonconstant_vector_store_derefs pass is
added to detect these cases.
Ideally we would want to emit an hlsl_error on this pass, but before
implementing loop unrolling, we could reach this point on valid HLSL.
Also, as pointed out by Nikolay in the mentioned bug, currently
new_offset_from_path_index() fails an assertion when this happens,
because it expects an hlsl_ir_constant, so a check is added.
It also felt correct to emit an hlsl_fixme there, despite the
redundancy.
Apparently Metal doesn't support specifying a bias directly in the
sampler, and, with "nearest" mip filtering, it doesn't switch
precisely at LOD 0.5 (though still between 0.5 and 0.6).