Historically the SPIR-V backend was only fed by the TPF parser,
which only generates _sat destination modifiers. Now it is fed
by the D3DBC parser too (among others), so it mustn't assert on
other modifiers.
Modifier _pp can be trivially ignored. Modifier _centroid would
probably require some handling, but I'm not immediately sure of
what should happen and it doesn't look like a very urgent thing
anyway, so I'm degrading the assertion to FIXME().
SPIR-V images have a "depth" parameter that, as far as I understand
(the spec doesn't look terribly clear in that regard), specifies
whether the image can be used for depth-comparison operations.
In TPF (and therefore in VSIR) the same information is specified
on the sampler type instead of on the image type. This puts us in
a hard spot, because in principle an image can be used with
many different samplers, and the mapping might even be unknown
at compilation time, so it's not clear how we should define our
images.
We currently have some algorithms to deal with that, but they are
incomplete and lead to SPIR-V validation errors like:
Expected Image to have the same type as Result Type Image
%63 = OpSampledImage %62 %59 %61
The problem here is that the image has a non-depth type, but is
being sampled as a depth image. This check was added recently to
SPIRV-Tools, so we became aware of the problem.
As I said, it's not easy in general to decide whether an image is
going to be sampled with depth-comparison operators or not.
Fortunately the SPIR-V spec allow to mark the depth parameter as
unknown (using value 2), so until we come up with something better
we use that for all images to please the validator and avoid
giving misleading hints to the driver.
Numeric types are used very frequently, and doing a tree search
each time one is needed tends to waste a lot of time.
I ran the compilation of ~1000 DXBC-TPF shaders randomly taken from
my collection and measured the performance using callgrind and the
kcachegrind "cycle count" estimation.
BEFORE:
* 1,764,035,136 cycles
* 1,767,948,767 cycles
* 1,773,927,734 cycles
AFTER:
* 1,472,384,755 cycles
* 1,469,506,188 cycles
* 1,470,191,425 cycles
So callgrind would estimate a 16% improvement at least.
I ran the compilation of ~1000 DXBC-TPF shaders randomly taken from
my collection and measured the performance using callgrind and the
kcachegrind "cycle count" estimation.
BEFORE:
* 1,846,641,596 cycles
* 1,845,635,336 cycles
* 1,841,335,225 cycles
AFTER:
* 1,764,035,136 cycles
* 1,767,948,767 cycles
* 1,773,927,734 cycles
So callgrind would estimate a 3.6% improvement at least.
The counterpoint is that the caller might get an allocation that
is potentially bigger than necessary. I would expect that allocation
to be rather short-lived anyway, so that's probably not a problem.
Avoid overflowing the (Wine) debug log buffer when output lines are too
long, and keep spirv-text output more legible. The output is still valid
SPIR-V asm, as the assembler does not care for which kind of whitespace
is used.
Commits 343c7942e1 and
94c74d2c00 moved applying the NonReadable
and Coherent decorations from spirv_compiler_emit_resource_declaration()
to spirv_compiler_build_descriptor_variable(), but unfortunately missed
the non-array path in the latter function.
The missing NonReadable decoration causes segmentation faults in
rasteriser-ordered-views.shader_test (among others) on my Intel SKL GT2
setup in particular.
In the case two uav descriptors are mapped to the same variable, and one is
read from while the other is not, the variable would get the NonReadable
decorator, while being read from later.
The existing code reuses the same SPIR-V variable for all descriptors mapped to
the same Vulkan binding, and applies the NonReadable decoration based on the
VKD3D_SHADER_DESCRIPTOR_INFO_FLAG_UAV_READ only. This potentially causes the
decoration to be applied twice, should two non-read descriptors be mapped to
the same variable, which isn't allowed in SPIR-V, and the validator complains.
As the newly added documentation describes, this reroll serves two purposes:
* to allow shader parameters to be used for any target type (which allows using
parameters for things like Direct3D 8-9 alpha test),
* to allow the union in struct vkd3d_shader_parameter to contain types larger
than 32 bits (by specifying them indirectly through a pointer).