Those tests are meant to check that each single sample computes the
right value during a multi-sampled rendering operation. Checking whether
the result is correct after multi-sample resolution isn't enough,
because errors at different samples belonging to the same pixel might
have cancelled out.
Instead, for each shader invocation we compute the expected result and
return the absolute value of the difference between the expected and
computed value. This way errors at different samples cannot cancel out,
but add up.
So we have a more recent version of SPIRV-Tools and also don't
have to recompile Mesa to test llvmpipe. This fixes a few failing
tests, but also breaks a couple.