The previous Intel-based macOS runner was replaced with another one
running as a virtual machine on top of an Apple Silicon host.
Since the current macOS runner is not yet able to deal with
different exit codes, we temporarily allow failure unconditionally.
This will be reverted as soon as the runner issue is fixed.
Autodetection doesn't work, but gitlab does in fact support colour
output. Perhaps more importantly, the "Scroll to next failure" feature
essentially scans the output for red text.
Mostly to avoid polluting other logs and artifacts, and also to avoid
recompiling crosstests over and over. Eventually the artifacts produced
at this stage should be run on native Windows.