Similarly to the already done split from
HLSL IR -> d3dbc
to
HLSL IR -> vsir -> d3bc
we now start splitting the
HLSL IR -> tpf
translation into
HLSL IR -> vsir -> tpf
So hlsl_sm4_write is split into two functions, sm4_generate_vsir() and
tpf_compile().
This translation should be completed once tpf_compile() no longer needs
the hlsl_ctx and entry_func parameters.