This moves construction of the mapping between VP outputs
and FP inputs into validation.
The map also contains slots for special outputs like clip
distance and point size, so we need to at least merge the
VP related and FP related parts on validation if we want
to support those.
Now we match every single FP input component with results
from the VP and leave those not read out of the map, or
replace those not written by 0 (xyz) or 1 (w).
The bitmap indicating linear interpolants is also filled,
and flat FP inputs are mapped in only after non-flat ones,
as is required.
Furthermore, we can save some space by only fetching VP
attrs we actually use, and avoid wasting any output regs
because of TGSI using less than 4 components.
We're going to try to reorder the scalar ops of a vector instr
to accomodate swizzles that would otherwise require us to emit
to an additional TEMP first (like MOV R0.xy, R0.zx).
1D tile span support for depth/stencil/color/textures
Z and stencil buffers are always tiled, so this fixes
sw access to Z and stencil buffers. color and textures
are currently linear, but this adds span support when we
implement 1D tiling.
This fixes the text in progs/demos/engine and progs/tests/z*
This fixes the compilation errors reported in bug 23945 but someone more
familiar with the code should review for correctness and close the bug
report.
Should be easier to read and work with than the older ways of emitting
TGSI tokens.
Also, emit simpler TGSI than previously:
- translate away source and dest extended modifiers
- translate away the SWZ opcode
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure
intermediate results do not overwrite yet-to-be read source registers.
This change ensures that MOV/SWZ handle this correctly, which is poor but
no worse than the current tgsi_exec.c path. Remove the fallback as there
is nothing to be gained correctness-wise between the two implementations now.
Fixing this properly looks like a bit of work in this code, but might be
easily achieved by sending destination writes to temporary storage.
Previously ureg would always call the driver's create-shader function. This
allows the caller the opportunity to hold onto the tokens if it needs to
reuse them, eg. to create an internal draw shader.