Let eg. sp_get_samples_rect be hooked directly in as the tgsi sampler
routine.
Add a field to determine whether this is a vertex or fragment sampling
call, and massage parameters to match the tgsi call.
Fixes two issues - firstly for mipmap levels with one or more
dimensions smaller than tilesize, the code was sampling off the edge
of the texture (but still within the tile).
Secondly, in the linear_mipmap_linear case, both the default code and
new fastpath were incorrect. This change fixes the fastpath and adds
a comment to the default path, which still needs to be fixed.
Basically the issue is that the coordinates in the smaller texture
level are/were being computed by just dividing thecoordinates from the
larger texture level by two, as in:
x0[j] /= 2;
y0[j] /= 2;
x1[j] /= 2;
y1[j] /= 2;
The issues with this are signficant. Initially x1 is most often equal
to x0+1, but after this, it will likely be equal to x0, so we will not
actually be performing the linear blend within the smaller mipmap.
The fastpath code avoided this (recalculated x1), but was still using
the weighting factors from the larger mipmap level (xw, yw), which
were incorrect.
Change the fastpath code to do two full, independent linear samples of
the two mipmap levels before blending. The default code needs to do
the same thing.
SOA dependencies can happen when a register is used both as a source and
destination and the source is swizzled. For example:
MOV T, T.yxwz; would expand into:
MOV t0, t1;
MOV t1, t0;
MOV t2, t3;
MOV t3, t2;
The second instruction will produce the wrong result since we wrote to t0
in the first instruction. We need to use an intermediate temporary to fix
this.
This will take more work to fix for all TGSI instructions. This seems to
happen with MOV instructions more than anything else so fix that case now
and warn on others.
Fixes piglit glsl-vs-loop test (when not using SSE). See bug 23317.
SOA dependencies can happen when a register is used both as a source and
destination and the source is swizzled. For example:
MOV T, T.yxwz; would expand into:
MOV t0, t1;
MOV t1, t0;
MOV t2, t3;
MOV t3, t2;
The second instruction will produce the wrong result since we wrote to t0
in the first instruction. We need to use an intermediate temporary to fix
this.
This will take more work to fix for all TGSI instructions. This seems to
happen with MOV instructions more than anything else so fix that case now
and warn on others.
Fixes piglit glsl-vs-loop test (when not using SSE). See bug 23317.
linear-mip-linear-repeat-POT sampling faspath, provides a very nice
speedup to apps that do this common type of texturing.
Test case: demos/terrain, turn fog off, turn texturing on.
Without patch: 12 fps
With patch: 20 fps.
Could previously emit opcodes with label arguments, but was no way to
patch them with the actual destinations of those labels.
Adds two functions:
ureg_get_instruction_number - to get the id of the next instruction
to be emitted
ureg_fixup_label - to patch an emitted label to point to a given
instruction number.
Need some more complex examples than u_simple_shader, so far this has
only been compile-tested.