Edit: Don't do it for the main function of (graphics) shaders,
its inputs and outputs always go through TGSI_FILE_INPUT/OUTPUT.
This prevents all TEMPs from counting as live out and reduces
register pressure.
The point is to keep an independent dictionary for each function.
The array that was being used as dictionary has been converted into a
"bimap" for two different reasons: first, because having an almost
empty instance of an array with as many entries as registers there are
in the program, once for every function, would be wasteful, and
second, because we want to be able to map Value pointers back to
locations at some point.
The reason is that several passes (regalloc, function argument
binding, inlining) are going to require the callees of a function to
be processed before the caller.
Instruction attributes like WriteALUResult and ALUResultCompare
were being discarded during the some of the local transformations.
This fixes the following piglit tests:
glsl1-inequality (vec2, pass)
loopfunc
fs-any-bvec2-using-if
fs-op-ne-bvec2-bvec2-using-if
fs-op-ne-ivec2-ivec2-using-if
fs-op-ne-mat2-mat2-using-if
fs-op-ne-vec2-vec2-using-if
fs-op-ne-mat2x3-mat2x3-using-if
fs-op-ne-mat2x4-mat2x4-using-if
https://bugs.freedesktop.org/show_bug.cgi?id=45921
NOTE: This is a candidate for the stable branches.
The loop registers weren't being cleared, so any shader that was
executed after a shader containing loops was at risk of having a loop
randomly inserted into it.
This fixes over one hundred piglit tests, although these test
only failed during full piglit runs and would pass if
run individually. The exact number of piglit tests that this patch
fixes will vary depending on the version of piglit and the order the
tests are run.
NOTE: This is a candidate for the stable branches.
Cuts 8/1068 instructions from glyphy's fragment shaders on i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This lets us significantly shorten p->instructions->push_tail(ir), and
will be used in a few more places.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now we can fold a bunch of our expression setup in ff_fragment_shader
into single-line, parseable commits.
v2: Make it actually work. I wasn't setting num_components in the
mask structure, and not setting up a mask structure is way easier.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Having to explicitly dereference is irritating and bloats the code,
when the compiler can detect and do the right thing.
v2: Use a little shim class to produce the automatic dereference
generation at compile time as opposed to runtime, while also
allowing compile-time type checking.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>