The explicit call to fs_builder::group() in emit_single_fb_write() is
required by the builder (otherwise the assertion in fs_builder::emit()
would fail) because the subsequent LOAD_PAYLOAD and FB_WRITE
instructions are in some cases emitted with a non-native execution
width. The previous code would always use the channel enables for the
first quarter, which is dubious but probably worked in practice
because FB writes are never emitted inside non-uniform control flow
and we don't pass the kill-pixel mask via predication in the cases
where we have to fall-back to SIMD8 writes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Yes, it's incorrect to use the 0-th channel enable group
unconditionally without considering the execution and regioning
controls of the instruction that uses the spilled value, but it
matches the previous behaviour exactly, the builder just makes the
preexisting problem more obvious because emitting an instruction of
non-native SIMD width without having called .group() or .exec_all()
explicitly would have led to an assertion failure.
I'll fix the problem in a follow-up series, as the solution is going
to be non-trivial.
Reviewed-by: Matt Turner <mattst88@gmail.com>
This simplifies opt_peephole_sel() slightly by emitting the SEL
instructions immediately after they are created, what makes the
sel_inst and mov_imm_inst arrays unnecessary and will make it possible
to get rid of the explicit inserts when the pass is migrated to the IR
builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
LOAD_PAYLOAD instructions need the same treatment as any other
generator instructions, at least FB writes and typed surface messages
will need a payload built with non-zero execution controls.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Most of these fields affect the behaviour of the instruction, but
apparently we currently don't CSE the kind of instructions for which
these fields could make a difference in the VEC4 back-end. That's
likely to change soon though when we start using send-from-GRF for
texture sampling and surface access messages.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Most of these fields affect the behaviour of the instruction so it
could actually break the program if we CSE a pair of otherwise
matching instructions with different values of these fields.
Reviewed-by: Matt Turner <mattst88@gmail.com>
The purpose of this change is threefold: First, it improves the
modularity of the compiler back-end by separating the functionality
required to construct an i965 IR program from the rest of the visitor
god-object, what in turn will reduce the coupling between other
components and the visitor allowing a more modular design. This patch
doesn't yet remove the equivalent functionality from the visitor
classes, as it involves major back-end surgery.
Second, it improves consistency between the scalar and vector
back-ends. The FS and VEC4 builders can both be used to generate
scalar code with a compatible interface or they can be used to
generate natural vector width code -- 1 or 4 components respectively.
Third, the approach to IR construction is somewhat different to what
the visitor classes currently do. All parameters affecting code
generation (execution size, half control, point in the program where
new instructions are inserted, etc.) are encapsulated in a stand-alone
object rather than being quasi-global state (yes, anything defined in
one of the visitor classes is effectively global due to the tight
coupling with virtually everything else in the compiler back-end).
This object is lightweight and can be copied, mutated and passed
around, making helper IR-building functions more flexible because they
can now simply take a builder object as argument and will inherit its
IR generation properties in exactly the same way that a discrete
instruction would from the same builder object.
The emit_typed_write() function from my image-load-store branch is an
example that illustrates the usefulness of the latter point: Due to
hardware limitations the function may have to split the untyped
surface message in 8-wide chunks. That means that the several
functions called to help with the construction of the message payload
are themselves required to set the execution width and half control
correctly on the instructions they emit, and to allocate all registers
with half the default width. With the previous approach this would
require the used helper functions to be aware of the parameters that
might differ from the default state and explicitly set the instruction
bits accordingly. With the new approach they would get a modified
builder object as argument that would influence all instructions
emitted by the helper function as if it were the default state.
Another example is the fs_visitor::VARYING_PULL_CONSTANT_LOAD()
method. It doesn't actually emit any instructions, they are simply
created and inserted into an exec_list which is returned for the
caller to emit at some location of the program. This sort of two-step
emission becomes unnecessary with the builder interface because the
insertion point is one more of the code generation parameters which
are part of the builder object. The caller can simply pass
VARYING_PULL_CONSTANT_LOAD() a modified builder object pointing at the
location of the program where the effect of the constant load is
desired. This two-step emission (which pervades the compiler back-end
and is in most cases redundant) goes away: E.g. ADD() now actually
adds two registers rather than just creating an ADD instruction in
memory, emit(ADD()) is no longer necessary.
v2: Drop scalarizing VEC4 builder.
v3: Take a backend_shader as constructor argument. Improve handling
of debug annotations and execution control flags.
v4: Drop Gen6 IF with inline comparison. Rename "instr" variable.
Initialize cursor to NULL by default and add method to explicitly
point the builder at the end of the program.
Reviewed-by: Matt Turner <mattst88@gmail.com>
simple_list.h defines a number of macros with short non-namespaced
names that can easily collide with other declarations (first_elem,
last_elem, next_elem, prev_elem, at_end), and according to the comment
it was only being included because of struct simple_node, which is no
longer used in this file.
Reviewed-by: Brian Paul <brianp@vmware.com>
This was using the wrong extension, ARB_stencil_texturing
doesn't mention any changes in this area.
Fixes "dEQP-GLES3.functional.fbo.completeness.renderable.texture.
stencil.stencil_index8."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90751
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
and some more code refactoring. No functional changes in this patch.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
v3: Use ffs() and a switch loop in
tr_mode_horizontal_texture_alignment() (Ben)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
v3: Use ffs() and a switch loop in
tr_mode_vertical_texture_alignment() (Ben)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
and change the name to brw_miptree_choose_tiling().
V3: Remove redundant function parameters. (Topi)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This refactoring is required by later patches in this series.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>