Calculate the IP ranges of the shader as an analysis pass. This will
later replace the existing tracking of start_ip/end_ip as the blocks are
changed (and the defer/adjust scheme to avoid too much churn when that
happen).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
Stop using the start_ip / end_ip, these are not really important for
those tests. What the test care was the number of instructions in the
block to check for changes and ensure we can peek at them by index.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
It was used by the pass tests to verify output with TEST_DEBUG=1,
replace it with brw_print_instructions().
The output is slightly different (not printing IP, not reordering the
blocks), we can add those features as we need, but given the usage was
already very reduced, don't bother with that until need arises.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
In 35df3925ca ("brw: ensure VUE header writes in HS/DS/GS stages") I
misread the PRMs and thought that the VF would initialize the header.
What actually happens is that the VF does not write valid values in
there and the PRMs explicitly say that the VS shader should overwrite
whatever is in there.
We could avoid writing the header in some cases when no HW is going to
read back the header. For example with rendering disables through
3DSTATE_STREAMOUT::RenderingDisable. But those cases are dynamic and
the compiler is not able to tell. So just always write the header.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 35df3925ca ("brw: ensure VUE header writes in HS/DS/GS stages")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12880
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34211>
The values already use BRW, make it consistent.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>
Will be useful for testing BFloat16 in later patches. No change
expected to the compiler itself.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>
Allow us to parse instructions in a form we currently generate
```
dpas.8x8(8) g55<1>F g47<1,1,0>F g31<1,1,0>HF g39<1,1,0>HF { align1 WE_all 1Q $4 };
```
Regions are not really needed, but this will be handled in a later patch
(that will also stop printing the regions).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34031>
This currently treats coarse and fine derivatives the same, but Qualcomm
needs to know whether just coarse derivatives are used or fine
derivatives/quad ops are also used. Rename this to
needs_coarse_quad_helper_invocations make clear the difference from the
new field, needs_full_quad_helper_invocations.
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: 264d8a6766 ("ir3: Set need_full_quad depending on info.fs.require_full_quads")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>
The key change is to use a builder to write the expected shader result
and compare that. To make this less error prone, a few helper functions
were added
- a way to allocate VGRFs from both shaders in parallel, that way the
same brw_reg can be used in both of them;
- assertions that a pass will make progress or not, and proper output
when the unexpected happens;
- use a common brw_shader_pass_test class so to collect some of the helpers;
- make some helpers work directly with builder.
The idea is to improve the signal in tests, so that the disasm comments
are not necessary anymore. For example
```
TEST_F(saturate_propagation_test, basic)
{
brw_reg dst1 = bld.vgrf(BRW_TYPE_F);
brw_reg src0 = bld.vgrf(BRW_TYPE_F);
brw_reg src1 = bld.vgrf(BRW_TYPE_F);
brw_reg dst0 = bld.ADD(src0, src1);
set_saturate(true, bld.MOV(dst1, dst0));
/* = Before =
*
* 0: add(16) dst0 src0 src1
* 1: mov.sat(16) dst1 dst0
*
* = After =
* 0: add.sat(16) dst0 src0 src1
* 1: mov(16) dst1 dst0
*/
brw_calculate_cfg(*v);
bblock_t *block0 = v->cfg->blocks[0];
EXPECT_EQ(0, block0->start_ip);
EXPECT_EQ(1, block0->end_ip);
EXPECT_TRUE(saturate_propagation(v));
EXPECT_EQ(0, block0->start_ip);
EXPECT_EQ(1, block0->end_ip);
EXPECT_EQ(BRW_OPCODE_ADD, instruction(block0, 0)->opcode);
EXPECT_TRUE(instruction(block0, 0)->saturate);
EXPECT_EQ(BRW_OPCODE_MOV, instruction(block0, 1)->opcode);
EXPECT_FALSE(instruction(block0, 1)->saturate);
}
```
becomes
```
TEST_F(saturate_propagation_test, basic)
{
brw_builder bld = make_shader(MESA_SHADER_FRAGMENT, 16);
brw_builder exp = make_shader(MESA_SHADER_FRAGMENT, 16);
brw_reg dst0 = vgrf(bld, exp, BRW_TYPE_F);
brw_reg dst1 = vgrf(bld, exp, BRW_TYPE_F);
brw_reg src0 = vgrf(bld, exp, BRW_TYPE_F);
brw_reg src1 = vgrf(bld, exp, BRW_TYPE_F);
bld.ADD(dst0, src0, src1);
bld.MOV(dst1, dst0)->saturate = true;
EXPECT_PROGRESS(brw_opt_saturate_propagation, bld);
exp.ADD(dst0, src0, src1)->saturate = true;
exp.MOV(dst1, dst0);
EXPECT_SHADERS_MATCH(bld, exp);
}
```
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33936>
The problem occurs with a series of instructions build the subgroup
invocation value :
mov(8) g23<1>UW 0x76543210V
add(8) g23.8<1>UW g23<8,8,1>UW 0x0008UW
add(16) g23.16<1>UW g23<16,16,1>UW 0x0010UW
Our register spilling code operates on physical registers (64B on
Xe2+) and using the brw_inst::is_partial_write() helper only considers
32B registers. So the spiller doesn't see that the add(16) instruction
is doing a partial write and ends up discarding the previous value.
You can reproduce the issue by running a test like :
INTEL_DEBUG=spill_fs ./deqp-vk -n dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.constant.uint8_uint8.buffer.rowmajor.linear
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: aa494cbacf ("brw: align spilling offsets to physical register sizes")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33642>
The premise of the change was wrong: the case where the defs analysis
was required was rare and requiring the analysis inside just the
case we care was being used for another analysis too. So for now,
the change doesn't really helps. I'll revisit this whole pass later on.
This backs out commit 6e19215810.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34039>
Instead of calling `require()` every instruction, call it once per pass.
Even though the defs are cached (i.e. we are not re-calculating them every
instruction), this prevents the extra check and the call to analysis
validation.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34010>
- brw_lower_3src_null_dest: Allocating a new destination, so include
INSTRUCTION_DATA_FLOW class.
- brw_lower_alu_restriction: Removing instruction, so include
INSTRUCTION_IDENTITY. No details are changed so remove
INSTRUCTION_DETAIL.
- brw_lower_vgrfs_to_fixed_grfs: Changing source and destination
numbers, so include INSTRUCTION_DETAIL.
- brw_lower_send_gather: Insert new instructions (scalar register) and
change sources and other information on existing ones. So include
INSTRUCTION_DETAIL and INSTRUCTION_IDENTITY. Promote to INSTRUCTIONS.
- brw_opt_eliminate_find_live_channel: Can change source, so include
INSTRUCTION_DATA_FLOW.
- brw_opt_copy_propagation_defs and brw_opt_cse_defs: Both can remove
instructions, so include INSTRUCTION_IDENTITY. Promote to
INSTRUCTIONS.
- brw_opt_saturate_propagation: Instruction can have `sat` modified,
and operands can have type modified, so include INSTRUCTION_DETAIL.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33993>
This was only needed on Sandybridge. We can delete the brw code,
and replace the generic devinfo bit with a helper inside the elk
compiler itself.
Thanks to Iván Briano for noticing we still had dead brw code for this.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
This is only needed for original 965G/GM clipper code, which only exists
in the legacy compiler. Send it off to live with the elk.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
Lowering the indirect derefs multiple times leads to very inefficient
shaders because of all the control flow inserted.
In particular on some DGC tests with mesh shaders, the tests can spin
for 1hour on an i7 and still not complete compilation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33809>
For each label store its offset and two lists of uses (for JIP and UIP).
Because the parser itself already restricts what opcodes can use labels
(and which ones), don't re-validate them.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33522>
The "JIP:" and "UIP:" markers were being ignored, so was possible
to switch their order in the text but the parser would act as the
same. Just fix the order now and enforce it through the parsing.
Since we are here, remove the "Jump:" and "Pop:" that are not used
for Gfx9+ anymore.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33522>