Commit graph

4222 commits

Author SHA1 Message Date
Matt Turner
1ab5b4f7db intel/compiler: Use FALLTHROUGH
Reported by clang's `-Wimplicit-fallthrough`.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>
2025-03-13 20:11:09 +00:00
Caio Oliveira
91feef40db brw: Simplify the test code for brw passes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The key change is to use a builder to write the expected shader result
and compare that.  To make this less error prone, a few helper functions
were added

- a way to allocate VGRFs from both shaders in parallel, that way the
  same brw_reg can be used in both of them;
- assertions that a pass will make progress or not, and proper output
  when the unexpected happens;
- use a common brw_shader_pass_test class so to collect some of the helpers;
- make some helpers work directly with builder.

The idea is to improve the signal in tests, so that the disasm comments
are not necessary anymore.  For example

```
TEST_F(saturate_propagation_test, basic)
{
   brw_reg dst1 = bld.vgrf(BRW_TYPE_F);
   brw_reg src0 = bld.vgrf(BRW_TYPE_F);
   brw_reg src1 = bld.vgrf(BRW_TYPE_F);
   brw_reg dst0 = bld.ADD(src0, src1);
   set_saturate(true, bld.MOV(dst1, dst0));

   /* = Before =
    *
    * 0: add(16)       dst0  src0  src1
    * 1: mov.sat(16)   dst1  dst0
    *
    * = After =
    * 0: add.sat(16)   dst0  src0  src1
    * 1: mov(16)       dst1  dst0
    */

   brw_calculate_cfg(*v);
   bblock_t *block0 = v->cfg->blocks[0];

   EXPECT_EQ(0, block0->start_ip);
   EXPECT_EQ(1, block0->end_ip);

   EXPECT_TRUE(saturate_propagation(v));
   EXPECT_EQ(0, block0->start_ip);
   EXPECT_EQ(1, block0->end_ip);
   EXPECT_EQ(BRW_OPCODE_ADD, instruction(block0, 0)->opcode);
   EXPECT_TRUE(instruction(block0, 0)->saturate);
   EXPECT_EQ(BRW_OPCODE_MOV, instruction(block0, 1)->opcode);
   EXPECT_FALSE(instruction(block0, 1)->saturate);
}
```

becomes

```
TEST_F(saturate_propagation_test, basic)
{
   brw_builder bld = make_shader(MESA_SHADER_FRAGMENT, 16);
   brw_builder exp = make_shader(MESA_SHADER_FRAGMENT, 16);

   brw_reg dst0 = vgrf(bld, exp, BRW_TYPE_F);
   brw_reg dst1 = vgrf(bld, exp, BRW_TYPE_F);
   brw_reg src0 = vgrf(bld, exp, BRW_TYPE_F);
   brw_reg src1 = vgrf(bld, exp, BRW_TYPE_F);

   bld.ADD(dst0, src0, src1);
   bld.MOV(dst1, dst0)->saturate = true;

   EXPECT_PROGRESS(brw_opt_saturate_propagation, bld);

   exp.ADD(dst0, src0, src1)->saturate = true;
   exp.MOV(dst1, dst0);

   EXPECT_SHADERS_MATCH(bld, exp);
}
```

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33936>
2025-03-13 17:43:17 +00:00
Lionel Landwerlin
35df3925ca brw: ensure VUE header writes in HS/DS/GS stages
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12820
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34041>
2025-03-13 16:06:01 +00:00
Lionel Landwerlin
c60180ba63 brw: fix spilling for Xe2+
The problem occurs with a series of instructions build the subgroup
invocation value :

mov(8)          g23<1>UW        0x76543210V
add(8)          g23.8<1>UW      g23<8,8,1>UW    0x0008UW
add(16)         g23.16<1>UW     g23<16,16,1>UW  0x0010UW

Our register spilling code operates on physical registers (64B on
Xe2+) and using the brw_inst::is_partial_write() helper only considers
32B registers. So the spiller doesn't see that the add(16) instruction
is doing a partial write and ends up discarding the previous value.

You can reproduce the issue by running a test like :

INTEL_DEBUG=spill_fs ./deqp-vk -n dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.constant.uint8_uint8.buffer.rowmajor.linear

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: aa494cbacf ("brw: align spilling offsets to physical register sizes")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33642>
2025-03-13 15:29:22 +00:00
Caio Oliveira
ce71f2badd brw: Move defs analysis back to its place in saturate propagation
The premise of the change was wrong: the case where the defs analysis
was required was rare and requiring the analysis inside just the
case we care was being used for another analysis too.  So for now,
the change doesn't really helps.  I'll revisit this whole pass later on.

This backs out commit 6e19215810.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34039>
2025-03-13 10:59:30 +00:00
Caio Oliveira
6e19215810 brw: Get the reference to brw_def_analysis only once in saturate propagation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instead of calling `require()` every instruction, call it once per pass.
Even though the defs are cached (i.e. we are not re-calculating them every
instruction), this prevents the extra check and the call to analysis
validation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34010>
2025-03-13 04:52:01 +00:00
Caio Oliveira
308f56ef82 brw: Add missing dependency classes to various passes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
- brw_lower_3src_null_dest: Allocating a new destination, so include
  INSTRUCTION_DATA_FLOW class.

- brw_lower_alu_restriction: Removing instruction, so include
  INSTRUCTION_IDENTITY.  No details are changed so remove
  INSTRUCTION_DETAIL.

- brw_lower_vgrfs_to_fixed_grfs: Changing source and destination
  numbers, so include INSTRUCTION_DETAIL.

- brw_lower_send_gather: Insert new instructions (scalar register) and
  change sources and other information on existing ones.  So include
  INSTRUCTION_DETAIL and INSTRUCTION_IDENTITY.  Promote to INSTRUCTIONS.

- brw_opt_eliminate_find_live_channel: Can change source, so include
  INSTRUCTION_DATA_FLOW.

- brw_opt_copy_propagation_defs and brw_opt_cse_defs: Both can remove
  instructions, so include INSTRUCTION_IDENTITY.  Promote to
  INSTRUCTIONS.

- brw_opt_saturate_propagation: Instruction can have `sat` modified,
  and operands can have type modified, so include INSTRUCTION_DETAIL.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33993>
2025-03-12 22:44:10 +00:00
Kenneth Graunke
cdbedc9eff intel: Move unlit centroid workaround into the elk compiler
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was only needed on Sandybridge.  We can delete the brw code,
and replace the generic devinfo bit with a helper inside the elk
compiler itself.

Thanks to Iván Briano for noticing we still had dead brw code for this.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
2025-03-10 17:23:07 -07:00
Kenneth Graunke
7f6b1dee2c intel: Move devinfo->has_compr4 into the elk compiler
Used in exactly one place in elk.  Off to live there.

Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
2025-03-10 17:23:07 -07:00
Kenneth Graunke
be8ec31e72 intel: Move devinfo->has_negative_rhw_bug into the elk compiler
This is only needed for original 965G/GM clipper code, which only exists
in the legacy compiler.  Send it off to live with the elk.

Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
2025-03-10 17:23:07 -07:00
Caio Oliveira
1744ecc1ce brw: Remove dead code from control flow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33957>
2025-03-10 19:23:17 +00:00
Caio Oliveira
89f0db0aaa brw: Remove extra interface in brw_cfg types
The C++ one is more used, so let that one remain.  These data structures
are not used from C sources anymore.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33957>
2025-03-10 19:23:17 +00:00
Lionel Landwerlin
1835bf3520 brw: avoid calling lower_indirect_derefs multiple times
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Lowering the indirect derefs multiple times leads to very inefficient
shaders because of all the control flow inserted.

In particular on some DGC tests with mesh shaders, the tests can spin
for 1hour on an i7 and still not complete compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33809>
2025-03-09 20:52:01 +00:00
Sagar Ghuge
1bfe2571f5 intel/compiler: Lower sample index into coord for MSRT messages
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32690>
2025-03-07 23:06:14 +00:00
Sagar Ghuge
bea9d79cb9 intel/compiler: Add support for MSAA typed load/store messages
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32690>
2025-03-07 23:06:14 +00:00
Caio Oliveira
ff59013571 brw: Rework label tracking in assembler
For each label store its offset and two lists of uses (for JIP and UIP).
Because the parser itself already restricts what opcodes can use labels
(and which ones), don't re-validate them.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33522>
2025-03-06 17:06:20 -08:00
Caio Oliveira
7b45d31df0 brw: Add support for GOTO/JOIN in the assembler
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33522>
2025-03-06 17:06:20 -08:00
Caio Oliveira
9df254731e brw: Make assembler strict about JIP and UIP order
The "JIP:" and "UIP:" markers were being ignored, so was possible
to switch their order in the text but the parser would act as the
same.  Just fix the order now and enforce it through the parsing.

Since we are here, remove the "Jump:" and "Pop:" that are not used
for Gfx9+ anymore.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33522>
2025-03-06 17:06:19 -08:00
Caio Oliveira
25875f5e79 brw: Remove bblock_t parameters from various passes
These are either unused or can be trivially replaced by
a block stored in an instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
8e2a7cb42d brw: Embed at_end() inside brw_builder(brw_shader *) constructor
All remaining uses of that constructor would also use at_end(),
and vice-versa.  So just implement that behavior in the constructor
itself.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
6f37e6f104 brw: Add explicit way to get an empty brw_builder
And use brw_builder(brw_shader *) and brw_builder() constructors
where possible.

The way tests are written, it is necessary to initialize an "empty"
builder -- which is later replaced by a proper one.  Default parameter
NULL make that initialization implicit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
32e562ae01 brw: Simplify brw_builder "insert before inst" constructor
Since brw_inst now has the block it belongs and the block can
reach the shader, the only necessary information to create a
builder is the brw_inst itself.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
66307811c3 brw: Remove block parameter from brw_inst::remove()
Use brw_inst::block instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
7924d48bcd brw: Use brw_inst::block in CSE
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
b0b0fa8624 brw: Use brw_inst::block in Combine Constants
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
07d0af763d brw: Use brw_inst::block in Def analysis
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
705d448bc3 brw: Add block pointer in brw_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
b71ec53048 brw: Remove unused function
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Caio Oliveira
54912281a0 brw: Always verify EU compaction in debug mode
There's already code to verify that any compacted instruction
that we produce is equivalent to the original uncompacted
instruction -- including detailed output if it fails.

This patch enables this verification in debug build and will
abort in case it fails.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33821>
2025-03-06 00:14:14 +00:00
Lionel Landwerlin
93a327c4e6 anv/brw: move INTEL_MSAA_* flag computation to the compiler
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33751>
2025-03-05 17:20:12 +00:00
Lionel Landwerlin
beaba53010 brw: make intel_shader_enums.h opencl importable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33751>
2025-03-05 17:20:12 +00:00
Caio Oliveira
dd1ca1588d brw: Fix size in assembler when compacting
Calculation was wrongly walking uncompacted instructions, even if we had
some compacted in the middle, generating invalid size.  Since we are
here just drop the instruction count, since in practice the caller will
have to walk the instruction stream anyway.

Fixes: 6267585778 ("intel/brw: Also return the size of the assembled shader")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33532>
2025-03-03 20:43:56 +00:00
Kenneth Graunke
88309a9818 brw: Rename shared function enums for clarity
Our name for this enum was brw_message_target, but it's better known as
shared function ID or SFID.  Call it brw_sfid to make it easier to find.

Now that brw only supports Gfx9+, we don't particularly care whether
SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5.  Also, the LSC SFIDs
were confusingly tagged "GFX12" but aren't available on Gfx12.0; they
were introduced with Alchemist/Meteorlake.

GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing.  It sounds
like the SFID to use for the sampler on Gfx6+, however it has nothing to
do with the sampler at all.  BRW_SFID_SAMPLER remains the sampler SFID.
On Haswell, we ran out of messages on the main data cache data port, and
so they introduced two additional ones, for more messages.  The modern
Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2.  I think
the "sampler" name came from some idea about reorganizing messages that
never materialized (instead, the LSC came as a much larger cleanup).

Recently we've adopted the term "HDC" for the legacy data cluster, as
opposed to "LSC" for the modern Load/Store Cache.  To make clear which
SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2.

We were also citing the G45, Sandybridge, and Ivybridge PRMs for a
compiler that supports none of those platforms.  Cite modern docs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>
2025-02-27 08:49:24 +00:00
Tapani Pälli
78e5157a9c intel/compiler: add a spec note about L1WT types being uncached
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33755>
2025-02-27 05:38:35 +00:00
Paulo Zanoni
fd10764cff brw: extend the NOP+WHILE workaround
It turns out that we need to add a NOP not only in between two
consecutive WHILE instructions, but also after every control flow
instruction that immediately precedes a WHILE.

v2: Rebase after the renames.

Fixes: 5ca883505e ("brw: add a NOP in between WHILE instructions on LNL")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>
2025-02-26 22:23:16 +00:00
Paulo Zanoni
3596b4e325 brw: add instructions missing from is_control_flow()
I'm not aware of any workloads that will be impacted by this change,
but let's keep our list of control flow instructions complete. A
shader-db run on MTL tells me nothing changes.

v2: "The scheduler relies on HALT not being considered control flow to
be able to move code past HALT instructions. Doing this would prevent
such optimization from happening and would reduce performance
dramatically in some cases." - Francisco.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>
2025-02-26 22:23:16 +00:00
Karol Herbst
dad5ee1039 intel/brw, lp: enable lower_pack_64_4x16
The compiler won't be able to emit pack_64_4x16, so we should prevent
nir_opt_algebraic to optimize to it. This fixes an infinite optimization
loop inside brw_nir_optimize:

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    64    %61999 = pack_64_2x32_split %61995, %61998
    64       %76 = iadd %100, %79
                   @store_global (%61999, %76)

nir_opt_algebraic
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    64    %62001 = pack_64_4x16 %62000
    64       %76 = iadd %100, %79
                   @store_global (%62001, %76)

nir_lower_pack
    16x4     %77 = @load_global (%80)
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    16    %62002 = mov %62000.y
    16    %62003 = mov %62000.x
    32    %62004 = pack_32_2x16_split %62003, %62002
    16    %62005 = mov %62000.w
    16    %62006 = mov %62000.z
    32    %62007 = pack_32_2x16_split %62006, %62005
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

// brw_nir_optimize loops here

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %62004 = pack_32_2x16_split %77.x, %77.y
    32    %62007 = pack_32_2x16_split %77.z, %77.w
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

llvmpipe has a similar issue inside lp_build_opt_nir

Fixes: b1bc691b0f ("nir/algebraic: add and improve pack/unpack patterns")
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33347>
2025-02-26 20:43:39 +00:00
Ian Romanick
495812d8e0 brw/print: Don't let SHADER_OPCODE_FLOW affect indentation
In `fossilize-replay --pipeline-hash 375a63e14afa96c4
fossils/fossil-db/steam-dxvk/f1_22_abu_dhabi.dx12vk-ultra.foz`,
`cf_count` would get decremented below zero. This would lead trying to
print `UINT_MAX` levels of indentation just a few lines below. I ran
out of disk space and patience before that finished. 🤣

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33748>
2025-02-26 19:50:30 +00:00
Lionel Landwerlin
d0c980caa7 brw: avoid setting up the sampler header bits when unused
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33704>
2025-02-26 17:19:04 +00:00
Lionel Landwerlin
8b4f997168 brw: optimize load payload with immediate headers
Currently the condition to use a single MOV is failing on immediate
values, so we emit 2 MOVs in SIMD8 instead of a single SIMD16.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33704>
2025-02-26 17:19:04 +00:00
Alyssa Rosenzweig
ff94b155ab treewide: port remaining nir_metadata_preserve users
apply our semantic patch manually to the remaining users. Coccinelle bailed on
these files for whatever reason, I guess.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>
2025-02-26 15:19:53 +00:00
Alyssa Rosenzweig
9a58a8257e treewide: Switch to nir_progress
Via the Coccinelle patch at the end of the commit message, followed by

sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')
ninja -C ~/mesa/build clang-format
cd ~/mesa/src/compiler/nir && clang-format -i *.c
agxfmt

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -return true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -return false;
    -}
    +bool progress = prog_expr;
    +return nir_progress(progress, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
    +nir_progress(prog, impl, metadata);

    @@
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    -return true;
    +return nir_progress(true, impl, metadata);

    @@
    expression impl;
    @@

    -nir_metadata_preserve(impl, nir_metadata_all);
    -return false;
    +return nir_no_progress(impl);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    -other_prog |= prog;
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +nir_progress(prog, impl, metadata);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -other_prog = true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    identifier prog;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -prog = true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +prog = prog | nir_progress(impl_progress, impl, metadata);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -other_prog = true;
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    identifier prog;
    @@

    -if (prog_expr) {
    -prog = true;
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +prog = prog | nir_progress(impl_progress, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +nir_progress(impl_progress, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    -prog = true;
    +prog = nir_progress(true, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -}
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -}
    +nir_progress(prog, impl, metadata);

    @@
    expression impl;
    @@

    -nir_metadata_preserve(impl, nir_metadata_all);
    +nir_no_progress(impl);

    @@
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    +nir_progress(true, impl, metadata);

squashme! sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>
2025-02-26 15:19:53 +00:00
Sagar Ghuge
6f7a76e9d9 intel/compiler: Zero out the header for texel fetch
It looks like even if we pass the header not present in the sampler descriptor,
it's not helping with the correct behavior of texelFetch.

Experiment on real HW shows that if we just zero out the header and include it
in the message, it helps with the correct behavior. I'm not sure if there is a
valid HW workaround for this one.

We can skip masking the sampler message header bits 4:0 but masking them out
doesn't hurt in this case.

Increasing number of parameter impact sampler performance, For example,
a sample message using 5 parameters will not be able to sustain the same
throughput as a sample message with only 4 valid parameters. We should
look out for any perf impact with respect to texel fetch.

This patch fixes ~3k tests involving texelFetch instruction on Xe3+

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33562>
2025-02-26 00:23:49 +00:00
Caio Oliveira
a030acd7c3 brw: Reformat brw_gram.y and brw_lex.l
Change to use Mesa space indentation.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33739>
2025-02-25 22:57:51 +00:00
Caio Oliveira
7311bcfd6a intel/brw: Don't need to repair CFG in brw_opt_combine_constants
Since a previous change ensured that a DO-block is guaranteed to not be
followed by a DO-block, it is sufficient to pick the next block without
requiring to repair the CFG.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>
2025-02-24 23:25:06 +00:00
Caio Oliveira
d2c39b1779 intel/brw: Always have a (non-DO) block after a DO in the CFG
Make the "block after DO" more stable so that adding instructions after
a DO doesn't require repairing the CFG.  Use a new SHADER_OPCODE_FLOW
instruction that is a placeholder representing "go to the next block"
and disappears at code generation.

For some context, there are a few facts about how CFG currently works

- Blocks are assumed to not be empty;
- DO is always by itself in a block, i.e. starts and ends a block;
- There are no empty blocks;
- Predicated WHILE and CONTINUE will link to the "block after DO";
- When nesting loops, it is possible that the "block after DO" is
  another "DO".

Reasons and further explanations for those are in the brw_cfg.c comments.

What makes this new change useful is that a pass might want to add
instructions between two DO instructions.  When that happens, a new
block must be created and any predicated WHILE and CONTINUE must be
repaired.

So, instead of requiring a repair (which has proven to be tricky in
the past), this change adds a block that can be "virtually" empty but
allow instructions to be added without further changes.

One alternative design would be allowing empty blocks, that would be
a deeper change since the blocks are currently assumed to be not empty
in various places.  We'll save that for when other changes are made to
the CFG.

The problem described happens in brw_opt_combine_constants, and a
different patch will clean that up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>
2025-02-24 23:25:06 +00:00
Caio Oliveira
d32a5ab0e4 intel/brw: Use the builder DO() function in all places
Shorter and a preparation to add some functionality to DO().

Had to make it const since that's the convention for builder, so
just made all the sibling helpers const too.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>
2025-02-24 23:25:06 +00:00
Lionel Landwerlin
ce7208c3ee brw: add support for texel address lowering
The expectations are :
  - no MSAA images
  - a single tiling mode is used when not linear

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>
2025-02-23 15:16:50 +00:00
Lionel Landwerlin
b25e050ec7 brw: add support for 64bit storage images load/store
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>
2025-02-23 15:16:50 +00:00
Lionel Landwerlin
3bd4c5a166 brw: include UGM fence when TGM + lowered image->global
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32676>
2025-02-23 15:16:50 +00:00