Commit graph

18 commits

Author SHA1 Message Date
Lionel Landwerlin
1d1866a84b brw: apply same workaround to spawn than trace opcode
Working around BRW's limitations

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>
2026-01-20 21:25:52 +00:00
Lionel Landwerlin
6d19b898e7 anv/brw: prep work for SIMD32 ray queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>
2026-01-12 12:19:21 +00:00
José Roberto de Souza
518705a4fe intel/brw: Split to a function the code that calculate sampler channels that should be written
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This block of code will be re-used in a future patch, also it reduces a bit the
size and complexity of lower_sampler_logical_send().
No changes in behavior intended here.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38792>
2025-12-15 06:57:15 -08:00
Kenneth Graunke
bdacab49c7 brw: Use LSC extended descriptor offsets for Xe2 URB messages
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
URB messages on Xe2 are LSC messages with FLAT addressing.  We can
specify a S19 immediate offset in the extended message descriptor,
which should be more than adequate to hold any offsets we need.

We wrote the original URB code before implementing that, and never
doubled back to take advantage of it.  But doing so can drop ADDs
near every URB access.

fossil-db results on Battlemage:

   Totals:
   Instrs: 232239759 -> 231432254 (-0.35%)
   Cycle count: 34044435848.0 -> 34055507100.0 (+0.03%); split: -0.00%, +0.04%
   Spill count: 520370 -> 520362 (-0.00%); split: -0.00%, +0.00%
   Fill count: 470790 -> 470803 (+0.00%); split: -0.00%, +0.00%
   Max live registers: 72111853 -> 72111369 (-0.00%); split: -0.00%, +0.00%

   Totals from 227920 (28.89% of 788851) affected shaders:
   Instrs: 59841897 -> 59034392 (-1.35%)
   Cycle count: 683385208.0 -> 694456460.0 (+1.62%); split: -0.14%, +1.76%
   Spill count: 17278 -> 17270 (-0.05%); split: -0.10%, +0.06%
   Fill count: 17481 -> 17494 (+0.07%); split: -0.03%, +0.10%
   Max live registers: 23052652 -> 23052168 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
2025-12-12 00:12:03 +00:00
Kenneth Graunke
9482d392a1 brw: Fix outdated comments about urb->offset units
I recently converted urb->offset to be in bytes on Xe2, but neglected to
update these comments that still said OWord.

Fixes: 9ffae42975 ("brw: Store brw_urb_inst::offset in bytes on Xe2")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
2025-12-12 00:12:03 +00:00
Lionel Landwerlin
d51c0b8988 brw: fix SS surfaces usage
In 80c89909f3 ("brw: fixup immediate bindless surface handling") I
forgot that we have a special usage for the only _SS surface (the
scratch surface).

Because it's only delivered in the 31:10 bits of R0 and because we
want to minimize the amount of shader instructions for scratch
messages, the surface offset in shifted right by the driver to align
things properly for the 31:6 extended descriptor format.

This is unfortunately incompatible with the full 32bit format of
ExBSO. So this surface type currently cannot be considered bindless.

We might revisit later if we start using _SS surfaces for other
things.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 80c89909f3 ("brw: fixup immediate bindless surface handling")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38618>
2025-11-24 16:12:27 +00:00
Lionel Landwerlin
80c89909f3 brw: fixup immediate bindless surface handling
This is unused at the moment but the backend incorrectly assumes
immediate handles are for the binding table (therefore not bindless).

Some new CTS tests are using an immediate bindless handle which is
broken.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38359>
2025-11-14 00:24:55 +00:00
Kenneth Graunke
9ffae42975 brw: Store brw_urb_inst::offset in bytes on Xe2
Xe2 uses byte offsets rather than OWord offsets.  We've been storing the
per-slot offsets in bytes on Xe2 for a while, but kept the global offset
immediate in OWords for some reason, choosing to lower it during logical
send lowering.

This patch makes both offsets (global immediate, per-slot) in the same
units, so they could be added together if necessary without scaling.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
2025-11-11 10:55:44 +00:00
Lionel Landwerlin
51893699a2 brw: stop emitting flush operations for begin/end interlock
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
NIR barrier intrinsics are already added for required flushing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38242>
2025-11-06 09:33:25 +02:00
Lionel Landwerlin
c5d313a2a8 brw: handling dynamic programmable offsets pre-Xe2
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
2025-10-21 06:13:10 +00:00
Lionel Landwerlin
b722e17203 brw: get rid of GET_BUFFER_SIZE opcode
Rely on RESINFO which is what was used already.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:16 +00:00
Lionel Landwerlin
efcba73b49 brw: switch to new sampler payload description scheme
Instead of having abstracted opcodes, we target directly the HW format
at the NIR translation.

The payload description gives us the order of the payload sources (we
can use that for pretty printing) and we don't have to have a
complicated scheme in the logical send lowering for the ordering. All
we have to do is build the header if needed as well as the descriptors.

PTL Fossil-db stats:
 Totals from 66759 (13.54% of 492917) affected shaders:
 Instrs: 44289221 -> 43957404 (-0.75%); split: -0.81%, +0.06%
 Send messages: 2050378 -> 2042607 (-0.38%)
 Cycle count: 3878874713 -> 3712848434 (-4.28%); split: -4.44%, +0.16%
 Max live registers: 8773179 -> 8770104 (-0.04%); split: -0.06%, +0.03%
 Max dispatch width: 1677408 -> 1707952 (+1.82%); split: +1.85%, -0.03%
 Non SSA regs after NIR: 11407821 -> 11421041 (+0.12%); split: -0.03%, +0.15%
 GRF registers: 5686983 -> 5838785 (+2.67%); split: -0.24%, +2.91%

LNL Fossil-db stats:

 Totals from 57911 (15.72% of 368381) affected shaders:
 Instrs: 39448036 -> 38923650 (-1.33%); split: -1.41%, +0.08%
 Subgroup size: 1241360 -> 1241392 (+0.00%)
 Send messages: 1846696 -> 1845137 (-0.08%)
 Cycle count: 3834818910 -> 3784003027 (-1.33%); split: -2.33%, +1.00%
 Spill count: 21866 -> 22168 (+1.38%); split: -0.07%, +1.45%
 Fill count: 59324 -> 60339 (+1.71%); split: -0.00%, +1.71%
 Scratch Memory Size: 1479680 -> 1483776 (+0.28%)
 Max live registers: 7521376 -> 7447841 (-0.98%); split: -1.04%, +0.06%
 Non SSA regs after NIR: 9744605 -> 10113728 (+3.79%); split: -0.01%, +3.80%

Only 2 titles negatively impacted (spilling) :
  - Shadow of the Tomb Raider
  - Red Dead Redemption 2

All impacted shaders were already spilling.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:15 +00:00
José Roberto de Souza
19de4b82f9 intel/brw: Store and set sfid in memory fences
sfid is another field that is not preserved after brw_transform_inst_to_send()
so we need to store it before transform and retore it to preserve the sfid value.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
a259f64595 intel/brw: Call lower_hdc_memory_fence_and_interlock() with brw_send_inst
With that we can avoid some as_send() calls.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
5b4deb7d2d intel/brw: Fix LSC fence scope and flush type
Opcodes SHADER_OPCODE_INTERLOCK and SHADER_OPCODE_MEMORY_FENCE are emitted as
brw_send_inst and at nir to brw conversion the desc field is set with scope and
flush type of the instruction.
But when brw_inst is converted to brw_send_inst all special fields of
brw_send_inst are set to 0, causing scope and flush type to always be 0.

So here calling lower_lsc_memory_fence_and_interlock() with brw_send_inst
parameter and storing the desc before brw_transform_inst_to_send().

I still have not figure out why we need do brw_transform_inst_to_send() even
if it is already a brw_send_inst but not doing so causes a segfault in
foreach_block_and_inst_safe(block, brw_inst, inst, s.cfg) of
brw_lower_logical_sends(), also other opcodes of that function does something
similar so I don't think that is wrong.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
6e02351c58 intel/brw: Share mode code in lower_lsc_varying_pull_constant_logical_send()
By dynamic setting num_channel we can share more code in
lower_lsc_varying_pull_constant_logical_send().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37853>
2025-10-14 18:26:39 +00:00
Lionel Landwerlin
37a9c5411f brw: serialize messages on Gfx12.x if required
The Intel EU fusion feature needs to be disabled on SEND messages
where either the texture handle, sampler handle, sampler header is not
identical on fused threads.

This is the case in particular with accesses on non-uniform
texture/sampler handles but could also strike with dynamic
programmable offsets (currently disabled).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>
2025-10-10 11:19:39 +00:00
Kenneth Graunke
73cbb35442 brw: Move into a new src/intel/compiler/brw subdirectory
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This keeps the directory structure a bit more organized:
- brw specific code
- elk specific code
- common NIR passes that could be used in both places

It also means that you can now 'git grep' in the brw directory without
finding a bunch of elk code, or having to "grep thing b*".

Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
2025-10-09 07:01:47 +00:00
Renamed from src/intel/compiler/brw_lower_logical_sends.cpp (Browse further)