Commit graph

4664 commits

Author SHA1 Message Date
Caio Oliveira
84963d6833 intel/brw: Take shader in the brw_generator::generate_code() parameters
Simplify the calls in all the stage compile functions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:20 +00:00
Caio Oliveira
c19a4150b5 intel/brw: Simplify variant tracking in brw_compile_fs
Remove the cfg variables and use the shader pointers directly.  Reset
the variant pointer if a shader failed or will not be used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:20 +00:00
Caio Oliveira
834e30d244 intel/brw: Simplify tracking of dispatch_width_limit in brw_compile_fs
Keep it in a variable, that way don't need to check which shader to look
for the limit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:20 +00:00
Caio Oliveira
9d53e27579 intel/brw: Remove brw_shader::import_uniforms()
The brw_shader::uniforms now is derived from the nir_shader.  The
only exception is compute shaders for older Gfx versions, so we
move the adjust logic for that.

The benefit here is untangling the code for compilation variants,
that before needed to keep track of the first that compiled to,
in most cases, copy an integer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:19 +00:00
Caio Oliveira
b8a35a8a27 brw: Pass per_primitive_offset in brw_shader_params
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:19 +00:00
Caio Oliveira
6ca9021758 brw: Add brw_shader_params
And unify the initialization code for brw_shader.  Avoid passing
brw_compile_params since for a single compilation we might have
multiple shaders (the case for BS stage).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33541>
2025-08-28 00:06:18 +00:00
Caio Oliveira
1c933b6511 brw: Fix checking sources of wrong instruction in opt_address_reg_load
Fixes: 8ac7802ac8 ("brw: move final send lowering up into the IR")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37019>
2025-08-27 22:50:23 +00:00
Lionel Landwerlin
93996c07e2 brw: fix broadcast opcode
The problem with the current code is that there is a disconnect between :
   - the virtual register size allocated
   - the dispatch size
   - the size_written value

Only the last 2 are in sync and this confuses the spiller that only
looks at the destination register allocation & dispatch size to figure
out how much to spill.

The solution in this change is to make BROADCAST more like
MOV_INDIRECT, so that you can do a BROADCAST(8) that actually reads a
SIMD32 register. We put the size of the register read into src2.

Now the spiller sees correct read/write sizes just looking at the
destination register & dispatch size.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 662339a2ff ("brw/build: Use SIMD8 temporaries in emit_uniformize")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13614
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36564>
2025-08-28 00:23:44 +03:00
Lionel Landwerlin
e6ca709a4e brw: fix INTEL_DEBUG=spill_fs
We need to dirty the instruction BRW_DEPENDENCY_INSTRUCTIONS &
BRW_DEPENDENCY_VARIABLES if anything was spilled.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a6b0783375 ("brw: Use brw_ip_ranges in scheduling / regalloc")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13233
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36925>
2025-08-27 15:08:35 +00:00
Lionel Landwerlin
3362b8dcb5 brw: use a scalar builder for the load_payload on transpose loads
I noticed SIMD32 shaders have that kind of pattern :

mov(32)         g94<1>D         0D                              { align1 WE_all };
send(1)         g15UD           g94UD           nullUD          0x6210d500                0x02010000
                ugm MsgDesc: ( load, a32, d32, V16, transpose, L1STATE_L3MOCS dst_len = 1, src0_len = 1, src1_len = 0 bti )  BTI 2  base_offset 16  { align1 WE_all 1N I@5 $1 };

Why use a 32 wide register for a SEND that is only going to read the first lane?

We can stick a single physical register and reduce register pressure.

DG2 fossils-db results :

Totals:
Instrs: 157417515 -> 157417796 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15362185116 -> 15363086774 (+0.01%); split: -0.05%, +0.05%
Max live registers: 29059141 -> 29051166 (-0.03%)
Max dispatch width: 5071256 -> 5075720 (+0.09%); split: +0.33%, -0.24%

Totals from 82132 (14.43% of 569221) affected shaders:
Instrs: 26564632 -> 26564913 (+0.00%); split: -0.00%, +0.00%
Cycle count: 4630907475 -> 4631809133 (+0.02%); split: -0.16%, +0.18%
Max live registers: 5425037 -> 5417062 (-0.15%)
Max dispatch width: 128384 -> 132848 (+3.48%); split: +12.92%, -9.45%

LNL fossils-db results :

Totals:
Instrs: 141870413 -> 141870745 (+0.00%); split: -0.00%, +0.00%
Cycle count: 20176018818 -> 20191262632 (+0.08%); split: -0.07%, +0.14%
Max live registers: 44858167 -> 44838370 (-0.04%)

Totals from 51859 (10.55% of 491590) affected shaders:
Instrs: 16834547 -> 16834879 (+0.00%); split: -0.00%, +0.00%
Cycle count: 5761980106 -> 5777223920 (+0.26%); split: -0.24%, +0.50%
Max live registers: 5893878 -> 5874081 (-0.34%)

Perf A/B testing only reported a 0.5% improvement on DG2 on one trace, no changes on BMG.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36958>
2025-08-26 12:03:22 +00:00
Lionel Landwerlin
27c69acb6a brw: remove uniform from opt_offsets
Those are for push constants, no point in doing that because :
   - there is no HW constant offsets in push constants (payload
     delivery), it's just register offset calculation
   - if we have an dynamic value it's already using MOV_INDIRECT

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e103afe7be ("brw: run the nir_opt_offsets pass and set the maximum offset size")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36958>
2025-08-26 12:03:22 +00:00
Konstantin Seurer
9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Caio Oliveira
74a4e7dd4b brw: Fix folding case for MAD instruction with all immediates
Fixes: b605f76b2a ("brw/algebraic: Constant fold multiplicands of MAD")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36867>
2025-08-21 17:19:18 +00:00
Caio Oliveira
eec64c865f brw: Add disabled test for MAD constant folding
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36867>
2025-08-21 17:19:18 +00:00
Calder Young
c7e48f79b7 brw,anv: Reduce UBO robustness size alignment to 16 bytes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instead of being encoded as a contiguous 64-bit mask of individual registers,
the robustness information is now encoded as a vector of up to 4 bytes that
represent the limits of each of the pushed UBO ranges in 16 byte units.
Some buggy Direct3D workloads are known to depend on a robustness alignment
as low as 16 bytes to work properly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>
2025-08-21 09:04:55 +00:00
Lionel Landwerlin
2281e88381 brw: make assign_curb_setup visible in optimizer debug
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>
2025-08-21 09:04:54 +00:00
Lionel Landwerlin
df37c7ca74 brw: fix analysis dirtying with pulled constants
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5c17299084 ("brw: enable A64 pulling of push constants")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>
2025-08-21 09:04:53 +00:00
Marek Olšák
c601308615 nir: convert nir_instr_worklist to init/fini semantics w/out allocation
This removes the malloc overhead.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:49 +00:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Lionel Landwerlin
fe38fb858c brw: workaround broken indirect RT messages on Gfx11
Unfortunately we cannot use the indirect descriptor on Gfx11, it
appears to just drop writes. Other platforms appear to be fine.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
Lionel Landwerlin
a0844458b8 brw: enable opt_register_coalesce to work with multiple EOT blocks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
Lionel Landwerlin
c4c7ff3f8f brw: enable register allocation to deal with multiple EOTs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
Caio Oliveira
4fda724fd4 brw: Avoid invalid access when compacting out-of-bounds JIP/UIP
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Usually JIP will be valid, but as part of other changes, it will be
possible to have a shader that have multiple EOT messages and end with
and ENDIF instruction.  Its JIP will point after the program ends.
This is fine but was tripping up the compaction code.

Change compaction to not read its internal structures beyond the last
instruction.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36822>
2025-08-20 00:54:41 +00:00
Caio Oliveira
148063670d brw: If the instruction is already a SEND, no need to resize sources
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Kept an assert as a placeholder in case we had something odd going on
that this code was protecting.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>
2025-08-19 13:54:43 +00:00
Caio Oliveira
cebac156c4 brw: Only access valid sources in lower_btd_logical_send()
Only the SHADER_OPCODE_BTD_SPAWN_LOGICAL has sources, so
only reach for them when handling that instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>
2025-08-19 13:54:43 +00:00
Caio Oliveira
dc960936fc brw: Move resize_sources() earlier when lowering FIND_LIVE_CHANNELS
Move it before the new source is used.  This currently works because all
instructions have a minimum amount of sources allocated, but a later commit
will change that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>
2025-08-19 13:54:43 +00:00
Caio Oliveira
fe2e2fabcd brw: Make sure copied instruction don't copy the list pointers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>
2025-08-19 13:54:43 +00:00
Caio Oliveira
5a34f676a5 brw: Define order for fixes in 3-src operand fix
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>
2025-08-19 13:54:43 +00:00
Sagar Ghuge
49b917baaf intel/compiler: Fix ray geometry index
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We have only 24-bit wide geometry index, not the 28-bit wide.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36796>
2025-08-19 09:32:55 +00:00
Matt Turner
6fd4dc353c elk/algebraic: Protect SHUFFLE from OOB indices
Akin to b67230de63 ("intel/fs: Protect opt_algebraic from OOB BROADCAST
indices"), we need to protect SHUFFLE as well.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36779>
2025-08-19 09:15:19 +00:00
Matt Turner
b4b692c486 brw/algebraic: Protect SHUFFLE from OOB indices
Akin to b67230de63 ("intel/fs: Protect opt_algebraic from OOB BROADCAST
indices"), we need to protect SHUFFLE as well.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13351
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36779>
2025-08-19 09:15:19 +00:00
Lionel Landwerlin
c871a62a75 brw: move URB channel mask shifting to the lowering pass
For example Xe2 uses the LSC and doesn´t need the shifting, so let's
just apply it where it's needed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36757>
2025-08-13 12:01:49 +00:00
Lionel Landwerlin
68838d7001 brw: reorder reloc enums to leave embedded samplers at the end
So that the driver can allocate an array of relocations using
BRW_SHADER_RELOC_EMBEDDED_SAMPLER_HANDLE + number_of_embedded_samplers

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36757>
2025-08-13 12:01:49 +00:00
Lionel Landwerlin
46c16f854e brw: compute consistent clip/cull distance masks with VUE
We can optimize the VUE layout in cases where all shaders are compiled
together and some outputs are unused. So we need to have consistent
clip/cull_distance_mask with the VUE.

Previously we could have a VUE without ClipDistance present in the
header and yet have a non zero clip_distance_mask. This would trip the
HW into taking into account a VUE field that doesn't exist.

Here we set the clip/cull_distance_mask to 0 if the associated output
is not written by the shader. The written outputs are always
consistent with what's in the VUE.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2d396f6085 ("intel: prepare VUE layout for more than 2 layouts")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13685
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36734>
2025-08-13 06:24:44 +00:00
Sagar Ghuge
cac3b4f404 anv: Mask off excessive invocations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For unaligned invocations, don't launch two COMPUTE_WALKER, instead we
can mask off excessive invocations in the shader itself at nir level and
launch one additional workgroup.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36245>
2025-08-12 23:17:02 +00:00
Kenneth Graunke
5e9de5317e brw: Validate that send payloads can't be imms or have source mods
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
To ensure we haven't missed resolving these things.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:11 +00:00
Kenneth Graunke
22165defb5 brw: Drop interlock and memory fence logical opcodes from is_payload()
These are lowered to sends prior to any callers of this helper.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:11 +00:00
Kenneth Graunke
ed4fadbb16 brw: Drop INTERPOLATE_AT_* opcodes from is_payload()
These are lowered to sends prior to any callers of this helper.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:10 +00:00
Kenneth Graunke
e2022017ce brw: Drop uniform pull constant load virtual opcode from is_send()
The logical send lowering already resolves sources when constructing
the send payload, so prior to that lowering, we don't need to apply
any special restrictions here.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:10 +00:00
Kenneth Graunke
9d5cd03ea8 brw: Drop interlock and memory fence logical opcodes from is_send()
The logical send lowering already resolves sources when constructing
the send payload, so prior to that lowering, we don't need to apply
any special restrictions here.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:09 +00:00
Kenneth Graunke
342ff81df0 brw: Drop INTERPOLATE_AT_* opcodes from is_send()
The goal here was to avoid propagating source modifiers, unusual
regions, and other things that couldn't be used as a send source.

A few patches ago ("brw: Properly resolve non-sendable sources in a few
logical opcodes") we fixed the logical send lowering to handle these
by resolving them when constructing the send payload.  So now prior
to lowering, we don't need to treat these opcodes specially.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:08 +00:00
Kenneth Graunke
47fe9d28e7 brw: Enumerate SHADER_OPCODE_SEND sources and standardize how many
This introduces enums for SHADER_OPCODE_SEND[_GATHER] sources, similar
similar to what we've done for most of the newer logical opcodes.  This
allows us to use actual names for sources rather than remembering their
order, or leaving ourselves comments like /* ex_desc */ all over.  It
will also make it easier to add or reorder sources in the future.

While we're at it, we also standardize on the number of sources.
Previously, we allowed SHADER_OPCODE_SEND to have either 3 (monosend) or
4 (split send) sources, but this is mostly for haphazard historical
reasons.  We now specify all sources every time, eliminating the need
for careful inst->source checks before accessing the last source.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:08 +00:00
Kenneth Graunke
00d38b980d brw: Properly resolve non-sendable sources in a few logical opcodes
Sources decorated with source modifiers, immediates, or particular
stride combinations may not be directly usable as SEND operands.  We
have to resolve them to an ordinary VGRF first.

Most opcodes do this as part of broader payload construction, but these
send directly because the messages are very simple.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:06 +00:00
Kenneth Graunke
b848fa4595 brw: Rename is_send_from_grf to is_send, replace other is_send() helper
The is_send() helper is just a wrapper around inst->is_send_from_grf()
now, so we can combine the two.  Trim the name from is_send_from_grf()
to is_send(), as it's shorter, and also matches is_math().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
e7d20bc86a brw: Drop inst->mlen check from is_send()
We used to have inst->mlen set on various virtual opcodes, but these
days the only instructions that should have inst->mlen set are
SHADER_OPCODE_SEND and SHADER_OPCODE_SEND_GATHER, which are already
covered in inst->is_send_from_grf().  So we don't need to check for mlen
specifically.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
3c455c3532 brw: Stop using is_send_from_grf() in CSE pass
Explicitly list FS_OPCODE_INTERPOLATE_AT_* as allowed, as they were
already allowed by the default case.  Interlock, memory fence, and
barrier were disallowed and remain so.  Uniform pull constant load
was allowed and remains so.  SHADER_OPCODE_SEND and SEND_GATHER get
explicit handling.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
e5ed6f64d9 brw: Stop checking inst->is_send_from_grf() for g127 register hack
Every case but SHADER_OPCODE_SEND and SHADER_OPCODE_BARRIER will be
lowered to SEND before register allocation happens.  And the barrier
send has a null destination, so the restriction doesn't apply.

Note that this hack is for Gfx9 only, so we don't need to worry about
Xe3's SHADER_OPCODE_SEND_GATHER feature.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
b0eb90ddb1 brw: Assert that EOT is always SHADER_OPCODE_SEND on pre-Xe3
We used to have other opcodes as well, but we've since transitioned
entirely to logical send lowering prior to register allocation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
90dbbc69bb brw: Use BAD_FILE instead of ARF null for second send payload
A number of places emit monolithic sends, where the second payload is
empty.  Some places were using a BAD_FILE register, while others were
specifying the hardware ARF null register.  Switch to BAD_FILE for
consistency - this is usually what we do for "source isn't present".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:04 +00:00
Lionel Landwerlin
6d863fda2d anv/brw: move sample_shading_enable to wm_prog_data
The vulkan runtime doesn´t store this parameter in the dynamic state
(since it's not a dynamic state). Just capture it at compile time and
leave on the wm_prog_data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36665>
2025-08-08 14:06:58 +00:00