Commit graph

5257 commits

Author SHA1 Message Date
Kenneth Graunke
66fbfe7bf3 brw: Fix single patch thread dispatch masks in NIR
Arguably a little more code but it brings us a bit closer to not
needing separate per-stage "run" functions.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Kenneth Graunke
4a9aa3ecc4 brw: Combine brw_assign_*_urb_setup() into one function
They all do exactly the same thing, except that GS multiplies by an
extra factor, and TCS has urb_read_length == 0 so it skips one line.

No need for four copies.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Kenneth Graunke
7d463a45f7 brw: Simplify GS load_invocation_id handling
Just return the register instead of having multiple functions stash the
register in an array of registers.  Way too much hoopla here.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Kenneth Graunke
9933882182 brw: Purge source_depth_to_render_target
This was used for Gfx4-5.  Since then, we're just passing around a
boolean that nobody wants.  Even if someone did, a better plan is to
just check nir->info directly.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Sagar Ghuge
cb423ee636 anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
WA states that we need to allocate maximum number of stackIDs per DSS
from RT_DISPATCH_GLOBALS to 2048.

We can still throttle/control the CFE_STATE::StackID to be in range
specified by the field.

This does impact performance having CFE_STATE::stackIDs capped to 2K
by default. More the outstanding ray queries, larger the working set and
have more impact on cache hit rate.

This affect performance on Xe2+ onwards:
* Boundary Benchmark:            36.2%
* Solar Bay extreme:             9.8%
* Hitman world of assassination: 3.9%

Fixes: c1a44e8d43 ("anv: force StackIDControl value for Wa_14021821874")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40310>
2026-03-10 22:41:54 +00:00
Lionel Landwerlin
f508c6acbb brw/nir: improve shader_indirect_data_intel handling
Use is_scalar to know if we can do transpose loading.

Also enable vectorization if 2 intrinsics share the same source (it
means the only difference is the base).

Fixes: e14d6b535c ("brw/nir: add new intrinsics to load data from the indirect address")
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>
2026-03-10 18:24:04 +00:00
Paulo Zanoni
85751506ab elk: don't use instr->const_index[] directly
From what I understand, use of const_index[] by the driver is
dangerous and should be avoided, as commits such as a6330ed4d0
("nir: add ACCESS to load_uniforms") may result in the indexes
changing, breaking the driver. Switch to using the parameter names in
order to make the code more future-proof.

For elk_fs_nir.cpp and elk_vec4_tes.cpp we can verify in the generated
nir_intrinsics.c that the wanted value is actually
nir_intrinsic_base().

For elk_nir.c, according to Caio Oliveira:

  "The code is checking for certain load/store via the is_input() and
   is_output() checks a few lines above. I've checked all them have
   BASE at 0."

Thanks to Ian Romanick for his guidance regarding this patch.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39438>
2026-03-10 01:03:42 +00:00
Ian Romanick
ffd4497e48 brw/asm: Don't drop accumulator number in the assembler
Previously "acc1" or "acc2" would be stored as acc0.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:39 +00:00
Ian Romanick
1ae7a82811 brw: Fix encoding of accumulator sources of 3-source instructions
Previously the accumulator was always forced to be acc0.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:39 +00:00
Ian Romanick
6531c425a0 brw/emit: Src1 can be accumulator on Gfx12.5 and newer
v2: Add Bspec reference number. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:39 +00:00
Ian Romanick
c3a5b62c08 brw/validate: Perform more 3-src validation in brw_validate instead of brw_eu_emit
v2: s/Lake/Ice Lake/ in a comment. Noticed by Caio. Add a missing Xe2
Bspec reference number. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:39 +00:00
Ian Romanick
1f45e33072 brw/validate: Implicit read of accumulator cannot also have explicit read
v2: Add Bspec reference number. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:38 +00:00
Ian Romanick
8a6de2d973 brw/validate: Eliminate duplicate integer multiply validation
I think two MRs must have crossed in the mail so to speak. Keep Caio's
formatting and error message, and keep my PRM quote.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>
2026-03-09 19:21:38 +00:00
Ian Romanick
64c60582b5 elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
shader-db:

Broadwell
total instructions in shared programs: 18607516 -> 18607530 (<.01%)
instructions in affected programs: 2095 -> 2109 (0.67%)
helped: 0 / HURT: 8

total cycles in shared programs: 955704436 -> 955702925 (<.01%)
cycles in affected programs: 34299 -> 32788 (-4.41%)
helped: 2 / HURT: 6

All Haswell and older platforms had similar results. (Haswell shown)
total instructions in shared programs: 16989200 -> 16989201 (<.01%)
instructions in affected programs: 461 -> 462 (0.22%)
helped: 0 / HURT: 1

total cycles in shared programs: 946537070 -> 946537035 (<.01%)
cycles in affected programs: 16378 -> 16343 (-0.21%)
helped: 1 / HURT: 0

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40284>
2026-03-09 18:41:55 +00:00
Ian Romanick
6c6c6ce054 brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
This optimization was added in October 2013, and the error was only just
now discovered. Removing the SEL.G.SAT optimization affected zero
shader-db shaders, and it affected 9 fossil-db shaders for instruction
size only.

I haven't checked to see if any of the hurt shaders are helped by
!39987.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17093041 -> 17093055 (<.01%)
instructions in affected programs: 2072 -> 2086 (0.68%)
helped: 0 / HURT: 8

total cycles in shared programs: 876739578 -> 876739154 (<.01%)
cycles in affected programs: 18946 -> 18522 (-2.24%)
helped: 2 / HURT: 6

fossil-db:

Lunar Lake
Totals:
Instrs: 906230557 -> 906240487 (+0.00%); split: -0.00%, +0.00%
CodeSize: 14498856128 -> 14499003168 (+0.00%); split: -0.00%, +0.00%
Send messages: 40667184 -> 40667205 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104068494103 -> 104068561943 (+0.00%); split: -0.00%, +0.00%
Max live registers: 189570192 -> 189570204 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48157648 -> 48157552 (-0.00%)
Non SSA regs after NIR: 139823587 -> 139823016 (-0.00%); split: -0.00%, +0.00%

Totals from 9172 (0.46% of 1985212) affected shaders:
Instrs: 10774709 -> 10784639 (+0.09%); split: -0.00%, +0.09%
CodeSize: 177868384 -> 178015424 (+0.08%); split: -0.08%, +0.17%
Send messages: 311154 -> 311175 (+0.01%); split: -0.00%, +0.01%
Cycle count: 232471392 -> 232539232 (+0.03%); split: -0.15%, +0.18%
Max live registers: 1243549 -> 1243561 (+0.00%); split: -0.00%, +0.01%
Max dispatch width: 196672 -> 196576 (-0.05%)
Non SSA regs after NIR: 509663 -> 509092 (-0.11%); split: -0.19%, +0.08%

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40284>
2026-03-09 18:41:55 +00:00
Lionel Landwerlin
9f2215b480 anv/brw: remove push constant load emulation from the backend compiler
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Anv is responsible for much of how the data is accessed (where is the
push constant base pointer located, etc...), so move the memory load
there.

Fossildb on LNL :
Totals from 135931 (8.65% of 1572134) affected shaders:
Instrs: 68518228 -> 67142101 (-2.01%); split: -2.05%, +0.05%
CodeSize: 1123507040 -> 1092022560 (-2.80%); split: -2.88%, +0.08%
Subgroup size: 32 -> 16 (-50.00%)
Send messages: 4401584 -> 4402565 (+0.02%); split: -0.02%, +0.04%
Cycle count: 4626573038 -> 4619434858 (-0.15%); split: -0.89%, +0.74%
Spill count: 451759 -> 452407 (+0.14%); split: -0.43%, +0.57%
Fill count: 374513 -> 377440 (+0.78%); split: -0.76%, +1.54%
Max live registers: 15788042 -> 15791399 (+0.02%); split: -0.05%, +0.08%
Max dispatch width: 3349408 -> 3346192 (-0.10%); split: +0.09%, -0.19%
Non SSA regs after NIR: 9477038 -> 9498328 (+0.22%); split: -0.27%, +0.50%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Lionel Landwerlin
e14d6b535c brw/nir: add new intrinsics to load data from the indirect address
This address is delivered on Gfx12.5+ in compute/mesh/task shaders
from the command stream instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Lionel Landwerlin
7b1533414a brw/nir: enable constant offsets for global_constant_uniform_block_intel
Will be useful to retain the base offset added in 0e9453291c ("brw:
improve push constant loading using base offsets") once we move push
constant data loading into NIR.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Lionel Landwerlin
d7c64af78e brw: use scalar build for immediate offsets
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Ian Romanick
8624da56ee brw: Also check for ADDRESS file in update_for_reads
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.

Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
2026-03-05 00:02:51 +00:00
Ian Romanick
366410e913 brw: Use brw_reg_is_arf in update_for_reads
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
2026-03-05 00:02:51 +00:00
Ian Romanick
a548466186 brw: Don't mark_invalid in update_for_reads for non-VGRF destination
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.

v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
2026-03-05 00:02:50 +00:00
Alyssa Rosenzweig
ef2a95a40a brw: move brw_can_coherent_fb_fetch to a C header
this isn't C++ brw code, it's just a devinfo query.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>
2026-03-02 12:44:42 +00:00
Alyssa Rosenzweig
d6d1dc5822 brw: move brw_nir_pack_vs_input to brw_nir.c
It's just a pass like the others.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>
2026-03-02 12:44:42 +00:00
Caio Oliveira
df4042371f anv: Set PIPELINE_SELECT systolic mode based on shader usage
For Gfx125 workloads that use systolic mode, this might mean
an extra PIPELINE_SELECT when flipping between a compute shader
that use the mode and another that doesn't use the mode
(or vice-versa).

Reviewed-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>
2026-02-26 19:05:56 +00:00
Caio Oliveira
ffc3219d57 brw: Add lowering for nir_cmat_call_op_per_element_op
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39904>
2026-02-26 18:45:20 +00:00
Caio Oliveira
63e1592f8d brw/scoreboard: Don't track dependencies for UNDEFs
Dependencies in UNDEFs were already not propagated by
update_inst_scoreboard(), since the instruction there
was not consider neither ordered or unordered; and also
not being used to resolve implicit dependencies.

The generator was already ignoring any baked dependency
but for cases where UNDEF had two dependencies, a sync nop
would be generated -- which would be redundant with a
later sync nop.

Since we know UNDEFs have no dependencies, stop treating
them specially when trimming dependencies.

This patch remove this particular class of redundant sync nops.
No functional change is expected.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39875>
2026-02-26 06:54:48 +00:00
Lionel Landwerlin
487586fefa anv: implement inline parameter promotion from push constants
Push constants on bindless stages of Gfx12.5+ don't get the data
delivered in the registers automatically. Instead the shader needs to
load the data with SEND messages.

Those stages do get a single InlineParameter 32B block of data
delivered into the EU. We can use that to promote some of the push
constant data that has to be pulled otherwise.

The driver will try to promote all push constant data (app + driver
values) if it can, if it can't it'll try to promote only the driver
values (usually a shader will only use a few driver values). If even
the drivers values won't fit, give up and don't use the inline
parameter at all.

LNL internal fossil-db:

Totals from 315738 (20.08% of 1572649) affected shaders:
Instrs: 155053691 -> 154920901 (-0.09%); split: -0.09%, +0.00%
CodeSize: 2578204272 -> 2574991568 (-0.12%); split: -0.15%, +0.02%
Send messages: 8235628 -> 8184485 (-0.62%); split: -0.62%, +0.00%
Cycle count: 43911938816 -> 43901857748 (-0.02%); split: -0.05%, +0.03%
Spill count: 481329 -> 473185 (-1.69%); split: -1.82%, +0.13%
Fill count: 405617 -> 399243 (-1.57%); split: -1.86%, +0.28%
Max live registers: 34309395 -> 34309300 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 8298224 -> 8299168 (+0.01%)
Non SSA regs after NIR: 18492887 -> 17631285 (-4.66%); split: -4.73%, +0.08%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:09 +00:00
Lionel Landwerlin
7f19814414 brw/nir: handle inline_data_intel more like push_data_intel
It's pretty much the same mechanism, except it's a different register
location.

With this change we gain indirect loading support.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:09 +00:00
Caio Oliveira
922e3c75cf brw: Explicitly set group=0 in generator for SYNC used in workaround
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instead of using whatever group was set by the previous
instruction.  No behavior change, just normalizes what
we generate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>
2026-02-20 17:11:59 +00:00
Caio Oliveira
4382d51cd0 brw: Make brw_builder::uniform() ignore previous group
The `group()` helper creates the new builder "relative" to the existing
one, so this was resulting in some uniform instructions having
a non-zero channel offset ("group") -- which was surprising and had no
practical effect.

Normalize to always use group = 0.  No change in behavior expected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>
2026-02-20 16:50:41 +00:00
Ian Romanick
da1fd9786b elk/cmod: Don't propagate from CMP to ADD if there is a write between
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

No shader-db changes on any ELK platform. I suspect the problematic
cases only occur after scheduling has rearranged instructions. This is
likely the reason BRW didn't experience this problem until 09450faf.

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
2026-02-19 21:28:55 +00:00
Ian Romanick
bdbfe8de4d elk/cmod: Don't propagate from CMP to possible Inf + (-Inf)
This is a backport of BRW e26270249b.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18623918 -> 18624594 (<.01%)
instructions in affected programs: 125179 -> 125855 (0.54%)
helped: 0 / HURT: 139

total cycles in shared programs: 957073100 -> 957072484 (<.01%)
cycles in affected programs: 16534168 -> 16533552 (<.01%)
helped: 42 / HURT: 68

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
2026-02-19 21:28:54 +00:00
Ian Romanick
d1614cd6db brw/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

shader-db:

Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3

total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1

Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11

total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%

Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%

Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
2026-02-19 21:28:54 +00:00
José Roberto de Souza
39ec9e3448 intel/brw: Add and call brw_lsc_supports_base_offset() in places that checks for support of this feature
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>
2026-02-19 16:53:03 +00:00
José Roberto de Souza
91c5744e25 intel/brw: Use computed push constants size in brw_assign_urb_setup()
It was already computed in brw_shader::assign_curb_setup() so we can use it
in brw_assign_urb_setup().

There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when
push_sizes were not multiple of REG_SIZE, the first one was aligning every
push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum
of all push_size.

By luck the only places that did not had a push_size aligned to REG_SIZE only
had one push_size, so this was not an issue.

So here also fixing this mismatch and adding an assert to caught any future
mismatch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>
2026-02-19 16:53:03 +00:00
Alyssa Rosenzweig
5386e93865 brw: use data helper
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Kenneth Graunke
1478329c53 iris: Move ALT mode handling from brw to iris
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We just read this from the NIR and store it in iris_compiled_shader,
there's no reason for the backend compiler to be involved.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
2026-02-19 02:51:00 +00:00
Kenneth Graunke
b985494d6f iris: Create our own enums for system values
These days, our system value concept is just about iris_program
communicating to iris_state which values to upload into a UBO.
Nowhere in that process is the backend compiler involved, so it
doesn't make sense for there to be brw/elk mechanisms.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
2026-02-19 02:51:00 +00:00
Kenneth Graunke
53c5798194 iris: Move passthrough TCS generation out of brw and into iris
iris needs this, but anv does not, and it's just a small wrapper around
common NIR lowering anyway.  This also removes some brw/elk splitting.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
2026-02-19 02:50:59 +00:00
Kenneth Graunke
341687a019 brw: Drop extra validation from TCS passthrough creation
nir_create_passthrough_tcs already validates the result, we don't need
to validate a second time.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
2026-02-19 02:50:59 +00:00
Kenneth Graunke
a8481295d8 brw: Only lower system values for passthrough TCS
This cuts 75 passes that do nothing useful.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
2026-02-19 02:50:58 +00:00
Rhys Perry
f44de53586 nir: only set fp_math_ctrl if meaningful
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Kenneth Graunke
add69407c7 brw: Use memset for initializing varying/slot maps
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
2026-02-16 15:15:38 -08:00
Kenneth Graunke
19d9e10f4d brw: Drop VUE header values and position from wm_prog_data->inputs
The FS doesn't read these from the VUE so we don't care about them.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
2026-02-16 15:15:36 -08:00
Kenneth Graunke
5e48094d72 brw: Drop BRW_VARYING_SLOT_PAD and brw_varying_slot enum
In elk, we tried to store our own "driver" enum values after Mesa's
VARYING_SLOT_MAX.  In brw, we eliminated all of these except for an
unnecessary "BRW_VARYING_SLOT_PAD" value.  This was used for empty
slots, so vue_map::slot_to_varying[] could store something.  This
patch replaces BRW_VARYING_SLOT_PAD with -1.

Our "driver" enum values overlapped with VARYING_SLOT_PATCH0, leading
to unnecessary headaches.  Now gl_varying_slot_name_for_stage will do
the right thing for both regular and patch varyings.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
2026-02-16 15:15:35 -08:00
Kenneth Graunke
16ab31f358 brw: Use NUM_TOTAL_VARYING_SLOTS instead of VARYING_SLOT_TESS_MAX
This is a bit larger, but also clearer.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
2026-02-16 15:15:34 -08:00
Kenneth Graunke
3dbeaf18c8 iris: Defeature native two-sided color support
This drops native support for legacy GL's two-sided color feature
in favor of lowering it via nir_lower_two_sided_color().  Instead
of having a whole bunch of state management hassle to set up the
SBE unit to swizzle between the COL and BFC VUE slots, and have it
transparently deliver one or the other to the fragment shader, we
simply deliver both and insert a conditional select there:

   (is-front-facing ? front color : back color)

This also works even for > 16 varyings, where swizzling via the SBE
unit isn't viable.

zink, asahi, freedreno, lima, panfrost, r600, v3d, and vc4 all use
this lowering rather than having native support.  Only four games in
our shader-db even use this feature.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>
2026-02-16 15:15:33 -08:00
Kenneth Graunke
e0fc4a7c54 brw: Drop brw_compiler option from brw_no_indirect_mask()
It's unused.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
2026-02-16 21:33:49 +00:00
Kenneth Graunke
c2df854359 brw: Make a devinfo temporary in lower_mem_access_bitsizes
Less typing.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>
2026-02-16 21:33:49 +00:00