Commit graph

15325 commits

Author SHA1 Message Date
Iván Briano
836a22d1a2 anv: don't try to fast clear D/S with multiview
If multiview is enabled on the render pass, baseLayer and layerCount
will be 0 and 1 respectively and throw us off.
We can still fast clear if view_mask == 1, but anything else hits the
BLORP_BATCH_NO_EMIT_DEPTH_STENCIL restriction.

Fixes: e488773b29 ("anv: Fast clear depth/stencil surface in vkCmdClearAttachments")

Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5d22f307d5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Ian Romanick
2a2dba1bc7 elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
shader-db:

Broadwell
total instructions in shared programs: 18607516 -> 18607530 (<.01%)
instructions in affected programs: 2095 -> 2109 (0.67%)
helped: 0 / HURT: 8

total cycles in shared programs: 955704436 -> 955702925 (<.01%)
cycles in affected programs: 34299 -> 32788 (-4.41%)
helped: 2 / HURT: 6

All Haswell and older platforms had similar results. (Haswell shown)
total instructions in shared programs: 16989200 -> 16989201 (<.01%)
instructions in affected programs: 461 -> 462 (0.22%)
helped: 0 / HURT: 1

total cycles in shared programs: 946537070 -> 946537035 (<.01%)
cycles in affected programs: 16378 -> 16343 (-0.21%)
helped: 1 / HURT: 0

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 64c60582b5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Ian Romanick
829e5ccc84 brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
This optimization was added in October 2013, and the error was only just
now discovered. Removing the SEL.G.SAT optimization affected zero
shader-db shaders, and it affected 9 fossil-db shaders for instruction
size only.

I haven't checked to see if any of the hurt shaders are helped by
!39987.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17093041 -> 17093055 (<.01%)
instructions in affected programs: 2072 -> 2086 (0.68%)
helped: 0 / HURT: 8

total cycles in shared programs: 876739578 -> 876739154 (<.01%)
cycles in affected programs: 18946 -> 18522 (-2.24%)
helped: 2 / HURT: 6

fossil-db:

Lunar Lake
Totals:
Instrs: 906230557 -> 906240487 (+0.00%); split: -0.00%, +0.00%
CodeSize: 14498856128 -> 14499003168 (+0.00%); split: -0.00%, +0.00%
Send messages: 40667184 -> 40667205 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104068494103 -> 104068561943 (+0.00%); split: -0.00%, +0.00%
Max live registers: 189570192 -> 189570204 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48157648 -> 48157552 (-0.00%)
Non SSA regs after NIR: 139823587 -> 139823016 (-0.00%); split: -0.00%, +0.00%

Totals from 9172 (0.46% of 1985212) affected shaders:
Instrs: 10774709 -> 10784639 (+0.09%); split: -0.00%, +0.09%
CodeSize: 177868384 -> 178015424 (+0.08%); split: -0.08%, +0.17%
Send messages: 311154 -> 311175 (+0.01%); split: -0.00%, +0.01%
Cycle count: 232471392 -> 232539232 (+0.03%); split: -0.15%, +0.18%
Max live registers: 1243549 -> 1243561 (+0.00%); split: -0.00%, +0.01%
Max dispatch width: 196672 -> 196576 (-0.05%)
Non SSA regs after NIR: 509663 -> 509092 (-0.11%); split: -0.19%, +0.08%

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 6c6c6ce054)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Ian Romanick
0d52c7941e brw: Also check for ADDRESS file in update_for_reads
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.

Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8624da56ee)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Ian Romanick
815691378b brw: Use brw_reg_is_arf in update_for_reads
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 366410e913)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Ian Romanick
f21bc439a1 brw: Don't mark_invalid in update_for_reads for non-VGRF destination
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.

v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a548466186)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Lionel Landwerlin
195fbfb2f1 anv: dirty all push constant stages in simple shader
Above we're reprogramming push constants as well at a couple of
workarounds that require dirtying all stages.

cmd_buffer->state.gfx.push_constant_stages was already set in the
above function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4fa1eddb4c ("anv: optimize binding table flushing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14953
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 38ef732169)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Lionel Landwerlin
98ec831d58 anv: add missing handling for attachment locations in secondaries
Fixes:
  dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.interaction_with_shader_object
  dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.remap_single_attachment_shader_object

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 095c470d25)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Lionel Landwerlin
03847a6f0b anv: remove snprintf for aux op transition
With perfetto that string is processed later leading to
use-after-free.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 413e169f45)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Lionel Landwerlin
77f3279c37 anv: dirty descriptors after blorp operations
Blorp emits 3DSTATE_BINDING_TABLE_POINTER_* instructions in 3D mode.

At the moment we're saved by the push constants reemitting the btp but
we'll drop that in the next commit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 533c748b34)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
José Roberto de Souza
b7752ddbc3 intel/perf: Add HSW verx10 to intel_perf_query_result_write_mdapi()
HSW is verx10 75 and when we switched from ver to verx10 I forgot to add the case
75.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a097a3d214 ("intel/perf: Change mdapi switch cases from ver to verx")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14902
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 48c685ee39)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Lionel Landwerlin
6f75431e98 anv: disable ccs modifier reporting when ccs modifiers are disabled
Reporting the modifiers when we're going to disable it in the back
hits various asserts in anv_image.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Helps: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14853
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4f38b5c888)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Lionel Landwerlin
5fa6c15b36 anv: apply the same ccs disabling for Xe3 than Xe2
The new compression scheme introduced in Xe2 also applies to Xe3, so
we're liable for the same bugs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4ac47f8dde)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
bfeb230f9b elk/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

No shader-db changes on any ELK platform. I suspect the problematic
cases only occur after scheduling has rearranged instructions. This is
likely the reason BRW didn't experience this problem until 09450faf.

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit da1fd9786b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
024c5de569 elk/cmod: Don't propagate from CMP to possible Inf + (-Inf)
This is a backport of BRW e26270249b.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18623918 -> 18624594 (<.01%)
instructions in affected programs: 125179 -> 125855 (0.54%)
helped: 0 / HURT: 139

total cycles in shared programs: 957073100 -> 957072484 (<.01%)
cycles in affected programs: 16534168 -> 16533552 (<.01%)
helped: 42 / HURT: 68

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit bdbfe8de4d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
d68b3091b2 brw/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

shader-db:

Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3

total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1

Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11

total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%

Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%

Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit d1614cd6db)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
0710d042db elk: Call nir_opt_algebraic_late in elk_postprocess_nir
Make sure that lowering undone in elk_nir_optimize are reapplied.

No shader-db or fossil-db changes on any Intel platform. This is most
likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't
fossil-db test either of those platforms.

I tried doing a similar thing here as is done in BRW (previous commit),
but that caused a couple Haswell shaders to fall off a performance
cliff:

total spills in shared programs: 8247 -> 8311 (0.78%)
spills in affected programs: 6 -> 70 (1066.67%)
helped: 0 / HURT: 2

total fills in shared programs: 8558 -> 8910 (4.11%)
fills in affected programs: 6 -> 358 (5866.67%)
helped: 0 / HURT: 2

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit df704bd38e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
1f65b768a1 brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
Move the call to nir_opt_algebraic_late after the last time
brw_nir_optimize might be called. nir_opt_algebraic_distribute_src_mods
works together with the late algebraic optimizations, so move it also.

shader-db:

Lunar Lake
total instructions in shared programs: 17081222 -> 17080842 (<.01%)
instructions in affected programs: 419931 -> 419551 (-0.09%)
helped: 545 / HURT: 826

total cycles in shared programs: 878437752 -> 879236226 (0.09%)
cycles in affected programs: 506003142 -> 506801616 (0.16%)
helped: 3091 / HURT: 3189

LOST:   18
GAINED: 16

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19994270 -> 19993231 (<.01%)
instructions in affected programs: 490499 -> 489460 (-0.21%)
helped: 660 / HURT: 800

total cycles in shared programs: 882498776 -> 882834186 (0.04%)
cycles in affected programs: 477858602 -> 478194012 (0.07%)
helped: 3458 / HURT: 3564

total fills in shared programs: 4371 -> 4370 (-0.02%)
fills in affected programs: 7 -> 6 (-14.29%)
helped: 1 / HURT: 0

LOST:   28
GAINED: 10

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
total instructions in shared programs: 19943849 -> 19942782 (<.01%)
instructions in affected programs: 467384 -> 466317 (-0.23%)
helped: 655 / HURT: 796

total cycles in shared programs: 860085674 -> 861410289 (0.15%)
cycles in affected programs: 426900998 -> 428225613 (0.31%)
helped: 3250 / HURT: 3441

LOST:   19
GAINED: 14

fossil-db:

Lunar Lake
Totals:
Instrs: 926472091 -> 926204838 (-0.03%); split: -0.04%, +0.01%
CodeSize: 14845921056 -> 14842776112 (-0.02%); split: -0.10%, +0.08%
Send messages: 41459570 -> 41459574 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104481085069 -> 104583692712 (+0.10%); split: -0.14%, +0.24%
Spill count: 3454651 -> 3457340 (+0.08%); split: -0.15%, +0.23%
Fill count: 4958779 -> 4958487 (-0.01%); split: -0.46%, +0.45%
Max live registers: 193805970 -> 193839002 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 49114416 -> 49113776 (-0.00%); split: +0.01%, -0.01%
Non SSA regs after NIR: 142953905 -> 142800740 (-0.11%); split: -0.12%, +0.01%

Totals from 420256 (20.80% of 2020128) affected shaders:
Instrs: 448571327 -> 448304074 (-0.06%); split: -0.09%, +0.03%
CodeSize: 7312002800 -> 7308857856 (-0.04%); split: -0.21%, +0.17%
Send messages: 17716494 -> 17716498 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52178854998 -> 52281462641 (+0.20%); split: -0.28%, +0.48%
Spill count: 2945654 -> 2948343 (+0.09%); split: -0.17%, +0.26%
Fill count: 4404768 -> 4404476 (-0.01%); split: -0.51%, +0.51%
Max live registers: 60875448 -> 60908480 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 9455280 -> 9454640 (-0.01%); split: +0.04%, -0.04%
Non SSA regs after NIR: 60542740 -> 60389575 (-0.25%); split: -0.28%, +0.02%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1000081384 -> 999726726 (-0.04%); split: -0.05%, +0.01%
CodeSize: 16764458080 -> 16761624256 (-0.02%); split: -0.09%, +0.07%
Subgroup size: 27599528 -> 27599544 (+0.00%)
Send messages: 45538933 -> 45538951 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93303830912 -> 93370118192 (+0.07%); split: -0.19%, +0.26%
Spill count: 3739306 -> 3739719 (+0.01%); split: -0.22%, +0.23%
Fill count: 5089719 -> 5083626 (-0.12%); split: -0.56%, +0.44%
Max live registers: 122041364 -> 122055848 (+0.01%); split: -0.00%, +0.01%
Max dispatch width: 38117296 -> 38127200 (+0.03%); split: +0.06%, -0.03%
Non SSA regs after NIR: 164296197 -> 164299306 (+0.00%); split: -0.01%, +0.01%

Totals from 338754 (14.82% of 2285730) affected shaders:
Instrs: 452723479 -> 452368821 (-0.08%); split: -0.10%, +0.03%
CodeSize: 7861878032 -> 7859044208 (-0.04%); split: -0.19%, +0.16%
Subgroup size: 16 -> 32 (+100.00%)
Send messages: 17050010 -> 17050028 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52881801997 -> 52948089277 (+0.13%); split: -0.33%, +0.46%
Spill count: 3271458 -> 3271871 (+0.01%); split: -0.25%, +0.26%
Fill count: 4628422 -> 4622329 (-0.13%); split: -0.61%, +0.48%
Max live registers: 30738902 -> 30753386 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 4787264 -> 4797168 (+0.21%); split: +0.47%, -0.26%
Non SSA regs after NIR: 61748026 -> 61751135 (+0.01%); split: -0.03%, +0.03%

Tiger Lake
Totals:
Instrs: 1011068379 -> 1010977290 (-0.01%); split: -0.03%, +0.02%
CodeSize: 14197751744 -> 14197683040 (-0.00%); split: -0.07%, +0.07%
Send messages: 46431228 -> 46431220 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85066526419 -> 85085088071 (+0.02%); split: -0.16%, +0.18%
Spill count: 3853750 -> 3855185 (+0.04%); split: -0.15%, +0.19%
Fill count: 6716746 -> 6719594 (+0.04%); split: -0.25%, +0.29%
Max live registers: 122307387 -> 122326083 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 38009632 -> 38003280 (-0.02%); split: +0.03%, -0.05%
Non SSA regs after NIR: 158403572 -> 158415390 (+0.01%); split: -0.01%, +0.02%

Totals from 277728 (12.17% of 2281577) affected shaders:
Instrs: 349206856 -> 349115767 (-0.03%); split: -0.07%, +0.05%
CodeSize: 5042621104 -> 5042552400 (-0.00%); split: -0.20%, +0.20%
Send messages: 13132243 -> 13132235 (-0.00%); split: -0.00%, +0.00%
Cycle count: 36183327716 -> 36201889368 (+0.05%); split: -0.38%, +0.43%
Spill count: 2210072 -> 2211507 (+0.06%); split: -0.26%, +0.33%
Fill count: 4188439 -> 4191287 (+0.07%); split: -0.39%, +0.46%
Max live registers: 24956695 -> 24975391 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3948832 -> 3942480 (-0.16%); split: +0.32%, -0.48%
Non SSA regs after NIR: 45616425 -> 45628243 (+0.03%); split: -0.04%, +0.06%

Ice Lake
Totals:
Instrs: 1009584306 -> 1009411757 (-0.02%); split: -0.02%, +0.01%
CodeSize: 12593466880 -> 12592958096 (-0.00%); split: -0.01%, +0.01%
Send messages: 47274203 -> 47274171 (-0.00%); split: -0.00%, +0.00%
Cycle count: 84920281455 -> 84914027301 (-0.01%); split: -0.05%, +0.04%
Spill count: 2988523 -> 2986191 (-0.08%); split: -0.14%, +0.07%
Fill count: 5296078 -> 5288737 (-0.14%); split: -0.21%, +0.07%
Max live registers: 125429384 -> 125444786 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 41269072 -> 41267312 (-0.00%); split: +0.03%, -0.03%
Non SSA regs after NIR: 163223895 -> 163236623 (+0.01%); split: -0.01%, +0.02%

Totals from 243818 (10.45% of 2334244) affected shaders:
Instrs: 296953759 -> 296781210 (-0.06%); split: -0.08%, +0.02%
CodeSize: 3643224480 -> 3642715696 (-0.01%); split: -0.04%, +0.03%
Send messages: 11518671 -> 11518639 (-0.00%); split: -0.00%, +0.00%
Cycle count: 33065548412 -> 33059294258 (-0.02%); split: -0.13%, +0.11%
Spill count: 1346515 -> 1344183 (-0.17%); split: -0.32%, +0.15%
Fill count: 2537906 -> 2530565 (-0.29%); split: -0.43%, +0.14%
Max live registers: 21476776 -> 21492178 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3727288 -> 3725528 (-0.05%); split: +0.31%, -0.35%
Non SSA regs after NIR: 41050474 -> 41063202 (+0.03%); split: -0.04%, +0.07%

Skylake
Totals:
Instrs: 513573157 -> 513462971 (-0.02%); split: -0.02%, +0.00%
CodeSize: 5950280672 -> 5950001392 (-0.00%); split: -0.01%, +0.00%
Send messages: 24909757 -> 24909758 (+0.00%); split: -0.00%, +0.00%
Cycle count: 57636102242 -> 57634726342 (-0.00%); split: -0.03%, +0.03%
Spill count: 627286 -> 627241 (-0.01%); split: -0.01%, +0.00%
Fill count: 837888 -> 837804 (-0.01%); split: -0.01%, +0.00%
Max live registers: 87272271 -> 87284192 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 32278832 -> 32271800 (-0.02%); split: +0.02%, -0.04%
Non SSA regs after NIR: 87387713 -> 87387614 (-0.00%); split: -0.00%, +0.00%

Totals from 177432 (10.30% of 1722906) affected shaders:
Instrs: 127170648 -> 127060462 (-0.09%); split: -0.10%, +0.01%
CodeSize: 1443406368 -> 1443127088 (-0.02%); split: -0.03%, +0.01%
Send messages: 5444220 -> 5444221 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15423028495 -> 15421652595 (-0.01%); split: -0.10%, +0.10%
Spill count: 235844 -> 235799 (-0.02%); split: -0.03%, +0.01%
Fill count: 333783 -> 333699 (-0.03%); split: -0.03%, +0.01%
Max live registers: 13765573 -> 13777494 (+0.09%); split: -0.01%, +0.10%
Max dispatch width: 3086880 -> 3079848 (-0.23%); split: +0.24%, -0.47%
Non SSA regs after NIR: 17623772 -> 17623673 (-0.00%); split: -0.00%, +0.00%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 11b96a84b0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
2874160ce2 brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
Make sure that lowering undone in brw_nir_optimize are reapplied.

No shader-db changes on any Intel platform.

Why are there fossil-db changes on platforms that don't support ray tracing?

Lunar Lake
Totals:
Instrs: 926636441 -> 926636313 (-0.00%); split: -0.00%, +0.00%
Send messages: 41510729 -> 41510723 (-0.00%); split: -0.00%, +0.00%
Cycle count: 104509492613 -> 104509490569 (-0.00%); split: -0.00%, +0.00%
Max live registers: 193792922 -> 193792890 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 150091934 -> 150092170 (+0.00%); split: -0.00%, +0.00%

Totals from 10 (0.00% of 2020428) affected shaders:
Instrs: 8142 -> 8014 (-1.57%); split: -3.14%, +1.57%
Send messages: 192 -> 186 (-3.12%); split: -7.29%, +4.17%
Cycle count: 131892 -> 129848 (-1.55%); split: -6.93%, +5.38%
Max live registers: 1442 -> 1410 (-2.22%); split: -3.05%, +0.83%
Non SSA regs after NIR: 950 -> 1186 (+24.84%); split: -26.95%, +51.79%

Meteor Lake
Totals:
Instrs: 1000805547 -> 1000805543 (-0.00%); split: -0.00%, +0.00%
Cycle count: 93131592265 -> 93131619619 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122081268 -> 122081244 (-0.00%); split: -0.00%, +0.00%

Totals from 16 (0.00% of 2286241) affected shaders:
Instrs: 18652 -> 18648 (-0.02%); split: -1.39%, +1.37%
Cycle count: 369520 -> 396874 (+7.40%); split: -2.94%, +10.34%
Max live registers: 1350 -> 1326 (-1.78%); split: -4.15%, +2.37%

DG2
Totals:
Instrs: 999834626 -> 999834651 (+0.00%); split: -0.00%, +0.00%
Send messages: 45719398 -> 45719403 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93118238139 -> 93118269557 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122098944 -> 122098936 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 169413734 -> 169413661 (-0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2286795) affected shaders:
Instrs: 18799 -> 18824 (+0.13%); split: -1.04%, +1.18%
Send messages: 492 -> 497 (+1.02%); split: -2.44%, +3.46%
Cycle count: 352838 -> 384256 (+8.90%); split: -1.08%, +9.98%
Max live registers: 1237 -> 1229 (-0.65%); split: -2.91%, +2.26%
Non SSA regs after NIR: 2191 -> 2118 (-3.33%); split: -20.86%, +17.53%

Tiger Lake
Totals:
Instrs: 1011816778 -> 1011816714 (-0.00%); split: -0.00%, +0.00%
Send messages: 46515289 -> 46515285 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85148902406 -> 85148894668 (-0.00%); split: -0.00%, +0.00%
Max live registers: 122362180 -> 122362172 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 38036160 -> 38036176 (+0.00%)
Non SSA regs after NIR: 160317521 -> 160317649 (+0.00%); split: -0.00%, +0.00%

Totals from 6 (0.00% of 2282318) affected shaders:
Instrs: 9204 -> 9140 (-0.70%); split: -1.43%, +0.74%
Send messages: 258 -> 254 (-1.55%); split: -3.10%, +1.55%
Cycle count: 287652 -> 279914 (-2.69%); split: -3.29%, +0.60%
Max live registers: 552 -> 544 (-1.45%); split: -2.90%, +1.45%
Max dispatch width: 48 -> 64 (+33.33%)
Non SSA regs after NIR: 914 -> 1042 (+14.00%); split: -14.00%, +28.01%

Ice Lake
Totals:
Instrs: 1012203285 -> 1012203249 (-0.00%); split: -0.00%, +0.00%
Send messages: 47358859 -> 47358858 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85112165276 -> 85112171905 (+0.00%); split: -0.00%, +0.00%
Max live registers: 125545002 -> 125544992 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 41335696 -> 41335656 (-0.00%)
Non SSA regs after NIR: 166448597 -> 166448602 (+0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2335519) affected shaders:
Instrs: 16486 -> 16450 (-0.22%); split: -1.67%, +1.46%
Send messages: 368 -> 367 (-0.27%); split: -4.89%, +4.62%
Cycle count: 347643 -> 354272 (+1.91%); split: -1.34%, +3.25%
Max live registers: 1104 -> 1094 (-0.91%); split: -3.80%, +2.90%
Max dispatch width: 192 -> 152 (-20.83%)
Non SSA regs after NIR: 2100 -> 2105 (+0.24%); split: -21.76%, +22.00%

Skylake
Totals:
Instrs: 504548665 -> 504548057 (-0.00%); split: -0.00%, +0.00%
Send messages: 24479148 -> 24479118 (-0.00%); split: -0.00%, +0.00%
Cycle count: 57575198140 -> 57575179256 (-0.00%); split: -0.00%, +0.00%
Max live registers: 85570671 -> 85570575 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 85097646 -> 85098486 (+0.00%); split: -0.00%, +0.00%

Totals from 22 (0.00% of 1703671) affected shaders:
Instrs: 19866 -> 19258 (-3.06%); split: -3.72%, +0.66%
Send messages: 464 -> 434 (-6.47%); split: -8.19%, +1.72%
Cycle count: 250854 -> 231970 (-7.53%); split: -9.23%, +1.70%
Max live registers: 2024 -> 1928 (-4.74%); split: -5.53%, +0.79%
Non SSA regs after NIR: 2498 -> 3338 (+33.63%); split: -8.33%, +41.95%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 5af0b8bd09)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Alyssa Rosenzweig
806f0a35a4 brw: drop buggy SLM optimization
This was incorrect for OpenCL due to the possibility of variable shared memory
existing despite shared_size == 0. Fortunately the optimization it was trying to
do should be done in NIR via nir_opt_barrier_modes so we can just drop the brw
code and move on with our merry lives. Fixes OpenCL tests on Iris:

non_uniform_work_group non_uniform_3d_barriers
basic async_strided_copy_local_to_global

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bd5ebbb2f8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
1994d93542 isl: fix 32bit math with 4GB buffer size
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d956957153)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
af97f7fe38 anv: add missing constant cache invalidation for descriptor buffers
A descriptor buffer promoted to push constants requires a constant
cache invalidation if it is modified on the device.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 42b70cf05a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
12da136c07 anv: fix nested command buffer relocations
When executing 3 command buffers :

vkCmdExecuteCommands(CB_B, CB_C);
vkCmdExecuteCommands(CB_A, CB_B);

vkQueueSubmit(CB_A);

We're not transfering correctly the relocations of CB_C from CB_B to
CB_A.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e64889635c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
2abdb028dd anv: flush render caches on first pipeline select
Given a situation like this :
  - CB_A: begin, renderDepthA, end
  - CB_B: begin, computeA, barrier (depth), computeB, end

The depth cache is not being flushed between renderDepthA & computeB
because :
  - it's not flushed at the end of CB_A (it's not required)
  - when CB_B starts, we're still on GFX pipeline mode but do not
    flush render caches because pipeline mode is unknown
  - when barrier is CB_B is executed, we're already in compute
    pipeline mode and HW cannot flush depth.

The fix is to flush RT/depth cached when switching from unknown
pipeline mode any pipeline mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e6dae6ef5f ("vulkan: Optimize implicit end_subpass barrier")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14816
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: David Gow <david@davidgow.net>
(cherry picked from commit 888ac904a3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Juston Li
6dbd6ee94b anv: set missing protected bit for protected depth/stencil surfaces
This bit is set in mocs for other protected attachment types by
anv_image_fill_surface_state() however was ommited for depth/stencil
attachments here.

Without the protected bit set, it causes heavy black artifacting when
attaching a protected depth attachment image to a framebuffer.

Fixes: 794b0496e9 ("anv: enable protected memory")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f84ed620c2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
2c250e6235 elk/cse: use copies in operands_match instead of in-place modification
`operands_match` was modifying instruction source operands in-place
(through the `elk_fs_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 14c65322e8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
03e6f285e5 elk/cse: fix operands_match corrupting non-IMM register data
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.

Guard all `.f` accesses with `.file == IMM` checks.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 93f39f87c4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
2b221e5a1a brw/cse: use copies in operands_match instead of in-place modification
`operands_match` was modifying instruction source operands in-place
(through the `brw_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit b302faad8b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
14c7d820cd brw/cse: fix operands_match corrupting non-IMM register data
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.

Guard all `.f` accesses with `.file == IMM` checks.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit f5e0f63216)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:19 +01:00
Caio Oliveira
99bb93440f brw: Fix cooperative matrix constant sources other than src0
Code was wrongly using src0 to pick the constant value.

Fixes: bf9ad36f2d ("brw: Properly handle cooperative matrices created with constants")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 6b0e29bc77)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Tapani Pälli
3442ccdcba anv: skip compressed flag for bo if not supported by modifier
This has not been problem before the compression hint given to kernel
but now that we set it we hit problems when allocating bo if modifier
does not support compression.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14625
Fixes: f91de58818 ("anv: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit fc814fa828)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Calder Young
cd67481fd2 anv: Avoid dumping BVH before command buffer is submitted
Fixes a race condition where a BVH will be dumped before its command buffer is
actually submitted if a different command buffer completes between the time the
BVH dump is recorded and the time the command buffer is actually submitted.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Fixes: 1b55f101 ("anv/bvh: Dump BVH synchronously upon command buffer completion")
(cherry picked from commit 95e471e558)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
e1446659f8 anv: set DisableAnyMCTRresponsefix to zero on init
This is to make sure early culling related Wa_16020518922 is enabled
properly.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14204
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9aaed82543)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
3988bebbe9 intel/genxml: add CHICKEN_RASTER_2 with required bit for Xe3
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 61b5e91bba)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Hyunjun Ko
a7d0da012e anv/video: disable encoder on untested platforms
Not enough tested on over Gen12 platforms.
Turns out to be not working on DG2, for example.

Cc: mesa-stable
Closes: #14449

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39676>
(cherry picked from commit d2c24a0d8b)
2026-02-04 18:39:35 +01:00
Nanley Chery
a7ace43e9a anv: Don't set the display flag on WSI blit sources
These images are never used with scanout hardware.

Fixes: 2c00b7d1e6 ("anv: flag WSI images as scanout images for ISL")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit c429d7479e)
2026-02-04 18:39:34 +01:00
Nanley Chery
d6d5071a84 anv: Treat non-WSI PRESENT_SRC as TRANSFER_SRC
For non-WSI images, explicitly map VK_IMAGE_LAYOUT_PRESENT_SRC_KHR to
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL in anv_layout_to_aux_state().

Before this patch, the function passed PRESENT_SRC into
vk_image_layout_to_usage_flags() and got a return value of 0 from it
(that function expects that layout to be explicitly handled by the
caller). This caused the logic dependent on the return value to be
unreliable.

Fixes: c5cad407f8 ("anv: handle non-wsi images in anv_layout_to_aux_state")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit f616d4fb2a)
2026-02-04 18:39:34 +01:00
Nanley Chery
7571128959 anv: Fix clear state of WSI blit sources during presentation
On gfx12+, this fixes assert failures in hybrid GPU scenarios.

Fixes: 811c413f98 ("anv: Don't return the Xe2+ fast-clear type early")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit 476f461ce7)
2026-02-04 18:39:34 +01:00
Nanley Chery
f4e0da9e07 anv: Don't return the Xe2+ fast-clear type early
Don't return early from anv_layout_to_fast_clear_type() for Xe2+. We'll
need to make more use of the function for some MCS changes in later
commits.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit 811c413f98)
2026-02-04 18:39:34 +01:00
Hyunjun Ko
e73e4e1554 anv/video: Compute AV1 tile positions internally
The pMiColStarts/pMiRowStarts arrays from applications may have
incorrect units. Instead of using them directly, compute the tile
start positions in superblock units internally based on the tile
dimensions.

Cc: mesa-stable
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
(cherry picked from commit 8e9fec8e40)
2026-02-04 18:39:33 +01:00
Hyunjun Ko
162ef4da2c anv/video: fix a typo in Vulkan AV1 decoding.
Cc: mesa-stable
Fixes: e510efed05d("anv: support in-loop super resolution for AV1 decoding")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
(cherry picked from commit 8004f46466)
2026-02-04 18:39:33 +01:00
Tapani Pälli
d89eceaa2c anv: route clear operations on compute to companion
This fixes bunch of cts tests hitting issues when attempting
anv_image_mcs_op with compute.

Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39581>
(cherry picked from commit 85978ccd28)
2026-02-04 18:39:33 +01:00
Iván Briano
cb8a069e24 brw: fix local_invocation_index with quad derivaties on mesh/task shaders
For mesh/task shaders, the thread payload provides a local invocation
index, but it's always linear so it doesn't give the correct value when
quad derivatives are in use.
The lowering pass where all of this is done correctly for compute
shaders assumes load_local_invocation_index will be lowered in the
backend for mesh/task, calculates the values for the quads correctly but
then avoid replacing the original intrinsic and we remain with the wrong
results.

Add an intel specific intrinsic and always lower the generic one to that
(or whatever else was calculated) to avoid ambiguities and fix the value
for quad derivatives.

Fixes future CTS tests using mesh/task shaders under:
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*

Fixes: d89bfb1ff7 ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>
(cherry picked from commit 5b48805b42)
2026-01-28 16:17:59 +01:00
Nanley Chery
c2eca1a1cc anv: Fix the fast clear type for FCV writes
We started allowing non-default clear colors with FCV in commit
cd8e120b97. When rendering to an image with FCV, set the fast-clear
type to ANV_FAST_CLEAR_ANY if the image properties allow such
fast-clears.

Fixes: cd8e120b97 ("anv: Allow more single subresource fast-clears with FCV")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit ce196c9de5)
2026-01-28 16:17:59 +01:00
Nanley Chery
f3db65d95e anv: Update predicated resolve documentation
* Don't mention gfx7-8 due to the hasvk split.
* Account for the array of clear colors.

Fixes: 0e6b132a75 ("anv: Access more colors in fast_clear_memory_range")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit e7854d06a5)
2026-01-28 16:17:59 +01:00
Nanley Chery
a680c20d40 intel/isl: Fix QPitch of arrayed MCS
From RENDER_SURFACE_STATE::AuxiliarySurfaceQPitch on BDW+,

   This field must be set to an integer multiple of the Surface
   Vertical Alignment

Accomplish this by aligning the height of each MCS layer to main
surface's vertical alignment. Prevents the following test group from
failing on Xe2 when a future commit enables multi-layer fast-clears in
anv:

   dEQP-VK.api.image_clearing.*.
   clear_color_attachment.multiple_layers.
   *_clamp_input_sample_count_*

The main test I used to debug this:

   dEQP-VK.api.image_clearing.core.
   clear_color_attachment.multiple_layers.
   a8b8g8r8_unorm_pack32_64x11_clamp_input_sample_count_2

Backport-to: 25.3
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit eb4a581e44)
2026-01-28 16:17:59 +01:00
Tapani Pälli
41026e14f9 blorp: fix asserts hit with msaa blorp blits on xe3
Tested on PTL, fixes various copy_and_blit tests that utilize compute
after ab9d3528dc that exposed this to them.

Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39548>
(cherry picked from commit bb84773c81)
2026-01-28 16:17:59 +01:00
Nanley Chery
0d3857c832 blorp: Fix Tile64 clear redescription assertion
Prevent assert failures in a future commit where Tile64 will be selected
more often.

Fixes: 42ef23ecd1 ("intel/blorp: Don't redescribe some Tile64 clears")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
(cherry picked from commit 6fc0e5c0aa)
2026-01-28 16:17:59 +01:00
Nanley Chery
cec72c7a29 intel/isl: Fix miptail selection for compressed textures
When determining if an LOD can fit within a miptail, we must minify in
pixel space and then convert to elements.

Prevents the following test case from failing when Yf is force-enabled:

   dEQP-VK.image.texel_view_compatible.graphic.extended.3d_image.texture_read.astc_8x5_srgb_block.r32g32b32a32_uint

Fixes: 46f45d62d1 ("intel/isl: Start using miptails")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
(cherry picked from commit add742fca6)
2026-01-28 16:17:59 +01:00
Sushma Venkatesh Reddy
6c6ed2a9e6 brw: Use lookup tables for Gfx12+ 3src type encoding/decoding
The previous Gfx12+ implementation using bit masking is failing for FP8
types, so replacing with explicit lookup tables.
For float types, the encoding now aligns with brw_data_type_float, ensuring
correct behavior for DPAS and other 3-source instructions.

Fixes: d1d4e3d530 ("brw: Add EU assembler support for float8")

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39448>
(cherry picked from commit 0ce4e8ba6f)
2026-01-28 16:17:58 +01:00