Commit graph

15951 commits

Author SHA1 Message Date
Caio Oliveira
1ebc14bcb9 brw: Stop tracking inline parameter usage in prog_key/prog_data
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since inline parameter is the last field of the thread payload, the
backend can always assume they may exist.  They won't affect the
position of other payload fields and the register allocator will
reuse any unused space.

In Anv, also update EmitInlineParameter for Task/Mesh/CS to reflect
previous changes in inline parameter setup.  Remove/Update some stale
comments since we are here.

Finally, remove the prog_key/prog_data bits that tracked whether inline
data or a push address was needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41230>
2026-04-30 16:39:22 +00:00
Samuel Pitoiset
f2ce2868c5 ci: uprev vkd3d
This contains new tests for DGC+multiview which are valid in DX12
but invalid in Vulkan, unless RADV allows support for it. Important
to have coverage for us because it's used for Crimson Desert.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41193>
2026-04-30 15:00:02 +00:00
Lionel Landwerlin
b795a1a20c intel/tools: add eu stall viewer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
d595529475 imgui: update copy and port all tools using it
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
0a965c0bce anv: add a shader-dump debug option
Will use this with EU stall monitor.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
3951a00d86 anv: reorder debug options
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:43 +00:00
Lionel Landwerlin
5a462d77ff anv: remove a bunch of KHR alias uses
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>
2026-04-30 09:04:01 +00:00
Lionel Landwerlin
4c7948ec0d anv: stop using queue priority KHR aliases
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>
2026-04-30 09:04:01 +00:00
Lionel Landwerlin
dad8f65611 anv: fix null pointer access
Reproduces with dEQP-VK.pipeline.no_queues.pipeline_binary.compute

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 595889018a ("anv: implement VK_KHR_maintenance9")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41233>
2026-04-30 09:04:01 +00:00
Caio Oliveira
e1745e0bd9 brw: Fix max_dispatch_width collection for CS with variable size
The intention of the original commit was to make all the shaders report
the same max_dispatch_width.  When CS has multiple variants, this was
not happening as expected.

Fixes: 2acc2f18ea ("intel/compiler: report max dispatch width statistic")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41209>
2026-04-29 15:52:04 +00:00
Alyssa Rosenzweig
a78634ccb0 jay/to_binary: rename grf -> phys_reg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
since it covers accumulators to

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
ab87a035c9 jay: drop a bunch of stale TODO and XXX
These are either done, or never going to be done, or otherwise stale or
silly or unnecessary. Drop a bunch.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
70d09d97ef jay: predicate NoMask instructions in uniform IF's
Totals:
Instrs: 4742391 -> 4742257 (-0.00%)
CodeSize: 70245120 -> 70243520 (-0.00%); split: -0.00%, +0.00%

Totals from 81 (3.06% of 2647) affected shaders:
Instrs: 337727 -> 337593 (-0.04%)
CodeSize: 4992992 -> 4991392 (-0.03%); split: -0.03%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
f199f00564 jay: adjust flag replication
Now instructions still read/write UFLAG, which preserves the information about
lane 0 we need for proper predication etc.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
930d36b54a jay: smarten predication pass
Merge the empty else optimization, the then-block predication, and the
break-while fusion into a unified "try to predicate each side of an if, peephole
optimizing control flow" optimization. This is simpler and more general.

Totals:
Instrs: 4783809 -> 4775647 (-0.17%)
CodeSize: 70766656 -> 70674064 (-0.13%); split: -0.13%, +0.00%

Totals from 1109 (41.90% of 2647) affected shaders:
Instrs: 4130644 -> 4122482 (-0.20%)
CodeSize: 61180848 -> 61088256 (-0.15%); split: -0.15%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
80081ef7b2 jay: check for inverse-ballots in jay_uses_flag
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
86f19bc983 jay: propagate inverse-ballots only locally
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
d7283a25d7 jay: do not copyprop ballots globally
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
5828b66b65 jay: convert to LCSSA
for correctness with loops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
fed6b7bea0 jay: drop UGPR->UMEM spilling path
This is totally broken now that we have a physical CFG for UGPRs. And of course,
UGPRs generally were totally broken without the physical CFG. So I conclude
this code basically never worked. Which is good because it was also basically
always dead too. Just delete it and replace with a clear error message, instead
of pretending it works and either randomly splatting validation or just straight
up miscompiling silently or whatever.

We might need an alternative UGPR->GPR spill path some day but that day is not
today.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
ad040f2fbb jay: introduce a physical control flow graph
Consider:

   u0 = foo()

   if (divergent) {
      u0 = bar()
      r0 = baz(u0)
   } else {
      r0 = quux(u0)
   }

Logically, this is fine, there is no interference between bar() and u0. But
physically, both sides of the if execute so the bar() write to u0 overwrites the
variable the else reads. So this is a miscompile.

The solution is to model the extra edges in the physical control flow graph,
which lives next to the existing logical control flow graph. Liveness for UGPRs
now follows the physical CFG, while liveness for GPRs continues to follow the
logical CFG. That models the interference properly, while still allowing phis to
work as before (since phis writing UGPRs follow uniform bits of control flow
that are necessarily critical edge free for the same reason the logical CFG is).

Because our RA copies shuffled registers back at block ends (following
Colombet), there's no issue with live range splits here (unlike aco which
inserts phis for this case and then needs to worry about critical edges around
those phis).

There might still be an extremely-challenging-to-hit bug here with UGPR spilling
which I need to think more about. It might be fine as-is? Not convinced though.
But this is big enough and strictly less broken than what we have right now and
the full solution will build on this, so here we are.

Fixes artefating in SuperTuxKart and Celestia knows what else.

Totals:
Instrs: 2770938 -> 2771269 (+0.01%); split: -0.00%, +0.02%
CodeSize: 40133712 -> 40138480 (+0.01%); split: -0.01%, +0.02%

Totals from 158 (5.97% of 2647) affected shaders:
Instrs: 514523 -> 514854 (+0.06%); split: -0.02%, +0.09%
CodeSize: 7603040 -> 7607808 (+0.06%); split: -0.03%, +0.09%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
fadb826515 jay/opt_propagate: disable f64 opts for now
could be done but would need more work.

No stats change.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
8e4145948f jay/opt_propagate: fold uflag copies
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
b9f8f2477e jay: inline jay_control()
This accessor is more opaque imho.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
978d20e5fe jay: drop jay_exec_mask
this strategy is panning out nicely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
238c4ecf40 jay: fix 16-bit predicated compares
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
0bd4f1b874 jay: consolidate file prefixes
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
15365f8ea2 jay: jayize swsb print
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
fccd68625c jay: shrink stack allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Kenneth Graunke
0a5c748e19 jay: Don't forget UACCUM!
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
3308626e12 jay/assign_flags: don't burn a flag for ballots
Increases GPR pressure somehow but it's obviously the right thing to do.

SIMD16:

   Totals:
   Instrs: 2767536 -> 2767381 (-0.01%); split: -0.01%, +0.00%
   CodeSize: 44323392 -> 40075680 (-9.58%); split: -9.58%, +0.00%

   Totals from 2147 (81.11% of 2647) affected shaders:
   Instrs: 2704498 -> 2704343 (-0.01%); split: -0.01%, +0.00%
   CodeSize: 43477568 -> 39229856 (-9.77%); split: -9.77%, +0.00%

SIMD32:

   Totals:
   Instrs: 4731031 -> 4746775 (+0.33%); split: -0.33%, +0.67%
   CodeSize: 76609152 -> 70004080 (-8.62%); split: -8.68%, +0.06%
   Number of spill instructions: 50110 -> 50187 (+0.15%); split: -0.00%, +0.16%
   Number of fill instructions: 51341 -> 51804 (+0.90%); split: -0.00%, +0.91%

   Totals from 2136 (80.70% of 2647) affected shaders:
   Instrs: 4666677 -> 4682421 (+0.34%); split: -0.34%, +0.67%
   CodeSize: 75735136 -> 69130064 (-8.72%); split: -8.78%, +0.06%
   Number of spill instructions: 50108 -> 50185 (+0.15%); split: -0.00%, +0.16%
   Number of fill instructions: 51339 -> 51802 (+0.90%); split: -0.00%, +0.91%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
2c77717e5c jay/assign_flags: don't burn a null flag
SIMD32:

   Totals from 423 (15.98% of 2647) affected shaders:
   Instrs: 740042 -> 736360 (-0.50%); split: -1.25%, +0.75%
   CodeSize: 11984176 -> 11925888 (-0.49%); split: -1.23%, +0.74%
   Number of spill instructions: 4675 -> 4676 (+0.02%)
   Number of fill instructions: 5698 -> 5684 (-0.25%); split: -0.28%, +0.04%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
796886f72c jay/assign_flags: refactor for next commit
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Georg Lehmann
26ec32dada intel/nir_opt_peephole_ffma: fix fp_math_ctlr for modifiers
If abs/neg don't preserve nan/inf/sz, the whole expressions won't.

Fixes: 1b0808adf3 ("intel/nir: Make ffma peephole optimization preserve fp_fast_math flags")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41101>
2026-04-28 18:26:58 +00:00
Tapani Pälli
bdaf8b6b39 anv: do not use resource barrier with split barriers
Fixes failing CTS tests using asymmetric and non-asymmetric (regular)
split barriers.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15310
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41237>
2026-04-28 18:07:37 +00:00
Ian Romanick
e301817753 brw: Don't lower phis involved in DPAS instructions to scalar
On my Arc A380 (DG2), this more than doubles the performance of Jeff
Bolz's cooperative matrix benchmark. With llama.cpp modified to use
cooperative matrix on DG2, performance is improved by 37%.

Closes: #15311
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Matt Corallo <git@bluematt.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41172>
2026-04-27 18:09:16 +00:00
Ian Romanick
09b43966ba brw: Lower all phis to scalar
The next commit will cause some very specific phis to not be lowered to
scalar, and that's the reason the callback is used instead of
nir_lower_all_phis_to_scalar.

It's worth noting that the comment in nir_lower_phis_to_scalar.c
specifically calls out Deus Ex as the reason some phis should not be
lowered. At least on current BRW, zero shaders from Deus Ex trace were
affected for spills or fills on any Intel platform.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17050005 -> 17051449 (<.01%)
instructions in affected programs: 41032 -> 42476 (3.52%)
helped: 29 / HURT: 159

total cycles in shared programs: 876411976 -> 876433702 (<.01%)
cycles in affected programs: 1455550 -> 1477276 (1.49%)
helped: 40 / HURT: 150

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 916599633 -> 916694854 (+0.01%); split: -0.00%, +0.01%
CodeSize: 14705971792 -> 14708302384 (+0.02%); split: -0.00%, +0.02%
Send messages: 40870114 -> 40870113 (-0.00%)
Cycle count: 102360965889 -> 102364169753 (+0.00%); split: -0.00%, +0.01%
Spill count: 3460669 -> 3460240 (-0.01%)
Fill count: 4988325 -> 4987891 (-0.01%)
Max live registers: 192914542 -> 192918153 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48848112 -> 48848128 (+0.00%)
Non SSA regs after NIR: 141633613 -> 141671589 (+0.03%); split: -0.00%, +0.03%

Totals from 5713 (0.28% of 2010434) affected shaders:
Instrs: 5215921 -> 5311142 (+1.83%); split: -0.09%, +1.91%
CodeSize: 88940784 -> 91271376 (+2.62%); split: -0.20%, +2.82%
Send messages: 284751 -> 284750 (-0.00%)
Cycle count: 275671864 -> 278875728 (+1.16%); split: -0.74%, +1.90%
Spill count: 857 -> 428 (-50.06%)
Fill count: 845 -> 411 (-51.36%)
Max live registers: 667776 -> 671387 (+0.54%); split: -0.86%, +1.40%
Max dispatch width: 160416 -> 160432 (+0.01%)
Non SSA regs after NIR: 1127904 -> 1165880 (+3.37%); split: -0.10%, +3.47%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Matt Corallo <git@bluematt.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41172>
2026-04-27 18:09:16 +00:00
Jaishankar Rajendran
12f43d048e anv: tune parameters of the ASTC software decoding
Signed-off-by: Prakhar Vishwakarma <prakhar.vishwakarma@intel.com>
Signed-off-by: Jaishankar Rajendran <jaishankar.rajendran@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41205>
2026-04-27 15:17:04 +00:00
Jaishankar Rajendran
cd941d3970 vulkan/runtime: enable parametrization of ASTC software decode
Enable the driver to select :
  - LUT allocation alignment
  - LUT memory flags selection

Signed-off-by: Prakhar Vishwakarma <prakhar.vishwakarma@intel.com>
Signed-off-by: Jaishankar Rajendran <jaishankar.rajendran@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41205>
2026-04-27 15:17:04 +00:00
José Roberto de Souza
965e28ff8a intel/tools: Fix parse of '[HWCTX].replay_*' in aubinator_error_decode_xe
This hides [HWCTX].replay_offset and [HWCTX].replay_length for error decoder as
those are not relevante when just reading the error decoded.

From:
GuC ID: 33
	Name: bcs33
	Class: 3
	Logical mask: 0x1
	Width: 1
	Ref: 65
	Timeout: 0 (ms)
	Timeslice: 1000 (us)
	Preempt timeout: 640000 (us)
	HW Context Desc: 0x03862000
	HW Ring address: 0x0385e000
	HW Indirect Ring State: 0x00000000
	LRC Head: (memory) 0
	LRC Tail: (internal) 4408, (memory) 4408
	Ring start: (memory) 0x0385e000
	Start seqno: (memory) -127
	Seqno: (memory) -128
	Timestamp: 0x00000001
	Job Timestamp: 0x0000005c
type char:
	[HWCTX].replay_offset: 0x0
type char:
	[HWCTX].replay_length: 0x1000
	Schedule State: 0x241
	Flags: 0x0

To:
**** Contexts ****
GuC ID: 33
	Name: bcs33
	Class: 3
	Logical mask: 0x1
	Width: 1
	Ref: 65
	Timeout: 0 (ms)
	Timeslice: 1000 (us)
	Preempt timeout: 640000 (us)
	HW Context Desc: 0x03862000
	HW Ring address: 0x0385e000
	HW Indirect Ring State: 0x00000000
	LRC Head: (memory) 0
	LRC Tail: (internal) 4408, (memory) 4408
	Ring start: (memory) 0x0385e000
	Start seqno: (memory) -127
	Seqno: (memory) -128
	Timestamp: 0x00000001
	Job Timestamp: 0x0000005c
	Schedule State: 0x241
	Flags: 0x0

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Carlos Santa <carlos.santa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41113>
2026-04-24 20:19:09 +00:00
Alyssa Rosenzweig
bccaeb28bb brw/nir_lower_cs_intrinsics: do some math at 16-bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There are less than 2^16 lanes within a threadgroup, so it is safe to do
all math at 16-bit. This allows us to use 16-bit integer division which is
much faster than 32-bit integer division (in terms of the lowerings).

In a "hello world" kernel with variable wg size, simd32 goes 72 inst -> 57
inst on jay and 82 -> 67 inst on brw.

OTOH it's a loss for non-variable wg size, so do it only there to avoid
unwelcome stats regresions on Vulkan.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41084>
2026-04-24 17:13:24 +00:00
Caio Oliveira
0422165d9a brw: Remove various unused fields
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These are a mix of fields whose last used was removed or fields that were
never used, possibly because they remained in a patch while the rest of the
code changed before landing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41139>
2026-04-24 15:04:25 +00:00
Sagar Ghuge
f36b6c8f13 anv: Update values for DispatchTimeoutCounter
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
BTD unit will keep accumulating the threads and then eventually dispatch
those active threads once it reaches the counter.

I guess dispatching too fast will not have full occupancy at the BTD
unit, instead we just pick the half of max value for counter.

This patch also add drirc option to dispatch_timeout_counter and tweak
values internally with respect to HW limits. Default value we have right
now is 512 clocks, we can for sure tune it per app.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40733>
2026-04-24 01:38:20 +00:00
Sagar Ghuge
8a990b5a1c intel/genxml: Added dispatch timeout counter extended field
Since field is split in between multiple fields, we have to manually
write the values and refer to Bspec 43851 for exact values.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40733>
2026-04-24 01:38:20 +00:00
Emma Anholt
01cb024922 ci/intel: Switch over to the new tool for restricted traces.
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
system monitoring for replay debug (OOMs especially), and DXVK support
(I've added a few traces, but not most of the collection because I didn't
want to block on stabilizing this job with everything).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41115>
2026-04-23 22:54:12 +00:00
Sagar Ghuge
e65e62b17f intel/genxml: Disable compute walker mid-thread preemption
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On Xe, we have this bit reversed. It's called Thread preemption Disable.
On Xe2+ (Bspec 56590), it's called Thread preemption with option
enabled/disabled.

AFAIK, we don't support mid-thread preemption. This patch set values
properly according to bspec.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41120>
2026-04-23 19:24:41 +00:00
Lionel Landwerlin
b3fe0cb34e anv: expose VK_KHR_shader_constant_data
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40741>
2026-04-23 19:02:27 +00:00
Tapani Pälli
c105366165 drirc/anv: add flag to disable VK_EXT_subgroup_size_control
This can be used to workaround problem cases with application
controlled subgroup size.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40813>
2026-04-23 13:16:05 +00:00
Iván Briano
c5edb90046 anv: silence warning
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
../src/intel/vulkan/genX_init_state.c: In function ‘gfx9_CreateSampler’:
../src/intel/vulkan/genX_init_state.c:1507:40: warning: ‘border_color_offset’ may be used uninitialized [-Wmaybe-uninitialized]
 1507 |       sampler_state.BorderColorPointer = border_color_offset;

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41116>
2026-04-22 16:17:35 -07:00
GKraats
3c01e6139a hasvk: unbreak assert format != ISL_FORMAT_UNSUPPORTED
Format is set to ISL_FORMAT_UNSUPPORTED at anv_get_format_plane at src/intel/vulkan_hasvk/anv_formats.c,
because Ivy Bridge does not support enough 24 and 48-bits formats.

Problem solved by checking format after the call.

Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40237>
2026-04-22 20:35:25 +00:00