Commit graph

16150 commits

Author SHA1 Message Date
Kenneth Graunke
b01d286083 jay: Move render target store payload/descriptor construction to backend
Constructing the render target store payload is more complex than we can
reasonably handle at the NIR level.  The main reason is that samplemask
and stencil are packed 16-bit and 8-bit parameters, respectively, which
are intermixed with other values that are 32-bit.  In SIMD32 mode, the
packed sub-32-bit values take up fewer registers than normal values.

Currently we also don't specialize the NIR for each FS dispatch width,
and we can't construct the message descriptor without knowing it.

So, we alter nir_intrinsic_store_render_target_intel to take each of
the expected parameters - colour, depth, stencil, samplemask,
src0_alpha, and discard predicate.  We construct the payloads and
descriptors in the backend.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
bc22a37d98 jay: schedule for pressure
Implement a simple pre-RA bottom-up list scheduler with the goal of decreasing
register pressure. On Xe2, this significantly reduces spilling.

SSA form allows us to estimate register demand cheaply and accurately, which
theoretically [1] gives this algorithm the two Hippocratic properties:

1. Shaders with low register pressure are unaffected.
2. Register pressure can only be decreased, never increased.

In other words: first, do no harm.

The heuristic itself is very simple: greedily choose instructions that decrease
liveness using a backwards list scheduler. This is far from optimal! But thanks
to the above properties, even a heuristic that picked random instructions would
be a win overall - by construction, we can only ever win.

In other words: this scheduler is your older brother powering off the game
console any time he's about to lose a game, maintaining a 100% win rate.

[1] In reality, neither property is strictly satisfied due to the messy details
of mapping our clean logical model onto Intel's many weird physical register
files. Nevertheless, the algorithm is well-motivated and the empirical results
on Xe2 are excellent.

SIMD16:

   Totals:
   Instrs: 2754194 -> 2753957 (-0.01%); split: -0.23%, +0.22%
   CodeSize: 41094768 -> 41092768 (-0.00%); split: -0.23%, +0.23%
   Number of spill instructions: 1724 -> 1129 (-34.51%)
   Number of fill instructions: 1912 -> 1119 (-41.47%)

   Totals from 168 (6.35% of 2647) affected shaders:
   Instrs: 850994 -> 850757 (-0.03%); split: -0.75%, +0.73%
   CodeSize: 12825680 -> 12823680 (-0.02%); split: -0.74%, +0.73%
   Number of spill instructions: 1724 -> 1129 (-34.51%)
   Number of fill instructions: 1912 -> 1119 (-41.47%)

SIMD32:

   Totals:
   Instrs: 4688858 -> 4557800 (-2.80%); split: -3.53%, +0.74%
   CodeSize: 70177200 -> 68214816 (-2.80%); split: -3.53%, +0.74%
   Number of spill instructions: 50316 -> 45795 (-8.99%); split: -9.56%, +0.57%
   Number of fill instructions: 51526 -> 45075 (-12.52%); split: -13.23%, +0.71%

   Totals from 819 (30.94% of 2647) affected shaders:
   Instrs: 3810182 -> 3679124 (-3.44%); split: -4.35%, +0.91%
   CodeSize: 57044000 -> 55081616 (-3.44%); split: -4.35%, +0.91%
   Number of spill instructions: 49264 -> 44743 (-9.18%); split: -9.76%, +0.58%
   Number of fill instructions: 50182 -> 43731 (-12.86%); split: -13.58%, +0.73%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
81e21a8756 jay: factor jay_op_(starts,ends)_block queries
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
e72ffb0046 jay: annotate pure sends
for scheduling, CSE, etc

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
c069b7e47c jay/opt_propagate: avoid branching on poison
logically it doesn't matter because we'll bail on a later check, but this is
still UB and therefore releases nasal demons.

i am jealous of Faith's Rust compilers. there, I said it.

==107281== Conditional jump or move depends on uninitialised value(s)
==107281==    at 0x7069768: propagate_backwards (jay_opt_propagate.c:327)
==107281==    by 0x7069768: jay_opt_propagate_backwards (jay_opt_propagate.c:367)
==107281==    by 0x7058960: jay_compile (jay_from_nir.c:2677)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
4b0c3f5c32 jay/lower_scoreboard: add asserts on key bounds
if these are botched you get UB (-:

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
4c97493b69 jay/lower_scoreboard: handle accumulator hazard
Challenging to hit but fixes
dEQP-GLES3.functional.shaders.swizzle_math_operations.vector_multiply.mediump_ivec4_wzyx_zyxw_fragment
with scheduling changes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
9a68101bc2 jay/liveness: drop redundant source filtering
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
9b68b4e7a1 jay/liveness: speed up physical CFG merging
on top of scheduler changes, compile-time of shaders/blender/1017.shader_test:

Difference at 95.0% confidence
	-0.00173202 +/- 0.00116931
	-0.791537% +/- 0.532384%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
1b50d3eed2 jay/liveness: remove pointless bitset init
dup initializes it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
5da3b57605 jay: insert simd32 deswizzle in a dedicated pass
we don't actually need the DESWIZZLE pseudo instruction, and the pseudo op
complicates pre-RA scheduling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Alyssa Rosenzweig
47c6601d5e jay: relax fragment payload layout
this isn't optimal but it should unblock bring up.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Co-authored-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:46 +00:00
Kenneth Graunke
cb75c9f962 brw: Lower sample_pos for non-per-sample shaders in NIR
We generalize the sample_mask_in lowering to handle this too.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>
2026-05-21 15:34:45 +00:00
Collabora's Gfx CI Team
18ba81e5b6 Uprev Piglit to 6fd29fe44f8857b876a67bee962919635f22ecc8
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
11ce9eb56e...6fd29fe44f

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40989>
2026-05-20 21:37:44 +00:00
Christoph Neuhauser
7eba054c5b anv: Add compute only divergent atomics fusion optimization for Blender
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Blender uses atomic operations as part of its virtual shadow mapping
implementation. Virtual shadow mapping page tagging in compute shaders
benefits from divergent atomics fusion, while fragment shaders doing the
atomic raster step in general have worse performance with this
optimization turned on.
Thus, an option is added to only apply divergent atomics fusion to compute
shaders in ANV, and this option is enabled for Blender.

Initial support for divergent atomics fusion optimization in ANV was added
in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631.

Signed-off-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41706>
2026-05-20 19:29:15 +00:00
Jordan Justen
28f6a442c6 brw/compact: Precompact using 2src fields on 3src instructions
In shader-db, with `-p skl`, shaders/0ad/12.shader_test does not
compact an instruction because precompact overwrites portions of the
instruction. (Treating the three source instruction as a two source
when accessing instruction fields.)

This instruction could be compacted:

mad(8)          g65<1>F         g61<4,4,1>F     g64<4,4,1>F     -g17<4,4,1>F { align16 1Q };

But, since precompact erroneously sets bits, the instruction isn't
compacted.

Fossil testing:

 * Tested with 0a3f3fd193 ("brw: drop unused color_outputs_valid
   key") reverted, as fossils are currently producing inconsitent
   results otherwise.

 * Tested skl, icl, dg2, mtl, lnl, bmg and ptl. Only skl had a change.

SKL:

Totals:
CodeSize: 8335219296 -> 8320248992 (-0.18%)

Totals from 359508 (14.42% of 2492689) affected shaders:
CodeSize: 2838254352 -> 2823284048 (-0.53%)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41588>
2026-05-20 11:52:52 -07:00
Iván Briano
d0253e25c4 intel/dev: ARL-H supports EXECUTE_INDIRECT_*
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41372>
2026-05-19 22:41:53 +00:00
Iván Briano
b420958166 anv, iris: fix MOCS Index setting of EXECUTE_INDIRECT_* commands
Unlike most other things where the MOCS setting combines the MOCS Index
and the protected memory bit, the EXECUTE_INDIRECT_DRAW/DISPATCH
commands take only the MOCS Index, and it's limited to only 4 bits.
Enabling the feature on ARL-H caused some tests to hit an assert when
the MOCS selected ended up out of range.

Rename the field to avoid confusion (and match documentation) and set it
through a helper function that calls the same old function and shifts it
down to fit.

Fixes: d1109f67bb ("iris: Emit EXECUTE_INDIRECT_DRAW when available")
Fixes: d161e3c2e2 ("iris: Emit a EXECUTE_INDIRECT_DISPATCH when available")
Fixes: 580728564e ("anv: Emit a EXECUTE_INDIRECT_DISPATCH when available")
Fixes: 6d4f43f0d6 ("anv: Emit EXECUTE_INDIRECT_DRAW when available")
Fixes: 7a9e82e82f ("genxml/12.5: Add the EXECUTE_INDIRECT_DISPATCH instruction")
Fixes: 4229757309 ("genxml/12.5: Add the EXECUTE_INDIRECT_DRAW instruction")
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41372>
2026-05-19 22:41:53 +00:00
Iván Briano
7b26ff692b anv: fix return of cmd_buffer_set_indirect_stride() function
Unless the tristate is unset, which is not, it will be true when casted
to bool, as the return of this function expects.

Fixes: 2741ddd75a ("anv: fix issues found with indirect data stride")
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41372>
2026-05-19 22:41:53 +00:00
Konstantin Seurer
690d9b0d00 util/u_trace: Rework resource management
Stops allocating events in chunks. u_trace_event is allocated using a
linear allocator which has minimal overhead. Buffers for timestamps are
allocated using a custom allocator.

As a sideeffect, it is possible to deduplicate consecutive tracepoints.

Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41271>
2026-05-19 20:27:59 +00:00
Samuel Pitoiset
54b71e9e77 util: pass a struct to driParseConfigFiles()
It would be easier to add more functionalities like shader hashes etc.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41657>
2026-05-19 19:51:45 +00:00
José Roberto de Souza
180d8cb544 intel/brw: Fix nir_intrinsic_load_inline_data_intel register offset calculation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In case of nir_intrinsic_load_inline_data_intel it was not using base_offset to
create the uniform, instead it was using only the special BRW_INLINE_PARAM_REG
value that later will be replaced by the inline_data fixed register.

So here using base_offset for both intrinsics, adding BRW_INLINE_PARAM_REG if
nir_intrinsic_load_inline_data_intel and then in brw_shader::assign_curb_setup
checking for inst->src[i].nr >= BRW_INLINE_PARAM_REG and adjusting brw_reg by
the remaining of the subtraction with BRW_INLINE_PARAM_REG.

Fixes: 7f19814414 ("brw/nir: handle inline_data_intel more like push_data_intel")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41607>
2026-05-19 19:30:18 +00:00
Karol Herbst
4f5e2e34d3 ci: update traces due to ffma rework
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:42 +00:00
Karol Herbst
e9c1cce35f nir: remove ffma_old
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:42 +00:00
Karol Herbst
a9206a271a intel/brw: port over to nir_op_ffma
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:33 +00:00
Karol Herbst
6208a590cb intel/jay: support nir_op_ffma
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:32 +00:00
Karol Herbst
df69364e69 intel/elk: port over to nir_op_ffma
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:32 +00:00
Karol Herbst
a9b18f8607 nir: rename ffma to ffma_old
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:27 +00:00
Lionel Landwerlin
7882321d4f anv: only reprogram line-stipple if enabled
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41581>
2026-05-19 16:53:38 +00:00
Lionel Landwerlin
d6751f2a3b anv: further optimize dirty state after secondary emission
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41581>
2026-05-19 16:53:38 +00:00
Nanley Chery
ec40c95385 anv: Add transfer_src usage for ANDROID_external_format_resolve
The android extension enables the driver to blit from single-sampled
color attachments.

Adding this image usage expressess that functionality and causes anv to
generate the ISL_FORMAT_RAW-formatted clear color during fast-clears.
This fixes an assert failure when anv tries to override the clear color
format used for a blorp_blit() call to ISL_FORMAT_RAW.

There are other ways to handle this, but this solution is consistent
with our handling of multisample images (which may be resolved as well).

Fixes: 465c186fc5 ("anv: Prepare for format width changes in blorp_copy()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15463
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41650>
2026-05-19 15:41:52 +00:00
Nanley Chery
da547a1a4d intel/blorp: Halve max bpp for some redescribed blits
We cannot use 128bpp formats with Y-tiling on gfx6 and prior.

Fixes: eb8883f3ef ("intel/blorp: Redescribe surfaces for copies")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15435
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41650>
2026-05-19 15:41:52 +00:00
Lionel Landwerlin
f19fc91c51 anv: bump max compute workgroup count
The HW can do up to UINT32_MAX but we're using that value to signal
indirect dispatch arguments.

A game like Resident Evil Requiem will use more than 64k on X
dimension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41592>
2026-05-19 11:23:52 +00:00
Eric Engestrom
28f3f2569d meson/intel: only build libblorp_elk when requested
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
All users already depend on `idep_intel_blorp_elk`.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41617>
2026-05-18 19:09:03 +00:00
Sergi Blanch Torne
70bf937c89 Revert "ci: disable Collabora's farm due to maintenance"
This reverts commit aaec108637.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41635>
2026-05-18 17:00:39 +00:00
Nemallapudi, Jaikrishna
e47ed60ee6 intel/dev: fix timebase_scale ticks-to-ns precision loss across 2^32
Android CTS CtsGpuProfilingDataTest#testProfilingDataProducersAvailable
intermittently fails with "Render stages reported before their
VkQueueSubmit events". Root cause is in the Perfetto clock correlation:
render-stage timestamps go through intel_device_info_timebase_scale()
while VkQueueSubmit packets use BOOTTIME directly, so any drift in the
scaler shows up as render stages preceding their submits.

intel_device_info_timebase_scale() scales the upper and lower halves
of the raw timestamp separately and recombines them, but silently
drops the upper-half division's remainder. When the frequency doesn't
evenly divide 1e9, every wrap past 2^32 loses a fixed number of ns
and shows up as a step in Perfetto's GPU-vs-BOOTTIME snapshot offset.

Carry the upper-half remainder into the lower-half numerator before
dividing, so no precision is lost. All intermediates still fit in
uint64_t.

Cc: mesa-stable
Signed-off-by: Nemallapudi, Jaikrishna <nemallapudi.jaikrishna@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41630>
2026-05-18 10:18:00 +00:00
Calder Young
f60749ff3c brw: Add support for ACCESS_CAN_REORDER memory ordering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Passes the ACCESS_CAN_REORDER flag from NIR on to the backend so that we
can lower the loads to a non-volatile SEND. This allows the scheduler to
freely reorder them around stores or fences.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41008>
2026-05-17 19:03:24 +00:00
Calder Young
bb4878b203 brw: Allow instruction reordering around memory writes
Our scheduler is overly conservative about reordering instructions
around memory writes or fences. Fortunately, there are several simple
assumptions we can make about our IR to schedule these things a lot
more fluidly:

 * Unless its an EOT, a SEND instruction's side effects will only be
   observed through other SEND instructions

 * The effects of workgroup barriers, memory fences, and BRW_OPCODE_SYNC,
   are only used in the IR to synchronize SEND instructions

 * All other scheduler dependencies related to memory access are already
   expressed through the source and destination operands

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41008>
2026-05-17 19:03:24 +00:00
Caio Oliveira
3f8a083f28 intel/perf: Show type, data type and units in intel_perf_query_layout
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41623>
2026-05-17 16:21:08 +00:00
Caio Oliveira
3628d6e532 intel/perf: Add helpers to get names of enums
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41623>
2026-05-17 16:21:08 +00:00
Lionel Landwerlin
b24a4c3cd0 anv: temporarily reenable scratch page by default
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
A couple of games are showing pagefaults :
  - https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15450
  - https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15474

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04bfdb287b ("anv: Disable scratch page by default on Xe KMD")
Reviewed-by: Calder Young <calder.young@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41596>
2026-05-15 19:47:12 +00:00
Caio Oliveira
f7fed3bdf8 intel/perf: Use intel_perf_context as ralloc parent of sample buffers
Prefer the context instead of the config structure.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41591>
2026-05-15 16:32:49 +00:00
Caio Oliveira
d3cfe04b3d intel: Move cmat configurations to anv_physical_device
Some cooperative properties are defined by the driver itself and
are not a property of the HW.  In particular whether the scope is
subgroup or workgroup is not directly related to the HW.

It could make sense encode the DPAS combinations into intel_device_info
but we are not using all possible combinations yet and wouldn't be very
useful in practice.

The new scheme was based on radv and will set us up for also filling
the flexible dimensions properties too.

Note: this also fixes a subtle issue where ARL was incorrectly inheriting
the PRE_XEHP configurations which included FLOAT16/FLOAT16/FLOAT16/FLOAT16
which it does not support.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41564>
2026-05-15 05:38:49 +00:00
Caio Oliveira
088eeb2d81 anv: When using INTEL_LOWER_DPAS disable BFloat16 cmat configurations
Those configurates are not currently supported by the
emulation pass brw_lower_dpas.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41564>
2026-05-15 05:38:49 +00:00
Caio Oliveira
6bcf70bd85 anv: Remove saturating cmat configurations when INTEL_LOWER_DPAS=1
Since we don't have any DPAS-based implementation of those, it is odd to
support them in the emulation mode that is only enabled with the debug
flag INTEL_LOWER_DPAS nowadays.  Remove it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41564>
2026-05-15 05:38:48 +00:00
Lionel Landwerlin
682dc50776 brw/jay: move sample_mask_in handling to NIR
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41529>
2026-05-14 14:05:06 +00:00
Lionel Landwerlin
df5a6d7b87 brw/jay: move some coarse lowering to NIR
We add a pass to allow testing partially known fs config bits (main
user is DX11 always disabling VRS/coarse).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41529>
2026-05-14 14:05:06 +00:00
Lionel Landwerlin
ccef88173b anv: add SIMD32 requirement heuristic for Dragon Dogma 2
A few compute shaders are doing BC3 image generation on the device and
then generate incorrect data if running at SIMD16. That data is then
sampled in a vertex shader that generates incorrect geometry.

See https://github.com/ValveSoftware/Proton/issues/7595#issuecomment-4343662131

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41501>
2026-05-14 10:39:25 +00:00
Lionel Landwerlin
dfa7e15f7c brw: simplify VF component packing code
We can determine used components earlier.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41501>
2026-05-14 10:39:25 +00:00
Lionel Landwerlin
8e3084dfe6 anv: add an option to disable allocation over subscription
Usually I'm able to run B580 capture on LNL, but in some cases the
oversubscription on replay would lead to allocation failures.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41501>
2026-05-14 10:39:25 +00:00