Commit graph

15285 commits

Author SHA1 Message Date
Nanley Chery
fe372f3b1b anv: Don't allow STORAGE + CCS for Y_TILED mod
This can happen as a result of us adding on CCS to modifiers which don't
support it on gfx9-11.

Fixes image corruption seen with the following test:

   $ mpv av://lavfi:testsrc --config=no --vo=gpu-next --scale=ewa_lanczossharp --fs

Fixes: 01c4ea771c ("anv: Enable storage accesses with modifiers on gfx12+")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12910
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38855>
2025-12-10 20:09:09 +00:00
Caio Oliveira
7bd238fa5a brw: Properly set 'desc as register' for SEND in assembler
The non-split SEND case was missing setting this.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38876>
2025-12-10 19:46:52 +00:00
Yonggang Luo
be4ad5c819 meson: Remove VK_ICD_FILENAMES totally from source tree.
This is a follow up of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28516

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@google.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> hk changes
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for RADV changes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38637>
2025-12-10 14:46:11 +00:00
Dylan Baker
938fb7703e anv/video: Cast intentional read past end of struct member to void*
Coverity notices that we read past the end of the array we're pointing
to, which is intentional, we want to copy additional members from the
source struct into the target pointer. As such, cast to a `void *`,
since this will make Coverity happy.

CID: 1649589
Fixes: 314de7af06 ("anv: Initial support for VP9 decoding")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38438>
2025-12-10 14:18:59 +00:00
Valentine Burley
4cbf5062b7 ci: Uprev GL & GLES CTS
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
2025-12-10 11:31:33 +00:00
Valentine Burley
a65a7dbac9 ci: Uprev VKCTS
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
2025-12-10 11:31:31 +00:00
Valentine Burley
3bb9880468 anv/ci: Increase timeout for nightly JSL job
This has been timing out for a while now.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
2025-12-10 11:31:30 +00:00
Lionel Landwerlin
6e92720ece anv/brw: drop cs_prog_key::lower_unaligned_dispatch usage
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38837>
2025-12-10 07:44:31 +00:00
Jianxun Zhang
ff3589b460 anv: Enable compression on importing Android buffers (xe2)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Consolidate importing paths by using the new importing
function so that compressed buffer can be imported
correctly.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>
2025-12-09 14:16:43 -08:00
Jianxun Zhang
0c523b6661 anv: Use gralloc helper to get tiling
The helper gets tiling and modifier in a single step.
The later will be used in the coming changes.

Copy the changes introduced in
cf5c294df4.
Suggested-by: Juston Li <justonli@google.com>

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>
2025-12-09 14:16:42 -08:00
Jianxun Zhang
7dbff29de1 anv: Replace ANV_MAX_PLANES with ISL_MODIFIER_MAX_PLANES
As discussed in the reviews of cf5c294df4,
the 'plane' in this context means plane of a drm
modifier, so it makes sense to just use the new ISL
macro once it is available.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>
2025-12-09 14:16:42 -08:00
Jianxun Zhang
33074e3ebe isl: Add a macro for number of maximum planes of modifiers
We will need it in multiple places in the following changes.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>
2025-12-09 14:16:42 -08:00
Jianxun Zhang
fa8f98138a anv: And a new function to consolidate import paths
The new added function will be invoked on several paths
of importing Android native and hardware buffers.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>
2025-12-09 14:16:42 -08:00
Dylan Baker
0735551b08 anv/video: Read the right source for memcpy
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I'm assuming this based off the `if` branch above, after reading the
code for bit that Coverity pointed out in that branch. It doesn't look
correct to start at the base pointer, which will be 0 initialized and
has 52 bits of zero padding, while the default values are 255.

Fixes: 314de7af06 ("anv: Initial support for VP9 decoding")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38437>
2025-12-09 12:15:45 -08:00
Dylan Baker
26aba9dc9f anv/video: void cast array we intentionally read off the end of
Coverity notices we're reading off the end of the array here, which is
true. We also intend to do that because we want to read the next field as
well. Cast to a `void *` to help Coverity out.

CID: 1649593
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38437>
2025-12-09 12:15:35 -08:00
Calder Young
2fbc722dcf anv: Fix misplaced assertion in anv_scratch_pool_alloc
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Fixes: ee42a489 ("anv: Fix scratch pool buffer allocation sizes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38861>
2025-12-09 06:13:53 +00:00
Calder Young
ee42a48984 anv: Fix scratch pool buffer allocation sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38840>
2025-12-08 20:09:57 +00:00
Gil Pedersen
858364be71 intel: Add PIPE_FORMAT_R10G10B10X2_UNORM support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This utilizes the RGBX format faking logic from e8cd7a30 to enable
PIPE_FORMAT_R10G10B10X2_UNORM renderer support using swizzling.

This format is needed for better HDR rendering support in the iris driver, to
support the Proton / Wine DXGI implementation, which requires an RGBA ordered
renderer for its Vulkan implementation. This in turn requires the Wayland
display to support both alpha and opaque formats. The check currently fails,
since only PIPE_FORMAT_R10G10B10A2_UNORM is exposed when Gallium (iris) is
the DRI Wayland renderer.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38616>
2025-12-03 11:22:38 +00:00
Lionel Landwerlin
86419dd519 brw: remove driver specific load_num_workgroup lowering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38735>
2025-12-02 22:44:05 +00:00
Lionel Landwerlin
578d2f0daa anv: move load_num_workgroups tracking to driver
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38735>
2025-12-02 22:44:04 +00:00
Calder Young
5bf3546cc6 anv: Use companion cmd buffer for CCS and MCS image barriers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37627>
2025-12-02 22:22:24 +00:00
Calder Young
69f6966ae2 anv: Add shorthand for executing on the companion cmd buffer
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37627>
2025-12-02 22:22:24 +00:00
Calder Young
fe0aed2302 anv: Fix missing const qualifiers on some params in anv_blorp.c
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37627>
2025-12-02 22:22:24 +00:00
Ian Romanick
d64ce23b08 elk: only lower flrp once
No shader-db changes on any Intel platform. Both Iris and Crocus use
st_nir_opts, which calls nir_lower_flrp before brw_nir_optimize. The
call still needs to exist for hasvk, but I don't collect fossil-db data
for hasvk.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12526>
2025-12-02 21:28:05 +00:00
Alyssa Rosenzweig
e4b8b758b1 brw: only lower flrp once
No shader-db changes on any Intel platform.

fossil-db:

Lunar Lake
Totals:
Instrs: 926275147 -> 926273376 (-0.00%); split: -0.00%, +0.00%
Cycle count: 106012190597 -> 106011255305 (-0.00%); split: -0.00%, +0.00%
Spill count: 3424180 -> 3424168 (-0.00%)
Fill count: 4877035 -> 4877017 (-0.00%)
Max live registers: 193918196 -> 193918122 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 49106544 -> 49106448 (-0.00%); split: +0.00%, -0.00%
Non SSA regs after NIR: 231281721 -> 231281719 (-0.00%)

Totals from 1705 (0.08% of 2020028) affected shaders:
Instrs: 926974 -> 925203 (-0.19%); split: -0.28%, +0.09%
Cycle count: 39024288 -> 38088996 (-2.40%); split: -2.77%, +0.37%
Spill count: 2229 -> 2217 (-0.54%)
Fill count: 2977 -> 2959 (-0.60%)
Max live registers: 183056 -> 182982 (-0.04%); split: -0.20%, +0.16%
Max dispatch width: 46880 -> 46784 (-0.20%); split: +0.07%, -0.27%
Non SSA regs after NIR: 263520 -> 263518 (-0.00%)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12526>
2025-12-02 21:28:05 +00:00
Lionel Landwerlin
36ba2672ca anv: reintroduce non independent sets dynamic descriptor optimization
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38678>
2025-12-02 13:25:20 +00:00
Lionel Landwerlin
0ca870c6f3 anv: fix broken ray tracing dynamic descriptors
We completely missed that handling.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e76ed91d3f ("anv: switch over to runtime pipelines")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14284
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38678>
2025-12-02 13:25:20 +00:00
Lionel Landwerlin
5c53c6e693 vulkan/runtime: track dynamic descriptor offsets for RT pipelines
Dynamic descriptors are mapped an array of offsets provided through
vkCmdBindDescriptorSets*() commands.

When pipelines are compiled with independent sets layouts, the
implementation might have to do additional runtime calculation to
figure out what offset in the contiguous array maps to what dynamic
descriptor in the pipeline layout.

For graphics pipelines you can always compute that information when
binding the shaders. There is always a limited amount of shaders (5
max).

For ray tracing pipelines, there could be lots of shaders to process
at every pipeline binding call. Besides there is no interface from the
runtime to the driver to list all the shaders used at the moment.

So do that tracking in the runtime and pass the information down to
the driver through the cmd_set_rt_state() vfunc.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 69a04151db ("vulkan/runtime: add ray tracing pipeline support")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38678>
2025-12-02 13:25:20 +00:00
Lionel Landwerlin
a4e9e660d4 brw/iris: remove fs key for coherent_fb_fetch
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38737>
2025-12-02 12:44:35 +00:00
Lionel Landwerlin
296325b787 anv: add 32-wide subgroup requirement heuristic
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13052
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38705>
2025-12-02 10:30:21 +00:00
Tapani Pälli
b2b5e83894 anv: add vk_wsi_disable_unordered_submits and enable for GTK
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
See radv change 0d9d45db4e for further explanation.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14354
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38765>
2025-12-02 10:22:02 +02:00
Kenneth Graunke
320f91a5ab intel/elk: Also disable output constant offset src folding
Same fix from brw.

Fixes: 9a56672f56 ("nir: add shader_info::disable_input/output_offset_src_constant_folding")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38758>
2025-12-01 20:10:37 +00:00
Hans-Kristian Arntzen
d7cf200b49 vulkan/wsi: Add missing KHR_surface_maintenance1 promotions.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Weird that CTS did not catch that ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 11195eb8de ("vulkan: Add KHR_swapchain_maintenance1 promotions.")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38728>
2025-11-30 12:17:33 +01:00
Hans-Kristian Arntzen
11195eb8de vulkan: Add KHR_swapchain_maintenance1 promotions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37157>
2025-11-30 10:30:53 +01:00
Calder Young
c0d809820f intel: Fix calculation of max_scratch_ids on fused devices
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The subslice IDs provided by the SR0.0 EU register are not adjusted to account
for fusing, so the upper bound max_scratch_ids can vary from device to device
depending on what specific slices were fused during manufacturing.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38689>
2025-11-29 15:10:29 +00:00
Faith Ekstrand
4711e5954e nir: Always use sysvals in lower_input_attachments()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The last holdouts of the var options are gone so we can just emit the
system values.  This is overall simpler as it confines all the sysval to
var logic to nir_lower_sysvals_to_varyings().

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38562>
2025-11-29 00:50:34 +00:00
Marek Olšák
fa0bea5ff8 nir: remove nir_io_add_const_offset_to_base
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_constant_folding does it now.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Marek Olšák
9a56672f56 nir: add shader_info::disable_input/output_offset_src_constant_folding
and set it where needed to prevent nir_opt_constant_folding from breaking
those drivers.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Valentine Burley
f9ef7e0f64 Revert "anv/ci: Run vkd3d job in parallel"
With the new vkd3d-proton uprev, a random crash has appeared when
running in parallel.

This reverts commit 45c9c61ad3.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652>
2025-11-28 11:44:28 +00:00
Samuel Pitoiset
92a468f8f2 ci: uprev vkd3d
vkd3d-proton had an issue with its runner and few tests were excluded
by accident.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652>
2025-11-28 11:44:28 +00:00
Tapani Pälli
ba89826b75 anv: add furmark workaround layer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14274
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38410>
2025-11-28 09:26:41 +00:00
Yonggang Luo
0a32d5e6fd treewide: Use regexp to replace usage of setenv with os_set_option.
setenv\((.*), 1\);
=>
os_set_option($1, true);

setenv\((.*), 0\);
=>
os_set_option($1, false);

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>
2025-11-27 18:22:34 +00:00
Lionel Landwerlin
515d8f8e3a brw: fix sample mask flag emission
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's also used for testing helper invocations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e3328dfa2f ("brw: only initialize sample mask flag if needed")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38699>
2025-11-27 15:59:35 +00:00
Calder Young
09e8a54087 anv: Fix ray query shadow stack buffer size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38685>
2025-11-26 22:49:52 +00:00
Ian Romanick
0c089a5c32 brw: Eliminate duplicate fills
When the register allocator decides to spill a value, all reads of that
value are filled. This can result in cases where the same value is
filled many times in a single block. In those cases, the result of an
earlier fill may still be available when a later fill occurs.

This optimization replaces the later fill with a move from the result of
the earlier fill.

v2: Use FIXED_GRF for register overlap tests. Since this is after
register allocation, the VGRF values will not tell the whole truth.

v3: Use brw_transform_inst. Suggested by Caio. Add
brw_scratch_inst::offset instead of storing it as a source. Suggested by
Lionel.

v4: In intervening spill to the same location also invalidates the
value. 🤦

v5: Don't eliminate a fill if its destination partially overlaps the
preceeding fill destination. Fixes failures in cooperative matrix CTS.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17249903 -> 17249653 (<.01%)
instructions in affected programs: 35550 -> 35300 (-0.70%)
helped: 20 / HURT: 0

total cycles in shared programs: 893092398 -> 893101836 (<.01%)
cycles in affected programs: 2501720 -> 2511158 (0.38%)
helped: 6 / HURT: 14

total fills in shared programs: 1901 -> 1776 (-6.58%)
fills in affected programs: 1757 -> 1632 (-7.11%)
helped: 20 / HURT: 0

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 929949528 -> 926770338 (-0.34%)
Cycle count: 105126671329 -> 104851299099 (-0.26%); split: -0.28%, +0.02%
Fill count: 6520785 -> 5021518 (-22.99%)

Totals from 54281 (2.69% of 2018922) affected shaders:
Instrs: 239616289 -> 236437099 (-1.33%)
Cycle count: 22051883404 -> 21776511174 (-1.25%); split: -1.33%, +0.08%
Fill count: 6406295 -> 4907028 (-23.40%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:13 +00:00
Ian Romanick
d2e3707ecc brw: Eliminate redundant fills and spills
When the register allocator decides to spill a value, all writes to that
value are spilled and all reads are filled. In regions where there is
not high register pressure, a spill of a value may be followed by a fill
of that same file while the spilled register is still live. This
optimization pass finds these cases, and it converts the fill to a move
from the still-live register.

The restriction that the spill and the fill must have matching NoMask
really hampers this optimization. With the restriction removed, the pass
was more than 2x helpful.

v2: Require force_writemask_all to be the same for the spill and the fill.

v3: Use FIXED_GRF for register overlap tests. Since this is after
register allocation, the VGRF values will not tell the whole truth.

v4: Use brw_transform_inst. Suggested by Caio. The allows two of the
loops to be merged. Add brw_scratch_inst::offset instead of storing it
as a source. Suggested by Lionel.

v5: Add no-fill-opt debug option to disable optimizations. Suggested by
Lionel.

v6: Move a calculation outside a loop. Suggested by Lionel.

v7: Check that spill ranges overlap instead of just checking initial
offset. Zero shaders in fossil-db were affected, but some CTS with
spill_fs were fixed (e.g.,
dEQP-VK.subgroups.arithmetic.compute.subgroupmin_uint64_t_requiredsubgroupsize).
Suggested by Lionel.

v8: Add DEBUG_NO_FILL_OPT to debug_bits in
brw_get_compiler_config_value(). Noticed by Lionel.

shader-db:

Lunar Lake
total instructions in shared programs: 17249907 -> 17249903 (<.01%)
instructions in affected programs: 10684 -> 10680 (-0.04%)
helped: 2 / HURT: 0

total cycles in shared programs: 893092630 -> 893092398 (<.01%)
cycles in affected programs: 237320 -> 237088 (-0.10%)
helped: 2 / HURT: 0

total fills in shared programs: 1903 -> 1901 (-0.11%)
fills in affected programs: 110 -> 108 (-1.82%)
helped: 2 / HURT: 0

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19968898 -> 19968778 (<.01%)
instructions in affected programs: 33020 -> 32900 (-0.36%)
helped: 10 / HURT: 0

total cycles in shared programs: 885157211 -> 884925015 (-0.03%)
cycles in affected programs: 39944544 -> 39712348 (-0.58%)
helped: 8 / HURT: 2

total fills in shared programs: 4454 -> 4394 (-1.35%)
fills in affected programs: 2678 -> 2618 (-2.24%)
helped: 10 / HURT: 0

fossil-db:

Lunar Lake
Totals:
Instrs: 930445228 -> 929949528 (-0.05%)
Cycle count: 105195579417 -> 105126671329 (-0.07%); split: -0.07%, +0.00%
Spill count: 3495279 -> 3494400 (-0.03%)
Fill count: 6767063 -> 6520785 (-3.64%)

Totals from 43844 (2.17% of 2018922) affected shaders:
Instrs: 212614840 -> 212119140 (-0.23%)
Cycle count: 19151130510 -> 19082222422 (-0.36%); split: -0.39%, +0.03%
Spill count: 2831100 -> 2830221 (-0.03%)
Fill count: 6128316 -> 5882038 (-4.02%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1001375893 -> 1001113407 (-0.03%)
Cycle count: 92746180943 -> 92679877883 (-0.07%); split: -0.08%, +0.01%
Spill count: 3729157 -> 3728585 (-0.02%)
Fill count: 6697296 -> 6566874 (-1.95%)

Totals from 35062 (1.53% of 2284674) affected shaders:
Instrs: 179819265 -> 179556779 (-0.15%)
Cycle count: 18111194752 -> 18044891692 (-0.37%); split: -0.41%, +0.04%
Spill count: 2453752 -> 2453180 (-0.02%)
Fill count: 5279259 -> 5148837 (-2.47%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:13 +00:00
Ian Romanick
b7f5285ad3 brw: Add fill and spill opcodes for LSC platforms
These opcodes are emitted during register allocation instead of the
scratch reads and writes that were previously emitted. These
instructions contain additional information (i.e., the instruction
encodes the scratch offset) that enable optimizations to be added
later.

The fill and spill opcodes are lowered to scratch reads and writes
shortly after register allocation. Eventually this lower may have some
optimizations (e.g., reuse previous address calculations for successive
spills).

v2: Add brw_scratch_inst::offset instead of storing it as a
source. Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:12 +00:00
Ian Romanick
2215003d95 brw: Add OPT macro to brw_shader.cpp like brw_opt.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:11 +00:00
Ian Romanick
1f42ff530c brw: Return the new register from brw_lower_vgrf_to_fixed_grf
...and make the function public.

v2: s/struct brw_reg/brw_reg/. Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:11 +00:00
Ian Romanick
243a3a4ca7 brw: Don't pass compressed to brw_lower_vgrf_to_fixed_grf
The parameter is never used. It's recalculated in the function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:10 +00:00