Commit graph

210577 commits

Author SHA1 Message Date
Lionel Landwerlin
fe38fb858c brw: workaround broken indirect RT messages on Gfx11
Unfortunately we cannot use the indirect descriptor on Gfx11, it
appears to just drop writes. Other platforms appear to be fine.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
Lionel Landwerlin
a0844458b8 brw: enable opt_register_coalesce to work with multiple EOT blocks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
Lionel Landwerlin
c4c7ff3f8f brw: enable register allocation to deal with multiple EOTs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>
2025-08-20 15:01:50 +00:00
David Rosca
325de7fe7e pipe: Remove now unused is_video_target_buffer_supported
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
a4aed7e517 radeonsi: Remove now unused si_vid_is_target_buffer_supported
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
0df4eed1e2 radeonsi/vcn: Support VPE with decode processing
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
10ac8567de radeonsi/vcn: Support EFC with encode processing
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
efc6d27fd4 frontends/va: Add support for decode/encode processing
This implements support for Decode processing allowing to perform
processing operation on the decoded picture in one single call without
having to use separate processing context.

This also implements the same functionality for encoding, which is
useful to perform conversion from RGB to YUV in a single call, and it
allows us to properly support the conversion inside encoder (eg. EFC on
AMD).
For Encode processing the additional output buffer is required same as
with Decode processing, but driver may not use it to perform the
conversion (in case where the conversion can be done by the encoder hw).
This means the contents of the additional buffer is undefined, and
application should not rely on the buffer actually containing output
picture of the conversion.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
b0a5d78247 frontends/va: Remove EFC support
It will be moved to encode processing in next commits.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:45 +00:00
David Rosca
d0eec62831 frontends/va: Change vlVaPostProcCompositor to take pipe_vpp_desc arg
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:44 +00:00
David Rosca
d2f3721d99 frontends/va: Refactor vlVaVidEngineBlit
Add struct pipe_vpp_desc as argument.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:44 +00:00
David Rosca
5ae6290446 frontends/va: Cleanup CreateContext
Also create video processor here, instead of when processing first
picture.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
2025-08-20 14:25:44 +00:00
Boris Brezillon
5e01ec4bd0 util/format: Auto-generate a bunch of YUV helpers
Now that the YUV subsampling pattern is encoded in the name, we can
auto-generate a bunch of helpers that were previously hand-written,
and are pretty often lagging behind when new formats are added.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:07 +00:00
Boris Brezillon
f20ee2806e util/format: Add subsampling info to our YUV-as-RGB format names
This will allow for more autogen and is good to have regardless, because
it makes it clear what the subsampling is when looking at the name.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:07 +00:00
Boris Brezillon
75ba8f403d util/format: Use more descriptive names for YUV formats
This is the first step for more auto-generated YUV helpers. We keep
the short/fourcc names as aliases, and generate defines so we don't have
to patch the existing code, but ultimately, it'd be good to consistently
use the fully descriptive names so it's easier to reason about the
formats when reading the code.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:07 +00:00
Boris Brezillon
fabd0d82db util/format: Auto-generate the enum pipe_format definition
I've recently discovered a case where the enum entry was defined, but the
description in the yaml was missing, leading to a NULL deref when we
were querying the util_format_description object for this format.

This autogen of the enum will also allow for more autogen, and proper
classication of formats.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:06 +00:00
David Rosca
20ad09af25 radeonsi: Map X6R10/X6R10X6G10 formats to R16/R16G16
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:06 +00:00
David Rosca
ddb42b2fc5 auxiliary/vl: Map X6R10/X6R10X6G10 formats to R16/R16G16
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>
2025-08-20 14:01:05 +00:00
Eric Engestrom
1fad1516b8 meson: add spirv-tools option to disable the optional dependency
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36689>
2025-08-20 12:50:40 +00:00
Michal Krol
e3476b4dbd lavapipe: Bump maxTransformFeedbackBufferDataStride to 2048.
D3D10 requires SO buffer stride to be at least 2048 bytes.

Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36842>
2025-08-20 12:33:00 +00:00
Mary Guillemard
1d03897564 pan/bi: Run opt_sink and opt_move in preprocess
We can do some movement for UBO and SSBO after they are lowered in
preprocess.

We already do this in postprocess but this now also catch SSBOs as they
are lowered in postprocess.

Overall, reduce fills (less load from TLS) in fossils (excluding
parallel-rdp as it crash still):

Totals:
Instrs: 115242 -> 115046 (-0.17%); split: -0.20%, +0.03%
CodeSize: 1168896 -> 1164928 (-0.34%); split: -0.35%, +0.01%
Estimated normalized CVT cycles: 762.015625 -> 757.109375 (-0.64%); split: -0.75%, +0.11%
Estimated normalized Load/Store cycles: 12693.0 -> 12680.0 (-0.10%); split: -0.11%, +0.01%
Number of spill instructions: 358 -> 359 (+0.28%)
Number of fill instructions: 1600 -> 1584 (-1.00%)

Totals from 127 (15.82% of 803) affected shaders:
Instrs: 31753 -> 31557 (-0.62%); split: -0.73%, +0.12%
CodeSize: 335104 -> 331136 (-1.18%); split: -1.22%, +0.04%
Estimated normalized CVT cycles: 205.546875 -> 200.640625 (-2.39%); split: -2.78%, +0.40%
Estimated normalized Load/Store cycles: 3935.0 -> 3922.0 (-0.33%); split: -0.36%, +0.03%
Number of spill instructions: 124 -> 125 (+0.81%)
Number of fill instructions: 452 -> 436 (-3.54%)

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
7e86653a6f pan/bi: remove dead variables in preprocess
This should have no effect apart cleaning up NIR_DEBUG print outputs a
bit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
bc8a277551 pan/bi: Split bi_optimize_nir and run bi_optimize_loop_nir in preprocess
We now have bi_optimize_loop_nir following optimize_nir from NAK.

Overall the more we can cleanup early the better, shouldn't cause much
changes.

For fossils/sascha-willems:
Totals:
Instrs: 40884 -> 40879 (-0.01%); split: -0.02%, +0.01%
Estimated normalized FMA cycles: 588.078125 -> 588.015625 (-0.01%)
Estimated normalized CVT cycles: 249.875 -> 249.859375 (-0.01%); split: -0.04%, +0.04%

Totals from 9 (1.44% of 627) affected shaders:
Instrs: 1521 -> 1516 (-0.33%); split: -0.66%, +0.33%
Estimated normalized FMA cycles: 9.1875 -> 9.125 (-0.68%)
Estimated normalized CVT cycles: 11.125 -> 11.109375 (-0.14%); split: -0.98%, +0.84%

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
6ab7a03aef panfrost: Split texture lowering passes
We now have lower_texture_early and lower_texture.

lower_texture_early handle nir_lower_tex and (in the future) could handle
anything that is backend specific that need to happen before nir_lower_io.

lower_texture handles actual lowering of backend specific things that
must happen after nir_lower_tex and nir_lower_io.

This allows us to finally not run nir_lower_tex two times in panvk.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
310eabacc0 panfrost: Move nir_lower_io outside of postprocess
Moving it out of there will allow us to shuffle and move API specific parts
out of there.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
a3f935c850 panfrost: Split compilers preprocess_nir
As we are going to move texture and IO lowering, this split preprocess
functions in two, one handling preprocess the other postprocess.

The split is done right before lower_io and has no functional change for
now.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
04e9a93339 panvk: Lower sampler and texture index in case of offset
We are going to move to run nir_lower_tex once and before
lower_descriptors.

To avoid needing to rerun it, let's never generate a sampler or texture
index in lower_descriptors when offset is present.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
62bfd3f132 panvk: Remove unused color_output_var function in fb_preload
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
5aba96d4ac pan/bi: Stop exposing bifrost_nir_lower_load_output
Unused outside of pan/bi and also remove orphan bifrost_nir_lower_xfb
declaration.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
7ba81b5f95 pan/bi: Move pan_lower_sample_pos to next block
This should only run on frag shaders, let's group it the same way we
have it in midgard compiler.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Yonggang Luo
9034a19aba radv: Fixes warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type';
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): error C2220: the following warning is treated as an error
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

../src/amd/vulkan/radv_dgc.c(2155): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_dgc.c(2155): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/radv_dgc.c(2155): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
58e55a9e45 radv: Fixes warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
../src/amd/vulkan/radv_pipeline.c(148): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_pipeline.c(148): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
to silence this warning
../src/amd/vulkan/radv_pipeline.c(148): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without
warnings
../src/amd/vulkan/radv_pipeline.c(150): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
to silence this warning
../src/amd/vulkan/radv_pipeline.c(150): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without
warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
1430798eac radv: Fixes warning implicit conversion from enum type
../src/amd/vulkan/radv_pipeline_rt.c(142): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_pipeline_rt.c(142): warning C5286: implicit conversion from enum type 'VkShaderGroupShaderKHR' to enum type 'VkRayTracingShaderGroupTypeKHR'; use an explicit cast to silence this warning
../src/amd/vulkan/radv_pipeline_rt.c(142): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
652e0d8ccf amdcommon: Use { 0 } initialize struct for .c files
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Lionel Landwerlin
ed471927e5 vulkan/runtime: use a pipeline flag for unaligned dispatches
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The problem with the current flag is that it seems to belong to
VkShaderCreateFlagsEXT, not VkPipelineShaderStageCreateFlagBits.

Also it is completely skipped by the vk_pipeline.c code.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7b634ebb63 ("vulkan/runtime: Add VK_SHADER_CREATE_UNALIGNED_DISPATCH_BIT_MESA flag")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36828>
2025-08-20 11:17:52 +00:00
David Rosca
f4808ea46f radv/video: Add support for VK_KHR_video_encode_intra_refresh
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>
2025-08-20 10:58:00 +00:00
David Rosca
c1610da677 vulkan/video: Add intra refresh support
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>
2025-08-20 10:57:59 +00:00
Georg Lehmann
639b91bb48 aco/isel: fix vectorized i2i16 with 8bit vec8 source
The extract index is in dwords, not bytes.

Fixes: 92d433c54a ("aco: vectorize conversions from 8bit to 16bit")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>
2025-08-20 10:13:22 +00:00
David Rosca
638fa01203 radv/video: Enable AV1 decode workaround for gfx1153
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>
2025-08-20 09:51:32 +00:00
David Rosca
4893e09c10 radeonsi/vcn: Enable AV1 decode workaround for gfx1153
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>
2025-08-20 09:51:32 +00:00
David Rosca
231d877cc8 ac/vcn_dec: Add av1_intrabc_workaround
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>
2025-08-20 09:51:32 +00:00
Valentine Burley
021a3f768b zink/ci: Update expectations from nightly jobs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Document current failures and flakes from the nightly jobs.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>
2025-08-20 08:53:36 +00:00
Valentine Burley
c4d8c5ed4a zink/ci: Switch to quick_gl profile for nightly ANV jobs
The full nightly jobs have been failing for a while without much interest
in them.

Reduce Piglit coverage by switching to the `quick_gl` profile, which
is what the pre-merge jobs run.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>
2025-08-20 08:53:36 +00:00
Valentine Burley
6b88e2bd38 anv/ci: Update expectations from nightly jobs
Document current failures and flakes from the nightly jobs, and add a
skip for tests that are timing out.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>
2025-08-20 08:53:36 +00:00
Valentine Burley
e4fc3e4ee6 anv/ci: Lower concurrency for nightly jobs
The nightly jobs can hit OOMs on JSL and ADL, so reduce the number of
threads used by deqp-runner to avoid that.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>
2025-08-20 08:53:36 +00:00
Job Noorman
7752cc26c4 ir3: use offset_shift for SSBO intrinsics
Our SSBO access instructions expect offsets in units of the accessed
type's size. However, we were ingesting SSBO intrinsics that use byte
addresses. We were fixing this up in ir3_nir_lower_io_offsets by
inserting a ushr or, if possible, propagating this shift into another
shift that's part of the address calculation.

Having to insert a ushr if unfortunate, as for most accesses, it should
be possible to extract this shift directly from the access chain because
the array strides and struct offsets would be properly aligned. It also
prohibits nir_opt_offsets to find constant additions to extract as they
would be hidden behind a ushr that often cannot be optimized away.

57ea689273 ("ir3: optimize SSBO offset shifts for nir_opt_offsets")
tried to overcome the latter problem somewhat by pushing a ushr into
additions. This turned out to be unsound because even though SSBO
offsets are unsigned, intermediate results in the offset calculation
might be negative values which means we should use ishr in those cases.
Unfortunately, we cannot know when to use ushr or ishr.

This commit switches ir3 to the newly introduced offset_shift index for
SSBO intrinsics. This allows the shift to be extracted when lowering
derefs in nir_lower_explicit_io. In some, we still might have to add an
extra shift to make sure the offset uses the correct units. It turns out
that this is very rare and using offset_shift greatly improves the
shader stats:

Totals from 33267 (20.20% of 164705) affected shaders:
MaxWaves: 440368 -> 455258 (+3.38%); split: +3.40%, -0.01%
Instrs: 22974358 -> 21844188 (-4.92%); split: -4.98%, +0.06%
CodeSize: 45456418 -> 43099334 (-5.19%); split: -5.22%, +0.03%
NOPs: 4612549 -> 4524353 (-1.91%); split: -2.97%, +1.05%
MOVs: 802018 -> 817547 (+1.94%); split: -3.29%, +5.23%
COVs: 381987 -> 382061 (+0.02%); split: -0.03%, +0.05%
Full: 514078 -> 477339 (-7.15%); split: -7.18%, +0.04%
(ss): 544419 -> 502332 (-7.73%); split: -9.12%, +1.39%
(sy): 292099 -> 304697 (+4.31%); split: -3.19%, +7.50%
(ss)-stall: 2106134 -> 2104011 (-0.10%); split: -1.82%, +1.71%
(sy)-stall: 9704720 -> 10324864 (+6.39%); split: -4.64%, +11.03%
STPs: 11301 -> 10074 (-10.86%)
LDPs: 18654 -> 17202 (-7.78%)
Preamble Instrs: 4652214 -> 4580289 (-1.55%); split: -1.59%, +0.04%
Early Preamble: 13977 -> 13978 (+0.01%)
Constlen: 1881764 -> 1881304 (-0.02%); split: -0.03%, +0.01%
Last helper: 5157587 -> 5074042 (-1.62%); split: -1.86%, +0.24%
Subgroup size: 2262976 -> 2263232 (+0.01%)
Cat0: 5065452 -> 4976324 (-1.76%); split: -2.73%, +0.97%
Cat1: 1241085 -> 1251974 (+0.88%); split: -2.52%, +3.40%
Cat2: 8462897 -> 7723367 (-8.74%); split: -8.74%, +0.01%
Cat3: 5738382 -> 5735312 (-0.05%); split: -0.06%, +0.00%
Cat5: 761945 -> 763017 (+0.14%); split: -0.00%, +0.14%
Cat6: 199819 -> 197766 (-1.03%); split: -1.34%, +0.31%
Cat7: 890192 -> 581842 (-34.64%); split: -35.20%, +0.57%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Job Noorman
30716cc524 nir/lower_explicit_io: add support for offset_shift
The goal here is to generate addresses that are a right-shifted version
of the actual byte address and record the shift amount in the
offset_shift index. While we could just insert a ushr at the end of
deref chains, this will prevent the shift to be optimized away in many
cases. Instead, we try to extract the shift from the array strides and
struct offsets that make up the deref chain, and only insert a ushr when
absolutely necessary (i.e., for casts). This means we have to walk the
entire deref chain at once for accesses that support offset_shift and we
don't use the standard algorithm of replacing each deref one at a time.

To be able to legally right-shift casts, we use the alignment
information and never shift more than what the alignment could support.
It should also be noted that casts generally have two sources: something
provided by the driver (e.g., a Vulkan resource index) or a variable
pointer coming from a phi/bcsel. For the latter, the entire access chain
consists of multiple parts that are ended by either a phi/bcsel or an
access. Only the part the ends in an access is handled by this new
algorithm; the other parts are handled as usual. This is necessary
because we have no way to encode the offset shift or to even know how
much we would be able to shift without knowing how it is accessed.

This commit adds the general implementation for lowering accesses using
offset_shift and adds a compiler option for drivers to enable it for
SSBO accesses.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Job Noorman
1406eafbcd nir/lower_explicit_io: add alignment parameters to address builder
We will need this when building shifted addresses. Since adding these
parameters has a lot of code churn which would distract from the main
changes, it is split-off in a separate commit.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Job Noorman
553a439b54 nir/lower_explicit_io: use nir_io_offset to pass around addresses
We will add support for shifted addresses; this commit makes sure the
APIs of the functions already support passing shifts.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Job Noorman
4c9afbd01d nir/lower_explicit_io: add helper to build address
The helper is used to build the address passed to
build_explicit_io_load/store. For now, it simply takes care of adding
the component offset when scalarizing. In the future, this can be used
to do more complex address manipulations, like calculating the full
deref chain address.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00