fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 15:20:10 +01:00

Author	SHA1	Message	Date
Lionel Landwerlin	fe38fb858c	brw: workaround broken indirect RT messages on Gfx11 Unfortunately we cannot use the indirect descriptor on Gfx11, it appears to just drop writes. Other platforms appear to be fine. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>	2025-08-20 15:01:50 +00:00
Lionel Landwerlin	a0844458b8	brw: enable opt_register_coalesce to work with multiple EOT blocks Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>	2025-08-20 15:01:50 +00:00
Lionel Landwerlin	c4c7ff3f8f	brw: enable register allocation to deal with multiple EOTs Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>	2025-08-20 15:01:50 +00:00
David Rosca	325de7fe7e	pipe: Remove now unused is_video_target_buffer_supported Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	a4aed7e517	radeonsi: Remove now unused si_vid_is_target_buffer_supported Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	0df4eed1e2	radeonsi/vcn: Support VPE with decode processing Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	10ac8567de	radeonsi/vcn: Support EFC with encode processing Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	efc6d27fd4	frontends/va: Add support for decode/encode processing This implements support for Decode processing allowing to perform processing operation on the decoded picture in one single call without having to use separate processing context. This also implements the same functionality for encoding, which is useful to perform conversion from RGB to YUV in a single call, and it allows us to properly support the conversion inside encoder (eg. EFC on AMD). For Encode processing the additional output buffer is required same as with Decode processing, but driver may not use it to perform the conversion (in case where the conversion can be done by the encoder hw). This means the contents of the additional buffer is undefined, and application should not rely on the buffer actually containing output picture of the conversion. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	b0a5d78247	frontends/va: Remove EFC support It will be moved to encode processing in next commits. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:45 +00:00
David Rosca	d0eec62831	frontends/va: Change vlVaPostProcCompositor to take pipe_vpp_desc arg Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:44 +00:00
David Rosca	d2f3721d99	frontends/va: Refactor vlVaVidEngineBlit Add struct pipe_vpp_desc as argument. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:44 +00:00
David Rosca	5ae6290446	frontends/va: Cleanup CreateContext Also create video processor here, instead of when processing first picture. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>	2025-08-20 14:25:44 +00:00
Boris Brezillon	5e01ec4bd0	util/format: Auto-generate a bunch of YUV helpers Now that the YUV subsampling pattern is encoded in the name, we can auto-generate a bunch of helpers that were previously hand-written, and are pretty often lagging behind when new formats are added. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:07 +00:00
Boris Brezillon	f20ee2806e	util/format: Add subsampling info to our YUV-as-RGB format names This will allow for more autogen and is good to have regardless, because it makes it clear what the subsampling is when looking at the name. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:07 +00:00
Boris Brezillon	75ba8f403d	util/format: Use more descriptive names for YUV formats This is the first step for more auto-generated YUV helpers. We keep the short/fourcc names as aliases, and generate defines so we don't have to patch the existing code, but ultimately, it'd be good to consistently use the fully descriptive names so it's easier to reason about the formats when reading the code. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:07 +00:00
Boris Brezillon	fabd0d82db	util/format: Auto-generate the enum pipe_format definition I've recently discovered a case where the enum entry was defined, but the description in the yaml was missing, leading to a NULL deref when we were querying the util_format_description object for this format. This autogen of the enum will also allow for more autogen, and proper classication of formats. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:06 +00:00
David Rosca	20ad09af25	radeonsi: Map X6R10/X6R10X6G10 formats to R16/R16G16 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:06 +00:00
David Rosca	ddb42b2fc5	auxiliary/vl: Map X6R10/X6R10X6G10 formats to R16/R16G16 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35177>	2025-08-20 14:01:05 +00:00
Eric Engestrom	1fad1516b8	meson: add spirv-tools option to disable the optional dependency Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36689>	2025-08-20 12:50:40 +00:00
Michal Krol	e3476b4dbd	lavapipe: Bump maxTransformFeedbackBufferDataStride to 2048. D3D10 requires SO buffer stride to be at least 2048 bytes. Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36842>	2025-08-20 12:33:00 +00:00
Mary Guillemard	1d03897564	pan/bi: Run opt_sink and opt_move in preprocess We can do some movement for UBO and SSBO after they are lowered in preprocess. We already do this in postprocess but this now also catch SSBOs as they are lowered in postprocess. Overall, reduce fills (less load from TLS) in fossils (excluding parallel-rdp as it crash still): Totals: Instrs: 115242 -> 115046 (-0.17%); split: -0.20%, +0.03% CodeSize: 1168896 -> 1164928 (-0.34%); split: -0.35%, +0.01% Estimated normalized CVT cycles: 762.015625 -> 757.109375 (-0.64%); split: -0.75%, +0.11% Estimated normalized Load/Store cycles: 12693.0 -> 12680.0 (-0.10%); split: -0.11%, +0.01% Number of spill instructions: 358 -> 359 (+0.28%) Number of fill instructions: 1600 -> 1584 (-1.00%) Totals from 127 (15.82% of 803) affected shaders: Instrs: 31753 -> 31557 (-0.62%); split: -0.73%, +0.12% CodeSize: 335104 -> 331136 (-1.18%); split: -1.22%, +0.04% Estimated normalized CVT cycles: 205.546875 -> 200.640625 (-2.39%); split: -2.78%, +0.40% Estimated normalized Load/Store cycles: 3935.0 -> 3922.0 (-0.33%); split: -0.36%, +0.03% Number of spill instructions: 124 -> 125 (+0.81%) Number of fill instructions: 452 -> 436 (-3.54%) Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	7e86653a6f	pan/bi: remove dead variables in preprocess This should have no effect apart cleaning up NIR_DEBUG print outputs a bit. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	bc8a277551	pan/bi: Split bi_optimize_nir and run bi_optimize_loop_nir in preprocess We now have bi_optimize_loop_nir following optimize_nir from NAK. Overall the more we can cleanup early the better, shouldn't cause much changes. For fossils/sascha-willems: Totals: Instrs: 40884 -> 40879 (-0.01%); split: -0.02%, +0.01% Estimated normalized FMA cycles: 588.078125 -> 588.015625 (-0.01%) Estimated normalized CVT cycles: 249.875 -> 249.859375 (-0.01%); split: -0.04%, +0.04% Totals from 9 (1.44% of 627) affected shaders: Instrs: 1521 -> 1516 (-0.33%); split: -0.66%, +0.33% Estimated normalized FMA cycles: 9.1875 -> 9.125 (-0.68%) Estimated normalized CVT cycles: 11.125 -> 11.109375 (-0.14%); split: -0.98%, +0.84% Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	6ab7a03aef	panfrost: Split texture lowering passes We now have lower_texture_early and lower_texture. lower_texture_early handle nir_lower_tex and (in the future) could handle anything that is backend specific that need to happen before nir_lower_io. lower_texture handles actual lowering of backend specific things that must happen after nir_lower_tex and nir_lower_io. This allows us to finally not run nir_lower_tex two times in panvk. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	310eabacc0	panfrost: Move nir_lower_io outside of postprocess Moving it out of there will allow us to shuffle and move API specific parts out of there. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	a3f935c850	panfrost: Split compilers preprocess_nir As we are going to move texture and IO lowering, this split preprocess functions in two, one handling preprocess the other postprocess. The split is done right before lower_io and has no functional change for now. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	04e9a93339	panvk: Lower sampler and texture index in case of offset We are going to move to run nir_lower_tex once and before lower_descriptors. To avoid needing to rerun it, let's never generate a sampler or texture index in lower_descriptors when offset is present. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	62bfd3f132	panvk: Remove unused color_output_var function in fb_preload Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	5aba96d4ac	pan/bi: Stop exposing bifrost_nir_lower_load_output Unused outside of pan/bi and also remove orphan bifrost_nir_lower_xfb declaration. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Mary Guillemard	7ba81b5f95	pan/bi: Move pan_lower_sample_pos to next block This should only run on frag shaders, let's group it the same way we have it in midgard compiler. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>	2025-08-20 12:11:43 +00:00
Yonggang Luo	9034a19aba	radv: Fixes warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; ../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): error C2220: the following warning is treated as an error ../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning ../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings ../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning ../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings ../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning ../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings ../src/amd/vulkan/radv_dgc.c(2155): error C2220: the following warning is treated as an error ../src/amd/vulkan/radv_dgc.c(2155): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning ../src/amd/vulkan/radv_dgc.c(2155): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>	2025-08-20 11:39:19 +00:00
Yonggang Luo	58e55a9e45	radv: Fixes warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast ../src/amd/vulkan/radv_pipeline.c(148): error C2220: the following warning is treated as an error ../src/amd/vulkan/radv_pipeline.c(148): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast to silence this warning ../src/amd/vulkan/radv_pipeline.c(148): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings ../src/amd/vulkan/radv_pipeline.c(150): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast to silence this warning ../src/amd/vulkan/radv_pipeline.c(150): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>	2025-08-20 11:39:19 +00:00
Yonggang Luo	1430798eac	radv: Fixes warning implicit conversion from enum type ../src/amd/vulkan/radv_pipeline_rt.c(142): error C2220: the following warning is treated as an error ../src/amd/vulkan/radv_pipeline_rt.c(142): warning C5286: implicit conversion from enum type 'VkShaderGroupShaderKHR' to enum type 'VkRayTracingShaderGroupTypeKHR'; use an explicit cast to silence this warning ../src/amd/vulkan/radv_pipeline_rt.c(142): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>	2025-08-20 11:39:19 +00:00
Yonggang Luo	652e0d8ccf	amdcommon: Use { 0 } initialize struct for .c files Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>	2025-08-20 11:39:19 +00:00
Lionel Landwerlin	ed471927e5	vulkan/runtime: use a pipeline flag for unaligned dispatches Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The problem with the current flag is that it seems to belong to VkShaderCreateFlagsEXT, not VkPipelineShaderStageCreateFlagBits. Also it is completely skipped by the vk_pipeline.c code. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7b634ebb63` ("vulkan/runtime: Add VK_SHADER_CREATE_UNALIGNED_DISPATCH_BIT_MESA flag") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36828>	2025-08-20 11:17:52 +00:00
David Rosca	f4808ea46f	radv/video: Add support for VK_KHR_video_encode_intra_refresh Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>	2025-08-20 10:58:00 +00:00
David Rosca	c1610da677	vulkan/video: Add intra refresh support Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>	2025-08-20 10:57:59 +00:00
Georg Lehmann	639b91bb48	aco/isel: fix vectorized i2i16 with 8bit vec8 source The extract index is in dwords, not bytes. Fixes: `92d433c54a` ("aco: vectorize conversions from 8bit to 16bit") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>	2025-08-20 10:13:22 +00:00
David Rosca	638fa01203	radv/video: Enable AV1 decode workaround for gfx1153 Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
David Rosca	4893e09c10	radeonsi/vcn: Enable AV1 decode workaround for gfx1153 Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
David Rosca	231d877cc8	ac/vcn_dec: Add av1_intrabc_workaround Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>	2025-08-20 09:51:32 +00:00
Valentine Burley	021a3f768b	zink/ci: Update expectations from nightly jobs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Document current failures and flakes from the nightly jobs. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	c4d8c5ed4a	zink/ci: Switch to quick_gl profile for nightly ANV jobs The full nightly jobs have been failing for a while without much interest in them. Reduce Piglit coverage by switching to the `quick_gl` profile, which is what the pre-merge jobs run. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	6b88e2bd38	anv/ci: Update expectations from nightly jobs Document current failures and flakes from the nightly jobs, and add a skip for tests that are timing out. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Valentine Burley	e4fc3e4ee6	anv/ci: Lower concurrency for nightly jobs The nightly jobs can hit OOMs on JSL and ADL, so reduce the number of threads used by deqp-runner to avoid that. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>	2025-08-20 08:53:36 +00:00
Job Noorman	7752cc26c4	ir3: use offset_shift for SSBO intrinsics Our SSBO access instructions expect offsets in units of the accessed type's size. However, we were ingesting SSBO intrinsics that use byte addresses. We were fixing this up in ir3_nir_lower_io_offsets by inserting a ushr or, if possible, propagating this shift into another shift that's part of the address calculation. Having to insert a ushr if unfortunate, as for most accesses, it should be possible to extract this shift directly from the access chain because the array strides and struct offsets would be properly aligned. It also prohibits nir_opt_offsets to find constant additions to extract as they would be hidden behind a ushr that often cannot be optimized away. `57ea689273` ("ir3: optimize SSBO offset shifts for nir_opt_offsets") tried to overcome the latter problem somewhat by pushing a ushr into additions. This turned out to be unsound because even though SSBO offsets are unsigned, intermediate results in the offset calculation might be negative values which means we should use ishr in those cases. Unfortunately, we cannot know when to use ushr or ishr. This commit switches ir3 to the newly introduced offset_shift index for SSBO intrinsics. This allows the shift to be extracted when lowering derefs in nir_lower_explicit_io. In some, we still might have to add an extra shift to make sure the offset uses the correct units. It turns out that this is very rare and using offset_shift greatly improves the shader stats: Totals from 33267 (20.20% of 164705) affected shaders: MaxWaves: 440368 -> 455258 (+3.38%); split: +3.40%, -0.01% Instrs: 22974358 -> 21844188 (-4.92%); split: -4.98%, +0.06% CodeSize: 45456418 -> 43099334 (-5.19%); split: -5.22%, +0.03% NOPs: 4612549 -> 4524353 (-1.91%); split: -2.97%, +1.05% MOVs: 802018 -> 817547 (+1.94%); split: -3.29%, +5.23% COVs: 381987 -> 382061 (+0.02%); split: -0.03%, +0.05% Full: 514078 -> 477339 (-7.15%); split: -7.18%, +0.04% (ss): 544419 -> 502332 (-7.73%); split: -9.12%, +1.39% (sy): 292099 -> 304697 (+4.31%); split: -3.19%, +7.50% (ss)-stall: 2106134 -> 2104011 (-0.10%); split: -1.82%, +1.71% (sy)-stall: 9704720 -> 10324864 (+6.39%); split: -4.64%, +11.03% STPs: 11301 -> 10074 (-10.86%) LDPs: 18654 -> 17202 (-7.78%) Preamble Instrs: 4652214 -> 4580289 (-1.55%); split: -1.59%, +0.04% Early Preamble: 13977 -> 13978 (+0.01%) Constlen: 1881764 -> 1881304 (-0.02%); split: -0.03%, +0.01% Last helper: 5157587 -> 5074042 (-1.62%); split: -1.86%, +0.24% Subgroup size: 2262976 -> 2263232 (+0.01%) Cat0: 5065452 -> 4976324 (-1.76%); split: -2.73%, +0.97% Cat1: 1241085 -> 1251974 (+0.88%); split: -2.52%, +3.40% Cat2: 8462897 -> 7723367 (-8.74%); split: -8.74%, +0.01% Cat3: 5738382 -> 5735312 (-0.05%); split: -0.06%, +0.00% Cat5: 761945 -> 763017 (+0.14%); split: -0.00%, +0.14% Cat6: 199819 -> 197766 (-1.03%); split: -1.34%, +0.31% Cat7: 890192 -> 581842 (-34.64%); split: -35.20%, +0.57% Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	30716cc524	nir/lower_explicit_io: add support for offset_shift The goal here is to generate addresses that are a right-shifted version of the actual byte address and record the shift amount in the offset_shift index. While we could just insert a ushr at the end of deref chains, this will prevent the shift to be optimized away in many cases. Instead, we try to extract the shift from the array strides and struct offsets that make up the deref chain, and only insert a ushr when absolutely necessary (i.e., for casts). This means we have to walk the entire deref chain at once for accesses that support offset_shift and we don't use the standard algorithm of replacing each deref one at a time. To be able to legally right-shift casts, we use the alignment information and never shift more than what the alignment could support. It should also be noted that casts generally have two sources: something provided by the driver (e.g., a Vulkan resource index) or a variable pointer coming from a phi/bcsel. For the latter, the entire access chain consists of multiple parts that are ended by either a phi/bcsel or an access. Only the part the ends in an access is handled by this new algorithm; the other parts are handled as usual. This is necessary because we have no way to encode the offset shift or to even know how much we would be able to shift without knowing how it is accessed. This commit adds the general implementation for lowering accesses using offset_shift and adds a compiler option for drivers to enable it for SSBO accesses. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	1406eafbcd	nir/lower_explicit_io: add alignment parameters to address builder We will need this when building shifted addresses. Since adding these parameters has a lot of code churn which would distract from the main changes, it is split-off in a separate commit. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	553a439b54	nir/lower_explicit_io: use nir_io_offset to pass around addresses We will add support for shifted addresses; this commit makes sure the APIs of the functions already support passing shifts. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	4c9afbd01d	nir/lower_explicit_io: add helper to build address The helper is used to build the address passed to build_explicit_io_load/store. For now, it simply takes care of adding the component offset when scalarizing. In the future, this can be used to do more complex address manipulations, like calculating the full deref chain address. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00

1 2 3 4 5 ...

210577 commits