fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-08 06:58:05 +02:00

Author	SHA1	Message	Date
Dhruv Mark Collins	3d6fab0404	fd/pps: Allocate performance counters from high-to-low The UMD will be switching to allocating counters from low-to-high, so to avoid the chances of conflict with this new policy the PPS driver now allocates the other way around. Additionally, this will future proof it for the MSM-DRM uAPI for performance counters which will similarly allocate from high-to-low. Cc: mesa-stable Signed-off-by: Dhruv Mark Collins <mark@igalia.com> Assisted-by: OpenAI Codex (GPT-5.4) (cherry picked from commit `24849eef9f`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Dhruv Mark Collins	920a848027	tu/autotune: Fail gracefully when CP counters are unavailable When preemption optimization is supported then the necessary CP counters being missing causes a device initialization error which is unnecessary as support can simply be disabled instead to allow for a more graceful fail. This also fixes A8XX which doesn't have performance counters hooked up yet. Cc: mesa-stable Signed-off-by: Dhruv Mark Collins <mark@igalia.com> Assisted-by: OpenAI Codex (GPT-5.4) (cherry picked from commit `a5ec9b7892`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Zan Dobersek	c63da3365d	tu: only support userspace-managed perfcounters on a7xx and earlier Future kernel API for perfcounter management will likely be required for a8xx and onwards. For a7xx and earlier, cmdstream-based selector and counter register management is still supported. Cc: mesa-stable Signed-off-by: Zan Dobersek <zdobersek@igalia.com> (cherry picked from commit `c2708afbc7`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Erik Faye-Lund	0d6e04debe	panvk: drop out-of-date TODO We already did this, so let's drop this TODO. Fixes: `d36e6af329` ("panvk: Bump the max image size on v11+") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (cherry picked from commit `f137207108`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Michel Dänzer	705dabbc26	winsys/amdgpu: Use render node only as fallback If ac_drm_device_initialize returns -EACCES for the fd passed in. A render node file description can't have DRM master status, which means AMDGPU_CTX_PRIORITY_HIGH can't work without CAP_SYS_NICE (which generally only the root user has). Fixes: `8f30e90fc1` ("winsys/amdgpu: Prefer render node FD for ac_drm_device_initialize") (cherry picked from commit `5cc3264b53`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Samuel Pitoiset	3740d70fc5	radv: lower SHADER_RECORD_INDEX to non-uniform This fixes an issue with RADV and NVIDIA-RTX/Donut-Samples with heap support in vkd3d-proton. Backport-to: 26.1 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `477c44ba93`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Samuel Pitoiset	010072b5bc	vulkan: add an option to lower SHADER_RECORD_INDEX to non-uniform Applications are required to set NonUniform if the resource is arrayed, but with VK_DESCRIPTOR_MAPPING_SOURCE_HEAP_WITH_SHADER_RECORD_INDEX_EXT, the resource is non-arrayed in the shader. So, it's technically not required to set it. Although, the offset can vary per-lane and NonUniform is implicit. Backport-to: 26.1 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `8e2869fa41`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Connor Abbott	eb7c92b6a2	ir3: Use correct immediate size for constlen calculation "size" is the allocated size of the array, not the number of immediates actually used. We could wind up returning a too-large constlen, larger than 512, and since the binning variant uses the non-binning variant's constlen as it's max_const we could make binning variants use c512.x and crash when encoding. Fixes: `86f3c0c4c2` ("ir3: simplify constlen calculation") (cherry picked from commit `49d29d4f10`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Connor Abbott	c15245f814	ir3: Don't reset immediate count to 0 after lowering We need to know the immediate count even after lowering, to compute the overall const size. Previously we were using the capacity field, but that's unreliable and won't be available once we switch to a real dynamic array container instead of (poorly) reinventing one. Fixes: `86f3c0c4c2` ("ir3: simplify constlen calculation") (cherry picked from commit `280c64d720`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Karol Herbst	76f4e2078a	nak: the MS location comes last in TLD, same spot as depth compare in TEX Some Max Payne 3 shaders are impacted by this and probably will fix some issue there. The VK CTS isn't testing this, but it was verified to fix a real problem by inserting 0 offsets into the instruction and having CTS tests fail with the old ordering. Totals from 3 (0.00% of 1163204) affected shaders: CodeSize: 2496 -> 2736 (+9.62%) Static cycle count: 732 -> 741 (+1.23%) Fixes: `ad01fbdda0` ("nak: Add a NIR texture lowering pass") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> (cherry picked from commit `e09045e26c`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Eric Engestrom	ddb44422f7	.pick_status.json: Update to `806fcc6193` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>	2026-04-22 14:34:48 +02:00
Eric Engestrom	0108eba5ed	VERSION: bump for 26.1.0-rc1	2026-04-15 15:42:35 +02:00
Alyssa Rosenzweig	3a9ef908ea	intel: fuse off Jay in Mesa 26.1 Jay is under heavy development and is not considered released. It is available in upstream Mesa for developers to hack on but is not part of the 26.1 release. Add a comment acting like a chicken bit to fuse off the compiler while minimizing conflicts with backports (which is why we don't remove Jay wholesale from the release). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>	2026-04-15 15:39:49 +02:00
Benjamin Cheng	9182da14a7	radv: Relax linear requirement to VCN1 and prior With the previous commit ("ac/surface: Filter swizzle modes for VCN"), only video-compatible swizzle modes will be picked, so we can enable tiling for VCN2+. Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>	2026-04-15 12:48:57 +00:00
Benjamin Cheng	fcaab2b921	ac/surface: Filter swizzle modes for VCN This will allow compatible swizzle modes to be picked for RADV (radeonsi filters modifiers when creating video surfaces). This mirrors the logic from ac_modifier_supports_video, and in addition ensures that XOR swizzle modes are disabled for image arrays because VCN does not support slice indices. Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>	2026-04-15 12:48:57 +00:00
Christian Gmeiner	713cecb1df	panvk: Advertise VK_EXT_rgba10x6_formats Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Map X6R10X6G10X6B10X6A10_UNORM to the native R10X6G10X6B10X6A10X6_UNORM HW format on PAN_ARCH >= 11 where it is supported. Enable the extension with formatRgba10x6WithoutYCbCrSampler in the physical device, allowing VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 to be used as a regular color format without YCbCr sampler conversion. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>	2026-04-15 12:16:53 +00:00
Christian Gmeiner	9f172ba4da	util/format, vulkan: Add PIPE_FORMAT_X6R10X6G10X6B10X6A10_UNORM The format has 4 x 16-bit words with 10-bit unorm values in bits [15:6] and 6 padding bits in [5:0]. Since this requires 8 channel slots but the format system only supports 4, use layout "other" with hand-written pack/unpack conversion functions. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>	2026-04-15 12:16:53 +00:00
Christian Gmeiner	81b8113a9f	radv: Don't advertise any features for R10X6G10X6B10X6A10X6_UNORM_4PACK16 The recent addition of PIPE_FORMAT_X6R10X6G10X6B10X6A10_UNORM caused vk_format_to_pipe_format() to map VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 to a real pipe format, which made radv_physical_device_get_format_properties() advertise BLIT_SRC/SAMPLED_IMAGE for it. The hardware samples the data as plain R16G16B16A16 UNORM, which doesn't match the 10-bit UNORM semantics the spec (and CTS) require, so dEQP-VK.api.copy_and_blit.core.blit_image.* tests with r10x6g10x6b10x6a10x6_unorm_4pack16 as the source started failing on gfx1201. Override the mapping to PIPE_FORMAT_NONE so RADV reports zero format features, matching the behavior prior to the new pipe format being added. Proper support can be restored once VK_EXT_rgba10x6_formats is implemented. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>	2026-04-15 12:16:53 +00:00
Samuel Pitoiset	dc0d6100f9	radv/ci: document a descriptor heap failure Test bug. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>	2026-04-15 11:22:10 +00:00
Samuel Pitoiset	6462055e38	radv/ci: fix setting RADV_EXPERIMENTAL=heap It's overwritten if manually set per jobs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>	2026-04-15 11:22:10 +00:00
Samuel Pitoiset	282bb0d11b	radv/ci: update flakes of VKCTS jobs Collected after 25 runs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>	2026-04-15 11:22:10 +00:00
Samuel Pitoiset	3af1f8dc0a	radv/ci: remove a hack for the number of deqp instances with RENOIR Latest VKCTS main uses way less memory than before, and increasing the number of deqp instances to 16 seems to work just fine now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>	2026-04-15 11:22:10 +00:00
Samuel Pitoiset	3777f7fe3b	ci: uprev VKCTS main to 634a3fc62d82c34de68c3b1add25e6b7f5777524 RADV is the only driver using VKCTS main. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>	2026-04-15 11:22:10 +00:00
Icenowy Zheng	b8c5e47949	pvr: propagate get_vis_results flag from secondary cmdbuf gfx jobs When recording secondary command buffers with occlusion queries, the get_vis_results flag could be set for some graphics sub_cmd's job. Propagate this flag from secondary command buffer graphics sub_cmds to primary command buffer sub_cmds to ensure occlusion queries in secondary command buffers being correctly executed. Fixes: `5c34be4340` ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.") Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>	2026-04-15 11:03:06 +00:00
Icenowy Zheng	87f4122e11	pvr: fix the code copying query_indices to sub_query_indices There's a dynarray field inside gfx sub_cmd called sub_query_indices, which will contain pending query indices for gfx sub_cmds inside a secondary command buffer. It's expected that when finishing such gfx sub_cmds, the content of query_indices is going to be moved there. However the `util_dynarray_append_dynarray()` call is called with wrong parameter order, thus it's copying sub_query_indices to query_indices and then immediately wiping query_indices, forgetting all query indices in such case. Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries in secondary command buffers. Fixes: `8c506c4b03` ("pvr: Use util_dynarray_append_dynarray()") Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>	2026-04-15 11:03:06 +00:00
Icenowy Zheng	36f34a72c1	pvr: finalize query_indices array after ending last sub_cmd The last sub_cmd in the command buffer could be a graphics one, and when ending a graphics sub_cmd, the query_indices array will be checked to know whether a occlusion query starts during this graphics sub_cmd. Finalize the query_indices array after ending the last sub_cmd, otherwise the check for query initiation may have a false negative result. Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff. random.seed6` test case. Fixes: `2b1992a000` ("pvr: Implement vkCmdBeginQuery API.") Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>	2026-04-15 11:03:05 +00:00
Lorenzo Rossi	7ccca9f972	pan/compiler: Document compilation pipeline expectations Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	43ba475d4c	panfrost,panvk: Move lower_texture_early inside preproc Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	e24228e327	panfrost,panvk: Move lower_texture_late inside postproc Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	d096a8e962	panfrost: Move lower_res_indices before postproc Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00
Lorenzo Rossi	eafc822dbd	panfrost,panvk: Move postprocess near shader_compile Ideally there should be only sysval lowering in the middle. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00
Lorenzo Rossi	83fd45aa5a	pan/compiler: Fix noperspective int varyings Ints and floats do not need to match between VS and FS, some crazy shaders might write an uint from the VS and read a noperspective float from the FS. There will be new tests in the conformance tests that check that too shortly. Is this a performance regression? yes. Can we fix this easily? No, we'll need dynamic prolog/epilog linking. Since maybe_noperspective is almost useless after this fix, the whole logic has been removed Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00
Lorenzo Rossi	6e67f2a996	pan/compiler: Don't crash nopersp if pos is undefined VS does not need to write the position, it can also leave it as undefined. We agree that there isn't much sense in noperspective varyings with undefined perspective, but we still do not want to crash. This does lead to some real crashes if we mistake some int varying to noperspective (see next commit). Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00
Georg Lehmann	5607417f57	radv: remove radv_nir_lower_viewport_to_zero Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_opt_varyings does this. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:17:01 +00:00
Georg Lehmann	c842186e39	radv: remove lower array vars to elem No Foz-DB changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:17:01 +00:00
Georg Lehmann	c7a953809a	radv: don't lower io vars to scalar Done later on lowered IO. Foz-DB Navi48: Totals from 4 (0.00% of 205045) affected shaders: Instrs: 1434 -> 1418 (-1.12%) CodeSize: 7912 -> 7848 (-0.81%) Latency: 5688 -> 5646 (-0.74%) InvThroughput: 642 -> 646 (+0.62%) PreVGPRs: 104 -> 100 (-3.85%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:17:01 +00:00
Georg Lehmann	b2e59d80b0	radv: do not shrink vectors when lowering IO vars to scalar I wanted to move this later, but looking at the stats, this pass actually hurts here because it shrinks smem loads that would be better vectorized. So just remove it. Foz-DB Navi48: Totals from 2268 (1.11% of 205045) affected shaders: Instrs: 1573491 -> 1569535 (-0.25%); split: -0.35%, +0.10% CodeSize: 8399092 -> 8378632 (-0.24%); split: -0.39%, +0.14% SpillSGPRs: 312 -> 355 (+13.78%) Latency: 12223349 -> 12225239 (+0.02%); split: -0.20%, +0.21% InvThroughput: 2235646 -> 2236174 (+0.02%); split: -0.15%, +0.17% VClause: 26526 -> 26549 (+0.09%); split: -0.02%, +0.11% SClause: 34974 -> 34053 (-2.63%); split: -3.01%, +0.37% Copies: 114417 -> 115513 (+0.96%); split: -0.33%, +1.28% Branches: 28085 -> 26899 (-4.22%); split: -4.24%, +0.02% PreSGPRs: 98109 -> 99024 (+0.93%); split: -0.10%, +1.03% PreVGPRs: 78224 -> 78226 (+0.00%) VALU: 929067 -> 928588 (-0.05%); split: -0.08%, +0.03% SALU: 204756 -> 206936 (+1.06%); split: -0.19%, +1.26% SMEM: 67181 -> 64687 (-3.71%); split: -3.83%, +0.11% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:17:00 +00:00
Georg Lehmann	f001afad23	radv: remove some unneeded passes from radv_nir_lower_io_vars_to_scalar No Foz-DB changes on Navi48. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:16:59 +00:00
Georg Lehmann	e0883d107a	radv: do not remove dead variables No Foz-DB changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:16:59 +00:00
Georg Lehmann	8c98ed9e85	radv: do not vectorize io variables No Foz-DB changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:16:59 +00:00
Georg Lehmann	d14fc27f44	radv: do not vectorize fs out variables This is scalarized later anyway. No Foz-DB changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:16:58 +00:00
Georg Lehmann	5d8c817fd7	radv: lower lowered io to scalar We already did this for everything except fragment shader outputs with epilogs. If we move it a bit earlier, we can stop lowering IO variables to scalar. Foz-DB Navi48: Totals from 1001 (0.49% of 205045) affected shaders: MaxWaves: 31252 -> 31256 (+0.01%) Instrs: 372258 -> 372036 (-0.06%); split: -0.14%, +0.08% CodeSize: 1999064 -> 1997836 (-0.06%); split: -0.13%, +0.06% VGPRs: 39096 -> 39072 (-0.06%) Latency: 1235558 -> 1235435 (-0.01%); split: -0.08%, +0.07% InvThroughput: 213845 -> 213875 (+0.01%); split: -0.06%, +0.07% VClause: 5840 -> 5838 (-0.03%) SClause: 10964 -> 10969 (+0.05%); split: -0.03%, +0.07% Copies: 21469 -> 21545 (+0.35%); split: -0.42%, +0.78% Branches: 5326 -> 5324 (-0.04%) PreSGPRs: 34214 -> 34206 (-0.02%); split: -0.03%, +0.01% PreVGPRs: 21931 -> 22001 (+0.32%); split: -0.06%, +0.38% VALU: 212386 -> 212418 (+0.02%); split: -0.07%, +0.09% SALU: 50409 -> 50378 (-0.06%); split: -0.07%, +0.01% VMEM: 8352 -> 8331 (-0.25%) SMEM: 17966 -> 17963 (-0.02%) This is mostly RA noise in GPL FS shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>	2026-04-15 09:16:58 +00:00
Natalie Vock	1f998b38f4	radv: Run nir_opt_deref after first optimization loop Only at this point are loads from uninitialized variables lowered to undef and copy-propagated so that nir_opt_deref's cast-of-undef optimization works properly. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40799>	2026-04-15 08:42:12 +00:00
Natalie Vock	57f796752d	nir/deref: Elide loads/stores from deref cast of undef These can never be meaningful. DOOM: The Dark Ages also relies on this. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40799>	2026-04-15 08:42:12 +00:00
Job Noorman	118b975ce7	ir3: use ldg.k load size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ldg.k can copy up to 256 vec4s at once but we currently emit one ldg.k per vec4. Fix this by using the load size field of ldg.k. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>	2026-04-15 07:58:01 +00:00
Job Noorman	a1272cabe0	ir3/isa: fix load size encoding for ldg.k The load size field starts at b23 instead of b24 and is 8 bits in size. b23 makes the blob disassembler select between interpreting the load size as an immediate or a GPR. However, using a GPR doesn't work as the HW still seems to interpret the field as an immediate. We copy the blob's behavior here for consistency. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>	2026-04-15 07:58:01 +00:00
Job Noorman	e6529b54c0	ir3: add support for the ldg.k a1.x addressing mode We assumed a1.x addressing doesn't work. However, it turns out it actually does work but instead of taking the offset's hight bits from a1.x and adding an immediate to the low bits, the full offset is stored in a1.x and the offset is ignored. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>	2026-04-15 07:58:01 +00:00
Job Noorman	bf167ca73b	ir3: allow shared address src for ldg.k Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>	2026-04-15 07:58:00 +00:00
Christoph Pillmayer	3427b20b71	pan/bi: Fix MEMMOV size calculation Doing stores first, loads second doesn't work because there can be chains of store, load, store... . Use a fixed point approach instead to calculate sizes for all destinations. Fixes: `2fd5b8a391` ("pan/bi: Account for MEMMOV in bi_record_sizes") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40915>	2026-04-15 07:22:35 +00:00
Job Noorman	ce810bb19b	ir3/parser: add @constlen header Constlen cannot always be derived from the usage of @const et al. For example when using ldc.k/ldg.k. Add a @constlen header to explicitly set it. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40940>	2026-04-15 06:46:10 +00:00

1 2 3 4 5 ...

221242 commits