fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 07:58:14 +02:00

Author	SHA1	Message	Date
Hans-Kristian Arntzen	42f021fc29	radv: Enable EXT_present_timing. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:51 +00:00
Samuel Pitoiset	14d3fb5f1b	radv: add a workaround for a synchronization bug in Strange Brigade Vulkan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This game has broken synchronization reported by VVL and it indeed doesn't wait for idle right before present. Workaround this by injecting a full barrier (easier than rewriting the dep struct). This only applies to the Vulkan backend. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14705 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39480>	2026-01-27 09:18:25 +00:00
Wang Ruitang	e11c04c0cc	amd/common/virtio: use device fd to init sync provider Use fd after dup instead of the one before dup to avoid drm_syncobj_find failed in guest kernel when dev is found in dev_list. When dev is not found in dev_list, it uses device fd which is duplicated, to init sync provider. And when it's found, the same device fd should be used. Otherwise, it would caused inconsistency and failures like in the Android domU CTS test where the guest kernel attempts to locate a syncobj. This occurs because vdrm_device_connect and VIRTGPU_EXECBUFFER ioctl use fd after dup while util_sync_provider_drm uses the one before dup. The fix has been validated with the CtsSdkSandboxWebkitTestCases in Android domU, and the previously failing test cases no longer occur. Signed-off-by: Ruitang.Wang@amd.com Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39520>	2026-01-27 08:24:35 +00:00
David Rosca	62f07b8c63	radeonsi/vcn: Add low latency decode debug option Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to the low latency option for encode, this reduces latency of decoding at the cost of increased power usage. Can be enabled with AMD_DEBUG=lowlatencydec Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>	2026-01-26 15:00:06 +00:00
Benjamin Cheng	c10ebb0fda	radv/video: Use a more reliable way of computing tile sizes Some apps (old FFmpeg, contemporary CTS) send down pMi{Col,Row}Starts in SB units, not MI units. Instead of dependening on those values which could be unreliable, derive the tile sizes in SB using other parameters. Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39492>	2026-01-26 14:41:20 +00:00
Georg Lehmann	809fb0fba3	ac/nir/lower_ps_late: emit scalar f2f16_rtz for when one half of a packed export is undef Foz-DB Navi48: Totals from 7200 (8.74% of 82405) affected shaders: Instrs: 9056391 -> 9048177 (-0.09%); split: -0.09%, +0.00% CodeSize: 48681288 -> 48640684 (-0.08%); split: -0.09%, +0.00% VGPRs: 413088 -> 413784 (+0.17%) Latency: 76340711 -> 76320080 (-0.03%); split: -0.03%, +0.00% InvThroughput: 12692959 -> 12684618 (-0.07%); split: -0.07%, +0.00% VClause: 148823 -> 148821 (-0.00%) Copies: 601739 -> 601874 (+0.02%); split: -0.01%, +0.03% VALU: 5213356 -> 5207253 (-0.12%); split: -0.12%, +0.00% SALU: 1160815 -> 1160817 (+0.00%); split: -0.00%, +0.00% VOPD: 79520 -> 79444 (-0.10%); split: +0.09%, -0.18% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:23 +00:00
Georg Lehmann	8c895c5c61	ac/nir/lower_ps_late: CSE partial packed exports Foz-DB Navi48: Totals from 425 (0.52% of 82405) affected shaders: Instrs: 1110029 -> 1109658 (-0.03%); split: -0.03%, +0.00% CodeSize: 6135272 -> 6133848 (-0.02%); split: -0.02%, +0.00% VGPRs: 29856 -> 29844 (-0.04%) Latency: 10258411 -> 10258043 (-0.00%); split: -0.00%, +0.00% InvThroughput: 1898177 -> 1897661 (-0.03%) Copies: 88221 -> 88173 (-0.05%) VALU: 575276 -> 574894 (-0.07%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	e74323577f	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for salu Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	6cbd16daae	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx8+ Do this late because the v_cvt_pkrtz_f16_f32 can be applied to its operand. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Georg Lehmann	57ca974d1d	aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx6/7 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:21 +00:00
Georg Lehmann	ba73792de0	aco/optimizer: fix parsing salu p_insert as shift Fixes: `88f7e3fff3` ("aco/optimizer: parse pseudo alu instructions") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:21 +00:00
Georg Lehmann	830d6de9ff	aco/isel: optimize pack_32_2x16_split(undef, const) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00
Rhys Perry	928ecfc6c0	radv: fix RADV_DEBUG=shaderstats with RT pipelines radv_dump_shader_stats() printed stats for every shader with a certain stage, and we called this function each time an RT shader is compiled. This means we could repeat the stats for a shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39484>	2026-01-26 09:26:14 +00:00
Rhys Perry	e59a0df302	aco/insert_fp_mode: remove incorrect assertion This can happen if a loop has no continues, and the later code should work fine in this situation. This fixes war_thunder/0013a69e097b2471 on navi21. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `6b9d28ab9b` ("aco/insert_fp_mode: insert fp mode in reverse") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39481>	2026-01-26 08:57:33 +00:00
Samuel Pitoiset	c91ed27582	radv: use the SQTT enable bit for PKT3_DISPATCH_TASKMESH_INDIRECT_MULTI_ACE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>	2026-01-26 08:10:53 +00:00
Samuel Pitoiset	e272c8062d	radv: use the SQTT enable bit for PKT3_DISPATCH_MESH_INDIRECT_MULTI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>	2026-01-26 08:10:53 +00:00
Samuel Pitoiset	c7da19e2bf	radv: use the SQTT enable bit for PKT3_DRAW_{INDEX}_INDIRECT_MULTI This reports more info in RGP. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>	2026-01-26 08:10:52 +00:00
Samuel Pitoiset	e5982496f6	radv: move emitting SQTT markers closer to the draw/dispatch packets Some packets already include a SQTT enable bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>	2026-01-26 08:10:52 +00:00
Georg Lehmann	5827de9cd6	aco/gfx12: use 64bit add/sub to swap sgprs Not writing SCC requires less instructions and gives the scheduler more freedom. Foz-DB GFX1201: Totals from 114 (0.14% of 82179) affected shaders: Instrs: 276265 -> 275791 (-0.17%) CodeSize: 1460504 -> 1458504 (-0.14%) Latency: 902933 -> 902548 (-0.04%); split: -0.04%, +0.00% InvThroughput: 166517 -> 166512 (-0.00%) SClause: 6703 -> 6698 (-0.07%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39329>	2026-01-23 10:13:19 +00:00
Georg Lehmann	763b4f1f0a	radv/gfx11: add a RADV_PERFTEST flag to expose bfloat16 cmat This doesn't pass CTS because of precision issues, but might still be useful. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14699 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39456>	2026-01-23 09:41:20 +00:00
Marek Olšák	ebeb904c95	ac,radeonsi: set optimal COMPUTE_DISPATCH_INTERLEAVE for buffer clears/copies Small buffer clears are a bit faster now. The numbers were tuned specifically for this compute shader. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:39 +00:00
Marek Olšák	a5e1d31dad	ac/nir/meta: tune 12B clear buffer performance for gfx12 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:39 +00:00
Marek Olšák	9257cf04a1	ac/nir/meta: tune image clear & copy performance for gfx12 Compute shaders are the fastest for all copies and some clears. Note that this is a very different compute shader than the one in RADV. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:38 +00:00
Natalie Vock	15328a5ef3	aco: Fix parameter stack size calculation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This only accounted for 1/32 (or 1/64) of the actual parameter size. In some cases this meant that some threads were smashing other threads' stacks. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39455>	2026-01-22 22:02:31 +00:00
jaap aarts	8f7941f92d	radv/sqtt: Prevent concurrent submit when sqtt is enabled cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39090>	2026-01-21 18:55:56 +00:00
Timur Kristóf	87a8d19b51	ac/gpu_info: Remove FIXME from regalloc hang description This is now implemented. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>	2026-01-21 17:24:57 +00:00
Timur Kristóf	fc0827126f	radv: Remove previous mitigation of CS regalloc hang bug Now that all larger workgroup sizes are lowered to 256, the old workaround is not needed anymore. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>	2026-01-21 17:24:57 +00:00
Timur Kristóf	86ff28b3da	radv: Allow using compute queue with CS regalloc hang bug on GFX7 Now that all larger workgroup sizes are lowered to 256, the regalloc hang cannot mess up the compute queues anymore. Still don't allow compute queues on GFX6 though, they are prone to hangs. Needs further investigation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>	2026-01-21 17:24:56 +00:00
Timur Kristóf	d31b4451f2	radv: Lower larger workgroups to 256 for CS regalloc bug This is the safest maximum workgroup size if we want to avoid the hang on affected GPUs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>	2026-01-21 17:24:56 +00:00
Rhys Perry	2c9775b339	aco: reduce memory usage of live_var_analysis Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39408>	2026-01-21 12:03:43 +00:00
Rhys Perry	874255e899	aco: use size_t for monotonic_buffer_resource Necessary for really big shaders. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14650 Backport-to: 25.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39408>	2026-01-21 12:03:42 +00:00
Samuel Pitoiset	de64c7238a	ac/nir: fix computing cube derivatives when the major axis is negative Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This corresponds to the face 1.0, 3.0 or 5.0. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39303>	2026-01-21 07:12:34 +00:00
Qiang Yu	4708eb85d7	radv: fix primitive restart gpu hang for pre gfx10 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details PAL always set WD_SWITCH_ON_EOP for pre gfx10 when primitive restart is enabled to prevent gpu hang. It only happens when specific index stream with primitive restart. Since we don't know what's the exact problem, just follow PAL to disable 4x primitive rate when primitive restart is enabled. GFX10+ does not use this function. Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39292>	2026-01-21 02:38:26 +00:00
Natalie Vock	e8f1dc687c	aco: Use parameter assignment hints for any-hit shaders Query the signature of the traversal function stored in the any-hit shader and make the parameter locations between the two match up, to remove unnecessary movs inside the traversal loop. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:55 +00:00
Natalie Vock	a32709674a	aco: Add parameter assignment hints Parameter assignment hints allow to influence parameter register assignment logic with user-specified affinities. If there is an affinity declared for a parameter, the assignment logic will try to match the registers a parameter and its affinity are assigned. It also allows to hint that certain registers are not suitable for assigning parameters to and should be avoided. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:55 +00:00
Natalie Vock	2d6ecf303a	aco: Add and use nir_abi_to_aco helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:55 +00:00
Natalie Vock	30f6eacfad	radv/rt: Call ahit/isec shaders Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:55 +00:00
Natalie Vock	a03e9287c3	radv/rt: Compile ahit/isec shaders to asm We can express any-hit/intersection shaders as functions, too. Any-hit/Intersection shaders need the usual parameters like launch IDs/descriptor data/ray properties, origin, direction/etc., but also some special parameters related to traversal state. Any-hit/intersection shaders need to return whether the hit was accepted and/or traversal should be terminated, as well as the intersection T value (for intersection shaders). Both any-hit and intersection shaders also need to be passed hit attributes via parameters. Closest-Hit shaders need those too, but we pass them out-of-band via LDS. LDS is used for the traversal stack when any-hit/intersection shaders, so we need to pass them via parameters. Hit attributes are similar to ray payloads in the sense that they're dynamically sized depending on how much space the application uses. However, unlike ray payloads, hit attribute sizes have a strict upper bound of 8 dwords. To make managing parameters easier, we put all hit attributes in a single vector parameter with 0-8 components. This prevents having a function with two sets of arbitrary numbers of parameters. This commit sets up ahit/isec function signatures and implements lowering for ahit/isec-specific intrinsics in the context of these functions. Subsequent commits will merely have to call into these functions to execute a separate-compiled any-hit/intersection shader. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:55 +00:00
Natalie Vock	e74e0983a7	radv/rt: Fix terminate_ray handling for intersection shaders terminate_ray should only return from any-hit shaders, it should not skip the intersection shader. If we insert a nir_jump_return when processing the already-inlined any-hit shader, the intersection shader will be skipped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Natalie Vock	646d3b9645	radv/nir: Make nir_lower_intersection_shader public Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Natalie Vock	1fb005b487	radv/nir: Add and use radv_nir_return_param_from_type helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Natalie Vock	bde7bebc01	radv/rt: Don't consider non-internal INTERSECTION shaders as the traversal shader Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Natalie Vock	b52adac42c	aco: Tweak ABI register param limits Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Natalie Vock	7a2f050daa	aco: Put boolean parameters inside SGPRs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>	2026-01-20 21:49:54 +00:00
Samuel Pitoiset	71f5434142	radv: optimize layered fast clear colors when comp-to-single is supported comp-to-single is supported since GFX10, it's a new type of DCC fast clear which doesn't require FCE and doesn't require to set fast clear registers (ie. comp-to-reg). This means that it's possible to fast clear even if not all slices are bound, because the clear code is stored in the main image. This improves performance in Dirt Rally 2.0 by +2-5%. Other games that have layered clears would also benefit on GFX10+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39394>	2026-01-20 18:28:59 +00:00
Samuel Pitoiset	8781dd85c2	radv/meta: add support for fast clearing color images with non-zero baseArrayLayer Like vkCmdClearAttachments(). This is a preliminary change for the next commit which will enable these fast clears when comp-to-single is supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39394>	2026-01-20 18:28:59 +00:00
Icenowy Zheng	734b6a8c35	vk: descriptors: sort bindings along with flags Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Vulkan spec requires binding flags to be matched with the binding with the same index, however currently bindings are sorted with flags not properly sorted, which leads to bindings and flags mismatch. Resolve this by adding optional flags info to the parameters of vk_create_sorted_bindings(), and refactoring panvk/pvr (which really pair bindings and flags instead of only iterating flags) to use sorted flags. Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com> Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>	2026-01-20 15:55:47 +00:00
Samuel Pitoiset	3e7f38efa8	radv: always fast-clear color image with comp-to-single on GFX11-11.5 This is possible because no comp-to-reg and no FCE. This probably helps a bunch on GFX11+ if GENERAL is widely used with color images. And since VK_KHR_unified_image_layout it's likely the case on GFX11-11.5 GFX10-10.3 could also benefit from this but some MSAA with DCC fast-clears are currently broken and they need to be fixed first. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39396>	2026-01-20 15:15:35 +00:00
Georg Lehmann	711598982a	ac/nir,radv: remove ac_nir_opt_pack_half Foz-DB Navi21: Totals from 2937 (3.01% of 97591) affected shaders: Instrs: 1908695 -> 1908291 (-0.02%); split: -0.02%, +0.00% CodeSize: 10232148 -> 10229224 (-0.03%); split: -0.03%, +0.01% VGPRs: 142168 -> 142080 (-0.06%) Latency: 8052895 -> 8052622 (-0.00%); split: -0.01%, +0.01% InvThroughput: 2550330 -> 2549602 (-0.03%); split: -0.03%, +0.01% VClause: 32601 -> 32603 (+0.01%); split: -0.01%, +0.02% Copies: 118570 -> 118587 (+0.01%); split: -0.04%, +0.05% PreVGPRs: 110090 -> 110082 (-0.01%) VALU: 1468422 -> 1468043 (-0.03%); split: -0.03%, +0.00% SALU: 173858 -> 173828 (-0.02%) Foz-DB Navi48: Totals from 4196 (4.30% of 97637) affected shaders: MaxWaves: 118678 -> 118680 (+0.00%); split: +0.01%, -0.01% Instrs: 3627604 -> 3624093 (-0.10%); split: -0.10%, +0.00% CodeSize: 18956684 -> 18939824 (-0.09%); split: -0.09%, +0.01% VGPRs: 225624 -> 225060 (-0.25%); split: -0.26%, +0.01% Latency: 11856204 -> 11857280 (+0.01%); split: -0.01%, +0.02% InvThroughput: 2388584 -> 2389178 (+0.02%); split: -0.01%, +0.03% VClause: 50409 -> 50410 (+0.00%) SClause: 64701 -> 64699 (-0.00%) Copies: 208353 -> 207522 (-0.40%); split: -0.43%, +0.03% PreVGPRs: 161314 -> 161306 (-0.00%) VALU: 2345604 -> 2345172 (-0.02%); split: -0.02%, +0.00% SALU: 391466 -> 388723 (-0.70%) VOPD: 1788 -> 1806 (+1.01%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>	2026-01-20 14:48:23 +00:00
Georg Lehmann	939b4a6476	aco/optimizer: apply v_cvt_pkrtz_f16_f32 as fma_mix to operands Foz-DB Navi21: Totals from 2085 (2.14% of 97591) affected shaders: Instrs: 4880879 -> 4882355 (+0.03%); split: -0.04%, +0.07% CodeSize: 26869332 -> 26881744 (+0.05%); split: -0.02%, +0.06% VGPRs: 93944 -> 94160 (+0.23%); split: -0.06%, +0.29% Latency: 40035558 -> 40035595 (+0.00%); split: -0.02%, +0.02% InvThroughput: 10333800 -> 10329093 (-0.05%); split: -0.06%, +0.01% VClause: 139147 -> 139148 (+0.00%) Copies: 454527 -> 454656 (+0.03%); split: -0.00%, +0.03% VALU: 3214838 -> 3211105 (-0.12%) Foz-DB Navi48: Totals from 2349 (2.41% of 97637) affected shaders: Instrs: 6471998 -> 6471817 (-0.00%); split: -0.05%, +0.05% CodeSize: 34793372 -> 34808748 (+0.04%); split: -0.02%, +0.06% VGPRs: 141804 -> 142560 (+0.53%) Latency: 45225910 -> 45226000 (+0.00%); split: -0.01%, +0.01% InvThroughput: 9152634 -> 9149850 (-0.03%); split: -0.04%, +0.01% VClause: 148536 -> 148537 (+0.00%) Copies: 527206 -> 527336 (+0.02%); split: -0.01%, +0.03% VALU: 3491701 -> 3487347 (-0.12%); split: -0.12%, +0.00% VOPD: 669 -> 683 (+2.09%); split: +2.69%, -0.60% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>	2026-01-20 14:48:23 +00:00

1 2 3 4 5 ...

19749 commits