fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 05:48:12 +02:00

Author	SHA1	Message	Date
Rhys Perry	f6581b41c4	aco/ra: don't require alignment for NPOT SGPR temporaries Aligning these can create situations where register allocation is impossible. Only pseudo-instructions can use these, and they don't require any alignment. I'm not sure if these temporaries actually happen in practice. This probably only affects the get_reg()'s compact_relocate_vars fallback path, which doesn't usually happen for SGPRs. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>	2025-04-29 15:15:10 +00:00
Rhys Perry	623230a6ef	aco/ra: change sorting in compact_relocate_vars D16 MIMG or pseudo-scalar transcendental instructions might give lower limits to the maximum register available for their definitions, so just try to place them earlier. This is also part of fixing compact_relocate_vars with aligned NPOT def/killed-op space (the second part is the later commit which changes get_stride()). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>	2025-04-29 15:15:10 +00:00
Eric Engestrom	ef1792bea8	amd/ci: document regression in e612e840...e210b79c Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34740>	2025-04-29 11:09:35 +02:00
Eric Engestrom	80b1aea705	amd/ci: disable retry on nightly radeonsi-vangogh-glcts-full job https://gitlab.freedesktop.org/mesa/mesa/-/jobs/75399939 should not have been retried. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34740>	2025-04-29 09:01:47 +00:00
Ricardo Garcia	bc44d029df	radv: Ignore image barrier queue families if equal Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The src and dst queue family indices in an image memory barrier may contain arbitrary values that can be ignored unless both are different. This fixes a crash in upcoming CTS tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34691>	2025-04-29 08:15:28 +00:00
Samuel Pitoiset	1fccc09abe	radv: fix re-emitting VRS state when rendering begins This state also depends on whether a VRS attachment is used. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11693 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34735>	2025-04-29 07:00:09 +00:00
Eric Engestrom	4227982326	ci: rename misleading -postmerge stages to -nightly Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details These stages are for the jobs that are skipped in merge pipelines, automatically run in nightly pipelines, and are available to run manually in other pipelines. None of these ever run in post-merge pipelines. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34590>	2025-04-29 05:49:00 +00:00
Valentine Burley	f8e87fbf50	ci/amd: Convert to using the new container based rootfs Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34451>	2025-04-28 20:08:32 +00:00
Valentine Burley	9870512787	ci/lava: Use the new test-video-based rootfs for VA-API jobs Add new job definitions using the new debian/x86_64-test-video container, and convert the radeonsi-raven-va and radeonsi-raven-vaapi-fluster jobs to use the rootfs derived from this container. Now that we are no longer downloading the Fluster vectors at runtime, the parallelsim of the radeonsi-raven-vaapi-fluster job can be greatly decreased. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34451>	2025-04-28 20:08:32 +00:00
Valentine Burley	e80045d23e	ci/lava: Use the new container based rootfs for piglit traces Also change the name of the job defitions to match the other definitions. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34451>	2025-04-28 20:08:32 +00:00
Rhys Perry	de896234d8	aco: improve spilling of clobbered operands Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We can ignore live_changes for these. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34699>	2025-04-28 10:04:43 +00:00
Rhys Perry	7fe84024cb	aco: fix get_temp_reg_changes with clobbered operands The spiller might have tried to spill a live-through first or second s_fmac_f32 operand, but this wouldn't have reduced the SGPRs if the third operand wasn't killed Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13038 Fixes: `d6cb45dbb0` ("aco/spill: Allow spilling live-through operands") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34699>	2025-04-28 10:04:43 +00:00
Rhys Perry	3c021b79b4	aco/ra: use a correct stride for subdword get_reg_impl Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details fossil-db (gfx1201): Totals from 3 (0.00% of 79377) affected shaders: Instrs: 1312 -> 1308 (-0.30%) CodeSize: 7112 -> 7096 (-0.22%) Latency: 5381 -> 5382 (+0.02%) InvThroughput: 753 -> 752 (-0.13%) Copies: 69 -> 68 (-1.45%) VALU: 836 -> 835 (-0.12%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34679>	2025-04-25 14:43:41 +00:00
Rhys Perry	ae6d4f1195	aco/ra: update_renames() before add_subdword_definition() The register file tests here should be done after update_renames(). Normally, get_reg() wouldn't have to move anything to make space for a 1-3 byte definition. This was encountered with skip_optimistic_path=true and a get_reg_impl() bug (fixed in a later commit) which caused suboptimal register assignment. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34679>	2025-04-25 14:43:41 +00:00
Eric Engestrom	1d7cce2700	ci/ci-tron: default HWCI_TEST_SCRIPT to deqp-runner, as it's almost always what's run Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34710>	2025-04-25 12:16:28 +00:00
Eric Engestrom	20631a07ca	ci/test: rename .b2c-vkd3d-proton-test to .test-vkd3d-proton It has nothing to do with ci-tron, it just happens that the first vkd3d job was running on ci-tron. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34710>	2025-04-25 12:16:28 +00:00
Eric Engestrom	ce79b8a799	radv/ci: move radv-kabini-vkd3d out of gitlab-ci-inc.yml It's currently disabled, which is probably why it was accidentally moved there. While at it, fix its name to match the rest of the jobs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34710>	2025-04-25 12:16:28 +00:00
Eric Engestrom	aecdf762ce	amd/ci: ci yaml indentation Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34710>	2025-04-25 12:16:27 +00:00
Rhys Perry	b03e071583	aco/gfx11: create waitcnt for workgroup vmem barriers Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It seems this is necessary on GFX11. Similar to `576a2e798c` Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Backport-to: 25.0 Backport-to: 25.1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34634>	2025-04-25 10:41:52 +00:00
Georg Lehmann	65411350ac	aco/insert_exec: disable empty quads when leaving divergent control, even if not top level We don't restore exec after uniform top level branches, so nothing disabled empty quads after a demote in divergent control flow in a uniform branch. Foz-DB Navi31: Totals from 17 (0.02% of 79789) affected shaders: Instrs: 34573 -> 34572 (-0.00%) CodeSize: 186876 -> 186872 (-0.00%) Latency: 324145 -> 324141 (-0.00%) Copies: 1467 -> 1458 (-0.61%) PreSGPRs: 802 -> 800 (-0.25%) SALU: 2531 -> 2530 (-0.04%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34670>	2025-04-24 15:43:09 +00:00
Timur Kristóf	3ad385b9cc	radv: Clear dirty flag for clip rects state after emitting it. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com> Fixes: `0ba3a8b3cc` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>	2025-04-24 15:13:44 +00:00
Timur Kristóf	3a05477ac6	radv: Clear dirty flag for MSAA state after emitting it. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com> Fixes: `08918f0880` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13022 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>	2025-04-24 15:13:44 +00:00
Georg Lehmann	6d2190300a	radv/nir/lower_cmat: tightly pack 8bit gfx11 acc matrix Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Invalid for now, but used by vkd3d-proton, where the use case is to convert a result matrix to lower precision, followed by a store. For 16bit accumulation matrices, GFX11 only uses 16bits per 32bit register. RADV's coop matrix code pads the unused space with undefs and uses a vector with twice as many elements as the matrix length. Extending that to 8bit by leaving 24 bits unused is unnecessary as these matrices as there is no hw unit that requires it. And in wave32, it would also result in vectors larger than NIR's limit. So tightly pack 8bit matrices without any undef padding. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>	2025-04-24 06:37:44 +00:00
Georg Lehmann	bbc9bc9d24	radv/nir/lower_cmat: use cmat_mul instead of duplicating hw details for type conversion Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>	2025-04-24 06:37:44 +00:00
Georg Lehmann	31a3430570	radv/nir/lower_cmat: use radv_nir_cmat_bits consistently Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>	2025-04-24 06:37:44 +00:00
Rhys Perry	62e50de5d0	aco: use v_perm_b32 for byte swaps within a VGPR on gfx10 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34636>	2025-04-23 18:23:18 +00:00
Rhys Perry	a43783fd76	aco: use v_perm_b32 for do_pack_2x16 on gfx10+ fossil-db (gfx1201); Totals from 93 (0.12% of 79377) affected shaders: Instrs: 373212 -> 372761 (-0.12%) CodeSize: 2062752 -> 2063704 (+0.05%); split: -0.00%, +0.05% Latency: 4172059 -> 4171993 (-0.00%); split: -0.00%, +0.00% InvThroughput: 1299144 -> 1299093 (-0.00%) Copies: 51268 -> 50831 (-0.85%) Branches: 10980 -> 10979 (-0.01%) VALU: 220192 -> 219756 (-0.20%) VOPD: 48 -> 47 (-2.08%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34636>	2025-04-23 18:23:18 +00:00
Georg Lehmann	dd3e1190a2	aco/insert_exec: reset temporary when recreating wqm mask from exact mask The old, now incorrect temporary was still used for invert blocks and loop masks. Foz-DB Navi31: Totals from 379 (0.48% of 79789) affected shaders: Instrs: 399471 -> 399897 (+0.11%); split: -0.00%, +0.11% CodeSize: 2197292 -> 2198908 (+0.07%); split: -0.00%, +0.08% Latency: 2500636 -> 2500895 (+0.01%); split: -0.00%, +0.01% SClause: 7912 -> 7918 (+0.08%); split: -0.04%, +0.11% Copies: 25687 -> 26068 (+1.48%); split: -0.04%, +1.53% PreSGPRs: 15648 -> 15562 (-0.55%) SALU: 35125 -> 35517 (+1.12%) Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12901 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13019 Fixes: `b872ff6ef2` ("aco/insert_exec_mask: if applicable, use s_wqm to restore exec after divergent CF") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34659>	2025-04-23 09:37:50 +00:00
Georg Lehmann	13f6be262a	aco/insert_exec: only restore wqm mask after control flow if necessary The next commit will make this not free, so we should avoid it if possible. Foz-DB Navi31: Totals from 3933 (4.93% of 79789) affected shaders: Instrs: 5726914 -> 5727295 (+0.01%); split: -0.00%, +0.01% CodeSize: 31307100 -> 31308884 (+0.01%); split: -0.00%, +0.01% SpillSGPRs: 1797 -> 1793 (-0.22%); split: -0.33%, +0.11% Latency: 58973929 -> 58974343 (+0.00%); split: -0.00%, +0.00% InvThroughput: 8591893 -> 8591911 (+0.00%); split: -0.00%, +0.00% SClause: 209074 -> 209115 (+0.02%); split: -0.00%, +0.02% Copies: 423965 -> 432420 (+1.99%) Branches: 149976 -> 149979 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 200175 -> 200663 (+0.24%) VALU: 3440165 -> 3440156 (-0.00%); split: -0.00%, +0.00% SALU: 555727 -> 556143 (+0.07%); split: -0.00%, +0.08% Fixes: `b872ff6ef2` ("aco/insert_exec_mask: if applicable, use s_wqm to restore exec after divergent CF") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34659>	2025-04-23 09:37:50 +00:00
Pierre-Eric Pelloux-Prayer	992a340eab	ac/nir: init blake3 for cs blit shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34574>	2025-04-23 07:59:10 +00:00
Karol Herbst	e3edc6029b	ac/llvm: use mul24 intrinsics With the current code in clpeak LLVM ended up generating v_mad_u64_u32 instructions, with this we get nice v_mad_u32_s24 ones instead and an 4x performance increase in the int24 benchmark. Suggested-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34630>	2025-04-23 01:11:48 +00:00
Georg Lehmann	8f3489f351	aco/isel: create WMMA with constant C matrix if possible Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:57 +00:00
Georg Lehmann	4fa3fb87c7	aco/insert_NOPs: allow WMMA with constant C matrix Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:56 +00:00
Georg Lehmann	c3964e87f8	radv: apply fneg/fabs modifiers to wmma Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:55 +00:00
Georg Lehmann	6d7e67d986	nir,amd: add neg_lo/hi modifiers to cmat_matmul_amd Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:55 +00:00
Georg Lehmann	b0c8f31600	aco: set opsel_hi to 1 for WMMA This is ignored by the hardware but LLVM requires it to disassemble GFX12 WMMA. Cc: mesa-stable Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:54 +00:00
Eric Engestrom	2bcb55f3f6	aco: help clang 20 do some additions and subtractions clang 20 complains: ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 837 \| vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1; \| ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 5 into destination object ‘vaddr’ of size 5 832 \| uint8_t vaddr[5] = {0, 0, 0, 0, 0}; \| ^~~~~ ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 837 \| vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1; \| ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 6 into destination object ‘vaddr’ of size 5 832 \| uint8_t vaddr[5] = {0, 0, 0, 0, 0}; \| ^~~~~ ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 837 \| vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1; \| ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 7 into destination object ‘vaddr’ of size 5 832 \| uint8_t vaddr[5] = {0, 0, 0, 0, 0}; \| ^~~~~ But `i < MIN2(instr->operands.back().size() - 1, 5 - num_vaddr)` means `i` is at most `5 - num_vaddr - 1`, which means `vaddr[num_vaddr + i]` => `vaddr[num_vaddr + 5 - num_vaddr - 1]` => `vaddr[5 - 1]` => `vaddr[4]` which is within the valid indices. For some reason, using signed `int` instead allows clang to figure this out, so let's do that since we don't need the extra range. While at it, use ARRAY_SIZE(vaddr) instead of hard-coding the same `5` in several places. Backport-to: 25.0 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34625>	2025-04-21 15:16:02 +00:00
Marek Olšák	4a51089f30	radv: fix incorrect patch_outputs_read for TCS with dynamic state Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `8c2f9f0665` - radv: switch to the new TCS LDS/offchip size computation Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	2948f7ce96	ac/gpu_info: rename tess ring variables, fold double_offchip_wg Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	d2e016c37d	ac/nir: don't store tess levels for TES in TCS if no_varying is set Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	be8977811b	ac/nir: remove shader_info parameter from ac_nir_compute_tess_wg_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	6d9e708642	ac/gpu_info: reduce the tess offchip ring size and compute it proportionately .. to the CU count. We allocated too much. This reduces the tess offchip ring size as follows (examples): - GFX11-12: - Navi31, Navi33, and Navi48 get 75% decrease. - Navi32 gets 68.75% decrease. - Phoenix gets 81.25% decrease. - Phoenix2 gets 93.75% decrease. - GFX10.3: - Navi21 and Navi22 get 37.5% decrease. - Navi23 and Navi24 get 50% decrease. - Rembrandt gets 62.5% decrease. - VanGogh gets 75% decrease. - Raphael gets 93.75% decrease. - GFX8-9: - Vega10 gets 0% decrease. - Vega20 gets 49.6% decrease. - Raven gets 65.3% decrease. - Raven2 gets 93.7% decrease. - Stoney gets 81% decrease. No difference in performance was measured. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	9333c0a1ed	ac/gpu_info: compute the tess factor ring size proportionately to the CU count No change in the size on GPUs with 16 CUs per SE such as Navi31 and Navi48. Navi21 and Navi32 get 25% increase. (20 CUs per SE) APUs get a significant decrease. For example: - Phoenix gets 25% decrease - Vangogh gets 50% decrease - Phoenix2 gets 75% decrease - Raphael and Stoney get 87.5% decrease Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	5fb2de9454	ac/nir: don't include TCS offchip size in LDS_SIZE This drastically reduces LDS usage for TCS. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	b8f2fb81f6	ac/gpu_info: print tessellation ring info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	b8d15fee3d	ac: minor cleanup of ac_compute_num_tess_patches No change in behavior. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	a905a17f39	ac: use HS offchip wg size from radeon_info in ac_compute_num_tess_patches Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	d82eda72a1	ac/gpu_info: move HS info into radeon_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	ea294349bd	radv: move the tess factor ring after the tess offchip ring to match radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:54:59 -04:00
Marek Olšák	c057d9105f	ac/gpu_info: add total_tess_ring_size Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:54:59 -04:00

1 2 3 4 5 ...

17426 commits