Samuel Pitoiset
9b774963fe
radv/ci: set RADV_DEBUG=novideo for NAVI21
...
Otherwise, the jobs just hang.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:50 +00:00
Samuel Pitoiset
82cd2df7b0
radv/ci: bump number of deqp-runner jobs to 32 for GFX1201
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:50 +00:00
Samuel Pitoiset
6fd1b9b397
radv/ci: drop RADV_PERFTEST=video_decode,video_encode for NAVI31
...
With up-to-date video firmwares, these flags are no longer needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:50 +00:00
Samuel Pitoiset
49d780db93
radv/ci: use the custom 6.17.3 kernel for POLARIS10
...
Looks like the SDMA regression is no longer reproducible.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:49 +00:00
Samuel Pitoiset
07d1461c53
radv/ci: use the custom 6.17.3 kernel for NAVI21/NAVI31
...
Now that the zerovram performance regression is fixed, everything
should be fine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:49 +00:00
Samuel Pitoiset
fc178c047d
radv/ci: uprev kernel to 6.17.3 + drm/buddy backported fixes for zerovram
...
Until we have a stable kernel which contains these fixes.
This applies to all jobs except NAVI21/NAVI31 which still use 6.6 and
POLARIS10 which is stucked to 6.15.9 for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37934 >
2025-10-20 10:30:48 +00:00
Benjamin Cheng
b1370e1935
radv/video: Fill maxCodedExtent caps first
...
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
Later code (i.e. max qp map extent filling) depends on this.
Fixes: ae6ea69c85 ("radv: Implement VK_KHR_video_encode_quantization_map")
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37940 >
2025-10-17 17:58:08 +00:00
Josh Simmons
b10c1a1952
radv: Fix crash in sqtt due to uninitalized value
...
Fixes: 772b9ce411 ("radv: Remove qf from radv_spm/sqtt/perfcounter where applicable")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37900 >
2025-10-17 06:10:46 +00:00
Samuel Pitoiset
abcaa46f6c
amd,radv,radeonsi: add ac_cmdbuf_flush_vgt_streamout()
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:41 +00:00
Samuel Pitoiset
679332f9a9
amd,radv,radeonsi: add ac_emit_cp_acquire_mem()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:40 +00:00
Samuel Pitoiset
9ad7fb8569
amd,radv,radeonsi: add ac_emit_cp_gfx_scratch()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:40 +00:00
Samuel Pitoiset
9ff8e71b4e
amd,radv,radeonsi: add ac_emit_cp_tess_rings()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:39 +00:00
Samuel Pitoiset
47a64f5b6f
amd,radv,radeonsi: add ac_emit_cp_gfx11_ge_rings()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:38 +00:00
Samuel Pitoiset
8f80a8502d
radv: use ac_emit_cp_pfp_sync_me() more
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:37 +00:00
Samuel Pitoiset
044bafb6ac
amd: add a predicate parameter to ac_emit_cp_pfp_sync_me()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:36 +00:00
Samuel Pitoiset
48b4a43e8f
amd,radv,radeonsi: add ac_emit_cp_set_predication()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:36 +00:00
Samuel Pitoiset
ad907efae2
radv: use ac_emit_cond_exec() more
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870 >
2025-10-16 06:31:35 +00:00
Alyssa Rosenzweig
84d8e6824b
treewide: don't check before free
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.
Via Coccinelle patches:
@@
expression ptr;
@@
-if (ptr) {
-free(ptr);
-}
+free(ptr);
@@
expression ptr;
@@
-if (ptr) {
-FREE(ptr);
-}
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr) {
-ralloc_free(ptr);
-}
+ralloc_free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-free(ptr);
-}
-
+free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-FREE(ptr);
-}
-
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-ralloc_free(ptr);
-}
-
+ralloc_free(ptr);
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892 >
2025-10-15 23:01:33 +00:00
Timur Kristóf
4982f435f9
radv: Document SWITCH_ON_EOP and WD_SWITCH_ON_EOP
...
Just add some code comments for the next person trying to
understand these bits. No functional changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885 >
2025-10-15 18:08:50 +00:00
Timur Kristóf
8ea08747b8
radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR
...
Mitigate a GPU hang in Dota 2 and Rise of the Tomb Raider
by reducing the primitive rate for triangle lists.
This workaround is not documented by AMD and may not be correct.
The problem isn't well understood and needs further investigation
to narrow down what the root cause is. Until then, it's better
to give users something that works, even if not optimal.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885 >
2025-10-15 18:08:50 +00:00
Timur Kristóf
6f499141f5
radv: Disable compute queues when the regalloc bug is present
...
Compute queues may run compute dispatches in parallel with
the graphics queue, even from other processes/apps.
At the moment we can't make sure that all compute shaders
use a workgroup size of 256 to mitigate the regalloc hang,
so disable compute queues on affected chips.
Can be reverted if a better mitigation is found in the future.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885 >
2025-10-15 18:08:49 +00:00
Rhys Perry
2985fb0df3
radv: allow WGP mode with task/mesh
...
In practice, mesh shaders probably won't use WGP mode because they use
LDS.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791 >
2025-10-15 13:37:48 +01:00
Rhys Perry
dd2f34c777
radv: use CU mode when LDS is used
...
This improves performance of llama.cpp.
fossil-db (navi21):
Totals from 1598 (2.00% of 79825) affected shaders:
MaxWaves: 30182 -> 29278 (-3.00%); split: +0.04%, -3.03%
Instrs: 1013136 -> 1013065 (-0.01%); split: -0.07%, +0.07%
CodeSize: 5275876 -> 5274948 (-0.02%); split: -0.06%, +0.04%
VGPRs: 86176 -> 88016 (+2.14%); split: -0.22%, +2.36%
SpillVGPRs: 0 -> 11 (+inf%)
Scratch: 0 -> 4096 (+inf%)
Latency: 7954289 -> 7824742 (-1.63%); split: -1.64%, +0.01%
InvThroughput: 1511429 -> 1510912 (-0.03%); split: -0.89%, +0.86%
VClause: 26503 -> 26460 (-0.16%); split: -0.23%, +0.07%
SClause: 19032 -> 19039 (+0.04%); split: -0.01%, +0.05%
Copies: 74577 -> 74329 (-0.33%); split: -0.79%, +0.46%
Branches: 20278 -> 20279 (+0.00%)
VALU: 665079 -> 664831 (-0.04%); split: -0.09%, +0.05%
SALU: 124899 -> 124818 (-0.06%); split: -0.08%, +0.01%
VMEM: 46141 -> 46163 (+0.05%)
fossil-db (navi31):
Totals from 1609 (2.02% of 79825) affected shaders:
MaxWaves: 39724 -> 38880 (-2.12%)
Instrs: 1147767 -> 1147595 (-0.01%); split: -0.04%, +0.03%
CodeSize: 5777072 -> 5776376 (-0.01%); split: -0.03%, +0.02%
VGPRs: 91752 -> 93132 (+1.50%); split: -0.03%, +1.53%
Latency: 7526930 -> 7396201 (-1.74%); split: -1.74%, +0.00%
InvThroughput: 1083131 -> 1088328 (+0.48%); split: -0.45%, +0.93%
VClause: 25864 -> 25789 (-0.29%); split: -0.33%, +0.04%
SClause: 19136 -> 19135 (-0.01%); split: -0.02%, +0.01%
Copies: 80797 -> 80501 (-0.37%); split: -0.42%, +0.05%
VALU: 674455 -> 674160 (-0.04%); split: -0.05%, +0.01%
SALU: 123849 -> 123806 (-0.03%)
fossil-db (gfx1201):
Totals from 1614 (2.02% of 79839) affected shaders:
MaxWaves: 40140 -> 39296 (-2.10%)
Instrs: 1183227 -> 1183102 (-0.01%); split: -0.04%, +0.03%
CodeSize: 6091060 -> 6090636 (-0.01%); split: -0.03%, +0.03%
VGPRs: 90708 -> 92040 (+1.47%); split: -0.01%, +1.48%
Latency: 7588683 -> 7425866 (-2.15%); split: -2.15%, +0.00%
InvThroughput: 1070469 -> 1075700 (+0.49%); split: -0.50%, +0.99%
VClause: 25691 -> 25597 (-0.37%); split: -0.37%, +0.00%
SClause: 19095 -> 19086 (-0.05%); split: -0.05%, +0.01%
Copies: 80753 -> 80452 (-0.37%); split: -0.42%, +0.05%
VALU: 665218 -> 664922 (-0.04%); split: -0.05%, +0.01%
SALU: 144059 -> 144011 (-0.03%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791 >
2025-10-15 13:37:48 +01:00
Rhys Perry
3e9921f52e
radv: only call radv_should_use_wgp_mode() once
...
This will let the compiler choose between CU and WGP mode.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791 >
2025-10-15 13:37:48 +01:00
Daniel Schürmann
b2c44e3a65
amd: change radeon_info::lds_size_per_workgroup for GFX10+ to 64KB
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even though in WGP-mode, there is a total of 128KB LDS, a single workgroup
can access at most 64KB.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:09 +00:00
Daniel Schürmann
eecd1c020d
amd: keep ac_shader_config::lds_size unaligned
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:09 +00:00
Daniel Schürmann
fe6ff6d1ef
aco: remove DeviceInfo::lds_encoding_granule and DeviceInfo::lds_alloc_granule
...
Use utility functions instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:08 +00:00
Daniel Schürmann
f99eba729d
amd/common: remove radeon_info::lds_alloc_granularity and radeon_info::lds_encode_granularity
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:08 +00:00
Daniel Schürmann
6fd5766620
amd: add and use utility functions for LDS size encoding
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:08 +00:00
Daniel Schürmann
11db02d5d9
radv: calculate LDS allocation requirements independently from the compiler
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:07 +00:00
Daniel Schürmann
b651234414
amd: change ac_shader_config::lds_size to bytes
...
We still keep it aligned to allocation granularity.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:07 +00:00
Daniel Schürmann
190bab9b77
radv: use lds_alloc_granularity alignment for stats
...
Totals from 780 (0.98% of 79839) affected shaders: (Navi48)
LDS: 5544960 -> 5944320 (+7.20%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:07 +00:00
Daniel Schürmann
c35d53667e
radv: fix max_waves calculation for tesselation
...
Totals from 337 (0.42% of 79839) affected shaders: (Navi48)
MaxWaves: 8084 -> 4474 (-44.66%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:07 +00:00
Daniel Schürmann
216d14fb65
radv/rt: fix LDS size calculation with LLVM for inlined stages
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577 >
2025-10-15 11:20:07 +00:00
David Rosca
282e8285f1
ac/surface: Limit video modifiers to 64K_S also for VCN 2.2
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14032
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37819 >
2025-10-15 10:24:29 +00:00
Samuel Pitoiset
e7695cc7c7
radv/ci: bump timeout for radv-gfx1201-vkcts to 5 minutes more
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37665 >
2025-10-15 09:18:48 +00:00
Samuel Pitoiset
24a95ed3e5
ci: uprev VKCTS main to db48c34bebaf3359453e44ab151a2ff9f9c58eb2
...
RADV is the only driver using main.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37665 >
2025-10-15 09:18:48 +00:00
Samuel Pitoiset
88f53906d8
amd,radv,radeonsi: add ac_emit_cp_pfp_sync_me()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:25 +00:00
Samuel Pitoiset
7ead034a06
amd,radv,radeonsi: add ac_emit_cp_copy_data()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:25 +00:00
Samuel Pitoiset
af169d7393
radv: use ac_emit_cp_{acquire,release}_mem_pws() when syncing GE rings
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:24 +00:00
Samuel Pitoiset
0e358cec52
amd,radv,radeonsi: add ac_emit_cp_release_mem_pws()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:24 +00:00
Samuel Pitoiset
c45035ceb4
amd,radv,radeonsi: add ac_emit_cp_acquire_mem_pws()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:23 +00:00
Samuel Pitoiset
6329e282b8
amd,radv,radeonsi: add ac_emit_cp_wait_mem()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:23 +00:00
Samuel Pitoiset
82cbfc964a
amd,radv: add ac_emit_write_data_imm()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:20 +00:00
Samuel Pitoiset
ac262c351f
amd,radv: add ac_emit_cond_exec()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:20 +00:00
Samuel Pitoiset
34e445e1fc
radv: use COPY_DATA_DST_MEM when writing timestamps
...
Only for consistency.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813 >
2025-10-15 07:35:19 +00:00
David Rosca
3e458d06b8
radeonsi/vcn: Support BT2020 matrix with EFC
...
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37058 >
2025-10-15 06:06:44 +00:00
Daniel Schürmann
d0b87a0d5f
ac/nir_flag_smem_for_loads: call divergence analysis internally
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Also don't flag more SMEM instructions (in ACO) after the last
call to ac_nir_lower_mem_access_bit_sizes().
Totals from 75 (0.09% of 79839) affected shaders: (Navi48)
Instrs: 191246 -> 189960 (-0.67%)
CodeSize: 996840 -> 985976 (-1.09%)
Latency: 3066184 -> 2945500 (-3.94%)
InvThroughput: 355373 -> 353106 (-0.64%); split: -0.66%, +0.02%
SClause: 4848 -> 4699 (-3.07%)
Copies: 13827 -> 13925 (+0.71%); split: -0.07%, +0.78%
Branches: 5176 -> 5003 (-3.34%)
PreSGPRs: 6222 -> 6272 (+0.80%)
VALU: 108934 -> 108993 (+0.05%); split: -0.00%, +0.06%
SALU: 31679 -> 31210 (-1.48%); split: -1.51%, +0.03%
SMEM: 7158 -> 6739 (-5.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843 >
2025-10-14 16:33:12 +00:00
Daniel Schürmann
c8830a1a79
radv: delay ac_nir_lower_mem_access_bit_sizes
...
Totals from 937 (1.17% of 79839) affected shaders: (Navi48)
MaxWaves: 25196 -> 25272 (+0.30%); split: +0.33%, -0.03%
Instrs: 619389 -> 617586 (-0.29%); split: -0.38%, +0.09%
CodeSize: 3161156 -> 3147008 (-0.45%); split: -0.49%, +0.04%
VGPRs: 54456 -> 54300 (-0.29%); split: -0.31%, +0.02%
Latency: 1548791 -> 1550501 (+0.11%); split: -0.16%, +0.27%
InvThroughput: 233599 -> 233561 (-0.02%); split: -0.10%, +0.08%
Copies: 42820 -> 42799 (-0.05%); split: -0.63%, +0.58%
PreVGPRs: 43480 -> 43479 (-0.00%)
VALU: 348357 -> 348358 (+0.00%); split: -0.08%, +0.08%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843 >
2025-10-14 16:33:12 +00:00
Daniel Schürmann
9b1a635bb3
amd/common: merge radv_nir_opt_access_speculate() into ac_nir_flag_smem_for_loads()
...
One shader is negatively affected, but we save 2 entire iterations over every shader.
This effect is also mitigated with the next commits.
Totals from 1 (0.00% of 79839) affected shaders: (Navi48)
Instrs: 947 -> 958 (+1.16%)
CodeSize: 4728 -> 4732 (+0.08%)
Latency: 20678 -> 20723 (+0.22%)
InvThroughput: 2697 -> 2698 (+0.04%)
SClause: 26 -> 27 (+3.85%)
Copies: 139 -> 145 (+4.32%)
Branches: 46 -> 47 (+2.17%)
VALU: 460 -> 463 (+0.65%)
SALU: 201 -> 204 (+1.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843 >
2025-10-14 16:33:12 +00:00