Commit graph

18434 commits

Author SHA1 Message Date
Samuel Pitoiset
d40e841cc4 radv: dirty some states from graphics pipeline earlier
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This might actually fixes a couple of things because needed dynamic
states are computed before radv_emit_graphics_pipeline(), so dirtying
them too late doesn't make much sense.

This doesn't fix anything known.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36900>
2025-08-21 15:45:48 +00:00
Samuel Pitoiset
5024c02d45 radv: precompute the depth clip enable
This should avoid re-emitting some states if it doesn't actually change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36852>
2025-08-21 08:23:04 +00:00
Samuel Pitoiset
2b5844df0e radv: precompute the depth clamp mode
This should avoid re-emitting the state if it doesn't actually change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36852>
2025-08-21 08:23:04 +00:00
Samuel Pitoiset
413f781234 radv: add a new dirty bit for the viewport state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36852>
2025-08-21 08:23:03 +00:00
Samuel Pitoiset
2733b2953e radv: emit depth clamp enable as part of the viewport state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36852>
2025-08-21 08:23:02 +00:00
Samuel Pitoiset
9c6f37c533 radv: get the depth clamp mode earlier when emitting viewports
Outside of the loop is also faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36852>
2025-08-21 08:23:01 +00:00
Yiwei Zhang
ea902a0e41 radv: advertise present_id/wait behind RADV_USE_WSI_PLATFORM
wsi_common_vk_instance_supports_present_wait returns true for all
supported wsi platforms here, so we can unconditionally advertise them
behind RADV_USE_WSI_PLATFORM like the other wsi extensions (also to not
tangle with Android).

v2: guard presentId2 and presentWait2 features as well

v3: drop direct option query for vk_khr_present_wait

Acked-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36835>
2025-08-21 07:53:15 +00:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Marek Olšák
271a1d8dd9 util/hash_table: don't allocate hash_table_u64::table, declare it statically
We can use _mesa_hash_table_init instead of _mesa_hash_table_create.
It doesn't have to be allocated.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Marek Olšák
ed246aafd8 util/set: set _mesa_set_init return type to void
it always returns true because it no longer allocates anything

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Marek Olšák
c12118decf radv,zink,st/mesa: use _mesa_set_fini instead of ralloc_free
This is the correct way to free the set.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:47 +00:00
Yonggang Luo
9034a19aba radv: Fixes warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type';
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): error C2220: the following warning is treated as an error
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1040): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1052): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/layers/radv_sqtt_layer.c(1059): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

../src/amd/vulkan/radv_dgc.c(2155): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_dgc.c(2155): warning C5287: operands are different enum types 'rgp_sqtt_marker_event_type' and 'rgp_sqtt_marker_general_api_type'; use an explicit cast to silence this warning
../src/amd/vulkan/radv_dgc.c(2155): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
58e55a9e45 radv: Fixes warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
../src/amd/vulkan/radv_pipeline.c(148): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_pipeline.c(148): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
to silence this warning
../src/amd/vulkan/radv_pipeline.c(148): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without
warnings
../src/amd/vulkan/radv_pipeline.c(150): warning C5287: operands are different enum types 'VkShaderStageFlagBits' and '<unnamed-enum-RADV_GRAPHICS_STAGE_BITS>'; use an explicit cast
to silence this warning
../src/amd/vulkan/radv_pipeline.c(150): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without
warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
1430798eac radv: Fixes warning implicit conversion from enum type
../src/amd/vulkan/radv_pipeline_rt.c(142): error C2220: the following warning is treated as an error
../src/amd/vulkan/radv_pipeline_rt.c(142): warning C5286: implicit conversion from enum type 'VkShaderGroupShaderKHR' to enum type 'VkRayTracingShaderGroupTypeKHR'; use an explicit cast to silence this warning
../src/amd/vulkan/radv_pipeline_rt.c(142): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
Yonggang Luo
652e0d8ccf amdcommon: Use { 0 } initialize struct for .c files
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36862>
2025-08-20 11:39:19 +00:00
David Rosca
f4808ea46f radv/video: Add support for VK_KHR_video_encode_intra_refresh
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36718>
2025-08-20 10:58:00 +00:00
Georg Lehmann
639b91bb48 aco/isel: fix vectorized i2i16 with 8bit vec8 source
The extract index is in dwords, not bytes.

Fixes: 92d433c54a ("aco: vectorize conversions from 8bit to 16bit")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>
2025-08-20 10:13:22 +00:00
David Rosca
638fa01203 radv/video: Enable AV1 decode workaround for gfx1153
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>
2025-08-20 09:51:32 +00:00
David Rosca
231d877cc8 ac/vcn_dec: Add av1_intrabc_workaround
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36725>
2025-08-20 09:51:32 +00:00
Samuel Pitoiset
e10d955bc4 radv/ci: document a very recent ACO regression on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>
2025-08-20 06:31:15 +00:00
Samuel Pitoiset
eaaef8db5a radv/ci: make radv-gfx1201-vkcts a pre-merge job
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>
2025-08-20 06:31:14 +00:00
Samuel Pitoiset
640aed5727 radv/ci: reduce the timeout for radv-gfx1201-vkcts
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>
2025-08-20 06:31:14 +00:00
Samuel Pitoiset
9b9f62125b radv/ci: use 3 parallel jobs for radv-gfx1201-vkcts
For pre-merge testing, it's required to be around 10 minutes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>
2025-08-20 06:31:14 +00:00
Samuel Pitoiset
d25952c3d3 radv/ci: update expected list of failures/flakes on GFX1201
50 runs in a row without any unexpected failures/hangs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806>
2025-08-20 06:31:14 +00:00
Kovac, Krunoslav
9452f2ca3f amd/vpelib: Minor Refactor
[WHY]
There will be more conditions for bypassing degamma, so refactor.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Chan, Roy
dda6a76b54 amd/vpelib: check stream_count as well before accessing streams
[WHY]
It was found that the caller may call with stream_count = 0, while
streams array is some garbage.
it randomly ends up output_ctx being modified and leading to validation
failure.

[HOW]
Add checking to the stream_count.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Roy Chan <Roy.Chan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Zhao, Jiali
2b50600a71 amd/vpelib: Extend TMZ value to 8 bit
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Ansari, Muhammad
c26cf7f74d amd/vpelib: VPE Events
[WHY]
For further debugging need to know about the build cmd variables.

[HOW]
Added these input and output paramaters to vpe events.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Leder, Brendan Steve (Brendan)
a486404e4d amd/vpelib: General cleanup / optimization tasks
Various small optimizations that have been accumulating, deal with them
in one commit:

- Add erase functionality for vector util, remove memsets for time opt.
- Update should_gen_cmd_info to take in any stream variables.
- Program funcs should directly program - update mpcc mux hook func to
  take in blend_mode.
- Add reserved bits for debug flags.

Signed-off-by: Brendan Steven, Leder <BrendanSteven.Leder@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Okenczyc, Andrzej
e5cdc78e0e amd/vpelib: Move predication size calculation to bufs_req
Calculation for the worst case scenario in bufs_req should also include
predication command size.

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Andrzei Okenczyc <Andrzej.Okenczyc@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Assadian, Navid
fbeaca1202 amd/vpelib: Add necessary pointer casting
Add necessary pointer casting to prevent unexpected behavior

Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Navid Assadian <Navid.Assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
2025-08-20 10:42:01 +08:00
Natalie Vock
4de3a5cce3 radv: Only expose indirect raytracing on gfx7+
It relies on unaligned indirect dispatches which are broken on gfx6.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30811>
2025-08-19 18:34:41 +00:00
Samuel Pitoiset
baaf5d643a radv: emit inlined push constants with buffered SH regs on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>
2025-08-19 18:01:23 +00:00
Samuel Pitoiset
c710eaa443 radv: emit descriptor pointers with buffered SH regs on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>
2025-08-19 18:01:22 +00:00
Samuel Pitoiset
95d2f009a9 radv: emit compute pipeline with buffered SH regs on GFX12
This also includes RT, task shaders and DGC IES for compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>
2025-08-19 18:01:21 +00:00
Samuel Pitoiset
bbf8338443 radv: rework the helper to emit buffered regs on GFX12
Also reserve enough space if needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>
2025-08-19 18:01:21 +00:00
Samuel Pitoiset
1f26f93aa7 radv: emit relocation for task shaders at the same place as other stages
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>
2025-08-19 18:01:21 +00:00
Daniel Schürmann
0546ecfadb aco/scheduler: small refactor of schedule_VMEM()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann
0c590eb903 aco/scheduler: schedule VMEM store clauses during the regular forward pass
Totals from 1456 (1.82% of 79839) affected shaders: (Navi48)

MaxWaves: 37780 -> 37128 (-1.73%); split: +0.15%, -1.87%
Instrs: 3788175 -> 3788435 (+0.01%); split: -0.04%, +0.04%
CodeSize: 20468648 -> 20467432 (-0.01%); split: -0.04%, +0.03%
VGPRs: 86820 -> 91440 (+5.32%); split: -0.10%, +5.42%
Latency: 26866232 -> 26858867 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 3491741 -> 3828339 (+9.64%); split: -0.02%, +9.66%
VClause: 90413 -> 89426 (-1.09%); split: -1.27%, +0.18%
SClause: 130532 -> 130530 (-0.00%); split: -0.00%, +0.00%
Copies: 347397 -> 347806 (+0.12%); split: -0.11%, +0.23%
Branches: 117476 -> 117496 (+0.02%)
VALU: 1897427 -> 1897830 (+0.02%); split: -0.02%, +0.04%
SALU: 602365 -> 602379 (+0.00%)
VOPD: 1259 -> 1251 (-0.64%); split: +0.24%, -0.87%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann
f601eb8555 aco/scheduler: move clauses as batch
Totals from 391 (0.49% of 79839) affected shaders:

Instrs: 612478 -> 612515 (+0.01%); split: -0.06%, +0.06%
CodeSize: 3342896 -> 3343228 (+0.01%); split: -0.04%, +0.05%
Latency: 6909794 -> 6909938 (+0.00%); split: -0.03%, +0.03%
VClause: 10752 -> 10167 (-5.44%); split: -5.46%, +0.02%
Copies: 26623 -> 26627 (+0.02%); split: -0.00%, +0.02%
VALU: 377494 -> 377499 (+0.00%); split: -0.00%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann
70f0c065e8 aco/scheduler: ignore potential SMEM stalls when forming clauses
Totals from 4190 (5.25% of 79839) affected shaders: (Navi48)

MaxWaves: 117020 -> 117014 (-0.01%)
Instrs: 4801892 -> 4801547 (-0.01%); split: -0.06%, +0.05%
CodeSize: 25327632 -> 25325500 (-0.01%); split: -0.05%, +0.04%
VGPRs: 236452 -> 236488 (+0.02%)
Latency: 30569070 -> 30539464 (-0.10%); split: -0.13%, +0.04%
InvThroughput: 4891650 -> 4891062 (-0.01%); split: -0.03%, +0.01%
VClause: 119615 -> 118763 (-0.71%); split: -1.02%, +0.31%
SClause: 100482 -> 100297 (-0.18%); split: -0.44%, +0.26%
Copies: 326644 -> 326756 (+0.03%); split: -0.19%, +0.22%
Branches: 98982 -> 98980 (-0.00%)
VALU: 2712397 -> 2712534 (+0.01%); split: -0.02%, +0.03%
SALU: 591836 -> 591817 (-0.00%); split: -0.00%, +0.00%
VOPD: 993 -> 987 (-0.60%); split: +0.20%, -0.81%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:11 +00:00
Daniel Schürmann
d3a0f268b9 aco/scheduler: short-cut downwards_move_clause() when no movement is done
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:11 +00:00
Daniel Schürmann
8543b6cf2e aco/scheduler: remove DownwardsCursor::clause_demand
As we stop scheduling after forming clauses, this value
is not needed anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:10 +00:00
Daniel Schürmann
5ae30deffb aco/scheduler: remove DownwardsCursor::insert_demand_clause
This partially reverts 93872270f0 ('aco/scheduler: keep track of RegisterDemand at DownwardsCursor::insert_idx{_clause}').

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:10 +00:00
Daniel Schürmann
e95d728a98 aco/scheduler: split downwards_move_clause() from downwards_move()
We will do batched moves for clauses with the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:09 +00:00
Daniel Schürmann
37299a8d1a aco/scheduler: Stop downwards scheduling after encountering the first clause
Totals from 9899 (12.40% of 79839) affected shaders: (Navi48)

MaxWaves: 276355 -> 276317 (-0.01%); split: +0.01%, -0.02%
Instrs: 8781768 -> 8766504 (-0.17%); split: -0.25%, +0.07%
CodeSize: 46297556 -> 46236104 (-0.13%); split: -0.19%, +0.06%
VGPRs: 574680 -> 574800 (+0.02%); split: -0.00%, +0.03%
Latency: 54261324 -> 54357916 (+0.18%); split: -0.14%, +0.32%
InvThroughput: 9122700 -> 9121115 (-0.02%); split: -0.07%, +0.05%
VClause: 222062 -> 218499 (-1.60%); split: -2.33%, +0.73%
SClause: 167138 -> 163233 (-2.34%); split: -2.43%, +0.09%
Copies: 602395 -> 598560 (-0.64%); split: -1.21%, +0.57%
Branches: 161939 -> 161932 (-0.00%); split: -0.01%, +0.00%
VALU: 5063999 -> 5060199 (-0.08%); split: -0.14%, +0.07%
SALU: 988254 -> 988285 (+0.00%); split: -0.02%, +0.02%
VOPD: 2478 -> 2443 (-1.41%); split: +0.40%, -1.82%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:09 +00:00
Daniel Schürmann
fb6b95517e aco/scheduler: check dependencies of entire clause upfront
and bail if any instruction of the clause can't be moved.

Totals from 4310 (5.40% of 79839) affected shaders:

MaxWaves: 115826 -> 115834 (+0.01%)
Instrs: 6256436 -> 6257599 (+0.02%); split: -0.05%, +0.07%
CodeSize: 32816488 -> 32820768 (+0.01%); split: -0.04%, +0.05%
VGPRs: 260184 -> 260172 (-0.00%)
Latency: 41207213 -> 41052150 (-0.38%); split: -0.45%, +0.07%
InvThroughput: 6822608 -> 6815208 (-0.11%); split: -0.14%, +0.03%
VClause: 148412 -> 147133 (-0.86%); split: -1.03%, +0.17%
SClause: 120854 -> 120856 (+0.00%); split: -0.01%, +0.01%
Copies: 425910 -> 427276 (+0.32%); split: -0.25%, +0.57%
VALU: 3572293 -> 3573647 (+0.04%); split: -0.03%, +0.07%
VOPD: 2803 -> 2816 (+0.46%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:08 +00:00
Daniel Schürmann
7e63251d1f aco/isel: refactor store_shared() by directly matching NIR intrinsics to ACO opcodes
Totals from 1435 (1.80% of 79839) affected shaders: (Navi48)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133>
2025-08-19 14:28:15 +00:00
Daniel Schürmann
e504c2543a radv: unconditionally call ac_nir_lower_mem_access_bit_sizes()
radv_nir_lower_io_to_mem() might also create unaligned memory accesses.

Totals from 1339 (1.68% of 79839) affected shaders: (Navi48)

MaxWaves: 35424 -> 35408 (-0.05%); split: +0.07%, -0.12%
Instrs: 1080783 -> 1047739 (-3.06%)
CodeSize: 5559464 -> 5311520 (-4.46%)
VGPRs: 78900 -> 78852 (-0.06%); split: -0.17%, +0.11%
Latency: 2802027 -> 2769668 (-1.15%); split: -1.16%, +0.01%
InvThroughput: 439935 -> 439313 (-0.14%); split: -0.23%, +0.09%
SClause: 15188 -> 15187 (-0.01%)
Copies: 63302 -> 62585 (-1.13%); split: -1.35%, +0.22%
PreVGPRs: 64891 -> 64901 (+0.02%)
VALU: 604979 -> 605116 (+0.02%); split: -0.04%, +0.06%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133>
2025-08-19 14:28:15 +00:00
Daniel Schürmann
1fde289539 aco/isel: refactor load_shared() by directly matching NIR intrinsics to ACO opcodes
Totals from 3 (0.00% of 79839) affected shaders: (Navi48)

Instrs: 700 -> 698 (-0.29%)
CodeSize: 3860 -> 3852 (-0.21%)
Latency: 2351 -> 2349 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133>
2025-08-19 14:28:15 +00:00