mesa/src
Rhys Perry 7c6be36cf4 aco: don't emit waitcnts before subgroup-scope execution barriers
This delays the waitcnt for has_attr_ring_wait_bug by a few instructions.

fossil-db (gfx1201):
Totals from 9 (0.00% of 208640) affected shaders:
Instrs: 19352 -> 19506 (+0.80%)
CodeSize: 101180 -> 101716 (+0.53%)
Latency: 660221 -> 678782 (+2.81%); split: -0.00%, +2.81%
InvThroughput: 95106 -> 97398 (+2.41%)

fossil-db (navi33):
Totals from 58834 (28.20% of 208626) affected shaders:
Instrs: 22424304 -> 22424571 (+0.00%)
CodeSize: 110198112 -> 110199184 (+0.00%)
Latency: 115894319 -> 126491124 (+9.14%); split: -0.00%, +9.14%
InvThroughput: 19424631 -> 19754358 (+1.70%); split: -0.00%, +1.70%

I don't think the stats are very accurate. This seems to often move the
s_waitcnt down into a divergent branch, but the wait still happens later
if the branch isn't taken, so the wait is counted twice.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>
2026-06-10 12:13:18 +00:00
..
amd aco: don't emit waitcnts before subgroup-scope execution barriers 2026-06-10 12:13:18 +00:00
android_stub android_stub: purge unused log utils 2026-05-01 20:23:23 +00:00
asahi hk: use drirc_gen 2026-06-10 07:17:14 +00:00
broadcom v3dv: use drirc_gen 2026-06-10 07:17:14 +00:00
c11
compiler nir/lower_undef_to_zero: add filter argument 2026-06-10 06:05:05 +00:00
drm-shim drm-shim: Include the hex of the driver ioctl for unimplemented ioctls. 2026-06-04 20:17:34 +00:00
egl egl/gbm: Eliminate max_age local variable 2026-05-29 16:38:01 +00:00
etnaviv etnaviv: Update headers from rnndb 2026-06-06 22:02:04 +00:00
freedreno util/drirc: remove the driver option in drirc_validate 2026-06-10 07:17:14 +00:00
gallium iris: only call brw_nir_fs_needs_null_rt() with no render targets 2026-06-10 11:08:47 +00:00
gbm gbm: Replace VER_MIN with common MIN2 2026-04-30 13:00:03 +00:00
getopt
gfxstream gfxstream: kumquat: validate device dmabuf support before use 2026-06-08 21:38:19 +00:00
glx glx/windows: Drop static from driwindowsCreateScreen() 2026-05-18 13:33:35 +00:00
gtest
imagination pvr: use drirc_gen 2026-06-10 07:17:14 +00:00
imgui imgui: update copy and port all tools using it 2026-04-30 10:59:45 +00:00
intel brw: fix null render target decision 2026-06-10 11:08:47 +00:00
kosmickrisp nir/lower_undef_to_zero: add filter argument 2026-06-10 06:05:05 +00:00
loader loader: check if the kernel driver is amdgpu 2026-05-27 10:19:50 +00:00
mesa mesa/main: cast GLhandleARB to unsigned int in api trace 2026-06-05 23:43:49 -07:00
microsoft dzn: use drirc_gen 2026-06-10 07:17:14 +00:00
nouveau nvk: use the new generation script for drirc 2026-06-10 07:17:14 +00:00
panfrost panvk: Advertise VK_GOOGLE_display_timing 2026-06-10 11:33:00 +00:00
poly poly: Fix range used for index unroll bounds checks 2026-05-26 10:39:00 +00:00
tool pps: Re-emit time clock_sync more regularly 2026-05-06 21:37:15 +00:00
util util/drirc: remove the driver option in drirc_validate 2026-06-10 07:17:14 +00:00
virtio venus: use drirc_gen 2026-06-10 07:17:14 +00:00
vulkan vulkan/wsi: Constify wsi_instance_supports_google_display_timing(..) 2026-06-10 11:33:00 +00:00
x11 meson: Add support for buidling zink + Turnip/KGSL 2026-03-31 15:00:29 +00:00
.clang-format intel: add Jay 2026-04-10 18:21:21 +00:00
meson.build bin: add drm-shim script 2026-06-08 23:07:43 +00:00