mesa/src
Iago Toral Quiroga 086ed1e54b broadcom/compiler: emit instructions producing flags earlier
We usually emit flags right before consuming them but this is
suboptimal from the point of view of register pressure: if an
instruction is only used to generate flags then waiting to emit
it right before reading the flags extends the liveness of the
sources used to generate the flags for no gain. This pass will
check for such instructions and try to move them as early as
possible.

Shader-db results below show this is effective to reduce register
pressure, allowing a few shaders to increase thread counts and/or
reduce spilling:

total instructions in shared programs: 11057173 -> 11057076 (<.01%)
instructions in affected programs: 1955543 -> 1955446 (<.01%)
helped: 4214
HURT: 3905
Inconclusive result (value mean confidence interval includes 0).

total threads in shared programs: 425096 -> 425170 (0.02%)
threads in affected programs: 74 -> 148 (100.00%)
helped: 37
HURT: 0
Threads are helped.

total uniforms in shared programs: 3846275 -> 3845674 (-0.02%)
uniforms in affected programs: 23574 -> 22973 (-2.55%)
helped: 217
HURT: 30
Uniforms are helped.

total max-temps in shared programs: 2222910 -> 2220488 (-0.11%)
max-temps in affected programs: 61904 -> 59482 (-3.91%)
helped: 2145
HURT: 113
Max-temps are helped.

total spills in shared programs: 4294 -> 4280 (-0.33%)
spills in affected programs: 148 -> 134 (-9.46%)
helped: 8
HURT: 0

total fills in shared programs: 6497 -> 6468 (-0.45%)
fills in affected programs: 291 -> 262 (-9.97%)
helped: 8
HURT: 0

total sfu-stalls in shared programs: 14344 -> 14611 (1.86%)
sfu-stalls in affected programs: 1308 -> 1575 (20.41%)
helped: 217
HURT: 335
Inconclusive result (%-change mean confidence interval includes 0).

total inst-and-stalls in shared programs: 11071517 -> 11071687 (<.01%)
inst-and-stalls in affected programs: 1946767 -> 1946937 (<.01%)
helped: 4191
HURT: 3909
Inconclusive result (value mean confidence interval includes 0).

total nops in shared programs: 270628 -> 269829 (-0.30%)
nops in affected programs: 22032 -> 21233 (-3.63%)
helped: 1213
HURT: 571
Inconclusive result (%-change mean confidence interval includes 0).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30511>
2024-08-07 09:28:39 +02:00
..
amd aco: test xor swap16 path 2024-08-06 20:40:12 +00:00
android_stub vulkan/android: Add helper to probe AHB support 2024-05-14 14:53:44 +00:00
asahi agx: use opt_uniform_atomics 2024-08-06 11:48:18 -04:00
broadcom broadcom/compiler: emit instructions producing flags earlier 2024-08-07 09:28:39 +02:00
c11 build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
compiler nir/opt_peephole_select: allow speculatable load constant 2024-08-06 20:01:37 +00:00
drm-shim drm-shim: stub synobj_timeline_wait and query ioctl 2024-07-16 11:17:59 +02:00
egl egl: simplify multibuffers check 2024-08-05 20:33:15 +00:00
etnaviv meson: centralize checking for new enough meson for rust support 2024-07-31 16:22:43 +00:00
freedreno ir3/postched: don't prioritize instructions with soft delays 2024-08-05 12:20:03 +00:00
gallium iris: Disable fast clear when surface height is 16k 2024-08-06 19:14:04 +00:00
gbm gbm: delete DRI_FLUSH remnants 2024-08-05 20:33:14 +00:00
getopt build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
glx glx: include src/gallium for apple 2024-08-01 16:01:17 +00:00
gtest build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
imagination pvr: Handle VK_STRUCTURE_TYPE_IMAGE_FORMAT_LIST_CREATE_INFO 2024-07-23 10:44:21 +00:00
imgui
intel anv: Disable fast clear when surface height is 16k 2024-08-06 19:14:04 +00:00
loader gallium: move loader_dri_create_image to dri frontend 2024-08-01 15:28:03 +00:00
loader_dri3 loader/dri3: delete loader_dri3_extensions 2024-08-01 15:28:03 +00:00
mapi mesa_interface: Move out of GL/internal/ 2024-07-17 23:47:05 +00:00
mesa mesa: check for enabled extensions for *UID enums 2024-08-02 15:04:41 +00:00
microsoft microsoft/clc: Split struct copies before vars_to_ssa in pre-inline optimizations 2024-07-22 21:16:58 +00:00
nouveau nvk: EXT_post_depth_coverage 2024-08-05 19:26:04 +00:00
panfrost panvk: Pass attrib_buf_idx_offset to desc_copy_info 2024-07-26 08:55:26 +00:00
tool build: pass licensing information in SPDX form 2024-06-29 12:42:49 -07:00
util util/u_queue: Replace relative time wait hack with u_cnd_monotonic 2024-08-06 16:37:59 +00:00
virtio venus/ci: Update skip tests to prevent timeouts 2024-08-01 08:45:54 +00:00
vulkan vulkan: MESA_VK_ENABLE_SUBMIT_THREAD=0 disables threaded submit 2024-08-06 13:19:40 +00:00
x11 loader: move some common dri3 functions out of dri3 loader 2024-07-31 18:50:38 +00:00
.clang-format hk: add Vulkan driver for Apple GPUs 2024-07-26 18:40:47 +00:00
meson.build loader: split out dri3 into subdir 2024-07-31 18:50:38 +00:00