mesa/src at cb3ac95d030066e9965a693363911e678c413e03 - fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 00:28:08 +02:00

History

Christian Gmeiner cb3ac95d03 etnaviv: nir: improve uniform usage for ALU opc The current code in lower_alu(..) counts how many const values are used by one ALU opc. If there are used more then one the compiler tries to fix this issues by e.g. resolve with a single combined const src. We are doing this as some GPUs only allow one const src per ISA instruction. But it is allowed to use the same const for multiple srcs. Lets have a closer look at a real world shader: impl main { /* preds: / vec1 32 ssa_0 = load_const (0x3f800000 = 1.000000) vec1 32 ssa_1 = load_const (0x00000000 = 0.000000) vec4 32 ssa_2 = intrinsic load_uniform (ssa_1) (base=0, range=1, dest_type=bool32 /38/) / u_var / vec1 32 ssa_4 = fmul ssa_2.x, ssa_2.y vec1 32 ssa_11 = load_const (0x00000000 = 0.000000) vec1 32 ssa_13 = seq ssa_2.w, ssa_11 vec1 32 ssa_6 = fmul ssa_2.z, ssa_13 vec1 32 ssa_7 = fmul ssa_4, ssa_6 vec1 32 ssa_9 = deref_var &gl_FragColor (shader_out vec4) vec4 32 ssa_10 = vec4 ssa_7, ssa_7, ssa_7, ssa_0 intrinsic store_deref (ssa_9, ssa_10) (wrmask=xyzw /15/, access=0) / succs: block_1 / block block_1: } The current compiler transforms it to: impl main { block block_0: / preds: / vec1 32 ssa_0 = load_const (0x3f800000 = 1.000000) vec4 32 ssa_14 = load_const (0x00000000, 0x00000001, 0x00000002, 0x00000003) = (0.000000, 0.000000, 0.000000, 0.000000) vec2 32 ssa_15 = load_const (0x00000000, 0x00000001) = (0.000000, 0.000000) vec1 32 ssa_4 = fmul ssa_15.x, ssa_15.y vec2 32 ssa_16 = load_const (0x00000003, 0x00000000) = (0.000000, 0.000000) vec1 32 ssa_13 = seq ssa_16.x, ssa_16.y vec1 32 ssa_6 = fmul ssa_14.z, ssa_13 vec1 32 ssa_7 = fmul ssa_4, ssa_6 vec1 32 ssa_9 = deref_var &gl_FragColor (shader_out vec4) vec1 32 ssa_17 = mov ssa_0 vec4 32 ssa_10 = vec4 ssa_7, ssa_7, ssa_7, ssa_17 intrinsic store_deref (ssa_9, ssa_10) (wrmask=xyzw /15/, access=0) / succs: block_1 / block block_1: } There is no need to create ssa_15 as we can use ssa_14 for the first fmul. With this change the compiler creates the following shader: impl main { block block_0: / preds: / vec1 32 ssa_0 = load_const (0x3f800000 = 1.000000) vec4 32 ssa_14 = load_const (0x00000000, 0x00000001, 0x00000002, 0x00000003) = (0.000000, 0.000000, 0.000000, 0.000000) vec1 32 ssa_4 = fmul ssa_14.x, ssa_14.y vec2 32 ssa_15 = load_const (0x00000003, 0x00000000) = (0.000000, 0.000000) vec1 32 ssa_13 = seq ssa_15.x, ssa_15.y vec1 32 ssa_6 = fmul ssa_14.z, ssa_13 vec1 32 ssa_7 = fmul ssa_4, ssa_6 vec1 32 ssa_9 = deref_var &gl_FragColor (shader_out vec4) vec1 32 ssa_16 = mov ssa_0 vec4 32 ssa_10 = vec4 ssa_7, ssa_7, ssa_7, ssa_16 intrinsic store_deref (ssa_9, ssa_10) (wrmask=xyzw /15/, access=0) / succs: block_1 */ block block_1: } This change reduces immediate pressure and reduces spend CPU cycles. No piglit or deqp regression seen. shader-db results for GC2000: total instructions in shared programs: 955128 -> 955128 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total temps in shared programs: 85689 -> 85689 (0.00%) temps in affected programs: 0 -> 0 helped: 0 HURT: 0 total immediates in shared programs: 155428 -> 155240 (-0.12%) immediates in affected programs: 1840 -> 1652 (-10.22%) helped: 34 HURT: 1 helped stats (abs) min: 4 max: 16 x̄: 5.65 x̃: 4 helped stats (rel) min: 2.94% max: 33.33% x̄: 16.92% x̃: 16.67% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 14.29% max: 14.29% x̄: 14.29% x̃: 14.29% 95% mean confidence interval for immediates value: -6.57 -4.17 95% mean confidence interval for immediates %-change: -19.83% -12.23% Immediates are helped. total loops in shared programs: 0 -> 0 loops in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 102.55 -> 96.35 (-6.05%) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23323>		2023-05-31 09:19:29 +00:00
..
amd	Fix DGC bug where indirect count > maxSequencesCount.	2023-05-31 07:49:54 +00:00
android_stub	util/log: improve logger_android	2023-02-22 17:55:40 +00:00
asahi	asahi: Reformat using the new style	2023-05-29 21:06:12 +00:00
broadcom	v3dv: Update texture padding logic to match v3d changes	2023-05-31 05:27:08 +00:00
c11
compiler	nir/print: Print locations for geometry shader inputs	2023-05-30 16:25:07 -04:00
drm-shim	drm-shim: Use anonymous file for file override	2023-05-16 04:31:22 +00:00
egl	meson: remove needless c++17-overrides	2023-05-19 12:45:31 +00:00
etnaviv	mesa/main: drop use_legacy_math_rules	2023-05-04 06:11:44 +00:00
freedreno	freedreno/drm: Don't try to export suballoc bo	2023-05-30 21:37:12 +00:00
gallium	etnaviv: nir: improve uniform usage for ALU opc	2023-05-31 09:19:29 +00:00
gbm	gbm: drop unnecessary vulkan dependency	2023-02-23 18:31:22 +00:00
getopt
glx	glx: fix build with APPLEGL	2023-05-15 03:50:30 +00:00
gtest	gtest: Update to 1.13.0	2023-05-14 11:09:02 +00:00
imagination	pvr: Fix page faults in occlusion query tests	2023-05-30 10:53:41 +00:00
imgui
intel	intel/dev: switch defect identifiers to use lineage numbers	2023-05-30 22:13:41 +00:00
loader	loader/dri3: temporarily work around a crash when front is NULL	2023-05-18 06:25:46 +00:00
mapi	mesa: Add EXT_instanced_arrays support	2023-04-11 10:22:35 +00:00
mesa	treewide: Use nir_replicate	2023-05-30 16:24:21 -04:00
microsoft	treewide: Use nir_replicate	2023-05-30 16:24:21 -04:00
nouveau	treewide: Avoid nir_lower_regs_to_ssa calls	2023-05-24 17:30:03 +00:00
panfrost	pan/lower_framebuffer: Use nir_replicate	2023-05-30 16:24:21 -04:00
tool	meson: remove needless c++17-overrides	2023-05-19 12:45:31 +00:00
util	anv: override vendorID for Cyberpunk 2077	2023-05-30 01:05:36 -07:00
virtio	venus: enable VK_EXT_image_2d_view_of_3d	2023-05-30 22:52:12 +00:00
vulkan	vulkan: use cmd size array for queued cmd allocations	2023-05-31 03:13:22 +00:00
.clang-format	treewide: Add a .clang-format file	2023-05-29 21:06:12 +00:00
meson.build	hgl: remove	2023-02-18 00:44:43 +00:00