mesa/src/panfrost
Alyssa Rosenzweig 2c446b6636 pan/mdg: Limit work registers for large workgroups
When more than 8 registers are used, Midgard can only fit 64 threads in a
thread group. For barriers to work properly, a threadgroup must fit an entire
work group. The GL driver configures the hardware to have threadgroups the size
of work groups. That means if more than 64 threads are used in a workgroup, and
more than 8 registers are used, the hardware will fault spawning threads.

To workaround this hardware limitation, we need to limit the number of work
registers used depending on the size of the workgroup. Typically, the work group
size is known at compile-time so that determination can usually be made without
variants. To avoid variants, we make a pessimistic estimate in the case when
it's not known at compile-time.

shader-db shows 6 shaders affected. I expect that all of these would fault with
DATA_INVALID_FAULT if they tried to execute before this patch, due to the
oversize local size, and faulting is even slower than spilling ;-)

Fixes dEQP-GLES31.functional.synchronization.* on Mali-T860.

instructions HURT:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 121 -> 157 (29.75%)
instructions HURT:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 121 -> 157 (29.75%)
instructions HURT:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 141 -> 184 (30.50%)
instructions HURT:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 141 -> 184 (30.50%)
instructions HURT:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 513 -> 933 (81.87%)
instructions HURT:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 505 -> 1002 (98.42%)

bundles HURT:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 73 -> 116 (58.90%)
bundles HURT:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 73 -> 116 (58.90%)
bundles HURT:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 61 -> 97 (59.02%)
bundles HURT:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 61 -> 97 (59.02%)
bundles HURT:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 281 -> 701 (149.47%)
bundles HURT:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 278 -> 775 (178.78%)

registers helped:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 11 -> 8 (-27.27%)
registers helped:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 11 -> 8 (-27.27%)
registers helped:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 14 -> 8 (-42.86%)
registers helped:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 14 -> 8 (-42.86%)
registers helped:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 16 -> 8 (-50.00%)
registers helped:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 16 -> 8 (-50.00%)

threads helped:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)
threads helped:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)
threads helped:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)
threads helped:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)
threads helped:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)
threads helped:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%)

spills HURT:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 0 -> 5
spills HURT:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 0 -> 5
spills HURT:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 0 -> 8
spills HURT:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 0 -> 8
spills HURT:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 0 -> 112
spills HURT:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 0 -> 146

fills HURT:   shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 0 -> 26
fills HURT:   shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 0 -> 26
fills HURT:   shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 0 -> 33
fills HURT:   shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 0 -> 33
fills HURT:   shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 0 -> 209
fills HURT:   shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 0 -> 234

total instructions in shared programs: 1521691 -> 1522766 (0.07%)
instructions in affected programs: 1542 -> 2617 (69.71%)
helped: 0
HURT: 6
HURT stats (abs)   min: 36.0 max: 497.0 x̄: 179.17 x̃: 43
HURT stats (rel)   min: 29.75% max: 98.42% x̄: 50.13% x̃: 30.50%
95% mean confidence interval for instructions value: -49.36 407.69
95% mean confidence interval for instructions %-change: 17.14% 83.12%
Inconclusive result (value mean confidence interval includes 0).

total bundles in shared programs: 649296 -> 650371 (0.17%)
bundles in affected programs: 827 -> 1902 (129.99%)
helped: 0
HURT: 6
HURT stats (abs)   min: 36.0 max: 497.0 x̄: 179.17 x̃: 43
HURT stats (rel)   min: 58.90% max: 178.78% x̄: 94.01% x̃: 59.02%
95% mean confidence interval for bundles value: -49.36 407.69
95% mean confidence interval for bundles %-change: 36.20% 151.83%
Inconclusive result (value mean confidence interval includes 0).

total registers in shared programs: 90681 -> 90647 (-0.04%)
registers in affected programs: 82 -> 48 (-41.46%)
helped: 6
HURT: 0
helped stats (abs) min: 3.0 max: 8.0 x̄: 5.67 x̃: 6
helped stats (rel) min: 27.27% max: 50.00% x̄: 40.04% x̃: 42.86%
95% mean confidence interval for registers value: -8.03 -3.30
95% mean confidence interval for registers %-change: -50.95% -29.13%
Registers are helped.

total threads in shared programs: 55717 -> 55723 (0.01%)
threads in affected programs: 6 -> 12 (100.00%)
helped: 6
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.00 1.00
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are helped.

total spills in shared programs: 1108 -> 1392 (25.63%)
spills in affected programs: 0 -> 284
helped: 0
HURT: 6

total fills in shared programs: 4721 -> 5282 (11.88%)
fills in affected programs: 0 -> 561
helped: 0
HURT: 6

Cc: mesa-stable
Closes: #7228
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19092>
2022-10-17 18:56:13 +00:00
..
bifrost panfrost: Remove load_kernel_input path 2022-10-05 16:09:21 +00:00
ci pan/mdg: Limit work registers for large workgroups 2022-10-17 18:56:13 +00:00
drm-shim panfrost: Advertise all textures in drm-shim 2022-04-26 17:47:49 +00:00
ds meson: replace manual compiler flags with meson arguments 2022-08-24 22:13:19 +00:00
include panfrost: Centralize our model list 2022-01-28 17:47:46 +00:00
lib util/mesa/wide: Rename _SIMPLE_MTX_INITIALIZER_NP to SIMPLE_MTX_INITIALIZER 2022-10-14 03:27:41 +00:00
midgard pan/mdg: Limit work registers for large workgroups 2022-10-17 18:56:13 +00:00
perf panfrost: Separate core ID range from core count 2022-07-08 01:14:55 +00:00
shared lima,panfrost: Use row stride for tiling routines 2022-05-03 14:20:15 +00:00
tools panfrost: Add userspace crash dump decoder and analyser 2022-08-31 23:54:55 +00:00
util panfrost: Honour flush-to-zero controls on Valhall 2022-09-19 17:22:58 +00:00
vulkan panvk: Implement VK_KHR_descriptor_update_template 2022-09-17 03:32:29 +00:00
meson.build panfrost: Add userspace crash dump decoder and analyser 2022-08-31 23:54:55 +00:00