mesa/src
Timur Kristóf 1ff9c1fe5d nir: Add pass to lower workgroup size
Lowers a shader to use a smaller workgroup to do the same work,
while it will still appear as a bigger workgroup to applications.

To achieve this, the pass augments the CF of the shader
so that each real subgroup will execute two or more logical
subgroups. A logical subgroup represents what the application
can observe as a subgroup.

The size of a logical subgroup is the same as a real subgroup.
Only one logical subgroup may be executed per real subgroup
at the same time. This ensures that all subgroup operations
keep working and the subgroup invocation ID stays the same.

- When the CF contains barriers, we need can't just repeat
  the code and we need to augment each CF node individually
  so that they are aware of logical subgroups.

- In case parts of the CF don't contain any barriers, we can simply
  repeat and predicate that CF for each logical subgroup.
  It is technically not necessary to implement this strategy, but
  in practice it helps reduce the amount of branches in the shader
  and therefore improves compile times.

The pass is mainly intended for working around HW limitations,
for example when the HW has an upper limit on the workgroup size
or doesn't support workgroups at all, but the API requires a
certain minimum.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Anna Maniscalco <anna.maniscalco2000@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2025-12-19 19:11:36 -06:00
..
amd radv: rename RADEON_FLAG_VA_UNCACHED to RADEON_FLAG_GL2_BYPASS 2025-12-16 07:17:08 +00:00
android_stub android_stub: fix missing prototypes issues 2025-12-02 20:03:02 +00:00
asahi nir: add nir_lower_single_sampled::lower_sample_mask_in option 2025-12-11 22:50:10 +00:00
broadcom v3dv: Enable TFU blits with raster destinations on 7.1 HW (RPi5) 2025-12-15 11:57:51 +00:00
c11 c11/threads: fix build on c23 2025-11-10 07:01:50 +10:00
compiler nir: Add pass to lower workgroup size 2025-12-19 19:11:36 -06:00
drm-shim drm-shim: handle DRM_CAP_ADDFB2_MODIFIERS 2025-11-24 12:34:08 +00:00
egl egl/x11: Fix memory leak when querying translated coord. 2025-12-11 14:58:59 +00:00
etnaviv Uprev Piglit to 2842979ebe03b99c33c3e49af5960c69be6c6d46 2025-12-12 21:45:24 +00:00
freedreno ir3/legalize: schedule (eq) more accurately 2025-12-13 00:01:02 +00:00
gallium panfrost: do not over-estimate format tib-size 2025-12-16 13:05:57 +00:00
gbm mesa: replace most occurrences of getenv() with os_get_option() 2025-11-06 04:36:13 +00:00
getopt
gfxstream meson: Remove VK_ICD_FILENAMES totally from source tree. 2025-12-10 14:46:11 +00:00
glx apple_cgl.c: Fix error: call to undeclared function 'os_get_option' 2025-11-20 18:39:19 +00:00
gtest
imagination treewide: Use wsi_common_is_swapchain_image() helper 2025-12-11 20:20:39 +00:00
imgui imgui: Silence build warnings for imgui 2025-09-16 06:16:19 +00:00
intel brw: Move MATH related validation 2025-12-16 01:34:46 +00:00
kosmickrisp treewide: Use wsi_common_is_swapchain_image() helper 2025-12-11 20:20:39 +00:00
loader loader: Wrap nouveau_zink_predicate with HAVE_LIBDRM 2025-11-20 18:39:19 +00:00
mesa gallium: Make upload_cb0 return a releasebuf 2025-12-12 00:55:55 +00:00
microsoft meson: Remove VK_ICD_FILENAMES totally from source tree. 2025-12-10 14:46:11 +00:00
nouveau nvk: Use rendering state attachment count when setting SET_CT_SELECT 2025-12-15 09:03:42 +00:00
panfrost panfrost: do not over-estimate format tib-size 2025-12-16 13:05:57 +00:00
poly nir: remove nir_io_add_const_offset_to_base 2025-11-29 00:16:38 +00:00
tool pps/meson: minor refactor for pps_deps 2025-11-08 18:39:00 -08:00
util util: Move STACK_ARRAY into util 2025-12-12 10:03:02 +01:00
virtio treewide: Use wsi_common_is_swapchain_image() helper 2025-12-11 20:20:39 +00:00
vulkan Revert "device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr" 2025-12-15 16:46:13 +00:00
x11 treewide: strip unneeded inc_gallium inc_gallium_aux 2025-11-13 22:01:43 +00:00
.clang-format util: Add sparse bitset data structure 2025-11-06 21:34:33 +00:00
meson.build kk: Add KosmicKrisp 2025-10-20 17:46:38 +00:00