nir: Add pass to lower workgroup size

Lowers a shader to use a smaller workgroup to do the same work,
while it will still appear as a bigger workgroup to applications.

To achieve this, the pass augments the CF of the shader
so that each real subgroup will execute two or more logical
subgroups. A logical subgroup represents what the application
can observe as a subgroup.

The size of a logical subgroup is the same as a real subgroup.
Only one logical subgroup may be executed per real subgroup
at the same time. This ensures that all subgroup operations
keep working and the subgroup invocation ID stays the same.

- When the CF contains barriers, we need can't just repeat
  the code and we need to augment each CF node individually
  so that they are aware of logical subgroups.

- In case parts of the CF don't contain any barriers, we can simply
  repeat and predicate that CF for each logical subgroup.
  It is technically not necessary to implement this strategy, but
  in practice it helps reduce the amount of branches in the shader
  and therefore improves compile times.

The pass is mainly intended for working around HW limitations,
for example when the HW has an upper limit on the workgroup size
or doesn't support workgroups at all, but the API requires a
certain minimum.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Anna Maniscalco <anna.maniscalco2000@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
This commit is contained in:
Timur Kristóf 2025-10-16 12:31:38 +02:00
parent c1bf22b56f
commit 1ff9c1fe5d
3 changed files with 1118 additions and 0 deletions

View file

@ -244,6 +244,7 @@ else
'nir_lower_bit_size.c', 'nir_lower_bit_size.c',
'nir_lower_ubo_vec4.c', 'nir_lower_ubo_vec4.c',
'nir_lower_uniforms_to_ubo.c', 'nir_lower_uniforms_to_ubo.c',
'nir_lower_workgroup_size.c',
'nir_lower_sysvals_to_varyings.c', 'nir_lower_sysvals_to_varyings.c',
'nir_metadata.c', 'nir_metadata.c',
'nir_mod_analysis.c', 'nir_mod_analysis.c',

View file

@ -5124,6 +5124,8 @@ bool nir_split_struct_vars(nir_shader *shader, nir_variable_mode modes);
bool nir_lower_returns_impl(nir_function_impl *impl); bool nir_lower_returns_impl(nir_function_impl *impl);
bool nir_lower_returns(nir_shader *shader); bool nir_lower_returns(nir_shader *shader);
bool nir_lower_workgroup_size(nir_shader *shader, const uint32_t target_wg_size);
nir_def *nir_inline_function_impl(nir_builder *b, nir_def *nir_inline_function_impl(nir_builder *b,
const nir_function_impl *impl, const nir_function_impl *impl,
nir_def **params, nir_def **params,

File diff suppressed because it is too large Load diff