v3dv: lower oversized compute workgroups to 256 invocations

V3D advertises maxComputeWorkGroupInvocations = 256 but ggml-vulkan
in many cases ignores this limit an creates compute pipelines with
over this limit. Although this is a bug in the application we can
take advantage of nir_lower_workgroup_size and make the application
work.

This issue was causing an assertion failure at nir_to_vir.c:

  assert(c->local_invocation_index_bits <= 8);

The solution is lowering the oversized workgroups to a 256-invocation
workgroup loop, like radv and radeonsi are doing on GFX7, by running
nir_lower_workgroup_size(256) for this scenario.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41257>
This commit is contained in:
Jose Maria Casanova Crespo 2026-04-28 13:08:15 +02:00 committed by Marge Bot
parent c3ba5effe2
commit d95076e581

View file

@ -3143,6 +3143,26 @@ lower_compute(struct nir_shader *nir)
NIR_PASS(_, nir, nir_lower_explicit_io,
nir_var_mem_shared, nir_address_format_32bit_offset);
/* V3D can't execute workgroups with more than 256 invocations
* (maxComputeWorkGroupInvocations). If the shader requested a
* larger workgroup, serialize it into a 256-invocation one.
*/
const uint32_t wg_size = nir->info.workgroup_size[0] *
nir->info.workgroup_size[1] *
nir->info.workgroup_size[2];
if (wg_size > V3D_MAX_CSD_WG_SIZE) {
perf_debug("Compute shader requested workgroup size %u (>256); "
"lowering to a 256-invocation workgroup wrapping an "
"outer loop (workgroup_size=(%u,%u,%u)).\n",
wg_size,
nir->info.workgroup_size[0],
nir->info.workgroup_size[1],
nir->info.workgroup_size[2]);
NIR_PASS(_, nir, nir_lower_workgroup_size, V3D_MAX_CSD_WG_SIZE);
v3d_optimize_nir(NULL, nir);
nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
}
struct nir_lower_compute_system_values_options sysval_options = {
.has_base_workgroup_id = true,
};