v3d,v3dv: fix BO allocation for shared vars

We need to allocate "shared size" bytes for each workgroup but
we were incorrectly multiplying by the number of workgroups in
each supergroup instead, which would typically cause us to allocate
less memory than actually required.

The reason this issue was not visible until now is that the kernel
driver is using a large page alignment on all BO allocations and
this causes us to "waste" a lot of memory after each allocation.
Incidentally, this wasted memory ensured that out of bounds
accesses would not cause issues since they would typically land
in unused memory regions in between aligned allocations, however,
experimenting with reduced memory aligments raised the issue,
which manifested with the UE4 Shooter demo as a GPU hang caused
by corrupted state from out of bounds memory writes to CS
shared memory.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27675>
(cherry picked from commit 1880e7cfed)
This commit is contained in:
Iago Toral Quiroga 2024-02-19 11:15:01 +01:00 committed by Eric Engestrom
parent 2e1ccf1c59
commit d0ea44cfdc
3 changed files with 3 additions and 3 deletions

View file

@ -2034,7 +2034,7 @@
"description": "v3d,v3dv: fix BO allocation for shared vars",
"nominated": true,
"nomination_type": 0,
"resolution": 0,
"resolution": 1,
"main_sha": null,
"because_sha": null,
"notes": null

View file

@ -4327,7 +4327,7 @@ cmd_buffer_create_csd_job(struct v3dv_cmd_buffer *cmd_buffer,
if (cs_variant->prog_data.cs->shared_size > 0) {
job->csd.shared_memory =
v3dv_bo_alloc(cmd_buffer->device,
cs_variant->prog_data.cs->shared_size * wgs_per_sg,
cs_variant->prog_data.cs->shared_size * num_wgs,
"shared_vars", true);
if (!job->csd.shared_memory) {
v3dv_flag_oom(cmd_buffer, NULL);

View file

@ -1390,7 +1390,7 @@ v3d_launch_grid(struct pipe_context *pctx, const struct pipe_grid_info *info)
v3d->compute_shared_memory =
v3d_bo_alloc(v3d->screen,
v3d->prog.compute->prog_data.compute->shared_size *
wgs_per_sg,
num_wgs,
"shared_vars");
}