mesa/src/panfrost/midgard
Alyssa Rosenzweig 59a3e12039 panfrost: do not push "true" UBOs
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!

The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:

- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
  write-combine memory, which we cannot read from efficiently (at least
  depending on coherency details that were never plumbed through panfrost.ko and
  unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
  true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
  dubious... When I benchmarked this on MT8192 in early 2023, this pushing
  improved FPS in SuperTuxKart but hurt FPS in Dolphin.

- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
  infrastructure to sort this mess out in theory. What this means is that
  pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
  ever hit, our performance is utterly trashed. But it gets worse.

- True UBOs can be written in the same batch that reads them. For example, we
  could bind a buffer as a transform feedback buffer, do a draw with XFB, then
  rebind as a UBO and do a draw reading. This is where we collapse -- our logic
  will flush the writer, which is the same batch we were in the middle of
  enqueueing a draw to. When we try to push words, we'll crash with theatrics.
  This could be solved by smartening the batch tracking logic but it's not
  trivial by any means.

So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!

Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.

I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.

Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)

According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
2025-04-10 08:05:21 +00:00
..
compiler.h pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
disassemble.c midgard: Make disassembler take a const void* 2024-06-17 07:31:50 +00:00
disassemble.h midgard: Make disassembler take a const void* 2024-06-17 07:31:50 +00:00
helpers.h pan/midgard: constify pointers 2025-01-22 13:50:44 +00:00
meson.build panfrost: Kill panfrost-job.h 2025-01-07 11:10:55 +00:00
midgard.h
midgard_address.c
midgard_compile.c pan/mdg: call nir_lower_is_helper_invocation 2025-03-08 07:47:40 +00:00
midgard_compile.h nir: remove dead code due to IO being always lowered in st/mesa 2025-01-22 02:15:04 +00:00
midgard_derivatives.c pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
midgard_emit.c pan/midgard: constify pointers 2025-01-22 13:50:44 +00:00
midgard_errata_lod.c treewide: use nir_shader_tex_pass 2025-02-24 19:33:26 +00:00
midgard_helper_invocations.c
midgard_liveness.c pan/midgard: constify pointers 2025-01-22 13:50:44 +00:00
midgard_nir.h pan: s/NIR_PASS_V/NIR_PASS/ 2024-12-05 08:49:45 +00:00
midgard_nir_algebraic.py nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic 2024-10-03 09:02:07 +00:00
midgard_nir_lower_image_bitsize.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
midgard_nir_type_csel.c pan: s/NIR_PASS_V/NIR_PASS/ 2024-12-05 08:49:45 +00:00
midgard_ops.c
midgard_ops.h
midgard_opt_copy_prop.c
midgard_opt_dce.c
midgard_opt_perspective.c pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
midgard_opt_prop.c
midgard_print.c pan/midgard: constify pointers 2025-01-22 13:50:44 +00:00
midgard_print_constant.c
midgard_quirks.h pan/mdg: quirk to disable auto32 2024-05-09 21:21:32 +00:00
midgard_ra.c pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
midgard_ra_pipeline.c
midgard_schedule.c pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
mir.c pan/compiler: don't pass midgard_instruction by value 2025-01-22 13:50:44 +00:00
mir_promote_uniforms.c panfrost: do not push "true" UBOs 2025-04-10 08:05:21 +00:00
mir_squeeze.c
nir_fuse_io_16.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00