mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-16 07:38:14 +02:00
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!
The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:
- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
write-combine memory, which we cannot read from efficiently (at least
depending on coherency details that were never plumbed through panfrost.ko and
unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
dubious... When I benchmarked this on MT8192 in early 2023, this pushing
improved FPS in SuperTuxKart but hurt FPS in Dolphin.
- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
infrastructure to sort this mess out in theory. What this means is that
pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
ever hit, our performance is utterly trashed. But it gets worse.
- True UBOs can be written in the same batch that reads them. For example, we
could bind a buffer as a transform feedback buffer, do a draw with XFB, then
rebind as a UBO and do a draw reading. This is where we collapse -- our logic
will flush the writer, which is the same batch we were in the middle of
enqueueing a draw to. When we try to push words, we'll crash with theatrics.
This could be solved by smartening the batch tracking logic but it's not
trivial by any means.
So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!
Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.
I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.
Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)
According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit
|
||
|---|---|---|
| .. | ||
| auxiliary | ||
| drivers | ||
| frontends | ||
| include | ||
| targets | ||
| tests | ||
| tools | ||
| winsys | ||
| meson.build | ||
| README.portability | ||
CROSS-PLATFORM PORTABILITY GUIDELINES FOR GALLIUM3D
= General Considerations =
The frontend and winsys driver support a rather limited number of
platforms. However, the pipe drivers are meant to run in a wide number of
platforms. Hence the pipe drivers, the auxiliary modules, and all public
headers in general, should strictly follow these guidelines to ensure
= Compiler Support =
* Include the util/compiler.h.
* Cast explicitly when converting to integer types of smaller sizes.
* Cast explicitly when converting between float, double and integral types.
* Don't use named struct initializers.
* Don't use variable number of macro arguments. Use static inline functions
instead.
* Don't use C99 features.
= Standard Library =
* Avoid including standard library headers. Most standard library functions are
not available in Windows Kernel Mode. Use the appropriate p_*.h include.
== Memory Allocation ==
* Use MALLOC, CALLOC, FREE instead of the malloc, calloc, free functions.
* Use align_pointer() function defined in u_memory.h for aligning pointers
in a portable way.
== Debugging ==
* Use the functions/macros in p_debug.h.
* Don't include assert.h, call abort, printf, etc.
= Code Style =
== Inherantice in C ==
The main thing we do is mimic inheritance by structure containment.
Here's a silly made-up example:
/* base class */
struct buffer
{
int size;
void (*validate)(struct buffer *buf);
};
/* sub-class of bufffer */
struct texture_buffer
{
struct buffer base; /* the base class, MUST COME FIRST! */
int format;
int width, height;
};
Then, we'll typically have cast-wrapper functions to convert base-class
pointers to sub-class pointers where needed:
static inline struct vertex_buffer *vertex_buffer(struct buffer *buf)
{
return (struct vertex_buffer *) buf;
}
To create/init a sub-classed object:
struct buffer *create_texture_buffer(int w, int h, int format)
{
struct texture_buffer *t = malloc(sizeof(*t));
t->format = format;
t->width = w;
t->height = h;
t->base.size = w * h;
t->base.validate = tex_validate;
return &t->base;
}
Example sub-class method:
void tex_validate(struct buffer *buf)
{
struct texture_buffer *tb = texture_buffer(buf);
assert(tb->format);
assert(tb->width);
assert(tb->height);
}
Note that we typically do not use typedefs to make "class names"; we use
'struct whatever' everywhere.
Gallium's pipe_context and the subclassed psb_context, etc are prime examples
of this. There's also many examples in Mesa and the Mesa state tracker.