Currently, we always use a hierarchy mask with all levels enabled. While this is
efficient for geometry-heavy workloads like 3D games, it is wasteful for 2D
applications that draw very few vertices. For drawing just a few textured quads,
the overhead of small bin sizes outweighs any performance advantages, so it's a
bit slower. More problematically, small bin sizes require tremendous amounts of
memory for the polygon lists, leading to significant memory consumption (~10MB)
for the polygon list for even the simplest of 2D blits.
To reduce our memory footprint, we need to choose our hierarchy masks more
carefully. In general, we want to allow small bin sizes for geometry-heavy
workloads but not for geometry-light workloads. We estimate vertex count in the
driver as a proxy for this, and use a simple heuristic to select a bin size
based on the estimated vertex count. None of this is an exact science, and the
heuristic could probably be tuned. Nevertheless, the heuristic used (comparing
framebuffer size to vertex count) works well in practice, significantly reducing
the memory footprint of 2D applications like Firefox without hurting the
performance of 3D applications.
I originally wrote this patch while diagnosing high memory footprints on my
Midgard laptop, which is why only Midgard is in scope here. On Bifrost and
Valhall, we have a similar hiearchy mask selection problem. It seems likely that
the same heuristic would work there too, but it's a different code path that I
have not integrated or tested. I'll leave that for the adventurous reader, to
get the memory footprint win there too.
(It's also possible the win is smaller on newer Malis than on Midgard, since Arm
claims they optimized the tiler data structures on the newer parts. There's
probably still some merit to the idea.)
On Mali-T860, glmark2 -bdesktop frametime decreased by 1.35% +/- 0.91% at 95%
confidence, showing a slight win for 2D workloads No statistically significant
difference for glmark2 -bshading:shading=phong, since 3D workloads continue to
use the same hierarchy masks.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19482>
In the next commit, we will refine our algorithm to select hierarchy masks based
on the vertex count. In preparation, augment the driver to track rough estimates
of the vertex count so we have a "geometry complexity" input for the heuristic.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19482>
As Intel does. These functions are written with the expectation that they will
be inlined away, allowing gcc's copy-prop and constant folding to eliminate the
template struct and any unused fields.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21848>
6148e3aae7 ("mesa: Fix ctx->Texture.CubeMapSeamless") introduced a hack, where
seamless cube maps would be requested even for GLES2 contexts despite the spec,
on the assumption that GLES2 gallium drivers would ignore the bit. But that
requires Gallium drivers to know what GLES version they advertise, which is a
horrible layering violation. When the commit was written 8 years ago, there were
classic drivers to contend with so it made sense as a fix to get GLES 3.0 up and
running. With classic drivers gone, it's time to sunset the hack and restore the
intended behaviour by setting ctx->Texture.CubeMapSeamless only once we know the
version.
In addition to fixing a semantic issue in the Gallium contract and preventing a
regression from the next commit, this fixes cube maps on Mali-T720 under
Panfrost. In general, Panfrost supports GLES3 (and honours the seamless flag
everywhere) but on T720 we only advertise GLES2 due to missing MRT support on
older Midgard devices, so we need the flag set properly to distinguish these
cases.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21978>
.lava-test hidden job was setting the HWCI_TEST_SCRIPT variable to deqp
runner. But that is not always the case. When we run piglit traces jobs,
we use piglit-traces.sh instead, for example.
Splitting into:
- .lava-test-deqp (deqp-runner + deqp)
- .lava-traces (deqp-runner + piglit)
- .lava-piglit (piglit-runner + piglit)
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Co-authored-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22065>
Previosuly it was assumed that primitives where always converted to
triangles if the driver did not support all primitives, however that's
not true for a driver that supports quads but not quad strips.
Fixes piglit spec@!opengl 1.1@dlist-fdo3129-01 on Panfrost
Fixes: dcbf2423d2 ("vbo/dlist: add vertices to incomplete primitives")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21987>
These were removed and replaced by new Bifrost RSD fields, don't print the wrong
values. Harmless but noises up the decoding.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Since 50b82ca818 ("nir/lower_blend,agx,panfrost: Use lowered I/O"),
nir_lower_blend needs to be called after lowering I/O rather than before.
Furthermore, after lowering blend, we need (in general) to lower the resulting
load_output intrinsics. Now that we have a proper preprocess_nir hook, there is
a natural place in panvk_vX_shader to do this.
Fixes dEQP-VK.pipeline.blend.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This will give the driver [notably, PanVK] a chance to lower dual source
blending without having the dual stores turned into store_combined_output_pan.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
If new load_output are created after preprocessing NIR (namely, from blend
lowering in panvk), this lowering needs to be called to lower load_output to the
vendor intrinsic with conversion descriptor.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Only the GL driver produces/consumes these, they shouldn't be in the common
shader_info.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Drop the backend compiler sysval handling in favour of the pass in the GL
driver, bringing us into compliance with Ekstrand's rule.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Blend constants are sysvals, it's just that they can sometimes be inlined
depending on the pipeline state. The old "inline blend constant" pass is a
special case of the new "lower all sysvals" pass in panvk.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Per Ekstrand's Rule. This avoids the "fixed sysval" hack that Faith introduced
to get this behaviour with the GL sysval handling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Now the only passes that depend on the shader key can run late, so we can
preprocess ahead-of-time once and throw away the original shader. This reduces
the cost of shader variants, as well as deduplicates some lowering for
transform feedback shaders.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Do it explicitly in NIR rather than implicitly in the Midgard compiler. This
avoids a nasty sideband input for the render target formats and sample count,
for blend shaders on midgard only.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This is a flag-day change to how we compile. We split preprocessing NIR into a
separate step from compiling, giving the driver a chance to apply its own
lowerings on the preprocessed NIR before the final optimization loop. During
that time, the different producers of NIR (panfrost, panvk, blend shaders, blit
shaders...) will be able to (differently) lower system values.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This will be needed to decouple the lowering in the Midgard compiler from the
specific sampler descriptors used in the blit code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Removes a lot of indentation, and improves metadata handling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
It doesn't make sense for shader stages other than fragment (and blend which is
fragment-like), assert this.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
To prepare for the new compile flow, where this will be called by the driver
instead of internally in the compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
To prepare for the new compile flow, where this will be called by the driver
instead of internally in the compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Nothing in the optimization loop should remat the lowered instructions, so
there's no need to do it inside the loop.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Nothing in the optimization loop should remat the lowered instructions, so
there's no need to do it inside the loop.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This sideband input is now unused, as the information is available locally
within the NIR as it should be.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
It is unused and should stay unused, as any use is a violation of Ekstrand's
rule.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This gets rid of the hidden gl_BaseVertex system value which violates Ekstrand's
rule.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
We need different settings for Bifrost and Valhall. Keeping everything static
simplifies lifetimes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
See previous commits for justification. Later, we'll split up NIR processing in
a few steps to give the caller a chance to lower the sysval, at which point the
goofy inputs here will go away.
v2: Only lower in fragment shaders. Likely harmless to run elsewhere but still
wrong because the location enum is defined per-stage.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
This uses the new NIR sysvals to avoid materializing magic sysvals in the
driver, getting us closer to the Ekstrand Rule.
v2: Only lower for fragment shaders. Lowering in vertex shaders should be a
no-op, except that FRAG_RESULT_SAMPLE_MASK shadows a VARYING_SLOT for fog
coords, causing v1 of this patch to regress fog. Caught by the G52 piglit job in
CI. Thank you, Marge.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
These two switches are redundant.
Furthermore, bi_tex_op could previously assume its input was a supported texop,
so it returned undefined values for unsupported texops. Now, without the guard
in front of it, bi_tex_op should check for supported texops, so we need to drop
the unsupported texops from the switch.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
Compilers are free to make the assumption that pointers don't violate
strict aliasing. If that assumption is incorrect, as it is with the
framebuffer pointer packing code here, the job can fail.
This depends heavily on the compiler and optimization levels, so it's
hard to reproduce, but it did happen for at least two users running with
-O2 on gcc.
Fixes: 67cbbf9417 ("panfrost: Use framebuffer pointer XML")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8627
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21991>
Skip heavyweight crashing tests that have the potential to take down not just
themselves but also other Piglit tests running concurrently via piglit-runner
(which would otherwise become piglit-runner level flakes).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
These are (have always been) quite broken. Given that the whole section is
already in the flakes.txt, and there's no plan for improving this (I've tried
and fails), I'd rather just skip the section and reduce the noise in
the #panfrost-ci channel.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
This is really dumb.
But this fixes arb_shader_language_420pack-active-sampler-conflict on v7 which
otherwise dereferences a null pointer trying to access the nonexistant texture
arrays, or DATA_INVALID_FAULTs if you give it a texture array filled with
zeroes. But it seems happy if you bind in null textures. This is dumb but less
faults in Piglit is good for reducing flakes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
mesa/st doesn't like to use 24-bit textures, preferring RGBX over true RGB even
for texture views where this isn't valid. Given how silly true RGB is in
practice, I'd rather drop support and fix texture views than go against the
grain and risk more issues down the line since nobody else in tree is testing
these paths and apps really shouldn't be caring.
Fixes page faults in arb_texture_view-rendering-formats_gles3 which tries to
sample an R8G8B8_UINT texture with a R8G8B8X8_UNORM view in one subcase. That
test is now passing reliably.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
This is signed, not unsigned. We were already passing negatives and silently
relying on 2's complement and C to do the right thing. But that's silly. We
should just, actually do the right thing.
Found while struggling to debug primitive-restart-draw-mode.
v2: Update the other architectures too, including a decode_csf.c change for the
v10 incarnation of this v4-era field.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
We just want a bit-exact transfer for integers. Using .auto32 accomplishes this
without any clamping shenanigans. Fixes gl-3.0-vertexattribipointer.
Note we can't use .auto32 unconditionally, since reading a uint vertex as float
is supposed to convert (or something like that, gl-2.0-vertexattribpointer tests
the bad case at any rate).
Fixes: 482cc273af ("pan/bi: Implement load attribute with the builder")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
draw->index_bounds_valid tells drivers that the values of min_index/max_index
are set correctly and can be used e.g. to allocate memory for varyings. If set
incorrectly, the GL promises badness.
But, with primconvert, we go mucking with index buffers and then never update
the bounds. So it doesn't matter if the original index bounds were valid, we
can't promise the original bounds are *still* valid. If we were trying to
optimize CPU overhead, we could try to preserve the new min/max index but seeing
as only older Mali cares about this flag, and if you're using primconvert you're
already screwed, I'm not too inclined to go rework primconvert.
Fixes* page faults in primitive-restart-draw-mode on Mali-G52 for GL_QUAD_STRIPS
and GL_POLYGON, which hit the primconvert path. The full dmesg splat looks like:
[ 5438.811727] panfrost ffe40000.gpu: Unhandled Page fault in AS0 at VA 0x000000100A16BAC0
Reason: TODO
raw fault status: 0x25002C1
decoded fault status: SLAVE FAULT
exception type 0xC1: TRANSLATION_FAULT_1
access type 0x2: READ
source id 0x250
Notice that a high bit is randomly set in the address, this is trying to read
a varying from the actual varying buffer in the vicinity of 0xa16bac0. What's
actually happening is that we're trying to read index #0 despite promising the
driver a minimum index of 2, causing an integer underflow as we try to read
index -2, or as the hardware sees, 4294967294.
As long as we stop lying to panfrost about the bounds being correct, panfrost is
able to calculate the real (post-primconverted) bounds on its own, fixing the
test.
* Alternatively, maybe Panfrost should just ignore this bit, in which I don't
know why we have it in Gallium, since it's probably not conformant to fault on
out-of-range glDrawRangeElements.
Fixes: 72ff53098c ("gallium: add pipe_draw_info::index_bounds_valid")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>