If we have per-draw state (vertex ID stuff), there's an ordering
mismatch. Fixes
dEQP-GLES31.functional.draw_base_vertex.draw_elements_instanced_base_vertex.builtin_variable.vertex_id
on Midgard, and I'm not sure why it was passing on Bifrost before. Also
should fix (on both architectures) DRAWID issues.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Ensures a valid schedule/regalloc is possible when vectors are used in
funny ways, as occurs in dEQP-GLES31 resulting in a scheduler hang (or
with prior patches, an RA failure).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Technically we can stick the offset in the vertex ID attribute record,
but this is a faster way to get the test passing and Midgard perf?
what's that?
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
You'd just get a blend shader anyway, and since they're not spec
requirements, let's not worry about backporting the Midgard lowerings.
Takes dEQP-GLES31.functional.fbo.color.tex2d.* on Midgard from crashing
to not supported.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Instead of reading wrong side of the union (undefined behaviour). Fixes
a GenXML assertion failure in
KHR-GLES31.core.texture_buffer.texture_buffer_texture_buffer_range
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Since the GLSL compilers will pack together flat varyings with no regard
to type, under the assumption the backend can deal with it. I guess we
can deal with it then... Fixes fails in
dEQP-GLES31.functional.separate_shader.random.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Instead of doing a complicated hack with the POT divisor, just zero the
stride of the linear attribute buffer like we do on the CPU.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
The function is complicated enough as it is -- hide the bit twiddling
behind a helper function.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
If the (vbi, divisor) tuple matches, we can save an attribute buffer
descriptor. We do the linking at CSO create time. This should be a bit
more cache friendly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Broken in several ways. Hide it until we can get this sorted, and have a
test plan to keep it sorted.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
It's too complicated and probably for no actual benefit. The main reason
we have BGR formats is for display, but that's export and doesn't get
hit by this path. Internal BGRA textures are possible with a Mesa
extension but sufficiently rare that I regret suggesting this as a
possible optimization. My apologies, and thanks for the fish.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
Reduces RA memory footprint by 4x, fixing an OOM in the following dEQP
test that otherwise would allocate 8GB of memory...
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.36
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
If the shader packs multiple varyings into the same location with
different location_frac, we'll need to lower to a single varying store
that collects all of the channels together. This is not trivial during
code gen, but it is trivial to do in NIR right before codegen by relying
on nir_lower_io_to_temporaries. Since we're guaranteed all varyings will
be written exactly once, in the exit block, we can scan the shader
linearly and collect stores together in a single pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>