drivers that use nir_assign_io_var_locations() with EXT_shader_object all
have the same bug with a shader interface that looks like this:
shader output block:
* PSIZ
* VAR0
* VAR8
shader input block:
* VAR0
* VAR8
in this case, output driver locations will be assigned like:
* PSIZ=0
* VAR0=1
* VAR8=2
and input driver locations will be:
* VAR0=0
* VAR8=1
which breaks the shaders even though this is a totally legitimate thing
to do
thus, a second set of shaders have to be created without PSIZ to work around
the bug since I've already spent 18+ hours trying to fix it and have only succeeded
in breaking every driver that uses it
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22725>
Indeed, the hardcoded framebuffer cleanup doesn't handle "resolve".
For instance, this issue is triggered with "piglit/bin/glx-copy-sub-buffer -samples=2 -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: f5bde99cbd ("gallium: plumb resolve attachments through from frontends -> pipe_framebuffer_state")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22554>
We had a lot of implicit locks going on even though there was strictly no
need in doing so. This makes the compilation APIs more atomic while also
providing a cleaner interface.
Not in the mood of splitting it up without deadlocking in the middle. So
it's one big commit sadly.
Signed-off-by: Karol Herbst <git@karolherbst.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22434>
0: KIL -none.1111
Negate is not allowed for texturing opcodes, so the incorrect swizzle
was detected, however later optimization, where we try to rewrite incorrect
swizzles from constant (immediate) registers by adding a new ones with
correct order was interfering and not handling this correctly, so we
ended with
CONST[0] = { -1.0000 -1.0000 -1.0000 -1.0000 }
0: KIL const[0].xyz-w;
Even if it would get the swizzle right, texturing opcodes can't read from
constant registers, so just skip it and let this be handled by a later
part which inserts an extra mov instead.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Fixes: a8e1e5b5c2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22704>
The side effect is removing the aco/llvm backend bc optimization code
and linear/persp_centroid variable.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22199>
Instead of forcing vertex buffer stride to be 4 byte aligned only,
DX10 actually allows the stride to be non 4-byte aligned but the
alignment of an element must be the nearest power of 2 greater or equal to the
width of the element's format, or 4, whichever is less. So the requirement is
better met with PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY which if set to
TRUE, the sum of vertex element offset + vertex buffer offset + vertex buffer
stride must be aligned to the vertex attributes component size.
Note: PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY cannot be set
with other alignment-requiring CAPs, so we have to return 0 for all the
other alignement CAPs.
This avoids some unnecessary software vertex translate fallback.
cc: mesa-stable
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22689>
Indeed, the function nir_to_tgsi() returns an ureg_get_tokens() allocated
object which is assigned locally. The ureg_get_tokens() allocated object
should be freed.
For instance, this issue is triggered with a llvm enabled lima,
"piglit/bin/gl-1.0-rendermode-feedback -auto -fbo":
Direct leak of 512 byte(s) in 1 object(s) allocated from:
#0 0x7faeaa4500 in __interceptor_realloc (/usr/lib64/libasan.so.6+0xa4500)
#1 0x7fa4a88f1c in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
#2 0x7fa4a88f1c in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
#3 0x7fa4a900f4 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
#4 0x7fa4a900f4 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
#5 0x7fa4a91dfc in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
#6 0x7fa4b20a2c in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4011
#7 0x7fa4a0c914 in draw_create_vertex_shader ../src/gallium/auxiliary/draw/draw_vs.c:77
Fixes: b5e782f5f4 ("aux/draw: use nir_to_tgsi for draw shader in llvm path")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21924>
We can now no longer rely on certain dirty bits to re-trigger draw time
resource tracking. We need to use the new fd_dirty*_resource() APIs.
Fixes `org.skia.skqp.SkQPRunner#gles_recordopts` on android 9.
Fixes: 0a62a874fc ("freedreno: Re-work dirty-resource tracking")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22683>
Previously it was assumed that between the and the variable there was
only one deref.
To handle all cases a new function is introduced that recreates a chain
of derefs.
Fixes: 5a4083349f ("zink: add provoking vertex mode lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22678>
Previously the passthrough gs shader loaded some values with uniform
loads using sevaral hardcoded values.
This was not flexible for other drivers and started becoming too
unflexible for zink itself.
Use system values instead and use a lowering pass in zink.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22667>
Some chips support firmware based mcbp. If supported this means
radeonsi needs to allocate 3 buffers and pass them to the firmware.
From there, the firmware will handle mcbp and register shadowing
on its own so we don't need to insert LOAD packet in the preamble.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21986>
It's not a lazy loaded type so doing the Once::call_once in every
Platform::get gives us a pointless atomic, which might be slow on some
platforms.
Every application has to call clGetPlatformIDs so we only need to do it
there.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22649>
Enabling fp64 support makes rarely sense, but in case we do claim it, we
should also tell if it's a pure software implementation.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22649>