the only dynamic allocation left for geometry shaders is all done in the setup
indirect kernel. so just pass the heap to that kernel directly, so we don't
reserve a heap for direct draws with GS (including pure-VS XFB). this should
reduce our memory footprint a lot in certain apps.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34661>
use 8-bit index buffer instead of 32-bit to significantly decrease the size of
serialized geometry shaders (agx_gs_info is not dynamic).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34661>
rather than a bunch of subtle booleans telling the driver how to invoke the GS
rast shader, collect everything into a common enum, and provide (CL safe)
helpers to do the appropriate calculations rather than duplicating across
GL/VK/indirects.
this fixes suboptimal handling of instancing with list topologies.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34661>
For h264/h265/av1/vp9, give warning when application is
sending more slices than allowed by limit, and stop copying
remaining slices to avoid unwanted behaviour.
Cc: mesa-stable
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34633>
They were both missing subtle cases. While we're here, fix a bunch of
SrcTypes. They shouldn't matter in practice since it's just used to
determine how many GPRs to allocate but we may as well get them right.
Fixes: 078ffb860b ("nak/sm20: Add initial SM20 encoding")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34675>
Since an FAbs or FNegAbs modifier is going to force the source out to a
register no matter what, we should do this first. That way we avoid the
unnecessary source swaps or other evictions when we have to evict the
source anyway because it has a modifier.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34675>
After the previous changes to varyings, this test no longer crashes.
Update the expected result from Crash to Fail to reflect that.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34074>
This logic to enable LD_VAR_BUF[_IMM] is on the conservative side.
For fixed varyings, we would need to know what the VS outputs to correctly
compute the indices the FS has to load from. For general varyings, the
locations are aligned either by the linker or by the application in case
of separable shaders.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34074>
This is no longer used after the previous commit and should therefore
be removed.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34074>
This change removes the limit of 16 varyings caused by the 8-bit offset
value used in LD_VAR_BUF[_IMM]. LD_VAR[_IMM] is used instead and the
necessary ADs are emitted at draw time.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34074>
This is already supported, all we have to do is advertise it.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
A8_UNORM on v9+ is using RGBA8_UNORM as a pixel format with the
A8_UNORM clump format to dealing with the diffences between
RGBA8 and the actual A8 in-memory layout.
The problem is, LEA_TEX only loads the InternalConversionDescriptor
which contains only the pixel format, and that's what ST_CVT uses
to do the conversion, so we'll actually store 4 components instead
of one.
This shows up with
dEQP-VK.image.load_store.without_any_format.buffer.a8_unorm* after
enabling maintenance5.
For now I've turned off the image storage capability for A8_UNORM
on all gens, but I'd be fine disabling it only on v9+ if you think
that's preferable.
Fixes: d95423686f ("pan/format: Add PAN_BIND_STORAGE_IMAGE flag")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
It doesn't do much, but let's call it, just in case this changes at
some point.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
It's not used, and we could retrieve the device from
image->vk.base.device if we had to anyway.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
The image is not supposed to be modified there.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
This is needed for maintenance5.
While at it, move the buffer offseting opertaion to CmdBindVertexBuffers2()
instead of applying the offset at draw time.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
This is already supported, all we have to do is advertise it.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>
si_destroy_context needs to call context->set_debug_callback(...) to
avoid the debug logs to access the destroyed context.
Adding this change introduced a different problem: when an aux context
is destroyed from si_destroy_screen, parts of the screen have been
freed already: the shader_compiler_queue_*.
c467a87e06 ("radeonsi: Destroy queues before the aux contexts") moved
the util_queue_destroy calls above the context destruction, but with
the 59a3f38ff6 change, it's not needed anymore: si_destroy_context
will finish the screen shader queues before proceeding with releasing,
so use-after-free isn't possible.
Fixes: 59a3f38ff6 ("radeonsi: clear the debug callback on ctx destroy")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12035
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34574>
We now support non-zero firstInstance when instance attributes have
a divisor != 1.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34642>
Letting the shader offset instanceID by baseInstance works only if
the divisor is one. If the divisor is greater than one, the firstInstance
parameter shouldn't be applied this divisor, but it currently is. Zero
divisors are also problematic, in that they will force use of the
instance zero attribute all the time.
The only way to fix that is to tweak the offsets of the per-instance
attributes instead, like is done in the JM backend.
Fixes: 1570f0172e ("panvk: Fix base_{instance,vertex} handling")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34642>
Most of the arguments we pass to emit_vs_attrib() can be extracted
from panvk_cmd_buffer, so let's pass a cmdbuf before we add more to
this function.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34642>
Radeonsi hasn't yet passed conformance, so it's not part of the auto set
at this point in time. But this will allow distribution to enable it if
they feel comfortable enough, or to disable it again, if it causes too
many problems.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34621>
With the current code in clpeak LLVM ended up generating v_mad_u64_u32
instructions, with this we get nice v_mad_u32_s24 ones instead and an 4x
performance increase in the int24 benchmark.
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34630>
Add debug option to show current shader type being
compiled within anv_shader_bin_create.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>
We are reaching our limit of adding flags to intel_debug
(apporaching 64 flags). Switch intel_debug to a bitset,
which gives us almost "unlimited" bits to use in the future.
v2(Michael Cheng): Fixed a few ci errors
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>