Initialize the inverted predication VA only when it is used
for the first time.
This is needed to get conditional rendering work correctly with
task shaders because the internal compute cmdbuf may not exist
yet when conditional rendering starts.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
This can't do it in the loop because it doesn't easily know what
attributes use a binding.
We could do it in a separate loop, but there's no point, especially since
zink does CmdSetVertexInputEXT() after CmdBindVertexBuffers2().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: c335a4d70e ("radv: dynamically calculate misaligned_mask for dynamic vertex input")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17521>
We're about to make it so that the compiler warns/errors if you use the
wrong iterator macro. Fix up a bunch of places where someone used the
wrong one before we break anything.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17630>
When the application uses depth values outside of the [0.0,1.0] range
with VK_EXT_depth_range_unrestricted, or when explicitly disabled.
Otherwise, the driver can clamp to [0.0,1.0] internally for optimal
performance.
From the Vulkan spec "in 28.10.1. Depth Clamping and Range Adjustment":
"If depth clamping is not enabled and zf is not in the range [0, 1]
and either VK_EXT_depth_range_unrestricted is not enabled, or the
depth attachment has a fixed-point format, then zf is undefined
following this step."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16349>
RADV and PowerVR use the same implementation.
Turnip does use a slightly modified version but the helper only has one
use -> just inline it and get rid of turnip's vk_format.h.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17515>
The PS epilog would be a "normal" compiled shader using RA, etc. It
will declare up to 8x4 VGPRs for all color exports.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
The main FS would have to jump to the PC of the PS epilog. Given that
shaders are allocated in the 32-bit addr space, one user SGPR is fine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
To pass the necessary pipeline information for compiling PS epilogs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
This is currently disabled but it will be used for testing first,
and then for graphics pipeline libraries.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
When using a static callable stack, the required scratch has already
been allocated.
Dynamic stacks are located at the end of scratch memory
and are allocated on demand using radv_set_rt_stack_size.
Static stacks live at the start of scratch memory and are allocated in
create_rt_shader by setting scratch_size.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17579>
This function causes a crash with RADV_DEBUG=llvm and this commit
works around that crash.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17580>
We used to do it for every queue, which was duplicate work as pstate is
per-device. It could also cause trouble when multiple hw_ctx are created as
the call will succeed for only one of them and the rest will return -EBUSY.
Simplify and fix this by only setting for the first non-null hw_ctx.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17541>
In the direct path we already skipped draws, but in DGC I noticed
that just emitting these packets can cause issues ...
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17269>
Got to put the commandbuffers & uploadbuffers there. With DGC
those can be allocated by the application.
Excluding it from all other buffers/images to avoid using the
precious 32bit address space.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17269>
Required for indirect(2) ray tracing to work.
Fixes the following tests:
dEQP-VK.ray_tracing_pipeline.trace_rays_indirect2.indirect_*
dEQP-VK.ray_tracing_pipeline.trace_rays_cmds_maintenance_1.indirect2_*
Fixes: 16585664 ("radv: vkCmdTraceRaysIndirect2KHR")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316>
Required for indirect(2) ray tracing to work.
Fixes: b30f96dd ("radv,aco: Use ray_launch_size_addr")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316>
With dynamic vertex bindings the vertex format lookups are a lot
more frequent (vs being baked in the pipeline). Add a simple lookup
cache using a dynamic array to keep track of the hw values, and
avoid repeated translation.
This also reduces the memset to just the bitfields since all
the others will be overwritten.
Seen in perf traces gputest gimark with zink on radv.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15846>
From looking at the CTS,
VK_QUERY_TYPE_ACCELERATION_STRUCTURE_SIZE_KHR
refers to the serialization size and not to the
actual, current size.
Fixes the following CTS:
dEQP-VK.ray_tracing_pipeline.acceleration_structures.query_pool_results.cpu.buffer.size
dEQP-VK.ray_tracing_pipeline.acceleration_structures.query_pool_results.cpu.memory.size
dEQP-VK.ray_tracing_pipeline.acceleration_structures.query_pool_results.gpu.buffer.size
dEQP-VK.ray_tracing_pipeline.acceleration_structures.query_pool_results.gpu.memory.size
Fixes: 5d56c2c ("radv: Add accel struct queries for maintenance1")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17444>
This workaround looks actually broken. We added it in the past
because otherwise the game would just report 3GiB of video memory
(ie. size of GTT on SD). Though, with this workaround enabled, the
game explodes in memory easily.
One theory is that because we fake integrated GPUs as discrete GPUS,
and because we report 6GiB of VRAM (ie. driver redistributes memory
for small carveout), the game thinks there is 6GiB of VRAM only and
then keep allocating stuff.
People reported that the memory explosion is gone without this
workaround applied and I confirmed this myself.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17421>
These set the pass and make sure we don't have multiple submissions
at the same time touching the perf counters/pass at the same time.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>