Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103513
Fixes: de88979413 ("radv: Implement VK_AMD_shader_info")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Otherwise we will leak them, load duplicates from disk rather
than memory and never write items loaded from disk to the apps
pipeline cache.
Fixes: fd24be134f 'radv: make use of on-disk cache'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.
When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.
v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
does not include the terminator)
v4: Fixed LDS reporting. (Bas)
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Serious Sam Fusion 2017 uses a huge number of occlusion queries,
and the allocated query pool buffer is greater than 4096 bytes.
This slightly improves performance (tested in Ultra) from
117.2 FPS to 119.7 FPS (~+2%) on my RX480.
This also improves Talos, from 69 FPS to 72/73 FPS (~+5%).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Only needed when the CS path is used.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When binding a new pipeline, we applied all dynamic states
without checking if they really need to be re-emitted. This
doesn't seem to be useful for the meta operations because only
the viewports/scissors are updated.
This should reduce the number of commands added to the IB
when a new graphics pipeline is bound.
Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state()
and set the dirty flags directly there.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The depth bounds test values are either set at pipeline
creation or dynamically using vkCmdSetDepthBounds().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This was duplicated between both drivers, share here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.
Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Not sure how useful this is, but it makes it more consistent.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.
Fixes: 06f05040eb (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The IR is reused in different pipeline combinations so we need
to clone it to avoid link time optimistaions messing up the
original copy.
Fixes: 06f05040eb (radv: Link shaders)
Reviewed-by: Dave Airlie <airlied@redhat.com>
This was the actual cause of GPU hangs fixed by 0fdd531457 ("radv:
Fix pipeline cache locking issues"), since multiple threads would end
up trying to create the variants for a single entry.
Now that we're locking around the whole of this function, this isn't
really necessary (we either create all or none of the variants), but
fix this anyway in case things change later.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 17.3 <mesa-stable@lists.freedesktop.org>
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.
I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.
This at least fixes radv dEQP-VK.texture.shadow.* on VI.
Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Since it also uses the output vector before writing to memory.
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Otherwise we just need to write them to the tf ring.
this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.
Fixes: ce03c119ce 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.
Fixes: 75dfab24a2 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.
Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.
Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
They don't take a single wave anymore and we need the barriers.
Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.
This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>