This wasn't calculating the correct value, this along with
a nir patch fixes a regression in:
dEQP-VK.tessellation.shader_input_output.barrier
Fixes: 043d14db30 (ac/nir: don't write tcs outputs to LDS that aren't read back.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reverts commit 2294d35b24.
We can't do this without adjusting the input SGPRs/VGPRs logic.
For now, just revert it. I will send a proper solution later.
It fixes a rendering issue in F1 2017 that CTS didn't catch up.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
anv merges the tess info correctly, but radv wasn't doing this.
This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw
Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
And move the comment to amd/common.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Use 16_ABGR instead of 32_ABGR if Z isn't written.
Ported from RadeonSI.
No CTS regressions on Polaris.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
ac_shader_util.c will contain shader helpers for RadeonSI
and RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We should also not load the input SGPRs and VGPRS, but
let's start with this for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Use a boolean instead because the number of needed SGPRs
is always 3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The number of grid components is always 3 when gl_NumWorkGroups
is declared, because it relies on the number of components of
nir_instrinsic_load_num_work_groups.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We never supported it. Missed during copy and pasting.
Fixes: 17201a2eb0 "radv: port to using updated anv entrypoint/extension generator."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Without this we get the error "FPExt only operates on FP" when
converting the following:
vec1 32 ssa_5 = b2f ssa_4
vec1 64 ssa_6 = f2f64 ssa_5
Which results in:
%44 = and i32 %43, 1065353216
%45 = fpext i32 %44 to double
With this patch we now get:
%44 = and i32 %43, 1065353216
%45 = bitcast i32 %44 to float
%46 = fpext float %45 to double
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.
Since it is all within a basic block, the compiler should be
able to merge any extra loads.
v2: Change visit_get_buffer_size too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.
Fixes: a639d40f13 "radv: add support for local bos. (v3)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Copied from RadeonSI.
This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*
And some other ones which use the same format.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This patch is ported from RadeonSI and it has two effects.
It fixes a rendering issue which affects F1 2017 and Dawn
of War 3 (Vega only) because LLVM was ending up by generating
the new v_mad_mix_{hi,lo} instructions which appear to be
buggy in some way. Not sure if Mesa is generating something
wrong or if the issue is in LLVM only. Anyway, that explains why
the DOW3 issue can't be reproduced with GL on Vega.
It also improves performance because v_cvt_pkrtz_f16 is faster,
and because I guess the rounding mode behaviour is similar between
GL and VK, we can use it. About performance, it improves Talos
by +3/4% but I don't see any other impacts.
No CTS regressions on Polaris.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Allows apps to determine the LLVM version so that they can decide
whether or not to enable workarounds for LLVM issues.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The handle type in the case statement is supposed to be VK_EXTERNAL_-
MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.
Fixes: 546e747867 ("radv: Implement VK_EXT_external_memory_dma_buf")
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The WSI core code does all the hard work. Just add the wrappers and
turn it on.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Now that we have anv_device_init/finish functions, there's no reason to
have the individual driver do any more work than that.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
This drops the unneeded callbacks struct as well as the queue_get_family
callback we were using before we'd pulled QueuePresent inside.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>