It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.
Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
shuffle is implemented with shared VGPRs with ACO and Wave64.
Fixes dEQP-VK.subgroups.shuffle.framebuffer.subgroupshuffle*_vertex
with Wave64.
Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4595>
Fixes
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.
Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4593>
To workaround a crash with Wolfeinstein Younglood because the
games creates one descriptor with
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV...
I reported the problem to Machine Games, but still no answer, so
let's remove the unreachable calls (which are technically not
unreachable for buggy apps) to help gamers.
Note that AMDVLK and AMDGPU-PRO don't crash because they ignore
unsupported descriptor types.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4571>
It's unsupported because small bitsizes are still not completely
supported. It should have been disabled by default with ACO.
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4549>
For emitting RMW packets in the command stream. This new helper
will be useful for implementing extended dynamic states to only
overwrite the fields that need to be updated instead of storing
more values in the pipeline.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4531>
This gives +8% with Wolfeinstein Youngblood on my Vega64, and
according to someone else, it also improves performance with Doom
2016 and Wolfenstein 2 (and probably other ID Tech games).
This improvement is because Youngblood uses GENERAL for the main
depth-only pass and TC-compat HTILE is now enabled with GENERAL if
we know that we are outside of a render loop. This obviously also
reduces the number of HTILE decompressions from/to GENERAL.
Note that Youngblood violates the Vulkan spec regarding render loops
because they are only allowed with input attachments. Expect possible
rendering issues if apps use render loops with the wrong way (ie.
without input attachmens) because HTILE might not be coherent if
a depth-stencil texture is sampled and rendered in the same draw.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2704
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4391>
If no texture fetches happen it's useless to enable TC-compat HTILE.
Because the driver currently doesn't support TC-compat HTILE for
storage images we don't have to check.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4497>
This disables all fp16 shader control features on GFX8 because only
GFX9+ supports double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
This feature allows to use both 16-bit integers and 16-bit floats
as inputs/outputs.
This disables storageInputOutput16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.
This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.
We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Only GFX9+ support double rate packed math instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.
Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>
This replaces emit_vertex with:
if (vertex_count < max_vertices) {
emit_vertex_with_counter vertex_count ...
vertex_count += 1
}
Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.
pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)
No pipeline-db changes on other generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
The goal of this function was to return whether a depth-stencil image
has HTILE, in comparison to radv_layout_is_htile_compressed() which
is used to know whether a depth-stencil image has HTILE compressed.
These two functions are actually similar and they have never been
used for what they were supposed to. Remove radv_layout_has_htile()
in favour of radv_layout_is_htile_compressed() for now. If it's
needed in the future, I will re-introduce this concept properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
Statistics will be added to ACO in later commits.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Enabling a Vulkan extension doesn't mean that all features need
to be implemented. DOOM Eternal crashes at launch if that ext
is not supported but it doesn't matter if the features are enabled
or not.
Let's enable it like we did for VK_KHR_16bit_storage.
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.
This fixes DOOM Eternal which requires subgroup shuffle.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
The Vulkan spec 1.2.135 says:
"pSizes is an optional array of buffer sizes, specifying the maximum
number of bytes to capture to the corresponding transform feedback
buffer. If pSizes is NULL, or the value of the pSizes array element
is VK_WHOLE_SIZE, then the maximum bytes captured will be the size
of the corresponding buffer minus the buffer offset."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2650
Fixes: b4eb029062 ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.
This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*
Fixes: f11ea22666 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.
Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.
Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>