Doom Eternal explicitly allows overallocation via this extension
but that shouldn't change anything because it's the default RADV
behavior.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4785>
By default, RADV supports overallocation by the sense that it doesn't
reject an allocation if the target heap is full.
With VK_AMD_overallocation_behaviour, apps can disable overallocation
and the driver should account for all allocations explicitly made by
the application, and reject if the heap is full.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4785>
The reason behind this is that FMASK requires CMASK and also that
FMASK for non color attachments looks unnecessary. It's currently
much easier to add this simple check because the driver tries to
always enable DCC first and if we enable FMASK only if CMASK, we
might loose some FMASK compressions.
This helps fixing some new robustness2 tests which fails because
only FMASK is enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4783>
This would be necessary for an application to figure out if the
memory was allocated using a memory type with VK_MEMORY_PROPERTY_PROTECTED_BIT.
It also allows one to determine VRAM vs. GTT etc.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>
Lots of extra coding was involved in managing them.
And for protected memory I was thinking of making a function that
goes from domain+flags to memory types, which can reuse this array.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>
On APUs, the memory is unified (all heaps are equally fast) and
apps should count all memory heaps together. But some games like
Id Tech games (Youngblood and such) don't manage memory correctly
on APUs and they spill everything when one VRAM heap is full.
Instead of spilling buffers, they should just allocate new buffers
in the second heap but it seems like these games are confused if
two memory heaps have the DEVICE_LOCAL_BIT set.
This is probably a first step towards better memory management on
APUs but there is still some work to do if we want to run most apps
with a small dedicated VRAM (256MB or so).
This gives a huge boost for Id Tech games on APUs, and doesn't
seem to reduce Feral games performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4771>
Also reverse the BO list removal loop. This way typical WSI usage
should find the entry in O(active swapchains) iterations, which
should not be a performance issues. Tested with Doom(2106) which
found the entry in 1 iteration every time.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4306>
Use it instead of the libdrm provided amdgpu_drm.h header. I used
the kernel revision from the README to get the header so the
header versions should be consistent.
Tested by removing /usr/include/libdrm/amdgpu_drm.h from my dev-machine.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4749>
It can be enabled via pEnabledFeatures or via vkPhysicalDeviceFeatures2.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4706>
With this, Doom Eternal should now run with ACO on GFX8+.
The generated 8/16-bit storage code is okay but the generated int8/int16
code is currently pretty bad but it works and apparently Doom Eternal
doesn't actually use it (even though it requires it).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>
Fixes dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 hang with ACO
(features needed for the test are implemented in a later commit)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
This is needed to prevent bounds checking issues when load 8/16-bit values
with dword loads.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
When we originally wrote spirv_to_nir we didn't have a good scalar value
union to handily use so we rolled our own thing for spec constants. Now
that we have nir_const_value, we can use that and simplify a bunch of
the spec constant logic.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
VK_SHADER_STAGE_ALL now includes all ray-tracing related stages.
Noticed while comparing vulkaninfo with some other drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4679>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
Use the f variants of the math functions if the input parameter is a
float, saves converting from float to double and running the double
variant of the math function for gaining no precision at all
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3969>
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.
Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
shuffle is implemented with shared VGPRs with ACO and Wave64.
Fixes dEQP-VK.subgroups.shuffle.framebuffer.subgroupshuffle*_vertex
with Wave64.
Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4595>
Fixes
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.
Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4593>
To workaround a crash with Wolfeinstein Younglood because the
games creates one descriptor with
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV...
I reported the problem to Machine Games, but still no answer, so
let's remove the unreachable calls (which are technically not
unreachable for buggy apps) to help gamers.
Note that AMDVLK and AMDGPU-PRO don't crash because they ignore
unsupported descriptor types.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4571>
It's unsupported because small bitsizes are still not completely
supported. It should have been disabled by default with ACO.
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4549>
For emitting RMW packets in the command stream. This new helper
will be useful for implementing extended dynamic states to only
overwrite the fields that need to be updated instead of storing
more values in the pipeline.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4531>
This gives +8% with Wolfeinstein Youngblood on my Vega64, and
according to someone else, it also improves performance with Doom
2016 and Wolfenstein 2 (and probably other ID Tech games).
This improvement is because Youngblood uses GENERAL for the main
depth-only pass and TC-compat HTILE is now enabled with GENERAL if
we know that we are outside of a render loop. This obviously also
reduces the number of HTILE decompressions from/to GENERAL.
Note that Youngblood violates the Vulkan spec regarding render loops
because they are only allowed with input attachments. Expect possible
rendering issues if apps use render loops with the wrong way (ie.
without input attachmens) because HTILE might not be coherent if
a depth-stencil texture is sampled and rendered in the same draw.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2704
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4391>
If no texture fetches happen it's useless to enable TC-compat HTILE.
Because the driver currently doesn't support TC-compat HTILE for
storage images we don't have to check.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4497>
This disables all fp16 shader control features on GFX8 because only
GFX9+ supports double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
This feature allows to use both 16-bit integers and 16-bit floats
as inputs/outputs.
This disables storageInputOutput16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.
This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.
We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Only GFX9+ support double rate packed math instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.
Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>
This replaces emit_vertex with:
if (vertex_count < max_vertices) {
emit_vertex_with_counter vertex_count ...
vertex_count += 1
}
Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.
pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)
No pipeline-db changes on other generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
The goal of this function was to return whether a depth-stencil image
has HTILE, in comparison to radv_layout_is_htile_compressed() which
is used to know whether a depth-stencil image has HTILE compressed.
These two functions are actually similar and they have never been
used for what they were supposed to. Remove radv_layout_has_htile()
in favour of radv_layout_is_htile_compressed() for now. If it's
needed in the future, I will re-introduce this concept properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>