Ignore primitive count and indices which work differently.
Then, assign driver locations to per-vertex and per-primitive
outputs independently because we allocate separate LDS
areas for them.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Mesh shaders use NGG, but the API allows many compute shader
features such as workgroups and shared memory.
Use the appropriate NIR lowerings for these, then
call ac_nir_lower_ngg_ms.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Use the same code path as other NGG shaders, but with
a few special cases.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Generic per-primitive outputs:
They work similarly to other NGG outputs.
In the ISA they are param export instructions that are executed
on the primitive threads. These per-primitive params must be
sorted last among both mesh shader outputs and pixel shader inputs.
PS can read these inputs using the same old VINTRP instructions.
They use the same amount of LDS space as per-vertex PS inputs.
Special per-primitive outputs:
The VRS rate x, y, viewport and layer are special per-primitive
outputs which must go to the second channel of the primitive
export instruction, which is enabled by EN_PRIM_PAYLOAD.
If the PS wants to read these, they must also be exported as
a generic param.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Makes the code a little cleaner, and makes it easier to add
per-primitive PS inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
This cleans up radv_pipeline_generate_ps_inputs.
Makes the code a little cleaner, and makes it easier to add
per-primitive PS inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Use the same old outinfo structure as other NGG shaders.
Additionally, store the output primitive type.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
This will be used to determine whether a pipeline uses
mesh shading or not.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
Lower mesh shader outputs to shared memory.
At the end of the shader, read the outputs from shared memory
and export their values as NGG expects.
We allocate separate shared memory (LDS) areas for per-vertex,
per-primitive outputs, primitive indices, primitive count.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
It gets a HTTP 504 error code when trying to access a git repo.
Disable it again until this issue is resolved.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14364>
Use firstMipIdInTail directly from addrlib which calculated this
in a different way:
Original way: either dimension size of mipmap should be less than
the tile size.
Addrlib way: all dimesion size of the mipmap should be less than
the tile size and at lest one dimension size should be less than
half of the tile size, so that all following mip levels can fit
in one tile and any commit for level in the mip tail also commit
for all levels in mip tail.
Theoretically either way is OK but addrlib way needs less care
about the mip tail commit and better align with the true memory
layout given by itself.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14223>
This was added to fix dEQP-VK.api.buffer.basic.size_max_uint64 but
with VK_KHR_maintenance4 this is now considered invalid usage.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14331>
For example, if the driver clears a view of SINT with 127 and the image
format is UNORM, the hw fills "1" instead of the clear value
(127.0f / 255.0f). This CB feature is GFX9+ only.
This should fix test_typed_srv_cast_clear from vkd3d-proton once it
correctly sets the MUTABLE flag (which is still buggy as of c0a3fa8a).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5676
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14228>
This also includes storage images and storage buffers, not just color
attachments.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14311>
1. the dst stride may be too small if count=1.
2. the src stride may be too small due to the availability bit.
So lets just compute the size needed explicitly and use it.
Fixes: 90a0556c ("radv: use pool stride when copying single query results")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14242>
Uplay needs this to avoid a crash because it does an use-after-free
of a descriptor set layout. This was initially introduced by Bas to
workaround a similar issue with Baldur's Gate 3, it seems needed again.
Cc: 21.3 mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5789
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14318>
This fixes an stack-use-after-scope detect by ASAN because the
subpass is used after the loop by radv_mark_noncoherent_rb().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14317>
It's required to have a valid device memory handle and it would make
no sense to call these functions with a NULL pointer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14287>
Fix defects reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable signal_semaphore_infos going out of scope leaks the storage it points to
leaked_storage: Variable wait_semaphore_infos going out of scope leaks the storage it points to.
Fixes: 3da7d10d9b ("radv: implement vkQueueSubmit2KHR()")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14260>
These can't be encoded on GFX6/7, and combining these additions causes
CTS failures on GFX10.3.
I think the low 2 MSBs are ignored before the addition, not after, so
load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0).
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>
Make sure to clamp the global scissor to the render area.
This fixes dEQP-VK.draw.dynamic_rendering.*oversized.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14170>
For v_pk_fma_f16, we should consider all three operands, not the first
two.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 15a375b4c8 ("radv,aco: don't lower some ffma instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229>
If signalled on the same queue it is totally useless, so only wait
if we have a syncobj that is explicitly being waited on, which can
be from potentially another queue/ctx. (Ideally we'd check but there
is no way to do so currently. Might revisit when we integrate the
common sync framework)
Fixes: 7675d066ca ("radv/amdgpu: Add support for submitting 0 commandbuffers.")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14214>
This fixes
dEQP-VK.texture.filtering_anisotropy.single_level.anisotropy_*.mag_linear_min_linear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14171>
When the BO is NULL, AMDGPU will reset the PTE VA range to the initial
state. Otherwise, it will first unmap all existing VA that overlap the
requested range and then map. This seems better than using MAP/UNMAP.
This reduces stuttering in Forza Horizon 5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
This shouldn't be necessary because applications have to manage
resources and memory themselves.
This also prevented memory to be freed if an application doesn't unbind
a sparse memory object and free it, which is legal as long as the
resource isn't used afterwards.
This was introduced to unmap the sparse mappings when destroying
a virtual BO, but now that the driver uses OP_CLEAR it's no longer
needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
I copied stuff from ac_gpu_info.c until there were no Sienna Cichild or
Polaris10 fossil-db changes between real hardware and RADV_FORCE_FAMILY.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14126>
As needed on Android (as it is required) and by driconf flag otherwise.
The non-Android case would be on the host side for an Android VM.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>