DXVK's DXGI implementation can create extra instances used for
enumerating physical devices besides the games' instance. When reserving
VMIDs for SPM, the DXGI instances may snatch the VMID reservation early,
making VMID reservation for the instance that actually needs it fail.
This starts being a problem on kernels 6.18+ where only one user may
reserve a VMID at a time.
Move reserving VMIDs to SQTT initialization inside vkCreateDevice so
that only the instances that actually create logical devices try
reserving VMIDs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38746>
(cherry picked from commit a7a4abc8d8)
Conflicts:
src/amd/vulkan/radv_physical_device.c
src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
According to AV1 spec, force_integer_mv=1 on intra frames. However, VCN
FW does not expect integer mv to be set unless screen content tools are
enabled. This also aligns the code to the radeonsi logic.
(cherry-picked from commit fa1fd2413f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38812>
Previously legacy_gs_info calculated based on
gs_info->legacy_gs_info.esgs_itemsize which is calculated based on gs
input varyings.
However, when using ESO vs/tes can have outputs not read by gs, which
leads to underestimating LDS usage.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38805>
It's the number of elements. RADV exposes VK_FORMAT_R64_{UINT,SINT}
formats for texel buffers, so the maximum is 1<<29 to fit in the
32-bit bounds checking.
Fixes KHR-GL46.texture_buffer_size_clamping.* with Zink and new VKCTS
dEQP-VK.texture.misc.max_elements.*.
Cc: mesa-stable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38805>
RDR2 VRAM memory management when resizable BAR is enabled seems
incorrect because it keeps allocating VRAM without freeing anything.
This introduces a drirc option to emulate a fake carveout of 256MiB to
workaround this game bug. This also adjust memory budgets by
distributing it between visible and invisible because AMDGPU reports
the same value for both when REBAR is enabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12091
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38805>
The order of pReferenceSlots is not well-defined by spec. Instead we
need to look at the RefPicList0/1 which provides slot indices.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit ab56ce154b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Since we only support 1 L0/L1 ref, the default num refs in the PPS
should always be 0. With that there never any need to set the override
flag in the slice header (until more references are supported).
Also the ref pic list modifications should be clamped to the size of the
ref pic list.
This fixes an issue seen with dEQP-VK.video.encode.h264.i_p_b_13_*.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 2e21eec921)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.
Cc: mesa-stable
(cherry picked from commit bc1eea90b9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Moving `ci-tron:priority:` out of the variable because an empty value
will not be authorized, and this makes it obvious if that bug ever
happens (job will not be picked up and gitlab will complain that
`ci-tron:priority:` is not a tag registered by any runner), instead of
getting picked up by any runner that will then reject (fail) the job.
(This is caused by GitLab's API not allowing tags to be enforced when
picking up jobs, resulting in jobs with missing tags being picked up by
any runner, like the bug we had with the generic fd.o runners a few
months ago.)
v2 (Martin Roukala):
* use the priority tags in all amdgpu jobs
* add missing tags in etnaviv jobs
* add missing tags in broadcom jobs
Cc: mesa-stable
(cherry picked from commit 53fe1f39a0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
When there are no color outputs in the rendering state, but color write
enable/write aren't masked out (which seems legal with
VK_EXT_dynamic_rendering_unused_attachments), the driver must emit
CB_DISABLE to disable CB rendering completely.
Otherwise, if there is also a depth/stencil attachment in the rendering
state, CB0 is always set to 32_R for RB+. That means, the pixel shader
would still export fragments but to the previously bound color
attachment.
VKCTS is missing coverage.
Fixes: 4580293ab2 ("radv: implement RB+ depth-only rendering for better perf")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14319
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 168a8d0b52)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.
This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.
Cc: mesa-stable
(cherry picked from commit 1243d575a5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This was copied from radeonsi which expected seq_force_screen_content_tools = 2
and seq_force_integer_mv = 2.
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 3858a6a696)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Disable sparse mappings on GFX7-8 due to GPU hangs in the VK CTS,
except Polaris where it happens to work "well enough" to pass
the VK CTS and run some games already.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 567e1b56ef)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Also disable the sparse binding queue and other related features.
Using sparse on GFX6-8 can cause GPU hangs at the moment.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1c8881fc60)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
VCN requires the luma/chroma VAs to be 256 aligned. On VCN5, the
collocated buffer was not 256 aligned which can cause these VAs to be
unaligned.
This fixes VVL PositiveVideoEncodeH264.Basic on VCN5.
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 8848495875)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
GTK is missing a semaphore between QueueSubmit() and QueuePresent()
causing the WSI submit to be "unordered" and to immediately signal the
semaphores (because it's missing a wait semaphore in QueuePresent()).
The workaround is to disable unordered WSI submits until GTK fixes it
properly.
Cc: "25.3"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14087
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0d9d45db4e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
For GS streamout, we need the following LDS scratch space:
- Repacking streamout vertices takes 1 dword per 4 waves per stream
(max 16 bytes for Wave64, max 32 bytes for Wave32)
- 1 dword per stream for buffer info
(16 bytes)
- 1 dword per buffer for buffer info
(16 bytes)
Previously, the space used for buffer info aliased with the
space for repacking the output vertices in ngg_gs_finale(),
and there was no barrier in between, which caused a race
condition, resulting in random failure.
Fix this by allocating a few more LDS dwords so that aliasing
is not required, which also allows us to remove an extra
workgroup barrier.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 8f99d736d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
1. The prolog needs to have a null check. Libraries don't have prologs.
2. We only need to print the shaders actually included in this pipeline.
Libraries were already printed separately.
3. The traversal shader was wrongly omitted from the output.
Cc: mesa-stable
(cherry picked from commit 73a31dafbc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
This fixes the VVL PositiveVideoDecodeAV1.* tests, which trigger error
concealment. These DPB addresses would not be normally used, but get
used by the error concealment path.
Fixes: d103b76ad6 ("radv/video: add VK_KHR_video_decode_av1 support.")
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 82d944b388)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Not the most optimal solution but 64-bit vertex attributes are rarely
used. Could still revisit if we find a real use case that matters.
This fixes recent VKCTS coverage:
dEQP-VK.pipeline.fast_linked_library.vertex_input.component_mismatch.r64g64b64.*_to_dvec2
dEQP-VK.pipeline.shader_object_.*.vertex_input.component_mismatch.r64g64b64.*_to_dvec2
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14243
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a0d607bfdb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
The shaders in question use:
(memory_load + (gl_SubgroupSize - 1)) & ~(gl_SubgroupSize - 1)
My guess is that this is supposed to be the subgroup size of whatever
produced the value, not the subgroup size in this shader.
And because in the consumer the workgroup size is 32, we use wave32.
Fixes: a2d3cbac2a ("radv: determine subgroup/wave size early")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14187
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 83e9ae2d5c)
Conflicts:
src/amd/vulkan/radv_instance.c
src/amd/vulkan/radv_instance.h
src/util/driconf.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
write_memory is used after encoding every frame to mark the feedback
buffer as ready. Only use it when write_memory can work without PCIe
atomics support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 874e02003a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The Vulkan spec says:
"VUID-vkCmdDraw-maxFragmentDualSrcAttachments-09239
If blending is enabled for any attachment where either the source
or destination blend factors for that attachment use the secondary
color input, the maximum value of Location for any output attachment
statically used in the Fragment Execution Model executed by this
command must be less than maxFragmentDualSrcAttachments"
Which means it must be disabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14190
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit b2badb2b24)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
VCN requires 64x16 alignment for HEVC. When the app requests non-aligned
resolutions, make up for it with conformance window cropping.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Cc: mesa-stable
(cherry picked from commit cef8eff74d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
This can happen with (for example) 32x2 loads with
align_mul=4,align_offset=2.
This patch does bit_size=min(bit_size,bytes) to prevent num_components
from being 0.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 52cd5f7e69 ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit b18421ae3d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>