nir_lower_clip_cull_distance_array_vars was sneakily updating
shader_info::clip/cull_distance_array_size.
This moves the gathering into a new function
nir_gather_clip_cull_distance_sizes_from_vars.
v2: remove assertions that prevented nir_lower_clip_cull_distance_array_vars
from being used with non-compact arrays
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38465>
We already did the work of transforming the ray data, no need to do it
multiple times.
Should theoretically be a lot better. However, none of the fossils
appear to use object-space ray data in anyhit/intersection shaders. :(
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38809>
This splits up radv_nir_rt_shader.c into several parts.
The first part is all ray traversal lowering for RT pipelines, located
at radv_nir_rt_traversal_shader.c. It implements building the traversal
loop, including inlined any-hit/intersection shaders (optionally as a
completely separate shader).
The second part is lowering for individual RT stages (right now,
monolithic vs. CPS-style separate compilation). Each lowering technique
lives in its own file (radv_nir_rt_stage_{monolithic,cps}.c).
Code shared between RT lowering techniques (shader inlining helpers and
storage lowering passes) gets moved into radv_nir_rt_stage_common.c.
One header, radv_nir_rt_stage.h, is the public interface for RT pipeline
stage lowering. Functions exposed to users (really just
radv_pipeline_rt.c) go there. The header for internal shared helpers is
radv_nir_rt_stage_common.c.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38809>
This seems to work just fine, except on NAVI21, NAVI22 and VANGOGH. It
might be the same issue as has_vrs_ds_export_bug but it's not documented
anywhere. That being said, the CTS tests that fail don't even export
depth or stencil from fragment shaders.
Do not enable this feature on these GPUs to be conservative.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38767>
It's the number of elements. RADV exposes VK_FORMAT_R64_{UINT,SINT}
formats for texel buffers, so the maximum is 1<<29 to fit in the
32-bit bounds checking.
Fixes KHR-GL46.texture_buffer_size_clamping.* with Zink and new VKCTS
dEQP-VK.texture.misc.max_elements.*.
Cc: mesa-stable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38140>
DXVK's DXGI implementation can create extra instances used for
enumerating physical devices besides the games' instance. When reserving
VMIDs for SPM, the DXGI instances may snatch the VMID reservation early,
making VMID reservation for the instance that actually needs it fail.
This starts being a problem on kernels 6.18+ where only one user may
reserve a VMID at a time.
Move reserving VMIDs to SQTT initialization inside vkCreateDevice so
that only the instances that actually create logical devices try
reserving VMIDs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38746>
According to AV1 spec, force_integer_mv=1 on intra frames. However, VCN
FW does not expect integer mv to be set unless screen content tools are
enabled. This also aligns the code to the radeonsi logic.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38716>
The order of pReferenceSlots is not well-defined by spec. Instead we
need to look at the RefPicList0/1 which provides slot indices.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38686>
Since we only support 1 L0/L1 ref, the default num refs in the PPS
should always be 0. With that there never any need to set the override
flag in the slice header (until more references are supported).
Also the ref pic list modifications should be clamped to the size of the
ref pic list.
This fixes an issue seen with dEQP-VK.video.encode.h264.i_p_b_13_*.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38686>
Weird that CTS did not catch that ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 11195eb8de ("vulkan: Add KHR_swapchain_maintenance1 promotions.")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38728>
aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common
because tests don't link with ACO, so linking would fail, but we want
to move the nir_opt_vectorize callback here that uses it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>
This fixes a real issue when ESO uses fbfetch output because this
was determined after instead of before.
This solution isn't the most elegant one but binding graphics shaders
earlier would require more work. Let's just handle this specific corner
case for now.
This fixes
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.shader_objects.fragment_region*
on some GPUs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>
Without this, we will report some image formats as unsupported
and the dedicated sparse binding queue won't work
when sparse support is enabled using RADV_PERFTEST=sparse
Fixes: dd90c76cea12 ("radv: Advertise sparse features pre Polaris with perftest flag")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38676>
RDR2 VRAM memory management when resizable BAR is enabled seems
incorrect because it keeps allocating VRAM without freeing anything.
This introduces a drirc option to emulate a fake carveout of 256MiB to
workaround this game bug. This also adjust memory budgets by
distributing it between visible and invisible because AMDGPU reports
the same value for both when REBAR is enabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12091
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38627>
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488>
RADV_PERFTEST=sparse is a new option to enable experimental
support for sparse features when they aren't enabled by default:
- gfx6 supports sparse, albeit with a reduced feature set
- gfx7 supports 3D images (with non-standard block shape)
and unaligned mip sizes
- gfx8 supports the same feature set as gfx7
(Polaris behaves more stable than other gfx8, so we had
already enabled it by default on Polaris for a long time.)
We pass all dEQP-VK.*sparse* tests on gfx6-8 when running on
a single thread however it may cause hangs or failures
when executing the tests on multiple parallel jobs.
We plan to enable this by default when we deem it stable enough.
Until then, users can already test some games that use it.
Note, at the moment there are some unsolved problems in the
amdgpu kernel driver regarding sparse bindings on these GPUs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38553>
Change has_cp_dma_with_null_prt_bug to cp_dma_supports_sparse
to know when CP DMA supports sparse. CP DMA doesn't support
sparse on any gfx6-9 chip.
Sources:
- d2669628 already documented this on gfx6 in 2018
- e259f405 added a radeonsi workaround for gfx9 in 2023
- 235f70e4 added a radv workaround for Polaris in 2025
Now RADV will use compute copy and fill for sparse resources
on all gfx6-9 chips (previously only did on Polaris and newer).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38553>