This calculation needs to happen in the same loop as the
geometry/triangle id calculations in case the selected invocation is
before all invocations that were already selected.
Totals from 1269 (15.10% of 8406) affected BVHs:
compacted_size: 137581888 -> 137606464 (+0.02%); split: -0.08%, +0.10%
sah: 6496048424 -> 6496048450 (+0.00%); split: -0.00%, +0.00%
primitive_node_count: 604384 -> 604656 (+0.05%); split: -0.14%, +0.19%
Fixes: c18a7d0 ("radv: Emit compressed primitive nodes on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38462>
nir_lower_clip_cull_distance_array_vars was sneakily updating
shader_info::clip/cull_distance_array_size.
This moves the gathering into a new function
nir_gather_clip_cull_distance_sizes_from_vars.
v2: remove assertions that prevented nir_lower_clip_cull_distance_array_vars
from being used with non-compact arrays
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38465>
We already did the work of transforming the ray data, no need to do it
multiple times.
Should theoretically be a lot better. However, none of the fossils
appear to use object-space ray data in anyhit/intersection shaders. :(
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38809>
This splits up radv_nir_rt_shader.c into several parts.
The first part is all ray traversal lowering for RT pipelines, located
at radv_nir_rt_traversal_shader.c. It implements building the traversal
loop, including inlined any-hit/intersection shaders (optionally as a
completely separate shader).
The second part is lowering for individual RT stages (right now,
monolithic vs. CPS-style separate compilation). Each lowering technique
lives in its own file (radv_nir_rt_stage_{monolithic,cps}.c).
Code shared between RT lowering techniques (shader inlining helpers and
storage lowering passes) gets moved into radv_nir_rt_stage_common.c.
One header, radv_nir_rt_stage.h, is the public interface for RT pipeline
stage lowering. Functions exposed to users (really just
radv_pipeline_rt.c) go there. The header for internal shared helpers is
radv_nir_rt_stage_common.c.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38809>
This seems to work just fine, except on NAVI21, NAVI22 and VANGOGH. It
might be the same issue as has_vrs_ds_export_bug but it's not documented
anywhere. That being said, the CTS tests that fail don't even export
depth or stencil from fragment shaders.
Do not enable this feature on these GPUs to be conservative.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38767>
It's the number of elements. RADV exposes VK_FORMAT_R64_{UINT,SINT}
formats for texel buffers, so the maximum is 1<<29 to fit in the
32-bit bounds checking.
Fixes KHR-GL46.texture_buffer_size_clamping.* with Zink and new VKCTS
dEQP-VK.texture.misc.max_elements.*.
Cc: mesa-stable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38140>
DXVK's DXGI implementation can create extra instances used for
enumerating physical devices besides the games' instance. When reserving
VMIDs for SPM, the DXGI instances may snatch the VMID reservation early,
making VMID reservation for the instance that actually needs it fail.
This starts being a problem on kernels 6.18+ where only one user may
reserve a VMID at a time.
Move reserving VMIDs to SQTT initialization inside vkCreateDevice so
that only the instances that actually create logical devices try
reserving VMIDs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38746>
According to AV1 spec, force_integer_mv=1 on intra frames. However, VCN
FW does not expect integer mv to be set unless screen content tools are
enabled. This also aligns the code to the radeonsi logic.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38716>
The order of pReferenceSlots is not well-defined by spec. Instead we
need to look at the RefPicList0/1 which provides slot indices.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38686>
Since we only support 1 L0/L1 ref, the default num refs in the PPS
should always be 0. With that there never any need to set the override
flag in the slice header (until more references are supported).
Also the ref pic list modifications should be clamped to the size of the
ref pic list.
This fixes an issue seen with dEQP-VK.video.encode.h264.i_p_b_13_*.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38686>
Weird that CTS did not catch that ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 11195eb8de ("vulkan: Add KHR_swapchain_maintenance1 promotions.")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38728>
aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common
because tests don't link with ACO, so linking would fail, but we want
to move the nir_opt_vectorize callback here that uses it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>
This fixes a real issue when ESO uses fbfetch output because this
was determined after instead of before.
This solution isn't the most elegant one but binding graphics shaders
earlier would require more work. Let's just handle this specific corner
case for now.
This fixes
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.shader_objects.fragment_region*
on some GPUs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>
Without this, we will report some image formats as unsupported
and the dedicated sparse binding queue won't work
when sparse support is enabled using RADV_PERFTEST=sparse
Fixes: dd90c76cea12 ("radv: Advertise sparse features pre Polaris with perftest flag")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38676>
RDR2 VRAM memory management when resizable BAR is enabled seems
incorrect because it keeps allocating VRAM without freeing anything.
This introduces a drirc option to emulate a fake carveout of 256MiB to
workaround this game bug. This also adjust memory budgets by
distributing it between visible and invisible because AMDGPU reports
the same value for both when REBAR is enabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12091
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38627>