No change in behavior. The previous overalignment is preserved.
It sets ib_pad_dw_mask sooner.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25578>
This doesn't fix anything because gds_needed should already be TRUE
because it's initialized at pipeline bind time, but this will be needed
for skipping GDS allocation on GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25284>
According to RadeonSI, this is required for preemption, user queues,
and we only have to wait for VS after streamout which should be more
performant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25284>
Primitive restart is also applied to non-indexed draws on AMD GPUs. On
GFX11, DISABLE_FOR_AUTO_INDEX can be set but we will need a different
solution for older GPUs.
This fixes all line related flakes in CI (at least).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25639>
this patch calls the init and finish functions of the vk
runtime astc decoder. initializes emulate_astc flag. sets
up the additional plane to store decoded texture.
v2: fix _tex_dataformat() and _tex_numformat() (Chia-I Wu)
use correct function for bufferToImage (Chia-I Wu)
v3: add radv_is_layout_emulated() (Chia-I Wu)
avoid repeated pattern (Chia-I Wu)
v4: not create all pipelines on_demand (Chia-I Wu)
v5: current code does not support astc hdr (Chia-I Wu)
v6: keep luts in staging buffer only (Chia-I Wu)
v7: use 2DArray for both input and output
v8: document todo to use fp16 (Chia-I Wu)
not required to move meta init anymore (Chia-I Wu)
move astc_emulation_format to vk_texcompress_astc.h (Chia-I Wu)
v9: remove LAYOUT check (Chia-I Wu)
check on iview->vk.view_format
move setting tiled flags for astc (Chia-I Wu)
remove is format emulated check in radv_is_storage_image* (Chia-I Wu)
use LAYOUT_ASTC for if check (Chia-I Wu)
no 1D support (Chia-I Wu)
calculate start end offset in 2x blk size
v10: remove old wrong code (Chia-I Wu)
v11: use existing defined local format variable (Chia-I Wu)
dst image layout is always VK_IMAGE_LAYOUT_GENERAL (Chia-I Wu)
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24672>
It works with just one counter.
This mitigates https://gitlab.freedesktop.org/drm/amd/-/issues/2902
quite a lot when you run dEQP-VK.transform_feedback.* in parallel on
more than 16 threads with RDNA3.
For example, on my GPU the kernel reports 16 GDS OA counters which means
that if you run VKCTS with 16 threads (ie. 16 Vulkan devices are
created) it's fine. Otherwise, the kernel might report ENOMEM.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25619>
Otherwise, we have dangling BO pointers in the global BO list. Not
quite sure why this hasn't been triggered before.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25623>
This re-introduces "radv: fix alignment of DGC command buffers" and
"radv/amdgpu: fix alignment of command buffers" which were valid
changes.
IBs need to be aligned to the IB size requirement, not the number of
padded NOPs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25588>
In a scenario like:
CmdBindTransformFeedbackBuffers()
BeginTransformFeedback()
CmdDraw() --> streamout descriptors emitted
EndTransformFeedback() --> streamout descriptors emitted as 0 (disabled)
CmdDraw()
BeginTransformFeedback()
CmdDraw() --> streamout descriptor not re-emitted
EndTransformFeedback()
Fix this by re-emitting streamout descriptors when streamout is
enabled/disabled because a buffer size of 0 acts like a disable bit.
This fixes dEQP-VK.transform_feedback.simple.backward_dependency_indirect*
on NAVI31.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25583>
Gang submissions are mostly only be used for task shaders to both
submit GFX and ACE command buffers in the same submission. Though,
the gang leader (the last submitted CS) IP type should match the
queue IP type to determine which fence to signal.
But if chaining is enabled, it's possible that all GFX cmdbufs are
chained to the first GFX cmdbuf, which means the last CS is ACE and
not GFX.
Fix this by resetting chaining for GFX when a cmdbuffer has GFX+ACE.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9724
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24966>
This was completely broken, for example in the following scenario:
- submits something which needs sample positions (this creates a new
descriptor BO with sample positions)
- submits something which needs the tess rings (this creates a new
descriptor BO with tess rings but without sample positions, ie.
add_sample_positions would be FALSE)
- submits something which needs sample positions again (this won't
create a new descriptor BO because it incorrectly remembered that
sample positions were set)
Fix this by always writing the sample positions.
This should fix the following flakes:
- dEQP-VK.fragment_shading_barycentric.*.weights.pipeline_topology_dynamic.msaa_interpolate_at_sample.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolate_at_distinct_values.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolation_consistency.*
- dEQP-VK.draw.renderpass.linear_interpolation.*
- dEQP-VK.draw.dynamic_rendering.primary_cmd_buff.linear_interpolation.*
These flakes were extremely hard to reproduce!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25529>
There are some minor differences
- fix incorrectly use of device->meta_state.resolve_compute.p_layout
- when on_demand is true, the creation of ds and pipeline layouts are
also deferred
- unlike radv_meta_get_view_type, vk_texcompress_etc2_image_view_type
returns 1d/2d array image views
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25071>
Move emitting the EOP even which writes the availability bit after the
GDS copy to ensure it's available.
This should fix all GS primitives/invocations flakes in CI.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25457>
Otherwise, DGC command buffers might not be correctly aligned.
This fixes a regression with the vkd3d-proton DGC tests.
Fixes: 4f660f9937 ("ac/gpu_info: pad IBs according to ib_size_alignment")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25500>
the intent of the VkExternalMemoryProperties API is that all compatible
handle types are returned, not just the type being queried. these two
types are compatible, so return both when both are supported
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25474>
The VA needs to be adjusted, otherwise the hw always writes at offset 0.
This fixes dEQP-VK.query_pool.statistics_query.*_cq.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25406>
This is now fixed.
Hi-Fi Rush, Sonic Frontiers and Hogwarts Legacy were known broken games.
I personally reproduced the issue with Hi-Fi Rush which has been fixed
since e6735409ee ("radv: disable DCC with signedness reinterpretation
on GFX11"). I also tested Sonic Frontiers which has been fixed since
52b6886992 ("amd: update addrlib"). I didn't check Hogwarts Legacy but
I think it was also fixed by e6735409ee.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25435>
This is actually wrong. For example, if there is a DCC decompress
draw followed by a draw without any color attachments,
CB_COLOR_CONTROL.MODE is still CB_DCC_DECOMPRESS but it should be
CB_DISABLED. For some reasons, this hangs on RDNA3 (VM faults are also
reported through dmesg).
This fixes GPU hangs with Resident Evil 6, Star Wars The Old Republic
and probably more games on RDNA3.
Strictly speaking, I don't think this dynamic state optimization is
worth a try, even for other states, and I think it would be safer to
remove it completely.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9335
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8327
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9878
Fixes: c08082e861 ("radv: ignore all CB dynamic states when there is no color attachments")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25402>