radeonsi may generate empty main shader or an empty exit block
for p_end_with_regs to jump to.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
PS with epilog does not need to fix_exports. And radeonsi use
p_end_with_regs so does not have jump instruction at last.
radeonsi may also have exec restore instruction, so may break
before reach to p_end_with_regs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
radeonsi need to compact color export for ps epilog while radv does not.
radv will fill empty color slot, so won't affected by this change.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
ps needs to handle wqm:
1. main part may compute with args from prolog in wqm mode, so
prolog need to compute these args in wqm mode too.
2. prolog and main part need to end with exact exec, so next
shader part which inherit previous shader part's exec won't
do valid job for helper threads
1 need p_end_with_regs to operate in wqm mode and itself can't
be exact, otherwise some move instruction added by it won't be
in wqm mode so helper threads' compute result is not passed to
next shader part as args.
2 is done by p_end_wqm added by finish_program automatically
after p_end_with_regs.
Piglit tests can trigger the problem:
1. gl-2.1-polygon-stipple-fs
a. ps prolog call discard_if
b. ps main pass wqm exec to epilog
c. ps epilog export color for discarded pixel
2. fs-fwidth-color.shader_test
a. ps prolog need to pass args computed in wqm mode
b. set p_end_with_regs to exact will end wqm mode before
the move instructions, so helper threads's result is not
passed to next shader part
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
It's a HW init reg, not driver spec user sgpr. radv just
doesn't use it. Move it to amd common for aco ps prolog
usage.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
In a scenario like:
CmdBindTransformFeedbackBuffers()
BeginTransformFeedback()
CmdDraw() --> streamout descriptors emitted
EndTransformFeedback() --> streamout descriptors emitted as 0 (disabled)
CmdDraw()
BeginTransformFeedback()
CmdDraw() --> streamout descriptor not re-emitted
EndTransformFeedback()
Fix this by re-emitting streamout descriptors when streamout is
enabled/disabled because a buffer size of 0 acts like a disable bit.
This fixes dEQP-VK.transform_feedback.simple.backward_dependency_indirect*
on NAVI31.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25583>
This test checks the driver's reported conformance version against the
version of the CTS we're running. This check fails every few months
and everyone has to go and bump the number in every driver.
Running this check only makes sense while preparing a conformance
submission, so skip it in the regular CI.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25519>
Gang submissions are mostly only be used for task shaders to both
submit GFX and ACE command buffers in the same submission. Though,
the gang leader (the last submitted CS) IP type should match the
queue IP type to determine which fence to signal.
But if chaining is enabled, it's possible that all GFX cmdbufs are
chained to the first GFX cmdbuf, which means the last CS is ACE and
not GFX.
Fix this by resetting chaining for GFX when a cmdbuffer has GFX+ACE.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9724
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24966>
This was completely broken, for example in the following scenario:
- submits something which needs sample positions (this creates a new
descriptor BO with sample positions)
- submits something which needs the tess rings (this creates a new
descriptor BO with tess rings but without sample positions, ie.
add_sample_positions would be FALSE)
- submits something which needs sample positions again (this won't
create a new descriptor BO because it incorrectly remembered that
sample positions were set)
Fix this by always writing the sample positions.
This should fix the following flakes:
- dEQP-VK.fragment_shading_barycentric.*.weights.pipeline_topology_dynamic.msaa_interpolate_at_sample.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolate_at_distinct_values.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolation_consistency.*
- dEQP-VK.draw.renderpass.linear_interpolation.*
- dEQP-VK.draw.dynamic_rendering.primary_cmd_buff.linear_interpolation.*
These flakes were extremely hard to reproduce!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25529>
There are some minor differences
- fix incorrectly use of device->meta_state.resolve_compute.p_layout
- when on_demand is true, the creation of ds and pipeline layouts are
also deferred
- unlike radv_meta_get_view_type, vk_texcompress_etc2_image_view_type
returns 1d/2d array image views
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25071>
Move emitting the EOP even which writes the availability bit after the
GDS copy to ensure it's available.
This should fix all GS primitives/invocations flakes in CI.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25457>
It was reported that it broke Stoney. Something probably uses a suboptimal
pitch, like minigbm.
Fixes: 7d066330e0 - ac/surface: relax custom pitch requirements to any multiple of 256B on gfx10.3+
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25540>
With the commit 4f660f99 ("ac/gpu_info: pad IBs according to ib_size_alignment"),
we found kernel isn't reporting ib_base/size_alignment correctly, thus causing
VCN_DEC and JPEG functions broken. We will fix the kernel and bump the kernel
version, and now for the older kernel, we need this override.
closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9916
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25511>
clrx hasn't seen any changes since 2021. I guess the only reson to keep it is
GFX6 support.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25522>