Commit graph

16881 commits

Author SHA1 Message Date
David Rosca
8ffedebf1c radv/video: Fix encode session info for VCN3+
Last dword should be 0.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 7249d9548e)
2025-04-16 15:37:03 +02:00
David Rosca
15b2a440da radv/video: Fix msg header total size
It needs to include also codec msg size.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 34031531fc)
2025-04-16 15:37:03 +02:00
Samuel Pitoiset
3c6e241f0d radv: apply the workaround for buggy HiZ/HiS on GFX12 for DGC
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit d2da54e6f3)
2025-04-15 17:24:55 +02:00
Samuel Pitoiset
3c932e7824 radv: add a workaround for buggy HiZ/HiS on GFX12
HiZ/HiS is buggy and can cause random GPU hangs when stencil is enabled.
There are basically two alternatives but RADV follows RadeonSI and emit
a dummy RELEASE_MEM packet after every draw which should workaround the
issue and maintain performance.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12944
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 6388db03c8)
2025-04-15 17:24:05 +02:00
Samuel Pitoiset
5449bd2eb7 radv: determine if HiZ/HiS is enabled earlier on GFX12
To lower CPU overhead of the hardware workaround.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 11b6d2ba60)
2025-04-10 18:06:29 +02:00
Natalie Vock
1fdd97ea51 aco: Make private_segment_buffer/scratch_offset per-resume
We need different Temps for each resume shader, because registers aren't
preserved across resume boundaries.

This was likely fine in practice because arg registers are the same for
each shader, but resulted in invalid IR and asserts.

Fixes crashes in Indiana Jones RT with assertions enabled on GFX8.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
(cherry picked from commit 3d8db3cbbb)
2025-04-10 17:12:25 +02:00
Eric Engestrom
8614baa5f8 ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners
Otherwise, ci-tron runners with that tag could pick up jobs meant for the fdo
runners, as happened here:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719

The inverse (fdo runners picking up a job meant for a ci-tron runner) is not
possible though, as ci-tron jobs always include a `farm:$RUNNER_FARM_LOCATION`
tag, so the problem only exists in the other direction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34358>
(cherry picked from commit 6331441e24)
2025-04-10 17:12:24 +02:00
Timur Kristóf
a64edc0e3b radv: Call nir_opt_undef too after nir_opt_varyings.
Shaders may have undefined output stores after nir_opt_varyings.
These must be optimized out, otherwise they hit an assertion.

Fixes: 17f6ab28cc
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit ce2138d73a)
2025-04-10 17:12:24 +02:00
Timur Kristóf
6ebea60d73 radv: Use buffers_written mask when gathering XFB info.
We need to enable these buffers regardless of whether or not the
shader actually writes any outputs to them, otherwise we break
XFB queries.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit 15d0804670)
2025-04-10 17:12:24 +02:00
Samuel Pitoiset
ef2a5bee7b radv: fix ignoring conditional rendering with vkCmdResolveImage()
This command isn't supposed to be affected by conditional rendering.

This fixes new VKCTS coverage
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
(cherry picked from commit 4d1d6d4147)
2025-04-10 17:12:23 +02:00
Eric Engestrom
2bde9b1ef7 [25.0 only] update more ci expectations
These changes happened with no mesa code change, only infrastructure
changes, which is really weird, but to be able to move on, let's simply
document the "new normal".

(Was missed in 69d6923cdb)
2025-04-10 17:12:23 +02:00
David Rosca
b592736211 radv: Add radv_format_description to remap 10/12bit formats to 16bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Remapping was missing for format description which made these formats
effectively unsupported as zero format features were reported.

Fixes: 0098f8ef35 ("radv: Remap 10 and 12 bit formats to 16 bit formats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34274>
(cherry picked from commit 597f13b244)
2025-04-02 14:27:04 +02:00
Samuel Pitoiset
ccc86bd62e Revert "radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs"
This reverts commit 96e9c3fe77.

This actually causes random GPU hangs like on Phoenix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12461
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12426
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12692
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34306>
(cherry picked from commit 64e6e043b3)
2025-04-02 14:24:08 +02:00
Samuel Pitoiset
abb47924db ac/surface: fix selecting preferred alignments for HiZ/HiS on GFX12
VK_MESA_image_alignment_control is used by vkd3d-proton to set
optimal alignments for images. Though, the preferred alignment was
only applied to the surface (or the stencil aspect) but not to the HiZ
surface due to the NULL check.

This caused rendering issues because swizzle modes didn't match.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12831
Fixes: 079f55d405 ("radv: advertise VK_MESA_image_alignment_control on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34322>
(cherry picked from commit fac44c0ca0)
2025-04-02 14:18:29 +02:00
Pierre-Eric Pelloux-Prayer
ae02c9a2d5 ac/nir: fix nir_metadata value of ac_nir_lower_image_opcodes
This pass can insert new blocks so 'nir_metadata_control_flow' is not
preserved.

Fixes: eaf98b1422 ("ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34241>
(cherry picked from commit 785df1b980)
2025-04-02 11:04:13 +02:00
Samuel Pitoiset
ffa6fd4bee radv: do not trigger FCE or FMASK decompress on compute queue
A pipeline barrier which contains an image layout transition like
COLOR_ATTACHMENT_OPTIMAL -> TRANSFER_DST_OPTIMAL on compute queue
would just hang. Such a barrier is useless in practice but it's legal.

Prevent GPU hangs by skipping FCE or FMASK_DECOMPRESS when it's not
on the graphics queue.

Fixes dEQP-VK.synchronization2.layout_transition.compute_transition*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
(cherry picked from commit 086f529bbe)
2025-04-02 11:04:13 +02:00
Natalie Vock
0c3b74d562 radv/rt: Flush CP writes from the common BVH framework with INV_L2 on GFX12
a1b05991 ("radv/rt: Flush L2 after writing internal node offset on GFX12")
did this for radv-internal CP writes - we also need to do this for PLOC
sync data initialization which is done in the common framework.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
(cherry picked from commit c1e1d86bd1)
2025-04-02 11:04:13 +02:00
Samuel Pitoiset
88c7326a61 radv/meta: fix color<->depth/stencil image copies
The color format needs to be compatible with depth or stencil. Also
the depth/stencil format was incorrect when it's the source.

Fixes dEQP-VK.api.ds_color_copy.*
and VKD3D_TEST_FILTER=test_copy_texture.

Fixes: d4ff011b12 ("radv: advertise VK_KHR_maintenance8")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34142>
(cherry picked from commit 2c3b9312cc)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
6ad9455f43 radv: fix compresed depth/stencil copies on transfer queue
HTILE is always pipe aligned.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
(cherry picked from commit 114fbdc534)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
7f32247d95 radv: fix bpe for the stencil aspect of depth/stencil copies on transfer queue
Using the bpe of depth+stencil when copying the stencil aspect only
doesn't work.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
(cherry picked from commit 7b15e85b95)
2025-04-02 11:04:11 +02:00
Rhys Perry
b3991dd8fc aco/ra: fix free register counting when moving variables
info.bounds might be smaller than the bounds available for the moved
variables.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 626aa7b648 ("aco: workaround GFX9 hardware bug for D16 image instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34158>
(cherry picked from commit 80fef30531)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
7e55a04643 radv: fix creating pipeline binary from the traversal shader
rt_stage_info is NULL.

Fixes: 8802612458 ("radv: advertise VK_KHR_pipeline_binary")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34141>
(cherry picked from commit 29b3d9f0f4)
2025-04-02 11:04:11 +02:00
Daniel Schürmann
e159e0000c aco: don't assume that demote doesn't cause an empty exec mask
Totals from 188 (0.24% of 79377) affected shaders: (Navi31)
Instrs: 209239 -> 209473 (+0.11%); split: -0.01%, +0.12%
CodeSize: 1101124 -> 1101744 (+0.06%); split: -0.02%, +0.07%
Latency: 1672182 -> 1672748 (+0.03%); split: -0.11%, +0.14%
InvThroughput: 237276 -> 237546 (+0.11%); split: -0.00%, +0.12%
SClause: 5694 -> 5690 (-0.07%); split: -0.28%, +0.21%
Copies: 21685 -> 21682 (-0.01%); split: -0.12%, +0.10%
Branches: 5740 -> 5863 (+2.14%)
PreSGPRs: 7004 -> 7034 (+0.43%)
VALU: 123595 -> 123641 (+0.04%); split: -0.00%, +0.04%
SALU: 28418 -> 28411 (-0.02%); split: -0.09%, +0.06%

Fixes: f35e229fae ('aco: skip code if exec is empty')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33619>
(cherry picked from commit 69dcd5be3a)
2025-04-02 11:04:10 +02:00
Eric Engestrom
69d6923cdb [25.0 only] update ci expectations
These changes happened with no mesa code change, only infrastructure
changes, which is really weird, but to be able to move on, let's simply
document the "new normal".
2025-04-02 11:04:10 +02:00
Bas Nieuwenhuizen
9d85e7eda9 radv: Move support check out of winsys.
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
To get the right error code. Mostly shouldn't be winsys dependent
anyway, outside of the idea that if we explicitly emulate a device
we should just assume th euser knows what they're doing.

Fixes: c942d957b0 ("radv: fail to initialize when the AMD GPU generation is unsupported")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12792
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33964>
(cherry picked from commit 61feea6954)
2025-03-15 09:49:05 +01:00
Ganesh Belgur Ramachandra
63e1e2c926 amd: use 128B compression for scanout images when drm.minor <63
Fixes: 8328e575 ("ac/surface/gfx12: enable DCC 256B compressed blocks and reorder modifiers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33702>
(cherry picked from commit ba80a11b69)
2025-03-15 09:49:05 +01:00
Georg Lehmann
d00144c8f0 aco/ra: disallow vcc definitions for pseudo scalar trans instrs
Foz-DB GFX1201:
Totals from 30 (0.04% of 79600) affected shaders:
Instrs: 58843 -> 58820 (-0.04%); split: -0.10%, +0.06%
CodeSize: 302228 -> 301944 (-0.09%); split: -0.13%, +0.04%
Latency: 204566 -> 204432 (-0.07%); split: -0.09%, +0.02%
InvThroughput: 136918 -> 136919 (+0.00%); split: -0.00%, +0.00%
SClause: 1241 -> 1249 (+0.64%); split: -0.56%, +1.21%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34006>
(cherry picked from commit d1dca26941)
2025-03-15 09:49:05 +01:00
Samuel Pitoiset
0c0f4de8ad radv: emit a dummy PS state for noop FS on GFX12
It seems the hardware requires a dummy PS state with a noop FS,
otherwise it might just hang. This used to work just fine on older
gens.

Note that RadeonSI refuses to draw if VS or PS is missing and AMDVLK
seems to also always emit this state. So, this might be a bug that AMD
didn't encounter at all.

This fixes a GPU hang during loading with Ghostwire: Tokyo.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34070>
(cherry picked from commit 1e4cfd9dfa)
2025-03-15 09:49:05 +01:00
Samuel Pitoiset
ca58bc9d8f aco: do not apply OMOD/CLAMP for pseudo scalar trans instrs
This optimization seems broken because eg. v_s_log_f32 uses SGPRs
for both the source and destination but applying OMOD seems to require
VGPRs.

This fixes a GPU hang when launching Enshrouded on GFX1201.

No fossils db changes on GFX1201.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34027>
(cherry picked from commit f46830912e)
2025-03-15 09:49:04 +01:00
Dave Airlie
9fb0403e27 radv/video: don't try and send events on UVD devices.
This should fix some hangs on polaris when decode is forced on.

Fixes: 95a980b61f ("radv/video: add event support for VCN4")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34013>
(cherry picked from commit 2e3b23539e)
2025-03-15 09:49:04 +01:00
Georg Lehmann
0be9c89310 aco/gfx11.5: remove vinterp ddx/ddy path
While the idea to take advantage of the higher throughput wasn't bad,
the hardware wasn't design with this in mind and doesn't behave like expected
with constant sources.

Fixes: bee487df48 ("aco/gfx11.5+: use vinterp for fddx/fddy")
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33969>
(cherry picked from commit 3b5e537b09)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
22ba337921 radv: update conformance version
A lot of people (including me) misinterpreted the conformanceVersion
field for so long. The Vulkan spec wasn't very clear either but it's
going to be clarified soon.

VkConformanceVersion is actually unrelated to the official CTS
conformance process in Khronos. It just reports the latest CTS version
that the driver can pass, not more.

For GFX8+, RADV should be passing CTS 1.4.0.0 on all GPUs because we
validated this CTS version recently for Vulkan 1.4.

For GFX6-7, which only suppports Vulkan 1.3, RADV should also be
passing CTS 1.4.0.0, because newer versions of the CTS can be used
to validate a driver against an older version of the spec, so
it's perfectly fine to report a higher CTS version than the Vulkan version.

Newer CTS versions likely can't pass 100% due to a DGC bug that I still
need to fix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12799
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34018>
(cherry picked from commit e519e0b9e6)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
e5f0fd5626 radv/amdgpu: fix device deduplication
To correctly deduplicate device inside the winsys, it should use the
fd or amdgpu_device_handle. Using the allocated ac_drm_device as key
is obviously broken.

Not deduplicating devices breaks memory budget and a bunch of games
were broken.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12686
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12775
Fixes: a565f2994f ("amd: move all uses of libdrm_amdgpu to ac_linux_drm")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34005>
(cherry picked from commit c627097841)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
2d9d444aa7 radv: fix a GPU hang with inherited rendering and HiZ/HiS on GFX1201
With secondary command buffers, inherited rendering can be used but
it's basically impossible to know if the depth/stencil attachment
enabled HiZ/HiS. But it's required to disable WALK_ALIGN8 to avoid
GPU hangs.

This assumes that HiZ/HiS is enabled for inherited rendering as long
as a depth/stencil attachment is used. It's not the most optimal
approach but it's not supposed to hurt either.

This fixes a GPU hang with
dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.contents_secondary_cmdbuffers
and friends.

GFX1200 isn't affected because it doesn't support HiZ/HiS.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33986>
(cherry picked from commit d1a2ba57f9)
2025-03-15 09:49:03 +01:00
Natalie Vock
9bcbdbfcf2 radv/rt: Flush L2 after writing internal node offset on GFX12
Otherwise the encoder can read a stale value and make internal nodes
point into leaf space (if 0 is read).

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33985>
(cherry picked from commit a1b0599105)
2025-03-15 09:49:03 +01:00
Natalie Vock
5603cefd94 radv/rt: Guard leaf encoding by leaf node count
For empty BVHs we shouldn't emit any leaf nodes, but there is one
invocation to encode the root node. Guard leaf node encoding so that
invocation doesn't try writing any leaves.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33985>
(cherry picked from commit cdadda2d51)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
a4e6a8fb96 ac,radv: add a workaround for a hw bug with primitive restart on GFX10-GFX10.3
At least, NAVI10, NAVI21 and NAVI24 are affected by this what looks
like a hardware bug when primitive restart is changed and no context
registers are written between draws. It seems the hardware doesn't
consider primitive restart at all in this situation.

Adding SQ_NON_EVENT(0) as suggested by Marek seems to fix it reliably
without introducing any overhead. It's basically a NOP packet that adds
a small delay.

Fixes new VKCTS coverage dEQP-VK.transform_feedback.primitive_restart.*.
Also fixes this old vkd3d-proton issue.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7258
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33929>
(cherry picked from commit 0bc9d59c2e)
2025-03-15 09:49:02 +01:00
Marek Olšák
de3dd2afdf Revert "ac/nir: clamp vertex color outputs in the right place"
This reverts commit b3fc49686e.

It was a rebase failure.

Fixes: b3fc49686e

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482>
(cherry picked from commit 177c9b173e)
2025-03-15 09:49:01 +01:00
Rhys Perry
6999073da9 aco: insert dependency waits in certain situations
This seems to fix some artifacts, but we're not sure why, so it might not
be a correct or optimal solution.

fossil-db (navi31):
Totals from 28424 (35.81% of 79377) affected shaders:
Instrs: 30112910 -> 30348977 (+0.78%); split: -0.00%, +0.78%
CodeSize: 159542980 -> 160485336 (+0.59%); split: -0.00%, +0.59%
Latency: 221438396 -> 221500856 (+0.03%); split: -0.00%, +0.03%
InvThroughput: 38154231 -> 38159984 (+0.02%); split: -0.00%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33853>
(cherry picked from commit 0ec174afd5)
2025-03-15 09:49:01 +01:00
Samuel Pitoiset
5200d13a0f radv: fix re-emitting fragment output state when resetting gfx pipeline state
When switching from pipeline to shader objects.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33840>
(cherry picked from commit 7f6e28db26)
2025-03-04 20:24:03 +01:00
Daniel Schürmann
553ab18656 aco/assembler: Fix short jumps over chained branches
If we insert

   <code>
   s_branch 1
   s_branch Target

at the end of some block, and later hide an additional chained branch
after the existing one, then we have to update the 's_branch 1' to
also jump over the newly added branch.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Closes: #12673
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
(cherry picked from commit 6659db285a)
2025-02-28 22:17:35 +01:00
Hans-Kristian Arntzen
1b6da4ed52 radv: Always set 0 dispatch offset for indirect CS.
Fixes severe glitching in Avowed.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33732>
(cherry picked from commit 13a3f9a972)
2025-02-28 22:17:34 +01:00
Samuel Pitoiset
20bb982788 radv: fix missing SQTT barriers for fbfetch color/depth decompressions
SQTT layout transitions need to be inside SQTT barrier. Otherwise, this
throws an assertion in RADV and might also crash when the capture is
opened with RGP.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12664
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33719>
(cherry picked from commit 67c150bf9e)
2025-02-28 22:17:34 +01:00
Natalie Vock
ea47f98811 radv/rt: Don't allocate the traversal shader in a capture/replay range
We never write the traversal shader address out to shader group handles,
so this is not necessary. On the flipside, it can cause conflicts if the
traversal shader is allocated in a range occupied by a replayed shader.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33711>
(cherry picked from commit 14b902c825)
2025-02-27 18:37:32 +01:00
Georg Lehmann
cb09b3f624 aco/insert_exec: fix continue_or_break on gfx6-7
s_cmp_lg_u64 is gfx8+

Fixes: 115ff5f95b ("aco/insert_exec_mask: don't restore exec in continue_or_break blocks")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33715>
(cherry picked from commit c249556bf4)
2025-02-27 18:37:31 +01:00
Rhys Perry
36e1923284 ac/nir: fix tess factor optimization when workgroup barriers are reduced
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b49eab68a8 ("ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12632
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33645>
(cherry picked from commit 2a3dce1b59)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
f9c3499918 aco/ssa_elimination: insert parallelcopies for p_phi immediately before branch
Totals from 2499 (3.15% of 79377) affected shaders: (Navi31)
Instrs: 6011729 -> 6011761 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31573216 -> 31574236 (+0.00%); split: -0.00%, +0.00%
Latency: 83364734 -> 83365781 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 13545643 -> 13545783 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 302678df91)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
4118fef567 aco/insert_exec_mask: don't restore exec in continue_or_break blocks
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 115ff5f95b)
2025-02-27 18:37:29 +01:00
Daniel Schürmann
1bb39be75e aco/insert_exec_mask: Don't immediately set exec to zero in break/continue blocks
Instead, only indicate that exec should be zero and do
so in the successive helper block. This allows to insert
the parallelcopies from logical phis directly before the
branch in break and continue blocks.

Totals from 56 (0.07% of 79377) affected shaders: (Navi31)
Latency: 2472367 -> 2472422 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 253053 -> 253055 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 7f7c1d463a)
2025-02-27 18:37:28 +01:00
Daniel Schürmann
c4d64f9f83 aco/scheduler: always respect min_waves on GFX10+
It could theoretically happen that for large workgroups,
the scheduler used more registers than allowed.

No fossil changes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
(cherry picked from commit 676b39d31f)
2025-02-21 17:07:28 +01:00