Commit graph

10167 commits

Author SHA1 Message Date
Konstantin Seurer
97f71420df radv/bvh: Fix comment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
100616859e radv/bvh: Remove some unused variables
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
f00b25331a radv/bvh: Make sure the AABB is written before internal_ready_count
Otherwise, the next stage can read garbage. Fixes flickering in The
Witcher 3.

Closes: #13145
Closes: #13196
Fixes: 2d48b2c ("radv: Use subgroup OPs for BVH updates on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
f42d52f922 radv: Flush L2 on GFX12 when binding an update pipeline
This is just for completeness since the flush above is probably
sufficient.

Fixes: 2d48b2c ("radv: Use subgroup OPs for BVH updates on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Hans-Kristian Arntzen
e674823d55 radv: Consider that DGC might need shader reads of predicated data.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
2025-05-15 06:14:46 +00:00
Samuel Pitoiset
3ca2f71f3d radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
e2625fa9ca radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
69ff204422 radv: remove the optimization for equal immutable samplers
This optimization used to optimize the allocated space for descriptors
when immutable samplers are equal. Though, this was basically broken :

- descriptor copies were broken for combiner image sampler (or sampler)
  with equal immutable samplers because 96 bytes were copied instead of
  64 bytes (cf. the linked ticket). This could be fixed but it's not
  worth it.
- the value returned by vkGetDescriptorLayoutSupport() was broken, it
  should have been 96 with no immutable samplers (or when they aren't
  equal)

This optimization was also not applied for descriptor buffers which is
the default for vkd3d-proton and Zink. DXVK doesn't use db but it
doesn't use immutable samplers, so basically only native vulkan games
would be concerned.

Note that immutable samplers would still be inlined in shaders if no
indirect access which should be 99.9% of the usecase.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11165
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34928>
2025-05-13 16:27:22 +00:00
Samuel Pitoiset
9a07ccbc89 radv: fix emitting dynamic viewports/scissors when the count is static
In a scenario where the viewports/scissors are a dynamic state but the
count is static (ie. updated when a graphics pipeline is bound), the
driver wasn't considering that and it was re-emitting the previous
number of viewports/scissors.

This fixes rendering issue with Blender.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34921>
2025-05-13 16:08:14 +00:00
David Rosca
5fee04bcae radv/video: Use ac_uvd_alloc_stream_handle
ac_uvd_alloc_stream_handle tries to avoid collisions in the case
when PID is not unique (eg. in sandboxes like Flatpak).

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12607
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
2025-05-13 09:36:48 +00:00
Natalie Vock
e32a90b57c radv,driconf: Add radv_force_64k_sparse_alignment config
Needed by DOOM: The Dark Ages.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34944>
2025-05-13 07:58:03 +00:00
Samuel Pitoiset
4b73d7e817 radv: fix SDMA copies for linear 96-bits formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The hardware requires a power of two bpe. To do that, the driver
needs to adjust the pitch/offset/extent based on a texel scale factor
which only applies to 96-bits formats.

This fixes new VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34927>
2025-05-13 06:15:55 +00:00
Konstantin Seurer
2d48b2cb47 radv: Use subgroup OPs for BVH updates on GFX12
This patch changes the update code to launch 8 invocations for every
internal node. The internal nodes update their child leaf nodes using
the geometry index and primitive index stored inside the primitive node.

Processing 8 child nodes in parallel is faster than looping over them.
Moving to one dispatch that updates all nodes in one go lets us get rid
of atomics and will also enable updatable BVHs to use pair compression.

Improves Elden Ring (high settings, max RT settings, 1080p) by around
10%.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:31 +02:00
Konstantin Seurer
c6fdf11303 radv: Make radv_update_memory non-static
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:25 +02:00
Konstantin Seurer
8157f84246 radv: Refactor the update scratch layout code
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:06 +02:00
Konstantin Seurer
b2aa0647d5 radv: Use a specialized shader for in place updates
If src == dst, we only need to update aabbs for the internal nodes.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:00 +02:00
Konstantin Seurer
e1110d20f8 vulkan: Add acceleration structure update keys
The driver can use an optimized shader when src == dst.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:44:56 +02:00
Samuel Pitoiset
219a2b1e32 radv: ignore radv_zero_vram=true if zeroInitialDeviceMemory is enabled
To let applications like vkd3d-proton to take full control.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Samuel Pitoiset
21badbf336 radv: advertise VK_EXT_zero_initialize_device_memory
Only expose this extension when AMDGPU supports zerovram allocations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Samuel Pitoiset
eaf646d020 radv: implement VK_EXT_zero_initialize_device_memory
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Daniel Schürmann
fa4eb37bf6 radv: move terminate{_if} out of loops.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33479>
2025-05-09 17:20:29 +00:00
Georg Lehmann
6f4e26e54d radv/gfx12+: enable VK_KHR_shader_bfloat16
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GFX11 seems to have precision issues, so don't enable the extension there for now.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
7716e63cd6 radv/nir/lower_cmat: handle bf16 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
78524837c1 radv/nir/opt_cmat: support bfloat16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
e8f5c335ff radv,aco,nir: keep the A and B base type for cmat_muladd_amd
With bfloat16, and the two fp8 formats in the future, using just the bit size
to identify the types is no longer possible.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Konstantin Seurer
c21e1776b3 radv: Use build flags instead of defines
Using the meta framework makes managing shader variants much easier.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Konstantin Seurer
33ac143779 vulkan: Introduce VK_BUILD_FLAG for specializing BVH build shaders
The advantage of using spec constants is that we do not have to include
multiple spirv binaries for multiple variants of a build stage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Samuel Pitoiset
ae6d3df139 radv,aco: dump more SQ_WAVE registers from the trap handler on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
0e73c85424 radv: fix configuring TRAP_PRESENT for compute shaders on GFX12
It no longer exists and it's been replaced by DYNAMIC_VGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
effa563bb0 radv: adjust computing the PC from the trap handler on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
4b76d04f7f radv: ignore conditional rendering with vkCmdTraceRays*
CmdTraceRays is neither a dispatch or a draw command which means it
shouldn't be affected by conditional rendering.

Fixes recent VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34868>
2025-05-08 19:08:10 +00:00
Samuel Pitoiset
b7d2cdd2b4 radv: ignore radv_disable_dcc_stores on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's not necessary because DCC is completely transparent to the
userspace driver. Also it's causing issues with scanout.

This fixes rendering issues with scanout in Indiana Jones.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12924
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34859>
2025-05-08 17:17:28 +02:00
Rhys Perry
2704a30df0 radv: perform nir_opt_access before the first radv_optimize_nir
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Two lowered loads might not be CSE'd after nir_lower_explicit_io if one of
them is shrinked. This doesn't happen for deref loads, but it needs the
CAN_REORDER flag first.

fossil-db (gfx1201):
Totals from 556 (0.70% of 79377) affected shaders:
MaxWaves: 14936 -> 14940 (+0.03%); split: +0.05%, -0.03%
Instrs: 2140334 -> 2140942 (+0.03%); split: -0.07%, +0.10%
CodeSize: 11137948 -> 11145416 (+0.07%); split: -0.07%, +0.13%
SpillSGPRs: 2385 -> 2527 (+5.95%); split: -0.34%, +6.29%
Latency: 12310570 -> 12305011 (-0.05%); split: -0.08%, +0.04%
InvThroughput: 2136142 -> 2135516 (-0.03%); split: -0.06%, +0.03%
VClause: 47419 -> 47420 (+0.00%); split: -0.01%, +0.01%
SClause: 58423 -> 58290 (-0.23%); split: -0.36%, +0.14%
Copies: 160626 -> 161321 (+0.43%); split: -0.25%, +0.68%
Branches: 69693 -> 69710 (+0.02%); split: -0.04%, +0.06%
PreSGPRs: 34824 -> 34945 (+0.35%); split: -0.24%, +0.58%
PreVGPRs: 28682 -> 28649 (-0.12%); split: -0.36%, +0.24%
VALU: 1080800 -> 1081171 (+0.03%); split: -0.04%, +0.08%
SALU: 353112 -> 353770 (+0.19%); split: -0.15%, +0.34%
SMEM: 81587 -> 81364 (-0.27%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
8abb787c6b radv/gfx12: use dword3 smem loads for push constants
fossil-db (gfx1201):
Totals from 5 (0.01% of 79377) affected shaders:
(no affected stats)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Marek Olšák
b960137ebf aco: remove unused aco_shader_info::tcs_offchip_layout
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>
2025-05-08 02:54:13 +00:00
Marek Olšák
f58c0cbb6a nir: split *_accessed_indirectly* bitmasks into *_read/written_indirectly*
for AMD

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>
2025-05-08 02:54:12 +00:00
Konstantin Seurer
84b9c281fe radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
VK_ERROR_INITIALIZATION_FAILED will fail physical device enumeration.
Returning VK_ERROR_INCOMPATIBLE_DRIVER means that the driver can still
be used on supported GPUs when multiple GPUs are installed.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34783>
2025-05-07 08:26:33 +02:00
Samuel Pitoiset
1aa5fd5da2 radv: promote VK_EXT_robustness2 to VK_KHR_robustness2
This is a 1:1 promotion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34810>
2025-05-05 15:02:19 +00:00
Samuel Pitoiset
0684dc5fa8 radv: fix GPU hangs with image copies for ASTC/ETC2 formats on transfer queue
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Emitting compute dispatches on SDMA just hangs. It might be needed
to switch to gang submit for these to work but fixing the GPU hang is
more important for now.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34805>
2025-05-05 13:50:25 +00:00
Samuel Pitoiset
1356d20042 radv: disable SINGLE clear codes to workaround a hw bug with DCC on GFX11
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a very weird cache-related corruption with DCC on GFX11 due
to a hw bug according to PAL.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12932
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34790>
2025-05-05 07:07:58 +00:00
Samuel Pitoiset
55ad0fd35c radv: do not clear unwritten color attachments with dual-source blending
This is incorrect because the color format at slot 0 needs to be
replicated to the slot 1. But with dual-source blending the colors
written mask is only 0xf and this was clearing the color format at
slot 1.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13082
Fixes: e1483d022b ("radv: clear unwritten color attachments for monolithic PS earlier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34773>
2025-05-05 06:46:32 +00:00
Marek Olšák
7f0de1a512 ac: remove gfx11_emulate_clear_state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We don't use CLEAR_STATE on gfx11 anymore.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34589>
2025-05-02 18:40:11 +00:00
Philip Rebohle
4cb358f1c2 radv: Remove offset parameter from radv_make_texel_buffer_descriptor.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is already only used in vkCreateBufferView, and causes a vkd3d-proton
test to fail with >4GB offsets since the parameter was 32-bit only.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34760>
2025-05-02 09:13:14 +00:00
Paul Gofman
96765935e8 radv/amdgpu: Fix hash key in radv_amdgpu_winsys_destroy().
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34774>
2025-05-02 07:51:23 +00:00
Ricardo Garcia
bc44d029df radv: Ignore image barrier queue families if equal
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The src and dst queue family indices in an image memory barrier may
contain arbitrary values that can be ignored unless both are different.
This fixes a crash in upcoming CTS tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34691>
2025-04-29 08:15:28 +00:00
Samuel Pitoiset
1fccc09abe radv: fix re-emitting VRS state when rendering begins
This state also depends on whether a VRS attachment is used.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11693
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34735>
2025-04-29 07:00:09 +00:00
Timur Kristóf
3ad385b9cc radv: Clear dirty flag for clip rects state after emitting it.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 0ba3a8b3cc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>
2025-04-24 15:13:44 +00:00
Timur Kristóf
3a05477ac6 radv: Clear dirty flag for MSAA state after emitting it.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 08918f0880
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13022
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>
2025-04-24 15:13:44 +00:00
Georg Lehmann
6d2190300a radv/nir/lower_cmat: tightly pack 8bit gfx11 acc matrix
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Invalid for now, but used by vkd3d-proton, where the use case is to convert
a result matrix to lower precision, followed by a store.

For 16bit accumulation matrices, GFX11 only uses 16bits per 32bit register.
RADV's coop matrix code pads the unused space with undefs and uses a vector
with twice as many elements as the matrix length. Extending that to 8bit by
leaving 24 bits unused is unnecessary as these matrices as there
is no hw unit that requires it. And in wave32, it would also result in
vectors larger than NIR's limit.
So tightly pack 8bit matrices without any undef padding.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>
2025-04-24 06:37:44 +00:00
Georg Lehmann
bbc9bc9d24 radv/nir/lower_cmat: use cmat_mul instead of duplicating hw details for type conversion
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>
2025-04-24 06:37:44 +00:00