Commit graph

19102 commits

Author SHA1 Message Date
Georg Lehmann
b2172467d1 aco/gfx10_3: work around NSA hazard
4+ dword NSA can hang if exec becomes non-zero again directly before
the instruction.

Foz-DB Navi21:
Totals from 608 (0.74% of 82161) affected shaders:
Instrs: 945138 -> 946431 (+0.14%)
CodeSize: 5171580 -> 5176864 (+0.10%)
Latency: 13356895 -> 13357113 (+0.00%)
InvThroughput: 3043234 -> 3043236 (+0.00%); split: -0.00%, +0.00%

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9852
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13981
Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38215>
2025-11-05 10:06:04 +00:00
David Rosca
bcb6e6b6e6 radv/video: Fix AV1 bidir compound encode with order_hint disabled
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37911>
2025-11-05 09:44:04 +00:00
David Rosca
96db490318 radv/video: Don't require encode FW version >= interface version
Otherwise this breaks backwards compatibility when bumping interface
version for new features.

Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37911>
2025-11-05 09:44:04 +00:00
David Rosca
1a8a8db8c5 radeonsi/vcn: Fix AV1 bidir compound encode with order_hint disabled
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37911>
2025-11-05 09:44:04 +00:00
Samuel Pitoiset
a0d607bfdb radv,aco: wait for all VMEM loads when the prolog loads large 64-bit attributes
Not the most optimal solution but 64-bit vertex attributes are rarely
used. Could still revisit if we find a real use case that matters.

This fixes recent VKCTS coverage:

dEQP-VK.pipeline.fast_linked_library.vertex_input.component_mismatch.r64g64b64.*_to_dvec2
dEQP-VK.pipeline.shader_object_.*.vertex_input.component_mismatch.r64g64b64.*_to_dvec2

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14243
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38237>
2025-11-05 07:26:45 +00:00
Samuel Pitoiset
ba5bf81aa2 aco: fix reserving VGPRs for 64-bit attributes in VS prologs
Otherwise the fetch index would be overwritten if the attribute format
is 64-bit and more than 2 components are loaded.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14242
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38237>
2025-11-05 07:26:45 +00:00
Georg Lehmann
83e9ae2d5c radv: do not report wave32 in gl_SubgroupSize for Doom Dark Ages
The shaders in question use:

(memory_load + (gl_SubgroupSize - 1)) & ~(gl_SubgroupSize - 1)

My guess is that this is supposed to be the subgroup size of whatever
produced the value, not the subgroup size in this shader.
And because in the consumer the workgroup size is 32, we use wave32.

Fixes: a2d3cbac2a ("radv: determine subgroup/wave size early")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14187

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38214>
2025-11-04 16:52:13 +00:00
Alyssa Rosenzweig
17355f716b treewide: use UTIL_DYNARRAY_INIT
Instead of util_dynarray_init(&dynarray, NULL), just use
UTIL_DYNARRAY_INIT instead. This is more ergonomic.

Via Coccinelle patch:

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray = {0};
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray;
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(&(dynarray), NULL);
    +dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(dynarray, NULL);
    +(*dynarray) = UTIL_DYNARRAY_INIT;

Followed by sed:

    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(&\(.*\), NULL)/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init( &\(.*\), NULL )/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(\(.*\), NULL)/*\1 = UTIL_DYNARRAY_INIT/g' \{} \;"

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38189>
2025-11-04 13:39:48 +00:00
Georg Lehmann
22dc06798b aco/optimizer: never unfuse fma
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This shouldn't change anything in practice, and reducing precision
if precise isn't set is weird.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38183>
2025-11-04 07:54:02 +00:00
Georg Lehmann
6610905b43 aco: allow v_fma_mix with denorms for gfx9 chips where it's fused
Only unfused mad unconditionally flushes denorms.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38183>
2025-11-04 07:54:02 +00:00
Georg Lehmann
75e7fb03a5 aco: fix v_mad_mix denorm behavior
It also flushes fp32 denorms, just like v_mad_f32.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38183>
2025-11-04 07:54:02 +00:00
Samuel Pitoiset
95b9801ad2 radv/sqtt: do not try to resize the SQTT buffer for per-submit captures
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Doesn't make sense to try that and it will fail later anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38188>
2025-11-04 06:57:38 +00:00
Daniel Schürmann
43f1ad308a radv/shader_info: repack and compact struct radv_shader_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
using pahole.

Reduces the size of radv_shader_info from 760 bytes to 640 bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37931>
2025-11-03 13:38:38 +01:00
Daniel Schürmann
e1bcbbf3dd radv/shader_info: rename gs_ring_info -> legacy_gs_info and use union with ngg_info
Reduces the size of radv_shader_info from 784 bytes to 760 bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37931>
2025-11-03 13:38:38 +01:00
Daniel Schürmann
9b34da3da8 radv/shader_info: use union for precomputed register values of non-overlapping stages
Reduces the size of radv_shader_info from 872 bytes to 784 bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37931>
2025-11-03 13:38:38 +01:00
Daniel Schürmann
68ab01b2f2 radv/shader_info: remove unused output_usage_mask
Reduces the size of radv_shader_info from 1000 bytes to 872 bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37931>
2025-11-03 13:38:37 +01:00
David Rosca
502f7e85e5 radv/ci: Enable video tests on navi21 and navi31
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38184>
2025-11-03 08:39:59 +00:00
David Rosca
874e02003a radv/video: Only use write_memory for encode feedback with full support
write_memory is used after encoding every frame to mark the feedback
buffer as ready. Only use it when write_memory can work without PCIe
atomics support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38184>
2025-11-03 08:39:59 +00:00
David Rosca
8e1d74bbb4 radv/video: Introduce two levels of write_memory support
Print warning when using write_memory with firmwares that require
PCIe atomics support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38184>
2025-11-03 08:39:59 +00:00
Samuel Pitoiset
f91e5c9cb7 radv: use radv_get_shader_layout() more with ESO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38106>
2025-11-03 08:00:35 +00:00
Samuel Pitoiset
e5513826aa radv: fix creating linked graphics ESOs with a compute shader
It's valid to create shader objects with linked VS/FS and with a
compute shader.

This reworks radv_CreateShadersEXT() completely to allow this. Unlinked
shaders are created first, then linked shaders.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14193
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38106>
2025-11-03 08:00:35 +00:00
Marek Olšák
5d92c92ce5 Revert ABI breakage "amd: Add user queue HQD count to hw_ip info"
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reverts commit 56d758d321.

It broke ABI between Mesa and libdrm, causing crashes due to stack smashing.

See: https://gitlab.freedesktop.org/mesa/libdrm/-/issues/121#note_3172362

Fixes: 56d758d321
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38203>
2025-11-02 02:54:59 +00:00
Marek Olšák
9125e34372 amd: lower get_ssbo_size in ac_nir_lower_resinfo
The code for lowering get_ssbo_size will be different in future chips,
so do it in common code to reduce duplication in the future.

Lower get_ssbo_size to ssbo_descriptor_amd + nir_channel.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38097>
2025-11-02 01:42:07 +00:00
Samuel Pitoiset
cb4e0c4140 radv: add a workaround for illegal depth/stencil descriptors with No Man's Sky
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Using descriptors with both depth and stencil aspects is illegal in
Vulkan and this hangs the GPU.

Use NULL descriptors to mitigate the issue. Note that AMDVLK also
ignores them.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13325
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38180>
2025-10-31 15:46:55 +00:00
Georg Lehmann
cbf5c881a5 aco/opcodes: remove VOP3 alias for new gfx12 VOP2 opcodes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:03 +00:00
Georg Lehmann
0f54136730 aco/isel: emit vop2 v_lshlrev_b64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:03 +00:00
Georg Lehmann
7ac67e2711 aco/isel: emit vop2 v_max_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:03 +00:00
Georg Lehmann
8397b91934 aco/isel: emit vop2 v_min_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:02 +00:00
Georg Lehmann
2e120d4e26 aco/isel: emit vop2 v_mul_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:01 +00:00
Georg Lehmann
86ea462f4d aco/isel: emit vop2 v_fadd_f64 for gfx12+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:01 +00:00
Georg Lehmann
7d2325b194 aco/lower_to_hw: emit vop2 for gfx12+ fp64 reductions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>
2025-10-31 08:31:00 +00:00
Samuel Pitoiset
968fb06a94 radv,vulkan: replace VK_RENDERING_INPUT_ATTACHMENT_NO_CONCURRENT_WRITES_BIT_MESA
The new flag from maintenance10 has similar meaning.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:23 +00:00
Samuel Pitoiset
c8aaf3f5b5 radv: advertise VK_KHR_maintenance10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
14639898d0 radv: add support for controlling sRGB transfer function with resolves
Just need to use UNORM image views.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
0034f5a948 radv: allow ds<->color copies on compute/transfer queues
This should be supported just fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:22 +00:00
Samuel Pitoiset
49128926d6 radv: implement new input attachment information for dynamic rendering
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:21 +00:00
Samuel Pitoiset
18fec61c8d radv: reverse the logic for NO_CONCURRENT_WRITES_BITS_MESA
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:21 +00:00
Samuel Pitoiset
d3924f5bd6 radv: add support for depth/stencil resolves with vkCmdResolve2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:20 +00:00
Samuel Pitoiset
8306256e2a radv: allow NULL pSamplesMask with vkCmdSetSampleMaskEXT()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:19 +00:00
Samuel Pitoiset
d5d2a4ad07 radv: implement vkCmdEndRendering2KHR()
Common runtime code already does CmdEndRendering()->CmdEndRendering2().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38043>
2025-10-31 07:51:19 +00:00
Marek Olšák
9def0a6e5b ac/nir: set support_indirect_inputs/outputs in common code
This fixes mesh shader performance of RADV for GravityMark by stopping
the lowering of ClipDistance[64][4] indirect access for mesh shader outputs.

The perf improvement is 14% on Navi48.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38155>
2025-10-31 00:57:46 +00:00
Daniel Schürmann
10be538851 tree-wide: don't call nir_opt_constant_folding after nir_lower_flrp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Daniel Schürmann
ef9ecc4058 nir: add nir_imul_nuw() and nir_imul_imm_nuw() helpers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:06 +00:00
Samuel Pitoiset
b2badb2b24 radv: ignore dual-source blending when blending isn't enabled for MRT0
The Vulkan spec says:
    "VUID-vkCmdDraw-maxFragmentDualSrcAttachments-09239
     If blending is enabled for any attachment where either the source
     or destination blend factors for that attachment use the secondary
     color input, the maximum value of Location for any output attachment
     statically used in the Fragment Execution Model executed by this
     command must be less than maxFragmentDualSrcAttachments"

Which means it must be disabled.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14190
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38107>
2025-10-30 07:59:50 +00:00
Samuel Pitoiset
14667eef53 radv: fix reserving enough space for emitting the SPM setup
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
4096 is an arbitrary but large enough number to emit everything needed.

Fixes: 22d73fc077 ("amd,radv,radeonsi: add ac_emit_spm_setup()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38087>
2025-10-30 07:36:27 +00:00
Samuel Pitoiset
0dcb800a07 radv: remove some RADV_DEBUG deprecated options
They have been marked as deprecated in 25.3, so one release cycle
before they are removed completely.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38085>
2025-10-30 07:16:23 +00:00
Georg Lehmann
a17afd5edd aco/tests: add some simple fp64 modifier tests
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>
2025-10-29 17:57:53 +00:00
Georg Lehmann
a54f95c52f aco/optimizer: apply fp64 modifiers
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>
2025-10-29 17:57:53 +00:00
Georg Lehmann
62e664f8c8 aco/optimizer: fix applying 64bit neg/abs
extract is only valid for <=32bit operands.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>
2025-10-29 17:57:53 +00:00
Georg Lehmann
0c8b885e21 aco/isel: emit v_mul_f64 for fp64 fsat
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>
2025-10-29 17:57:52 +00:00