Commit graph

17763 commits

Author SHA1 Message Date
Samuel Pitoiset
22e06d65d7 radv: make sure to zero-initialize image view descriptors
This prevents a regression from the next commit which would write
garbage for combined image+sampler descriptors and that might break
capture&replay.

It seems also more robust to write zeroes than garbage overall.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35455>
2025-06-13 06:27:25 +00:00
Marek Olšák
0cbcb72869 nir/opt_vectorize_io: work around a 16-bit IO bug for RADV
If nir_opt_vectorize_io isn't called, 16-bit IO is broken.
This is a workaround to keep RADV working and consume incorrect NIR
while other drivers consume correct NIR.

Hopefully this will be removed ASAP.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:37 +00:00
Peyton Lee
75736aa494 amd/gmlib: remove the executable bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
remove the executable bit for all files.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35361>
2025-06-12 07:44:27 +00:00
Peyton Lee
fd1930b035 amd: add vpe_version
vpe_version describes which generation of vpe capabilities a chip has.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35361>
2025-06-12 07:44:27 +00:00
Peyton Lee
47163fa8d3 radeonsi/vpe: enhance scaling quality
add support for lanczos coefficients
which enhaces the quality of scaling down

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35361>
2025-06-12 07:44:26 +00:00
Collabora's Gfx CI Team
350eccd032 Uprev Piglit to a0a27e528f643dfeb785350a1213bfff09681950
685ea49b47...a0a27e528f

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35303>
2025-06-11 21:14:59 +00:00
Pierre-Eric Pelloux-Prayer
7280e3b2a1 radeonsi/tests: update expected results
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:28 +00:00
Pierre-Eric Pelloux-Prayer
3bcbd11a33 aco/isel: fix visit_tex handling of is_sparse
For cases when less than 4 components are read, the original code
would compute an incorrect dmask. eg: with a single component + is_sparse,
the dmask was 0x13:
  - 0x 3 = coming from nir_def_components_read
  - 0x10 = the sparse bit
While it should have at 2 bits set (1 for the color/depth, 1 for tfe).

This caused problem when expand_vector() used the dmask to generate
the final results, because the value for the sparse component was
read from the wrong index.

So after the call to emit_mimg() dmask needs to be adjusted
because the components will be stored in order, so if mask is 0x11
the tfe value would be stored at invalid index=5 (while it should
be at index=1).

This fixes KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupResidency_texture_2d_depth_component16
and KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16
with ACO.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:28 +00:00
Pierre-Eric Pelloux-Prayer
4a84ebfcb1 ac/llvm: rework component trimming in visit_tex
The referenced commit was a step in the right direction, but not
complete.

ac_build_image_opcode returns a vec<4> or a struct<vec<4>, int>
so we can simplify visit_tex. We just need to map these 4/5 values
to the expected layout from NIR.
eg: depth + TFE would produces "<d, x, x, x>, t" so it has to be
transformed into <d, t>.

nir_texop_fragment_mask_fetch_amd + sparse doesn't exist, so it's
another opportunity for simplification.

This is required to get KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16
working properly.
The same test fails with ACO so it probably needs a change in the
same area.

Fixes: c0ef2aa7f8 ("DEPENDENCY: ac/llvm: fix sparse code handling")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:28 +00:00
Pierre-Eric Pelloux-Prayer
1cc52dff05 radeonsi: allow sparse depth textures
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:28 +00:00
Pierre-Eric Pelloux-Prayer
b153188f25 amd/ci: remove references to tests that don't exist anymore
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:27 +00:00
Pierre-Eric Pelloux-Prayer
0e9ba3031e radeonsi: allow msaa sparse textures on gfx10+
The hardware doesn't support the prt layouts, but we can use normal
layouts and ac_surface_addr_from_coord to determince which pages
need to be committed.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>
2025-06-11 12:11:27 +00:00
Rhys Perry
bc2edf14d8 ac/nir: run nir_lower_vars_to_ssa after nir_lower_task_shader
nir_lower_task_shader does nir_lower_returns, so we need this if the
launch_mesh_workgroups was in control flow.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13326
Backport-to: 25.1
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35411>
2025-06-11 09:01:39 +00:00
Samuel Pitoiset
3b326abf7b radv: add capture/replay for sparse buffers and descriptor buffer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Shouldn't be super useful in practice because the normal capture/replay
BDA path should also work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35366>
2025-06-11 07:31:29 +00:00
Samuel Pitoiset
643e1c4395 radv: cleanup creating sparse buffers with capture/replay
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35366>
2025-06-11 07:31:29 +00:00
Samuel Pitoiset
74acae0ed8 radv: stop setting the address for capture/replay and non-sparse buffers
This doesn't do anything because for non-sparse buffers, a device
memory object must be bound to the buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35366>
2025-06-11 07:31:28 +00:00
Samuel Pitoiset
ee200cc0d1 radv: stop using vk_common entrypoints when not necessary
For less indirections.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35359>
2025-06-11 07:10:02 +00:00
Samuel Pitoiset
7d2f20b2fb radv: remove useless vk_common_entrypoints.h includes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35359>
2025-06-11 07:10:02 +00:00
Samuel Pitoiset
f3578973d7 radv/meta: fix using the wrong pipeline layout for ASTC decoding
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35359>
2025-06-11 07:10:01 +00:00
Eric Engestrom
0a4a47b92f radeonsi/ci: document flakes seen over the last week
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35432>
2025-06-10 13:58:25 +00:00
Eric Engestrom
9d7bd8e78d radv/ci: document flakes seen over the last week
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35432>
2025-06-10 13:58:25 +00:00
Samuel Pitoiset
6fac587aa2 radv: use 32 bytes descriptor for sampled/input attachment images on GFX11+
FMASK has been removed since GFX11+ and using 32 bytes can save a lot
of memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19621>
2025-06-10 08:49:09 +00:00
Samuel Pitoiset
2797efb12d radv: remove dead code in radv_CreateDescriptorSetLayout()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19621>
2025-06-10 08:49:09 +00:00
Valentine Burley
5ee7a4c1e9 ci: Uprev GL & GLES CTS
Update to the newest releases.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13076
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34321>
2025-06-10 07:56:46 +00:00
Georg Lehmann
f36ac8434c aco: add a readme entry for v_pk_cvt_u8_f32
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:05 +00:00
Georg Lehmann
94c191e6d9 aco: remove p_v_cvt_pk_u8_f32
Now unused.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:04 +00:00
Georg Lehmann
d95e90ab5f aco: do not use v_cvt_pk_u8_f32 for f2u8
The ISA docs don't mention this, but instead of always truncating
like other integer conversions, this opcode actually uses the single
precision rounding mode.

We could continue to use the opcode and set the rounding mode to rtz
in lower_to_hw_instrs, but I think I should just concede that f2u8
isn't worth the effort.

Fixes: 9bb10b58 ("aco: use v_cvt_pk_u8_f32 for f2u8")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:04 +00:00
Valentine Burley
519ecf372d radv/ci: Add a pre-merge vkd3d job on Raven
Introduce a new, pre-merge vkd3d-proton job on Raven.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35167>
2025-06-10 06:33:10 +00:00
Samuel Pitoiset
d98533630b radv: stop using multiview with DGC
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
DGC doesn't support multiview. The Vulkan spec says:

"VUID-vkCmdExecuteGeneratedCommandsEXT-None-11062
 If a rendering pass is currently active, the view mask must be 0."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35342>
2025-06-10 06:15:00 +00:00
Marek Olšák
edd2fc3c7f radeonsi: use AC_EXP_PARAM_UNDEFINED for clarity
The code was slightly confusing.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35392>
2025-06-10 03:31:20 +00:00
Marek Olšák
447d744833 ac/llvm: allocate LLVM PS output variables on demand
This stops relying on si_shader_info, allowing further cleanup of
si_shader_info.

radv_load_output was unused.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35392>
2025-06-10 03:31:20 +00:00
Dave Airlie
b8ac2d47e7 radv/video: add KHR_video_decode_vp9 support.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This adds the VP9 decoding support.

This was initially developed by me,

Stéphane Cerveau from Igalia did a bunch of fixes and testing,
Benjamin Cheng from AMD also helped with a few fixes and how
to program the firmware better.

This passes the current VK-GL-CTS tests.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35398>
2025-06-09 20:46:04 +00:00
Dave Airlie
4399e43ffd ac/vcn: add new firmware flag to pass uncompresed header offset.
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35398>
2025-06-09 20:46:04 +00:00
Dave Airlie
a0f4cbe6f7 amd: move vp9 probs table to common code.
This will be reused by radv eventually, so let's move it all
over to common code. It might have other users eventually,
but we can worry about that later.

Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35398>
2025-06-09 20:46:03 +00:00
Natalie Vock
a28515f096 aco/opt: Rename loop header phis
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fossil stats on top of !35269:
Totals from 133 (0.16% of 81077) affected shaders:

Instrs: 4328456 -> 4327891 (-0.01%)
CodeSize: 22890004 -> 22887732 (-0.01%); split: -0.01%, +0.00%
Latency: 28406452 -> 28404732 (-0.01%)
InvThroughput: 5361458 -> 5361153 (-0.01%)
Copies: 376788 -> 376222 (-0.15%)
VALU: 2429210 -> 2428645 (-0.02%)
VOPD: 57 -> 56 (-1.75%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35270>
2025-06-09 14:36:44 +00:00
Rhys Perry
00dd0d0dd1 aco: update VALUReadSGPRHazard comment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>
2025-06-09 10:12:25 +00:00
Rhys Perry
a714a19e16 aco/gfx12: fix VALUReadSGPRHazard with carry-out
fossil-db (gfx1201):
Totals from 370 (0.46% of 79653) affected shaders:
Instrs: 3933639 -> 3935914 (+0.06%)
CodeSize: 20743448 -> 20752068 (+0.04%); split: -0.00%, +0.04%
Latency: 26261246 -> 26261921 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 5363675 -> 5363760 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 65f95ae74e ("aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>
2025-06-09 10:12:25 +00:00
Marek Olšák
d279d019d4 ac/nir/tess: remove parameter from and simplify hs_per_patch_output_vmem_offset
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
5734a916d6 ac: move tcs_offchip_layout into ac_shader_args
It's the same variable between radv and radeonsi, but the implementation of
the load intrinsics is very different.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
5994e08f8b ac: set LDS limit for TCS to 32K for all chips
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
fa5e07d5f7 ac/nir/tess: write TCS patch outputs to memory as vec4 stores at the end
This moves per-patch output VMEM stores to the end of the shader where they
execute only once. They are skipped if the whole workgroup discards
all patches.

If tcs_vertices_out == 1, per-patch output VMEM stores use the same lanes
as per-vertex output VMEM stores, which are aligned to 4 or 8 lanes to get
cached bandwidth for the stores.

Previously, per-patch outputs were stored to memory for every store_output
intrinsic in TCS.

Additionally, LDS is no longer allocated for per-patch outputs that are only
written and read by invocation 0, or they are written by all invocations
but not read, and don't have indirect indexing. This reduces LDS usage and
LDS traffic.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
c732306c5a ac/nir/tess: unify computing LDS output patch size, minimize LDS bank conflicts
This unifies the duplicated LDS output patch size computation between
hs_output_lds_offset and ac_nir_compute_tess_wg_info.

"+ 4" to the output patch stride minimizes LDS bank conflicts by making
the beginning of each patch start on a different LDS bank.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
37dc376395 ac/nir/tess: use if-ladder to determine valid tess level components for the vote
Checking whether every compoment is valid in tess_level_has_effect() when
prim_mode is unknown generated too many SALU. Do this instead:

    if (triangles) ...
        subgroup vote for triangles
    else if (quads) ..
        subgroup vote for quads
    else // isoline
        subgroup vote for isolines

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
2f0d9495c5 ac/nir/tess: inline mask helpers
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
10ae5b2fbf ac/nir/tess: rewrite tess level tracking, don't use LDS for more cases
This rewrites tess level value tracking to use the 2-bit masks, which
means LDS allocation is determined separately for outer and inner levels.

LDS is not allocated for tess levels that are only written by invocation 0
and never read or only read by invocation 0. If the number of output
patch vertices is 1, LDS is also not allocated for tess levels.

Tess level outputs for TES are always written as whole vec4 to get cached
bandwidth.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
9d9cfd89da ac/nir/tess: compute the number of remapped VRAM outputs in common code
This unifies it for both drivers.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
ea70060826 ac/nir/tess: stop using tes_inputs_read / tes_patch_inputs read for TCS & TES
use ac_nir_tess_io_info instead

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
c38bc4824f ac/nir/tess: apply no_varying to ac_nir_tess_io_info
This has the effect that no_varying is finally honored for per-patch
outputs, skipping VMEM stores that TES doesn't read.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
42445e271e radv,radeonsi: use ac_nir_tess_io_info for LDS size computation
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00
Marek Olšák
c678844ccb ac/nir/tess: move LDS and VMEM output masks into a new info structure
This will replace LDS and VMEM output size computations in drivers.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>
2025-06-07 16:29:39 +00:00