Commit graph

16641 commits

Author SHA1 Message Date
Timur Kristóf
85eab189ee ac/nir: Move ac_nir_opt_pack_half to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:15 +01:00
Timur Kristóf
e79c77b1ef ac/nir: Move ac_nir_gs_shader_query declaration to ac_nir_helpers.h
This is a helper function, so drivers don't need to call it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:13 +01:00
Timur Kristóf
88c951bd46 ac/nir: Move ac_nir_lower_legacy_gs to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:11 +01:00
Timur Kristóf
6dd3f53204 ac/nir: Move ac_nir_lower_legacy_vs to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:10 +01:00
Timur Kristóf
d0e71ac9cd ac/nir: Move ac_nir_lower_intrinsics_to_args to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:08 +01:00
Timur Kristóf
a0b226bafb ac/nir: Expose ac_nir_unpack_value in ac_nir_helpers.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:06 +01:00
Timur Kristóf
1181348e80 ac/nir: Move ac_nir_create_gs_copy_shader to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:04 +01:00
Timur Kristóf
1191408d4b ac: Move ac_nir_config struct to ac_nir.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:03 +01:00
Timur Kristóf
4cad0bc438 ac/nir: Rename emit_streamout to ac_nir_emit_legacy_streamout
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:01 +01:00
Timur Kristóf
015e5080e9 ac: Stop including nir.h in ac_shader_util.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:36 +01:00
Timur Kristóf
305fdfddb5 ac/nir: Move ac_set_nir_options to ac_nir.c
And rename it to ac_nir_set_options to match other functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:34 +01:00
Timur Kristóf
855de0483f ac/nir: Move ac_nir callback functions to ac_nir.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:32 +01:00
Timur Kristóf
cc0166462e ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c
And change its name to indicate that it is NIR specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:30 +01:00
Timur Kristóf
ad5c0b7103 ac/nir: Move ac_nir_lower_bit_size_callback to ac_nir.c
ac_shader_util should not concern itself with NIR stuff.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:28 +01:00
Marek Olšák
7e21b48a2e ac/nir: split ac_nir_lower_ps into 2 passes
It's split into ac_nir_lower_ps_early ac_nir_lower_ps_late.

ac_nir_lower_ps_early doesn't generate any AMD specific intrinsics except
some system values and is mainly an optimization pass with some lowering.
The new change here is that it also eliminates output components not needed
by spi_shader_col_format.

ac_nir_lower_ps_late lowers output stores to exports and does the bc_optimize
thing.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:25 +01:00
Marek Olšák
62c184c491 ac/nir: remove broadcast_last_cbuf because it can be deduced from NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:22 +01:00
Samuel Pitoiset
94da1edbe4 radv: rename attr_ring to ge_rings
This is better naming.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32994>
2025-01-14 00:59:38 -08:00
Samuel Pitoiset
ab96333490 radv: fix configuring the attribute ring size on GFX12
The attribute ring size per SE is different than GFX11 and it was
already computed correctly in common code but RADV was using the old
GFX11 style.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32994>
2025-01-14 00:59:37 -08:00
Samuel Pitoiset
10e424f586 aco: always use ds_bpermute for shuffle/rotate on GFX12
ds_bpermute supports both 32 and 64 lanes now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32974>
2025-01-13 08:33:38 +00:00
Samuel Pitoiset
b3d4d65f5a radv: fix CP DMA clears/copies on GFX12
CP DMA on GFX12 doesn't always use L2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32971>
2025-01-13 08:07:58 +00:00
Samuel Pitoiset
603541f1a2 ac/gpu_info: add cp_dma_use_L2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32971>
2025-01-13 08:07:58 +00:00
Rhys Perry
2b10930b48 aco: use VOP3 v_mov_b16 if necessary
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32944>
2025-01-10 15:05:00 +00:00
Rhys Perry
46787fc2d0 aco/util: fix bit_reference::operator&=
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32944>
2025-01-10 15:05:00 +00:00
Timur Kristóf
dd980d2b28 radv: Only print "testing use only" message on GFX12+.
This message has been confusing users, especially now that
popular toolkits such as Gtk started using a Vulkan renderer.

Printing a message on non-conformant implementations is also
actually not required. So let's remove it.

We haven't fully finished the GFX12 implementation yet,	but on
all other hardware, RADV should	work just fine,	and is definitely
not meant for "testing use only".

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12314
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32930>
2025-01-09 23:16:48 +00:00
Marek Olšák
e640d5a9c3 amd: vectorize SMEM loads aggressively, allow overfetching for ACO
If there is a 4-byte hole between 2 loads, they are vectorized. Example:
    load 4 + hole 4 + load 8 -> load 16
This helps GLSL uniform loads, which are often sparse. See the code for more
info.

RADV could get better code by vectorizing later.

radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355)
  Spilled SGPRs: 841 -> 747 (-11.18 %)
  Code Size: 67552396 -> 65291092 (-3.35 %) bytes
  Max Waves: 714439 -> 714520 (0.01 %)

This should have no effect on LLVM because ac_build_buffer_load scalarizes
SMEM, but it's improved for some reason:

radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355)
  Spilled SGPRs: 1450 -> 1282 (-11.59 %)
  Spilled VGPRs: 106 -> 107 (0.94 %)
  Scratch size: 101 -> 102 (0.99 %) dwords per thread
  Code Size: 14994624 -> 14956316 (-0.26 %) bytes
  Max Waves: 66679 -> 66735 (0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
2025-01-09 22:01:54 +00:00
Marek Olšák
abd5216ae8 ac,radeonsi: scalarize overfetching loads
There is nothing preventing ACO from generating loads with unused
components. This happens often with GLSL uniforms. Some of those loads
are partially re-vectorized after this.

radeonsi+ACO:

TOTALS FROM AFFECTED SHADERS (19564/58918)
  VGPRs: 732900 -> 728448 (-0.61 %)
  Spilled SGPRs: 429 -> 433 (0.93 %)
  Code Size: 38446004 -> 38485612 (0.10 %) bytes
  Max Waves: 305440 -> 305549 (0.04 %)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
2025-01-09 22:01:54 +00:00
Marek Olšák
58a88bbdb9 ac/nir/ngg: export positions after streamout to improve performance
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
fc73749d6c ac/nir/ngg: fold so_vertex_index * so_stride into immediate offset
Instead of using a different voffset VGPR per streamout vertex,
point voffset to the first vertex for all 3 vertices because
the stride and vertex index are constant and can be in the immediate
offset.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
97e82af162 ac/nir/ngg: vectorize streamout stores for NGG optimally
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.

It also removes the memory_modes field from VMEM stores because I don't
think it's needed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
4f2e2e10bc ac/nir: vectorize streamout stores for legacy pipeline optimally
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.

It also removes the COHERENT flag from VMEM stores because NGG streamout
doesn't use it either and I don't think it's needed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
e399f3bed9 ac/nir: sort xfb info to facilitate vectorization of xfb stores
xfb stores are not vectorized properly, leading to generating random soup
of b32, b64, b96, and b128 stores.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Samuel Pitoiset
f09f31d093 ac/nir: fix a comment typo in load_subgroup_id_lowered()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>
2025-01-09 08:02:19 +00:00
Samuel Pitoiset
44ba856089 ac/nir: fix lowering subgroup ID for compute shaders on GFX12
This is lowered in backend compilers (LLVM or ACO) because it needs
to access ttmp registers which aren't exposed to NIR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>
2025-01-09 08:02:19 +00:00
Samuel Pitoiset
bc1374355b radv: program DB_RENDER_OVERRIDE correctly on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32941>
2025-01-09 07:39:23 +00:00
Rhys Perry
8ac4744706 aco/tests: fix skip_lines=True with remaining characters in matches
If the remaining character check fails, we should try a later line if
skip_lines=True. So the check has to be done earlier.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32902>
2025-01-08 15:28:37 +00:00
Friedrich Vock
71392fff25 aco: Fix dead instruction/index handling for try_insert_saveexec_out_of_loop
The loop checking if exec is overwritten didn't check for NULL
instructions, and didn't fix up reg write indices after inserting
instructions.

Fixes: fcd94a8c ("aco: move try_optimize_branching_sequence() to postRA optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32746>
2025-01-08 10:48:01 +00:00
Georg Lehmann
208d8cd715 radv: run peephole_select in optimize_nir_algebraic
Foz-DB Navi21:
Totals from 451 (0.57% of 79395) affected shaders:
MaxWaves: 8680 -> 8616 (-0.74%)
Instrs: 689610 -> 688225 (-0.20%); split: -0.21%, +0.01%
CodeSize: 3524580 -> 3521740 (-0.08%); split: -0.11%, +0.03%
VGPRs: 28512 -> 28584 (+0.25%)
Latency: 1906219 -> 1892124 (-0.74%); split: -0.91%, +0.17%
InvThroughput: 481931 -> 483570 (+0.34%); split: -0.00%, +0.34%
VClause: 10317 -> 10296 (-0.20%)
SClause: 18105 -> 18088 (-0.09%); split: -0.17%, +0.07%
Copies: 69532 -> 67579 (-2.81%); split: -2.85%, +0.04%
Branches: 21353 -> 20501 (-3.99%)
PreSGPRs: 27004 -> 27005 (+0.00%)
VALU: 436235 -> 436334 (+0.02%); split: -0.01%, +0.03%
SALU: 102349 -> 101944 (-0.40%); split: -0.61%, +0.21%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32792>
2025-01-08 09:56:39 +00:00
Marek Olšák
c20c46cf7b ac: update ATOMIC_MEM definitions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32877>
2025-01-07 20:24:19 +00:00
Pierre-Eric Pelloux-Prayer
dd11eec06b gl/spirv: update subgroup_size if GroupNonUniform is used
This is similar to what link_intrastage_shaders is doing and it
fixes the following test:
   KHR-Single-GL46.subgroups.builtin_var.compute.subgroupsize_compute

Which was failing with SPIRV but passing with GLSL, the diff being:
 - SPIRV: "subgroup_size: 1"
 - GLSL:  "subgroup_size: 2"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32698>
2025-01-07 19:32:43 +00:00
Pierre-Eric Pelloux-Prayer
dc293ffe50 radeonsi: fallback to util_blitter_draw_rectangle
The blitter VS expects coords to fit in a signed int16. When this
is not the case, use util_blitter_draw_rectangle instead.

Since util_blitter_draw_rectangle sets vertex elements, we need
to make sure they're properly restored.

The alternative to this fallback would be to pass coordinates
unpacked (so 4 SGPRs instead of 2), but this doesn't fix the
fbo-blit-check-limits test because of uv interpolation precision
issue.
Using 2 triangles instead of a rectangle + disabling
window_space_position helps but then this breaks some GLES3 tests,
like dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x
(which doesn't pass either if u_blitter is used for all cases).

Using a single triangle covering the whole rectangles fixes all
cases but it then requires to setup scissors to not write too
much pixels...
So, instead of adding so much complexity, let's use u_blitter
for the "large coordinates" fallback, and keep the rectangle blit
for the other cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32698>
2025-01-07 19:32:43 +00:00
Samuel Pitoiset
7f50162424 radv: fix programming WALK_ALIGN8_PRIM_FITS_ST on GFX12
This also needs to be disabled when a VRS image is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>
2025-01-07 18:56:24 +00:00
Samuel Pitoiset
d7bc370b9e radv: configure the VRS surface swizzle mode on GFX12
GFX11 allowed only one swizzle mode for the VRS image but GFX12 allows
all 2D non-linear swizzle modes and PC_SC_VRS_INFO needs to be
configured.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>
2025-01-07 18:56:24 +00:00
Samuel Pitoiset
0b53e645a0 radv: disable VRS coarse shading with 8x MSAA on GFX12
This isn't supported and the hw always clamps to 1x1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>
2025-01-07 18:56:24 +00:00
Samuel Pitoiset
f94bd67b82 aco: fix VS prologs on GFX12
MTBUF/MUBUF instructions must use zero for SOFFSET, use const_offset
instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32904>
2025-01-07 13:44:32 +00:00
Feng Jiang
701600fb11 radv/rt: Fix memleak in radv_init_header()
Fixes: f8b584d ("vulkan/runtime,radv: Add shared BVH building framework")
Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32887>
2025-01-07 09:49:56 +00:00
Samuel Pitoiset
c5fe9dcf16 ac/descriptors: fix configuring NBC views on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32892>
2025-01-07 09:15:12 +00:00
Chia-I Wu
f6332ca650 radv: use common calibrated timestamp support
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32689>
2025-01-07 03:39:29 +00:00
Martin Roukala (né Peres)
f1a6af133a radeonsi/ci: run a fraction of glcts-vangogh in pre-merge
Now that ACO has become the default on pre-RDNA GPUs, all pre-merge CI
coverage of radeonsi+LLVM has disapeared. Let's fix this by making
our post-merge glcts-vangogh-valve job run inpre-merge pipelines.

However, we are limited in vangogh capacity, so rather than running the
full glcts/piglit test suites we run a fraction of it to stay under 15
minutes of execution time on a single Steam Deck.

Suggested-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>
2025-01-06 11:55:22 +00:00
Martin Roukala (né Peres)
0c538f82bc radeonsi/ci: run on ACO changes
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>
2025-01-06 11:55:22 +00:00
Martin Roukala (né Peres)
bec7f09e76 radeonsi/ci: update the vangogh expectations
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>
2025-01-06 11:55:21 +00:00