Samuel Pitoiset
6d5a2ae928
aco: clear the current wave exception in the trap handler
...
This is required to re-enable VALU instructions in this wave, only
float exception seem to be affected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960 >
2024-11-05 07:58:38 +00:00
Samuel Pitoiset
e85fc0f869
aco: fix validation for VOP1 instructions without any dest/src
...
Like v_clrexcp.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960 >
2024-11-05 07:58:38 +00:00
Samuel Pitoiset
81f4670ed6
radv,aco: dump all SGPRS from the trap handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960 >
2024-11-05 07:58:38 +00:00
Samuel Pitoiset
45d56d9395
radv: set missing shader info values for the trap handler
...
This fixes an assert in radv_precompute_registers_pgm() on GFX11
because it was considered a vertex shader.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31962 >
2024-11-05 07:11:30 +00:00
Marek Olšák
755fb7a262
amd: move Tonga and Iceland TC-compat HTILE workarounds to ac_gpu_info.c
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31910 >
2024-11-04 19:45:54 +00:00
Georg Lehmann
a7f6294f90
radv: use nir_opt_frag_coord_to_pixel_coord
...
Foz-DB Navi21:
Totals from 1648 (2.08% of 79395) affected shaders:
MaxWaves: 44918 -> 44950 (+0.07%); split: +0.09%, -0.02%
Instrs: 1004193 -> 1001179 (-0.30%); split: -0.33%, +0.03%
CodeSize: 5486412 -> 5486592 (+0.00%); split: -0.08%, +0.09%
VGPRs: 56664 -> 56552 (-0.20%); split: -0.93%, +0.73%
Latency: 15430894 -> 15435320 (+0.03%); split: -0.12%, +0.15%
InvThroughput: 3097789 -> 3092861 (-0.16%); split: -0.20%, +0.04%
VClause: 18757 -> 18793 (+0.19%); split: -0.13%, +0.32%
SClause: 34475 -> 34495 (+0.06%); split: -0.11%, +0.17%
Copies: 66195 -> 66150 (-0.07%); split: -0.88%, +0.81%
Branches: 23035 -> 23033 (-0.01%)
PreVGPRs: 42235 -> 41724 (-1.21%); split: -1.32%, +0.11%
VALU: 709730 -> 706662 (-0.43%); split: -0.47%, +0.04%
SALU: 111731 -> 111722 (-0.01%); split: -0.02%, +0.01%
VMEM: 25988 -> 25987 (-0.00%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864 >
2024-11-04 12:34:31 +00:00
Georg Lehmann
a58d2b59e9
aco: implement load_pixel_coord
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864 >
2024-11-04 12:34:30 +00:00
Georg Lehmann
42d5cb62bb
ac/llvm: implement load_pixel_coord
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864 >
2024-11-04 12:34:30 +00:00
Georg Lehmann
a2a9e93e72
radv: add support for load_pixel_coord
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864 >
2024-11-04 12:34:30 +00:00
Samuel Pitoiset
1fa0fe1e0c
aco: add support for the trap handler shader on GFX9-GFX10.3
...
This has been tested on navi21.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31926 >
2024-11-04 10:48:52 +00:00
Samuel Pitoiset
281eb14df8
aco: fix reading registers from the trap handler shader
...
It should read 32-bit values, otherwise some MSB are 0 and it's missing
some information.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31926 >
2024-11-04 10:48:52 +00:00
Samuel Pitoiset
f7636b611a
radv: add a struct that describes the trap handler layout
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934 >
2024-11-01 15:40:25 +00:00
Samuel Pitoiset
49682fc0cb
radv,aco: save SQ_WAVE_GPR_ALLOC from the trap handler
...
This would be used to dump SGPRs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934 >
2024-11-01 15:40:25 +00:00
Samuel Pitoiset
31fc3199dd
radv: fix dumping the faulty shader detected by the trap handler on GFX9+
...
The most significant bits need to be cleared.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925 >
2024-11-01 15:01:35 +00:00
Samuel Pitoiset
7b4da7f736
radv: only emit the TBA/TMA registers on GFX8
...
On GFX9+, these registers are privilegied and the kernel needs to
configure them.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925 >
2024-11-01 15:01:35 +00:00
Samuel Pitoiset
930395c5e4
radv: check for has_trap_handler_support instead of asserting
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925 >
2024-11-01 15:01:35 +00:00
Samuel Pitoiset
e27ba67d33
ac: add ac_gpu_info::has_trap_handler_support
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925 >
2024-11-01 15:01:35 +00:00
Samuel Pitoiset
b23cc8c1d3
radv: add missing L2 non-coherent image case for mipmaps with DCC/HTILE on GFX11
...
According to PAL, an image with DCC/HTILE and mipmaps isn't coherent
with L2 when the mip level is in the metadata mip-tail region.
This fix isn't super optimal because the driver should rely on the
subresource range to determine if the mip level is in the mip-tail,
but it's easier to backport. Upcoming commits will optimize that.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11939
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31920 >
2024-11-01 14:36:55 +00:00
David Rosca
c9ade8c3b5
radeonsi/vcn: Enable VCN4 AV1 encode WA
...
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31889 >
2024-11-01 14:05:04 +00:00
Samuel Pitoiset
01f329ec82
radv/ci: skip dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary
...
It can also hang randomly on VanGogh, let's skip it by default for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31922 >
2024-10-31 11:44:12 +00:00
Samuel Pitoiset
77e59eefc1
radv: add an option to configure the trap handler exceptions
...
This introduces RADV_TRAP_HANDLER_EXCP to configure the various
shader exceptions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902 >
2024-10-31 06:58:15 +00:00
Samuel Pitoiset
6b5a0f57ba
radv: fix configuring the memory violation exception for the compute stage
...
The compute stage has two EXCP_EN fields and the memory violation bit
is in EXCP_EN_MSB. Confirmed by writing a small test on GFX8.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902 >
2024-10-31 06:58:14 +00:00
Timur Kristóf
96b95c8427
radv: Flush L2 cache for non-L2-coherent images in EndCommandBuffer.
...
This fixes a CTS hang on Hawaii.
We previously only did a CB/DB flush,
but that doesn't include a L2 cache flush.
Also fix the comment that said this is for GFX9+.
Fixes: 7c62f6fa01
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31906 >
2024-10-30 17:46:50 +00:00
Samuel Pitoiset
7015e22cb6
ac/nir: cull triangles/lines when all W positions are zero/NaN
...
It looks like the fixed-func hardware is very slow to cull primitives
with zero pos.w but shader based culling helps a lot.
This fixes a massive performance gap with the FSR2 demo compared to
AMDGPU-PRO, +228% on RDNA2.
Based on my investigation, AMDGPU-PRO seems to always cull these
primitives. Note that disabling NGG culling with AMDGPU-PRO reports the
same performance as RADV without that fix. Also note that the FSR2
sample doesn't specify any cull mode (ie. VK_CULL_MODE_NONE is used),
so this is the only reason PRO was culling more than RADV.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7260
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31891 >
2024-10-30 17:09:37 +00:00
Samuel Pitoiset
fc0545e6a7
radv: fix wrong index in radv_skip_graphics_pipeline_compile()
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12089
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31901 >
2024-10-30 11:25:59 +00:00
Daniel Schürmann
62715984f8
aco/README: add descriptions of recently added passes
...
... and less recent ones.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888 >
2024-10-30 09:23:54 +00:00
Daniel Schürmann
21ceeb22ed
aco: move jump threading optimization into separate pass
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888 >
2024-10-30 09:23:54 +00:00
Daniel Schürmann
87a3c08df1
aco/ssa_elimination: remove some redundant checks during jump threading
...
Since phis got already lowered to parallelcopies by this point,
there is no need to cross-check.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888 >
2024-10-30 09:23:54 +00:00
Daniel Schürmann
a6c38f706d
aco/ssa_elimination: perform jump threading after parallelcopy insertion
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888 >
2024-10-30 09:23:54 +00:00
Samuel Pitoiset
4459a1d210
radv: resize the SPM bo when it's too small
...
This used to abort (see the previous commit) when the hardware wasn't
able to sample all SPM counters because the BO was too small. The SPM
BO can now be resized like the SQTT BO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883 >
2024-10-29 18:33:17 +00:00
Samuel Pitoiset
e14511f77d
ac/spm: do not abort when the SPM BO is too small
...
It needs to be resized instead, like the SQTT BO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883 >
2024-10-29 18:33:17 +00:00
Marek Olšák
4f096b994d
ac/nir,radeonsi: use load_cull_line_viewport_xy_scale_and_offset_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
0f39d44f1b
ac/nir,radeonsi: use load_cull_small_line_precision_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
10c6f87adb
ac/nir,radeonsi: use load_cull_small_lines_enabled_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
ee452129c6
nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
2227f5be9d
nir: rename load_cull_small_primitive_precision -> triangle, add line_precision
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
0914e0d02f
nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Lu Yao
0442a6c292
ac/radeonsi: compute htile for tile mode RADEON_SURF_MODE_1D on GFX6-8
...
Computing 'htile_size/meta_size' is allowed for RADEON_SURF_MODE_1D when
RADEON_SURF_TC_COMPATIBLE_HTILE isn't set.
Lacking of computing causes performance degradation in some scenarios.
Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31617 >
2024-10-29 16:23:51 +00:00
Georg Lehmann
938f5ec7ce
radv: use nir_opt_fragdepth
...
Cyberpunk 2077 writes unmodified depth.
Foz-DB Navi21:
Totals from 28 (0.04% of 79395) affected shaders:
Instrs: 6484 -> 6448 (-0.56%)
CodeSize: 36016 -> 35784 (-0.64%)
Latency: 58517 -> 58400 (-0.20%)
InvThroughput: 7719 -> 7717 (-0.03%)
Branches: 129 -> 119 (-7.75%)
PreVGPRs: 394 -> 372 (-5.58%)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31874 >
2024-10-29 15:15:24 +00:00
Georg Lehmann
695d2414cd
nir,radv: optimize shared atomic offsets
...
Foz-DB Navi21:
Totals from 87 (0.11% of 79395) affected shaders:
Instrs: 140877 -> 140873 (-0.00%)
CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01%
Latency: 4528171 -> 4528162 (-0.00%)
InvThroughput: 826358 -> 826349 (-0.00%)
Copies: 10888 -> 10884 (-0.04%)
VALU: 84634 -> 84630 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080 >
2024-10-29 09:31:08 +00:00
Georg Lehmann
a2baff4810
ac/llvm: handle shared atomic base offset
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080 >
2024-10-29 09:31:08 +00:00
Samuel Pitoiset
e83f91f206
radv: regroup and emit all raster related states in the same function
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31787 >
2024-10-29 07:25:34 +00:00
Samuel Pitoiset
62f51becbb
radv: track more redundant raster related registers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31787 >
2024-10-29 07:25:34 +00:00
Konstantin Seurer
0963a0a2b4
radv: Move ac_addrlib to the physical device
...
There is nothing amdgpu specific here so this does not need to be
abstracted away. max_alignment also is not used in winsys code.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31643 >
2024-10-28 20:06:38 +00:00
Semenov Herman (Семенов Герман)
1764f70ba8
radv: fix memleaks in radv_init_shader_upload_queue()
...
Co-authored-by: default avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31608 >
2024-10-28 17:11:41 +00:00
Samuel Pitoiset
8300378bf3
radv: advertise VK_EXT_device_generated_commands on GFX8+
...
GFX6-7 can't really support it and it's not worth the effort anyways.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383 >
2024-10-28 16:27:35 +00:00
Samuel Pitoiset
9f8684359f
radv: implement VK_EXT_device_generated_commands
...
The major differences compared to the NV extensions are:
- support for the sequence index as push constants
- support for draw with count tokens (note that DrawID is zero for
normal draws)
- support for raytracing
- support for IES (only compute is supported for now)
- improved preprocessing support with the state command buffer param
The NV DGC extensions were only enabled for vkd3d-proton and it will
maintain both paths for a while, so they can be replaced by the EXT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383 >
2024-10-28 16:27:35 +00:00
Semenov Herman (Семенов Герман)
637a4b849a
radv: fix memleaks in radv_sqtt_reloc_graphics_shaders()
...
Co-authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31607 >
2024-10-28 15:48:05 +01:00
Samuel Pitoiset
f7652de1f1
Revert "ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY for GFX9+"
...
This reverts commit dc5ef90547 .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869 >
2024-10-28 12:47:38 +00:00
Samuel Pitoiset
0ae880c08c
Revert "radv: implement 2D views of 3D images using 2D_ARRAY descriptors on GFX9+"
...
Using view3dAs2dArray changes the tiling and it's slower (-7.5% in
Silent Hill 2 Remake) than using 3D tiling. The previous implementation
was the best one regarding performance (it's also what RadeonSI does).
Sadly it seems that sampler2DViewOf3D can't really be supported without
that but nobody really needs it apparently.
Also view3dAs2array is incompatible for 2D views of sparse 3D images
because sparse 3D images requires 3D tiling.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11997
This reverts commit f5805bcb8e .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869 >
2024-10-28 12:47:38 +00:00