Georg Lehmann
e136a0629d
radv/gfx11+: add rtwave32 perftest option
...
Useful for testing compiler changes and performance considerations.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27584 >
2024-02-14 17:11:01 +00:00
Samuel Pitoiset
32c1e45718
radv: fix emitting VS prologs for merged shaders compiled separately on GFX10+
...
RSRC1 isn't equal to the VS RSRC1 and both config registers need to
be re-emitted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27574 >
2024-02-13 14:01:42 +00:00
Samuel Pitoiset
6762307698
radv: cleanup radv_shader_combine_cfg_vs_tcs()
...
To match radv_shader_combine_cfg_vs_gs().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27574 >
2024-02-13 14:01:42 +00:00
Georg Lehmann
6121497228
aco/gfx11+: limit hard clauses to 32 instructions
...
https://github.com/llvm/llvm-project/pull/81287
Foz-DB Navi31:
Totals from 406 (0.52% of 78112) affected shaders:
Instrs: 585342 -> 585750 (+0.07%)
CodeSize: 3077856 -> 3079456 (+0.05%); split: -0.00%, +0.05%
Latency: 3263165 -> 3263326 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 664092 -> 664114 (+0.00%); split: -0.00%, +0.00%
VClause: 11143 -> 11537 (+3.54%)
SClause: 11878 -> 11884 (+0.05%)
Copies: 39807 -> 39815 (+0.02%)
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27569 >
2024-02-13 13:40:52 +00:00
Samuel Pitoiset
0c05bdf1c1
radv/ci: enable RADV_PERFTEST=shader_object on VEGA10
...
Renoir currently hangs in Mesa CI. Needs to be investigated.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27433 >
2024-02-13 09:14:21 +00:00
Samuel Pitoiset
bead3f2ec3
radv: allow RADV_PERFTEST=shader_object on GFX9/VEGA10
...
It's passing VKCTS on VEGA10 but for some reasons RENOIR currently
hangs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27433 >
2024-02-13 09:14:21 +00:00
Rhys Perry
926d9f1cef
radv: support minmax filter for more formats
...
Support should be the same as AMDVLK, except for these formats:
- VK_FORMAT_R4G4_UNORM_PACK8
- VK_FORMAT_A4R4G4B4_UNORM_PACK16_EXT
- VK_FORMAT_A4B4G4R4_UNORM_PACK16_EXT
- VK_FORMAT_A1B5G5R5_UNORM_PACK16_KHR
- VK_FORMAT_A8_UNORM_KHR
- VK_FORMAT_X8_D24_UNORM_PACK32
- VK_FORMAT_D24_UNORM_S8_UINT
And the various emulated compressed formats.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27551 >
2024-02-12 20:05:27 +00:00
Konstantin Seurer
fb62bffcda
radv: Wire up ac_gather_context_rolls
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27322 >
2024-02-12 14:04:24 +00:00
Konstantin Seurer
ba6d6e5ee1
amd/common: Use the correct register table for GFX10_3
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27322 >
2024-02-12 14:04:24 +00:00
Samuel Pitoiset
6cab5559f9
radv: add support for emitting TES+GS compiled separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27432 >
2024-02-12 08:09:28 +00:00
Samuel Pitoiset
dd92f5f664
radv: bind the vertex input SGPR only for relevant stages
...
Otherwise, user_data_0 is wrong if merged shaders are compiled
separately and if we have GS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27432 >
2024-02-12 08:09:28 +00:00
Samuel Pitoiset
d64d7373f3
radv: declare AC_UD_TES_STATE for separate compilation of GS on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27432 >
2024-02-12 08:09:28 +00:00
Samuel Pitoiset
e15d1ed7cb
radv: declare streamout buffers for TES+GS compiled separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27432 >
2024-02-12 08:09:28 +00:00
Samuel Pitoiset
83bc7e27a5
radv: force GS stage for TES as ES compiled separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27432 >
2024-02-12 08:09:28 +00:00
Samuel Pitoiset
b58de424f4
radv: fix RGP barrier reason for RP barriers inserted by the runtime
...
Without that, RGP is confused and it's reporting CmdPipelineBarrier()
instead of CmdRenderPassSync().
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27546 >
2024-02-12 07:50:16 +00:00
Hans-Kristian Arntzen
8b4259e69b
wsi/x11: Disable vk_xwayland_wait_ready by default on most drivers.
...
Venus is special and still requires extra waits.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27074 >
2024-02-10 11:47:22 +00:00
Georg Lehmann
36f23bf96d
aco: print exec/vcc_lo/hi for single dword access
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27455 >
2024-02-09 22:14:44 +00:00
Georg Lehmann
684014ff12
aco: print permlane16 bc/fi
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27455 >
2024-02-09 22:14:44 +00:00
Georg Lehmann
f469fda44c
aco: don't print hi() for permlane opsel
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27455 >
2024-02-09 22:14:44 +00:00
Georg Lehmann
b59f5f9c85
aco: print neg prettier for packed math
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27455 >
2024-02-09 22:14:44 +00:00
Georg Lehmann
767eb15ddc
aco/print_ir: don't use alloca for input modifiers
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27455 >
2024-02-09 22:14:44 +00:00
Georg Lehmann
cd6d9c5918
aco: don't remove branches that skip v_writelane_b32
...
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27537 >
2024-02-09 21:55:02 +00:00
Georg Lehmann
2c4980716f
aco: add packed fma dpp note to README-ISA
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27533 >
2024-02-09 21:36:53 +00:00
Georg Lehmann
e927c5004f
aco/gfx11+: disable v_pk_fmac_f16_dpp
...
Public docs are apparently wrong: https://github.com/llvm/llvm-project/pull/79598#issuecomment-1933988048
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27533 >
2024-02-09 21:36:52 +00:00
Pierre-Eric Pelloux-Prayer
6c3a294eef
radv: don't remove the blit queue from the device queues
...
I don't remember why I implemented it like this in !13959 , but
AFAICT there's no need to manually remove this queue from vk_device's
queues list.
On the other hand, this hack causes issues if syncobj timeline isn't
supported; amdgpu always support timeline, but amdgpu over virtio-gpu
doesn't.
The issue is as follow: the sequence in vk_queue.c:
case VK_QUEUE_SUBMIT_MODE_DEFERRED:
vk_queue_push_submit(queue, submit);
return vk_device_flush(queue->base.device);
Would fail to produce the expected result, because vk_device_flush would
fail to realize that the blit queue has some pending work because
"vk_foreach_queue(queue, device)" would never process the queue.
Then, the call to vk_drm_syncobj_export_sync_file() would fail, because
the syncobj handle was never used in a submit.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27412 >
2024-02-09 09:20:52 +01:00
Samuel Pitoiset
61a125647b
radv: add radv_disable_ngg_gs and enable it for Persona 3 Reload
...
Persona 3 Reload is largely affected by the way amplification works with
NGG GS and disabling it drastically improve performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27518 >
2024-02-09 07:24:16 +00:00
Samuel Pitoiset
69d734a8d5
radv: add RADV_DEBUG=nongg_gs for GFX10/GFX10.3
...
NGG GS doesn't perform well in some cases and having an option to
disable it for performance experiments is very useful.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27518 >
2024-02-09 07:24:16 +00:00
Daniel Schürmann
932b9e6a23
radv: enable VK_KHR_shader_quad_control
...
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27277 >
2024-02-09 05:32:35 +00:00
Daniel Schürmann
e546f2a55d
radv: enable VK_KHR_shader_maximal_reconvergence
...
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27277 >
2024-02-09 05:32:35 +00:00
Daniel Schürmann
2649717a36
aco: enable WQM if demote is used with maximal reconvergence
...
If otherwise no helper lanes are required by the shader, then demote
behaves like discard and immediately terminates the invocations.
With maximal reconvergence, however, we need to ensure that helper lanes
are not terminated unless the entire quad was demoted.
In order to fix this, generally enable helper lanes in this unlikely
corner case and avoid a major refactor.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27277 >
2024-02-09 05:32:35 +00:00
Philip Rebohle
7b0fd4cc05
radv: Remove dead shared variables after optimization loop.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27104 >
2024-02-08 18:27:57 +00:00
Samuel Pitoiset
63b238e84e
radv: only load 3x32-bit elements when emitting draws with mesh shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27488 >
2024-02-08 18:04:15 +00:00
Samuel Pitoiset
0296196d32
radv: remove unused radv_indirect_command_layout::state_offset
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27488 >
2024-02-08 18:04:15 +00:00
Samuel Pitoiset
7bdf9f5002
radv/ci: remove VKD3D_CONFIG=dxr11 for navi21/navi31
...
The vkd3d-proton version in CI has this enabled by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27520 >
2024-02-08 17:39:59 +00:00
Samuel Pitoiset
bde272349d
radv: add support for emitting VS+GS compiled separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Samuel Pitoiset
416b20d381
radv: force GS stage for VS as ES compiled separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Samuel Pitoiset
8ef4c049ec
radv: declare streamout buffers for VS+GS compiled separately on GFX9+
...
The shader input arguments must match.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Samuel Pitoiset
a68e19204e
radv: rework shader arguments for separate compilation of VS+GS on GFX9+
...
The shader input args must match for VS+GS compiled separately.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Samuel Pitoiset
482dbacdeb
radv/nir: lower esgs_vertex_stride for GS compiled separately on GFX9+
...
The ESGS vertex stride would be emitted at draw time using the number
of VS/TES outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Samuel Pitoiset
d777cbf66c
radv: add a new user SGPR for the ESGS ring item size
...
With shader object, when VS+GS or TES+GS are compiled separately and
the VS has written (but unused) outputs, the ESGS vertex stride
must be passed through an user SGPR. This is because when the GS is
compiled we can't know the number of ES outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27388 >
2024-02-08 13:33:34 +00:00
Rhys Perry
5f0226d82f
aco/tests: use raw strings in form_hard_clauses.nsa
...
Like cad2c0915d .
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
d59d00ebf8
aco/tests: add tests for VOPD operand swapping
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
24c02dbfa6
aco: improve printing of VOPD instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
ea92aea9f2
aco: turn v_mov_b32 into addition to create VOPD instructions
...
fossil-db (navi31, wave32):
Totals from 15655 (19.76% of 79242) affected shaders:
Instrs: 10699119 -> 10688239 (-0.10%); split: -0.11%, +0.00%
CodeSize: 61290308 -> 61288596 (-0.00%); split: -0.01%, +0.00%
Latency: 89159743 -> 89150355 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16966295 -> 16955427 (-0.06%); split: -0.07%, +0.00%
VALU: 5484626 -> 5473993 (-0.19%); split: -0.20%, +0.00%
VOPD: 1446725 -> 1457358 (+0.73%); split: +0.74%, -0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
65dfb27f8f
aco: swap operands to create VOPD instructions
...
fossil-db (navi31, wave32):
Totals from 61565 (77.69% of 79242) affected shaders:
Instrs: 34874995 -> 34439344 (-1.25%); split: -1.27%, +0.02%
CodeSize: 200611536 -> 200564028 (-0.02%); split: -0.12%, +0.10%
Latency: 242520024 -> 242120510 (-0.16%); split: -0.28%, +0.11%
InvThroughput: 50236383 -> 49588742 (-1.29%); split: -1.31%, +0.02%
VClause: 713308 -> 712902 (-0.06%); split: -0.07%, +0.01%
SClause: 1184865 -> 1184620 (-0.02%); split: -0.03%, +0.01%
VALU: 18235068 -> 17803847 (-2.36%); split: -2.37%, +0.00%
VOPD: 3930904 -> 4362125 (+10.97%); split: +10.99%, -0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
96d8b7c59c
aco: refactor create_vopd_instruction
...
Prepare for operand swapping.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Marek Olšák
72948d9ff9
radeonsi,aco: remove the VS prolog
...
The upside is that this removes 600 lines of code. The downside is
that if instance divisors are used, we will compile the VS on demand.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27120 >
2024-02-07 09:50:53 +00:00
Eric Engestrom
faad4ffe97
radv: enable VK_EXT_headless_surface on all platforms except Windows
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27448 >
2024-02-06 20:32:38 +00:00
Samuel Pitoiset
ffbd3e5b2d
radv: change the user SGPR idx of AC_UD_TES_STATE
...
When GS will be compiled separately, we will have to always declare
both VS and TES user SGPRs because we can't know the previous stage,
and the shader input arguments must match and mustn't overlap.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27431 >
2024-02-06 20:12:38 +00:00
Samuel Pitoiset
3e9815173a
radv: set the default workgroup size for VS/TES as ES
...
If shaders are linked, the optimal value would be computed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27431 >
2024-02-06 20:12:38 +00:00