Commit graph

16223 commits

Author SHA1 Message Date
Marek Olšák
a7ba36f589 ac/nir: get pass_tessfactors_by_reg from nir_gather_tcs_info
If nir_tcs_info::all_invocations_define_tess_levels is true, the pass
doesn't have to insert a barrier and use output loads to get tess level
output values. It can just use the SSA defs that are being stored (or phis
thereof) to get the tess level output values.

The remaining tcs_info fields will be used by the HS shader message.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>
2024-11-16 21:58:29 -05:00
Marek Olšák
b258a9aa4e aco: remove unused TCS fields from aco_shader_info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>
2024-11-16 21:58:26 -05:00
Samuel Pitoiset
d2960a8430 radv: consider VK_PIPELINE_STAGE_2_NONE like BOTTOM_OF_PIPE
VK_PIPELINE_STAGE_2_NONE from sync2 is similar to BOTTOM_OF_PIPE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32115>
2024-11-15 08:22:23 +00:00
Samuel Pitoiset
c08d2c40ed radv: fix ignoring src stage mask when dst stage mask is BOTTOM_OF_PIPE
Otherwise the driver doesn't synchronize if there are image layout
transitions.

This fixes rendering issues with displayable DCC (usually black squares
in the bottom of screen). This mostly happens when an application
uses a lower resolution than the screen supports and fshack
(wine/proton) which upscales images uses COMPUTE_SHADER->BOTTOM_OF_PIPE
for the barrier after a dispatch.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11547
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11600
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11789
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8705
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9890
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32115>
2024-11-15 08:22:23 +00:00
Samuel Pitoiset
45c0ef3bb4 radv: dump SPIR-V and NIR for the faulty shader detected with the trap
More logs is always better for debugging.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32116>
2024-11-14 15:57:07 +00:00
Samuel Pitoiset
9149488a9d radv: mark live invocations when dumping VGPRS with the trap handler
Similar to UMR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32116>
2024-11-14 15:57:07 +00:00
Georg Lehmann
3e037ac2a9 aco/gfx8: use ds_swizzle_b32 rotate mode
Despite only being mentioned in the ISA docs since vega, rotate (and fft)
swizzle mode seem to exist since gfx8.

https://github.com/llvm/llvm-project/issues/28975#issuecomment-980964939

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31348>
2024-11-14 15:34:48 +00:00
David Rosca
dcfc956521 radv/video: Override pic_init_qp_minus26 in PPS
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
David Rosca
d166bb5dd1 radv/video: Use 64x16 alignment for HEVC encode
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
David Rosca
d1c1a33b35 radv/video: Avoid selecting rc layer over maximum
Vulkan spec doesn't say if this is allowed or not, but trying
to do this will hang.

Fixes: 4a19047d32 ("radv/video: Select temporal layer when encoding each frame")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
David Rosca
e941acfb9d radv/video: Report correct encodeInputPictureGranularity
Only aligned size can be encoded.

Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
David Rosca
e4ec135d8b radv/video: Fix HEVC slice control
This needs to use aligned size, otherwise it will output two
slices when the size is not 64 aligned.

Fixes: 967e4e09de ("radv/video: add h265 encode support")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
David Rosca
6a121f1507 radv/video: Fix H264 slice control
This needs to use aligned size, otherwise it will output two
slices when the size is not 16 aligned.

Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31418>
2024-11-14 07:52:56 +00:00
Samuel Pitoiset
b4b5f9eeb0 radv,aco: dump VGPRS from the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
132b7a85c7 aco: drop the second M0 operand for s_set_gpr_idx_on
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
c712555a9f aco: save/restore VGPRS on GFX8 in the trap handler shader
This will be needed for dumping VGPRs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
a77af57e83 aco: use all invocations from the current wave in the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
034014a165 aco: restore m0/exec before exiting the trap handler
Dumping VGPRs will overwrite m0 and exec and they need to be restored
if we want to return to the shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
185a165a85 aco: fix validation for v_movrels_b32 and friends
m0 is the second operand.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
40b343bbee aco: add a new variant for vop1() with two operands
For v_movrels_b32 and friends which need a second operand for m0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
f4cf6a71ed aco: use a 64-bit mov to save exec in the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:53 +00:00
Rhys Perry
7d4cc04156 radv,ac/nir: split global access using nir_lower_mem_access_bit_sizes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
17cc8a5a54 aco: remove load byte_align
8/16-bit loads given to instruction selection now always use VMEM and
scalar load instructions unless alignment easily allows a vector load.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
8fdc5d7f9f radv,ac/nir: lower sub-dword loads using nir_lower_mem_access_bit_sizes
fossil-db (navi21):
Totals from 427 (0.54% of 79395) affected shaders:
Instrs: 2939637 -> 2937224 (-0.08%); split: -0.08%, +0.00%
CodeSize: 15982272 -> 15969880 (-0.08%); split: -0.08%, +0.00%
Latency: 21128645 -> 21125738 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 5626811 -> 5626220 (-0.01%); split: -0.03%, +0.02%
SClause: 65771 -> 65731 (-0.06%); split: -0.07%, +0.00%
Copies: 243247 -> 242917 (-0.14%); split: -0.14%, +0.01%
Branches: 100089 -> 100085 (-0.00%)
PreSGPRs: 17879 -> 18118 (+1.34%)
VALU: 1899641 -> 1899278 (-0.02%)
SALU: 468508 -> 466469 (-0.44%)
SMEM: 84305 -> 84291 (-0.02%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
d3ae1842a2 aco,ac/nir: flag loads to use smem in NIR
This pass will be re-used later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
0619e4db63 nir,aco,ac/llvm: add nir_op_alignbyte_amd
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
db0cbb7e9b aco: optimize nir_op_shfr with <32 src1
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Rhys Perry
bd88c8733a ac/nir: add ACCESS_CAN_REORDER to lowered load_global_constant
fossil-db (navi21):
Totals from 39 (0.05% of 79395) affected shaders:
Instrs: 2619146 -> 2619273 (+0.00%); split: -0.00%, +0.01%
CodeSize: 14158064 -> 14158304 (+0.00%)
Latency: 17277051 -> 17274098 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 4242241 -> 4241746 (-0.01%); split: -0.01%, +0.00%
SClause: 56514 -> 57561 (+1.85%); split: -0.02%, +1.87%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>
2024-11-13 12:59:26 +00:00
Eric Engestrom
6018d15f32 radv/ci: document flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32080>
2024-11-13 12:26:49 +00:00
Samuel Pitoiset
0c77469995 aco: fix saving/restoring VGPRS in the trap handler on GFX9
When ADD_TID_ENABLE=1, DATA_FORMAT is STRIDE[14:17], so the stride
was too large.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32109>
2024-11-13 11:12:54 +00:00
Georg Lehmann
7e8a08ae77 aco: use nir_def_all_uses_ignore_sign_bit
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844>
2024-11-12 18:03:57 +00:00
Samuel Pitoiset
44fa24580f radv: optimize the pipe misaligned L2 cache invalidation on GFX11
When using the subresource range, it's possible to reduce the number
of L2 cache invalidations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a3a65c0c4 radv: pass the image subresource range to radv_{src,dst}_access_flush()
This will allow us to optimize the pipe misaligned special case for
GFX11 because only the first mip in the mip-tail needs the L2 cache
invalidation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
f7a39fac10 radv: use vk_image_view_subresource_range() when possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a8b725d03 radv: determine the first mip that is pipe misaligned on GFX10+
This will allow us to optimize the GFX11 case where not all mips are
affected by the L2 invalidation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
c5d5f2fbef radv: move the GFX11 special case for mips to radv_image_is_pipe_misaligned()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
65bb39bf96 radv: do not always invalidate L2 for GPUs with non-coherent RBs on GFX10+
According to PAL, L2 should be invalidated only for images with
DCC/HTILE even on GPUs with non-coherent RBs. In practice, most of
the images have either DCC/HTILE but this can reduce the number of L2
flushes for images without any compression.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
5e0b81413d radv: emit nir_debug_break instructions when the trap handler is enabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061>
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
2d5df46c25 aco: emit nir_intrinsic_debug_break
s_trap is used to enter the trap.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061>
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
5f79b8ea2d radv,aco: save/restore overwritten VGPRs in the trap handler shader
The trap currently doesn't return to the shader but it will be needed
for example for the debug mode.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
ccde8ecd64 radv: compute the TMA BO size instead of using a constant
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
3e88f996a5 radv: fix the TMA descriptor size
The TMA BO contains the descriptor first.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6ec0c85908 radv,aco: use the trap handler layout struct while compiling the shader
It's less error prone to rely on the layout for offsets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6bfd92123f aco: simplify postprocessing the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
44dfeb4479 radv,aco: add a separate function to compile the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
62e335c779 radv,aco: dump more SQ_WAVE regs from the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
0cc21d0601 radv: cleanup printing SGPRS dumped from the trap handler
It's more readable like that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Georg Lehmann
ece1ab3b87 radv: run copy prop before vectorizing
Otherwise there are a lot of scalar movs between texture instructions
and alu. With those removed, the top down vectorizer has more starting
points.

Totals from 296 (0.37% of 79206) affected shaders:
MaxWaves: 5710 -> 5754 (+0.77%)
Instrs: 388051 -> 386630 (-0.37%); split: -0.46%, +0.09%
CodeSize: 2120800 -> 2117144 (-0.17%); split: -0.30%, +0.13%
VGPRs: 17496 -> 17344 (-0.87%)
Latency: 8893751 -> 8901364 (+0.09%); split: -0.10%, +0.18%
InvThroughput: 1740411 -> 1731710 (-0.50%); split: -0.57%, +0.07%
VClause: 6573 -> 6576 (+0.05%); split: -0.21%, +0.26%
SClause: 11233 -> 11209 (-0.21%); split: -0.28%, +0.07%
Copies: 31582 -> 31635 (+0.17%); split: -1.49%, +1.66%
PreSGPRs: 15878 -> 15876 (-0.01%)
PreVGPRs: 15380 -> 15274 (-0.69%)
VALU: 278528 -> 277036 (-0.54%); split: -0.65%, +0.11%
SALU: 49062 -> 49054 (-0.02%); split: -0.03%, +0.02%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32060>
2024-11-11 18:33:48 +00:00
Samuel Pitoiset
107f29c39a aco: do not reorder s_trap instructions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32055>
2024-11-11 15:46:36 +00:00
Samuel Pitoiset
30d9166d80 radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>
2024-11-11 09:34:05 +00:00