Samuel Pitoiset
034014a165
aco: restore m0/exec before exiting the trap handler
...
Dumping VGPRs will overwrite m0 and exec and they need to be restored
if we want to return to the shader.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090 >
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
185a165a85
aco: fix validation for v_movrels_b32 and friends
...
m0 is the second operand.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090 >
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
40b343bbee
aco: add a new variant for vop1() with two operands
...
For v_movrels_b32 and friends which need a second operand for m0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090 >
2024-11-13 15:27:53 +00:00
Samuel Pitoiset
f4cf6a71ed
aco: use a 64-bit mov to save exec in the trap handler shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090 >
2024-11-13 15:27:53 +00:00
Rhys Perry
7d4cc04156
radv,ac/nir: split global access using nir_lower_mem_access_bit_sizes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
17cc8a5a54
aco: remove load byte_align
...
8/16-bit loads given to instruction selection now always use VMEM and
scalar load instructions unless alignment easily allows a vector load.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
8fdc5d7f9f
radv,ac/nir: lower sub-dword loads using nir_lower_mem_access_bit_sizes
...
fossil-db (navi21):
Totals from 427 (0.54% of 79395) affected shaders:
Instrs: 2939637 -> 2937224 (-0.08%); split: -0.08%, +0.00%
CodeSize: 15982272 -> 15969880 (-0.08%); split: -0.08%, +0.00%
Latency: 21128645 -> 21125738 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 5626811 -> 5626220 (-0.01%); split: -0.03%, +0.02%
SClause: 65771 -> 65731 (-0.06%); split: -0.07%, +0.00%
Copies: 243247 -> 242917 (-0.14%); split: -0.14%, +0.01%
Branches: 100089 -> 100085 (-0.00%)
PreSGPRs: 17879 -> 18118 (+1.34%)
VALU: 1899641 -> 1899278 (-0.02%)
SALU: 468508 -> 466469 (-0.44%)
SMEM: 84305 -> 84291 (-0.02%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
d3ae1842a2
aco,ac/nir: flag loads to use smem in NIR
...
This pass will be re-used later.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
0619e4db63
nir,aco,ac/llvm: add nir_op_alignbyte_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
db0cbb7e9b
aco: optimize nir_op_shfr with <32 src1
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Rhys Perry
bd88c8733a
ac/nir: add ACCESS_CAN_REORDER to lowered load_global_constant
...
fossil-db (navi21):
Totals from 39 (0.05% of 79395) affected shaders:
Instrs: 2619146 -> 2619273 (+0.00%); split: -0.00%, +0.01%
CodeSize: 14158064 -> 14158304 (+0.00%)
Latency: 17277051 -> 17274098 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 4242241 -> 4241746 (-0.01%); split: -0.01%, +0.00%
SClause: 56514 -> 57561 (+1.85%); split: -0.02%, +1.87%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904 >
2024-11-13 12:59:26 +00:00
Eric Engestrom
6018d15f32
radv/ci: document flakes seen recently
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32080 >
2024-11-13 12:26:49 +00:00
Samuel Pitoiset
0c77469995
aco: fix saving/restoring VGPRS in the trap handler on GFX9
...
When ADD_TID_ENABLE=1, DATA_FORMAT is STRIDE[14:17], so the stride
was too large.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32109 >
2024-11-13 11:12:54 +00:00
Georg Lehmann
7e8a08ae77
aco: use nir_def_all_uses_ignore_sign_bit
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844 >
2024-11-12 18:03:57 +00:00
Samuel Pitoiset
44fa24580f
radv: optimize the pipe misaligned L2 cache invalidation on GFX11
...
When using the subresource range, it's possible to reduce the number
of L2 cache invalidations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a3a65c0c4
radv: pass the image subresource range to radv_{src,dst}_access_flush()
...
This will allow us to optimize the pipe misaligned special case for
GFX11 because only the first mip in the mip-tail needs the L2 cache
invalidation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
f7a39fac10
radv: use vk_image_view_subresource_range() when possible
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a8b725d03
radv: determine the first mip that is pipe misaligned on GFX10+
...
This will allow us to optimize the GFX11 case where not all mips are
affected by the L2 invalidation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
c5d5f2fbef
radv: move the GFX11 special case for mips to radv_image_is_pipe_misaligned()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
65bb39bf96
radv: do not always invalidate L2 for GPUs with non-coherent RBs on GFX10+
...
According to PAL, L2 should be invalidated only for images with
DCC/HTILE even on GPUs with non-coherent RBs. In practice, most of
the images have either DCC/HTILE but this can reduce the number of L2
flushes for images without any compression.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
5e0b81413d
radv: emit nir_debug_break instructions when the trap handler is enabled
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
2d5df46c25
aco: emit nir_intrinsic_debug_break
...
s_trap is used to enter the trap.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
5f79b8ea2d
radv,aco: save/restore overwritten VGPRs in the trap handler shader
...
The trap currently doesn't return to the shader but it will be needed
for example for the debug mode.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
ccde8ecd64
radv: compute the TMA BO size instead of using a constant
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
3e88f996a5
radv: fix the TMA descriptor size
...
The TMA BO contains the descriptor first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6ec0c85908
radv,aco: use the trap handler layout struct while compiling the shader
...
It's less error prone to rely on the layout for offsets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6bfd92123f
aco: simplify postprocessing the trap handler shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
44dfeb4479
radv,aco: add a separate function to compile the trap handler shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
62e335c779
radv,aco: dump more SQ_WAVE regs from the trap handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
0cc21d0601
radv: cleanup printing SGPRS dumped from the trap handler
...
It's more readable like that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Georg Lehmann
ece1ab3b87
radv: run copy prop before vectorizing
...
Otherwise there are a lot of scalar movs between texture instructions
and alu. With those removed, the top down vectorizer has more starting
points.
Totals from 296 (0.37% of 79206) affected shaders:
MaxWaves: 5710 -> 5754 (+0.77%)
Instrs: 388051 -> 386630 (-0.37%); split: -0.46%, +0.09%
CodeSize: 2120800 -> 2117144 (-0.17%); split: -0.30%, +0.13%
VGPRs: 17496 -> 17344 (-0.87%)
Latency: 8893751 -> 8901364 (+0.09%); split: -0.10%, +0.18%
InvThroughput: 1740411 -> 1731710 (-0.50%); split: -0.57%, +0.07%
VClause: 6573 -> 6576 (+0.05%); split: -0.21%, +0.26%
SClause: 11233 -> 11209 (-0.21%); split: -0.28%, +0.07%
Copies: 31582 -> 31635 (+0.17%); split: -1.49%, +1.66%
PreSGPRs: 15878 -> 15876 (-0.01%)
PreVGPRs: 15380 -> 15274 (-0.69%)
VALU: 278528 -> 277036 (-0.54%); split: -0.65%, +0.11%
SALU: 49062 -> 49054 (-0.02%); split: -0.03%, +0.02%
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32060 >
2024-11-11 18:33:48 +00:00
Samuel Pitoiset
107f29c39a
aco: do not reorder s_trap instructions
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32055 >
2024-11-11 15:46:36 +00:00
Samuel Pitoiset
30d9166d80
radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031 >
2024-11-11 09:34:05 +00:00
Samuel Pitoiset
4d50691ae9
radv: remove unused parameter to radv_fill_nir_compiler_options()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031 >
2024-11-11 09:34:05 +00:00
Konstantin Seurer
e3cf6290e0
radv: Add RADV_DEBUG=nirdebuginfo
...
Annotates the shader with source locations into the nir shader.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:14 +00:00
Konstantin Seurer
736c8c6f23
radv: Dump nir shaders before compiling
...
It will allow adding source locations that point to the nir_string to
the shader.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:14 +00:00
Konstantin Seurer
aaf65d6219
radv: Store debug info inside radv_shader
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:14 +00:00
Konstantin Seurer
54c22656b8
radv: Add a helper for accessing the shader binary
...
Use pointers into the blob instead of hardcoding the layout everywhere.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:13 +00:00
Konstantin Seurer
69ebba82d4
aco: Pass debug information to the driver
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:13 +00:00
Konstantin Seurer
f8ef1afec8
aco: Handle nir_debug_info_instr
...
Propagated debug info using p_debug_info and Program::debug_info.
Offsets into the shader binary are gathered during assembly.
This will be usefull for mapping back the disassembled shader to
nir, glsl or spirv.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:13 +00:00
Konstantin Seurer
7dd9840128
amd: Add ac_shader_debug_info
...
This is very similar to nir_debug_info_instr but it can exist outside of
a nir shader.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298 >
2024-11-11 08:39:13 +00:00
Konstantin Seurer
c5e40a60f8
radv: Lower non-uniform access after vectorization
...
Scalar access can make nir_lower_non_uniform_access emit a lot of
waterfall loops.
Totals from 83 (0.10% of 84770) affected shaders:
Instrs: 2747926 -> 2745959 (-0.07%); split: -0.07%, +0.00%
CodeSize: 15022460 -> 14998240 (-0.16%); split: -0.16%, +0.00%
Latency: 18602932 -> 18404976 (-1.06%); split: -1.18%, +0.12%
InvThroughput: 4500730 -> 4450364 (-1.12%); split: -1.18%, +0.06%
VClause: 93651 -> 91848 (-1.93%); split: -1.93%, +0.00%
SClause: 63672 -> 63595 (-0.12%); split: -0.13%, +0.00%
Copies: 229377 -> 229896 (+0.23%); split: -0.04%, +0.27%
Branches: 107630 -> 107627 (-0.00%); split: -0.01%, +0.00%
PreSGPRs: 5247 -> 5253 (+0.11%)
PreVGPRs: 5911 -> 5903 (-0.14%); split: -0.29%, +0.15%
VALU: 1761158 -> 1761540 (+0.02%); split: -0.01%, +0.03%
SALU: 419743 -> 419783 (+0.01%); split: -0.01%, +0.02%
VMEM: 152142 -> 150208 (-1.27%)
SMEM: 80251 -> 80244 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509 >
2024-11-11 07:53:13 +00:00
Visan, Tiberiu
d379a3a428
amd/vpelib: remove luma offset ( #459 )
...
\[WHY\]
Shader and VPE does not apply brightness adjs in the same manner
\[HOW\]
Removed luma offset added in VPE
\[TESTING\]
Tested on real time video rendering
Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075 >
2024-11-11 13:00:54 +08:00
Visan, Tiberiu
2172ab2c2a
amd/vpelib: patch to match shader ( #456 )
...
\[WHY\]
Shader and VPE had different behavior while adjusting the brightness
\[HOW\]
Apply the same normalization factor
\[TESTING\]
Tested on real video outputs
Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075 >
2024-11-11 13:00:44 +08:00
Leder, Brendan Steve
891c4694ba
amd/vpelib: Refactor OCSC and update missing check
...
Missing check for 601 in limited format check, updated that.
Refactored OCSC to use specific limited depths.
Cleaned up general color processing.
Co-authored-by: Brendan <breleder@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075 >
2024-11-11 13:00:29 +08:00
Samuel Pitoiset
437bd63265
radv,aco: dump m0 and exec from the trap handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026 >
2024-11-08 14:00:15 +00:00
Samuel Pitoiset
d1d41be43f
aco: declare phys regs for tba_hi/tma_hi
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026 >
2024-11-08 14:00:15 +00:00
Samuel Pitoiset
13bab450a2
aco: fix storing SQ_WAVE_STATUS in the trap handler shader
...
SQ_WAVE_STATUS can change inside the trap because of SCC.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026 >
2024-11-08 14:00:14 +00:00
Samuel Pitoiset
494050d2ea
aco: add a helper to dump SGPR to memory for the trap handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026 >
2024-11-08 14:00:14 +00:00
Samuel Pitoiset
8c6f2fef1b
aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8
...
This avoids using any VGPRs on GFX8.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026 >
2024-11-08 14:00:14 +00:00