Commit graph

2571 commits

Author SHA1 Message Date
Georg Lehmann
2028df8757 aco: don't validate p_constaddr_addlo/p_resumeaddr_addlo operands
These can have two literals so validation would fail.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507>
2023-06-12 19:43:17 +00:00
Georg Lehmann
b9854a9097 aco: move cfg validation to its own function
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507>
2023-06-12 19:43:17 +00:00
Georg Lehmann
e5df6ee605 aco: make validation work without SSA temps
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507>
2023-06-12 19:43:17 +00:00
Friedrich Vock
9de8134410 aco: Fix assert in insert_exec_mask
This assert would trigger on unconditional demotes, because the demotes
don't remove the mask_type_global flag from the exec mask.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23594>
2023-06-12 14:20:28 +00:00
Friedrich Vock
3ea01b86f0 aco: Fix live_var_analysis assert
Fixes: 3d4f6a00b ('aco/spill: allow for disconnected CFG')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23586>
2023-06-12 13:52:03 +00:00
Friedrich Vock
7f3cfcc96a aco: Reset scratch_rsrc on blocks without predecessors
Fixes hangs with raytracing in Hellblade: Senua's Sacrifice.

Fixes: 3d4f6a00b ('aco/spill: allow for disconnected CFG')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23586>
2023-06-12 13:52:03 +00:00
Timur Kristóf
67a0f2532f aco: Mark exec write used when it writes other registers.
When an exec write isn't used but writes other registers
besides exec, and also reads exec (such as s_and_saveexec),
we would mistakenly delete the previous instruction that
writes the exec value that this instruction uses.

No Fossil DB changes on Rembrandt (GFX10.3).

Fixes: 0211e66f65
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9036
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23576>
2023-06-12 15:07:55 +02:00
Konstantin Seurer
51cd2965c7 aco/rt: Do not initialize the next shader addr
The uniform one is already set and the raygen shader isn't guarded
anymore.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23545>
2023-06-10 10:00:27 +00:00
Friedrich Vock
64fda091de aco: Lower divergent bool phis iteratively
Avoids stack overflows with really large programs.

No fossil-db changes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8760
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8701

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23531>
2023-06-09 12:39:55 +00:00
Rhys Perry
56eb831155 aco: run nir_lower_int64 after nir_opt_uniform_atomics
nir_opt_uniform_atomics can create 64-bit ALU instructions which need to
be lowered.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23502>
2023-06-09 11:18:33 +00:00
Rhys Perry
31c8c42f48 aco/tests: test that s_bfe bits is masked
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23464>
2023-06-08 11:54:45 +01:00
Rhys Perry
08064a5542 aco: mask bits source of s_bfe
The s_bfe instructions use 7 bits, not 5 like the NIR opcode requires.

No fossil-db changes (navi21).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9162
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23464>
2023-06-08 11:54:17 +01:00
Yonggang Luo
586391720b util: use uint32_t as the parameter of align function
align on negative value doesn't make sense

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23421>
2023-06-08 06:41:21 +00:00
Daniel Schürmann
defdcd2058 aco: adjust RT prolog for shader functions [disables RT]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
b33c01b00f radv/rt: set up RT shader args for separate compilation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
393f3426b6 aco: implement select_program_rt()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
f66f274304 aco: implement nir_intrinsic_load_resume_shader_address_amd
Similar to p_constaddr but targeting BBs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
03c4b5b0cc nir,amd: add nir_intrinsic_store_[scalar|vector]_arg_amd to overwrite inputs
This intrinsic must only be used at top-level CF in order
to not break SSA properties.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
1be3a558f2 radv: add remaining RT shader args for separate compilation
Also wrap RT args into struct {} rt for improved consistency
and remove some 'ray_' prefixes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Marek Olšák
6dc1ae1759 amd: drop support for LLVM 14
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23471>
2023-06-07 19:56:55 +00:00
Marek Olšák
ab5662dc61 amd: drop support for LLVM 13
We can remove the LLVM 13 Wave32 discard workaround and
SI_PROFILE_IGNORE_LLVM13_DISCARD_BUG that disabled the workaround.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23471>
2023-06-07 19:56:55 +00:00
Rhys Perry
b5c31f14a2 aco: remove memory_barrier_buffer implementation
This is no longer used.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624>
2023-06-07 13:19:42 +00:00
Rhys Perry
417990b19e aco: consider position/primitive exports around memory barriers
This is needed to create barriers which ensure stores finish before
position/primitive exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624>
2023-06-07 13:19:41 +00:00
Georg Lehmann
dfb6d3e443 aco: use v_fma_mix for f2f32 and f2f16 on gfx11 if wave64
v_fma_mix can be dual issued, trade some code size for throughput.

Foz-DB GFX1100:
Totals from 8204 (6.08% of 134864) affected shaders:
CodeSize: 89608584 -> 89693968 (+0.10%)
Latency: 160744811 -> 160699309 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 19737977 -> 19678308 (-0.30%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
2023-06-07 12:30:11 +00:00
Georg Lehmann
177dba62a1 aco: use v_add_f{16,32} with clamp for fsat
v_add can be dual issued on gfx11, v_med3 cannot.
Don't use v_add directly to still optimize omod(fsat(x)).

Foz-DB GFX1100:
Totals from 32702 (24.24% of 134913) affected shaders:
Latency: 475008203 -> 474928037 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 59226198 -> 59140787 (-0.14%); split: -0.14%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
2023-06-07 12:30:11 +00:00
Georg Lehmann
3a0bc8f007 aco/statistics: improve v_fma_mix dual issuing detection
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
2023-06-07 12:30:11 +00:00
Georg Lehmann
0f023d90c0 aco/ir: return true in hasRegClass for Operand(reg, rc)
Makes isOfType() usable after lower_to_hw_instr.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
2023-06-07 12:30:11 +00:00
Rhys Perry
53383fe8a5 aco: fix ds_sub_gs_reg_rtn validation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Fixes: 8d5cc23c18 ("aco: use gds reg when ordered xfb counter add")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23434>
2023-06-06 16:09:28 +00:00
Qiang Yu
bd88c75d4c aco,radv: remove unused gs aco shader info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23432>
2023-06-06 10:55:10 +00:00
Rhys Perry
9ae5c942da aco/tests: add discard export target tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23371>
2023-06-05 11:42:21 +00:00
Rhys Perry
21867b45c1 aco: fix has_color_exports=true for mrtz exports
V_008DFC_SQ_EXP_NULL is after V_008DFC_SQ_EXP_MRTZ.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com
Fixes: d3611af389 ("aco: support nir_export_amd with ps targets")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9135
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23371>
2023-06-05 11:42:21 +00:00
Qiang Yu
2d0e8e0258 aco: use ac_get_image_dim for array check when image intrinsic
This is to avoid missing array flag when <=GFX8 and 3D image
in which case is treated as 2D array image.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
ed97cd92dc aco: implement nir_bindless_image_fragment_mask_load_amd
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Sigend-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
50f9d644e8 aco: implement nir_xfb_counter_sub_amd
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
8d5cc23c18 aco: use gds reg when ordered xfb counter add
This is currently only used by radeonsi.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
438dcf6d0f aco/assembler: handle ds_(add|sub)_gs_reg_rtn encoding
They are different than normal DS instructions, only
use DATA[0], not use ADDR.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
460b528c9e aco: implement load buffer with ACCESS_USES_FORMAT_AMD
This is used by radeonsi for vs input load and cdna
image load emulation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:59 +00:00
Qiang Yu
12ee7eccf7 aco,radv: remove unused aco_shader_info fields
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:58 +00:00
Qiang Yu
89dab66561 aco: implement two load lds ngg intrininsic for radeonsi
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
2023-06-02 09:21:58 +00:00
Georg Lehmann
79821d7afb aco: p_start_linear_vgpr doesn't always need exec mask
Foz-DB Navi21:
Totals from 1605 (1.21% of 132657) affected shaders:
CodeSize: 14023700 -> 14020320 (-0.02%)
Instrs: 2589881 -> 2589052 (-0.03%)
Latency: 22478420 -> 22473359 (-0.02%)
InvThroughput: 3851237 -> 3851092 (-0.00%)
Copies: 215316 -> 215438 (+0.06%); split: -0.39%, +0.44%

Allows more vcmpx usage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23302>
2023-06-01 20:18:23 +00:00
Georg Lehmann
7836260af8 aco: cleanup v_cmp_class usage
It's not the best way to check for NaN.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23310>
2023-05-31 12:56:37 +00:00
Samuel Pitoiset
1947500208 aco: remove nir_intrinsic_load_barycentric_at_sample occurences
This is lowered earlier and shouldn't get there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23307>
2023-05-31 07:25:46 +00:00
Chia-I Wu
750d641ca6 aco: fix alignment check in emit_load
align_offset already takes const_offset into consideration.  Remove the
adjustment.

Fixes: 542733dbbf ("aco: add emit_load helper")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9097
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23242>
2023-05-30 16:02:34 +00:00
Rhys Perry
94958e637d aco: improve printing of s_delay_alu
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
2023-05-30 12:42:00 +00:00
Rhys Perry
54c0088629 aco: insert s_delay_alu on the linear CFG
fossil-db (gfx1100):
Totals from 10498 (7.87% of 133428) affected shaders:
Instrs: 22274711 -> 22277717 (+0.01%); split: -0.01%, +0.03%
CodeSize: 114557040 -> 114569064 (+0.01%); split: -0.01%, +0.02%
Latency: 236505186 -> 236497338 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 33425052 -> 33423876 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
2023-05-30 12:42:00 +00:00
Rhys Perry
d7f48a61ec aco: use pass_flags to recover s_delay_alu cycles
This is simpler and more accurate.

fossil-db (gfx1100):
Totals from 11678 (8.75% of 133428) affected shaders:
Instrs: 25448655 -> 25436028 (-0.05%)
CodeSize: 130364728 -> 130314220 (-0.04%)
Latency: 325247603 -> 325231064 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 45901166 -> 45900022 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
2023-05-30 12:42:00 +00:00
Rhys Perry
d9cdb3524a aco: fix update_alu(clear=true) for exports
For:
   v_mov_b32_e32 v0, 1.0
   exp mrtz v0, off, off, off
we should completely remove the ALU entry before creating the EXP's WaR entry for v0.
Otherwise, the two will be combined into an entry which will wait for
expcnt(0) for later uses of v0.

gen_alu() should also be before gen(), since gen_alu() performs the clear
while gen() creates the WaR entry.

fossil-db (gfx1100):
Totals from 3589 (2.69% of 133428) affected shaders:
Instrs: 5591041 -> 5589047 (-0.04%); split: -0.04%, +0.00%
CodeSize: 28580840 -> 28572864 (-0.03%); split: -0.03%, +0.00%
Latency: 65427923 -> 65427543 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 11109079 -> 11109065 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
2023-05-30 12:42:00 +00:00
Konstantin Seurer
03a9715a68 amd: Use the Mesa base style
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23275>
2023-05-29 21:06:12 +00:00
Rhys Perry
c9cfe7bc80 aco/tests: add fix_derivs_in_divergent_cf tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>
2023-05-25 16:29:16 +00:00
Rhys Perry
02b933981c aco/tests: improve performance of declaration parsing
Unlike \S, \w only matches characters which are valid in identifiers. This
seems to be much faster, especially for longer identifier names.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>
2023-05-25 16:29:16 +00:00