Georg Lehmann
2028df8757
aco: don't validate p_constaddr_addlo/p_resumeaddr_addlo operands
...
These can have two literals so validation would fail.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507 >
2023-06-12 19:43:17 +00:00
Georg Lehmann
b9854a9097
aco: move cfg validation to its own function
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507 >
2023-06-12 19:43:17 +00:00
Georg Lehmann
e5df6ee605
aco: make validation work without SSA temps
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507 >
2023-06-12 19:43:17 +00:00
Friedrich Vock
9de8134410
aco: Fix assert in insert_exec_mask
...
This assert would trigger on unconditional demotes, because the demotes
don't remove the mask_type_global flag from the exec mask.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23594 >
2023-06-12 14:20:28 +00:00
Friedrich Vock
3ea01b86f0
aco: Fix live_var_analysis assert
...
Fixes: 3d4f6a00b ('aco/spill: allow for disconnected CFG')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23586 >
2023-06-12 13:52:03 +00:00
Friedrich Vock
7f3cfcc96a
aco: Reset scratch_rsrc on blocks without predecessors
...
Fixes hangs with raytracing in Hellblade: Senua's Sacrifice.
Fixes: 3d4f6a00b ('aco/spill: allow for disconnected CFG')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23586 >
2023-06-12 13:52:03 +00:00
Timur Kristóf
67a0f2532f
aco: Mark exec write used when it writes other registers.
...
When an exec write isn't used but writes other registers
besides exec, and also reads exec (such as s_and_saveexec),
we would mistakenly delete the previous instruction that
writes the exec value that this instruction uses.
No Fossil DB changes on Rembrandt (GFX10.3).
Fixes: 0211e66f65
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9036
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23576 >
2023-06-12 15:07:55 +02:00
Konstantin Seurer
51cd2965c7
aco/rt: Do not initialize the next shader addr
...
The uniform one is already set and the raygen shader isn't guarded
anymore.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23545 >
2023-06-10 10:00:27 +00:00
Friedrich Vock
64fda091de
aco: Lower divergent bool phis iteratively
...
Avoids stack overflows with really large programs.
No fossil-db changes.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8760
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8701
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23531 >
2023-06-09 12:39:55 +00:00
Rhys Perry
56eb831155
aco: run nir_lower_int64 after nir_opt_uniform_atomics
...
nir_opt_uniform_atomics can create 64-bit ALU instructions which need to
be lowered.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23502 >
2023-06-09 11:18:33 +00:00
Rhys Perry
31c8c42f48
aco/tests: test that s_bfe bits is masked
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23464 >
2023-06-08 11:54:45 +01:00
Rhys Perry
08064a5542
aco: mask bits source of s_bfe
...
The s_bfe instructions use 7 bits, not 5 like the NIR opcode requires.
No fossil-db changes (navi21).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9162
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23464 >
2023-06-08 11:54:17 +01:00
Yonggang Luo
586391720b
util: use uint32_t as the parameter of align function
...
align on negative value doesn't make sense
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23421 >
2023-06-08 06:41:21 +00:00
Daniel Schürmann
defdcd2058
aco: adjust RT prolog for shader functions [disables RT]
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Daniel Schürmann
b33c01b00f
radv/rt: set up RT shader args for separate compilation
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Daniel Schürmann
393f3426b6
aco: implement select_program_rt()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Daniel Schürmann
f66f274304
aco: implement nir_intrinsic_load_resume_shader_address_amd
...
Similar to p_constaddr but targeting BBs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Daniel Schürmann
03c4b5b0cc
nir,amd: add nir_intrinsic_store_[scalar|vector]_arg_amd to overwrite inputs
...
This intrinsic must only be used at top-level CF in order
to not break SSA properties.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Daniel Schürmann
1be3a558f2
radv: add remaining RT shader args for separate compilation
...
Also wrap RT args into struct {} rt for improved consistency
and remove some 'ray_' prefixes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Marek Olšák
6dc1ae1759
amd: drop support for LLVM 14
...
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23471 >
2023-06-07 19:56:55 +00:00
Marek Olšák
ab5662dc61
amd: drop support for LLVM 13
...
We can remove the LLVM 13 Wave32 discard workaround and
SI_PROFILE_IGNORE_LLVM13_DISCARD_BUG that disabled the workaround.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23471 >
2023-06-07 19:56:55 +00:00
Rhys Perry
b5c31f14a2
aco: remove memory_barrier_buffer implementation
...
This is no longer used.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624 >
2023-06-07 13:19:42 +00:00
Rhys Perry
417990b19e
aco: consider position/primitive exports around memory barriers
...
This is needed to create barriers which ensure stores finish before
position/primitive exports.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624 >
2023-06-07 13:19:41 +00:00
Georg Lehmann
dfb6d3e443
aco: use v_fma_mix for f2f32 and f2f16 on gfx11 if wave64
...
v_fma_mix can be dual issued, trade some code size for throughput.
Foz-DB GFX1100:
Totals from 8204 (6.08% of 134864) affected shaders:
CodeSize: 89608584 -> 89693968 (+0.10%)
Latency: 160744811 -> 160699309 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 19737977 -> 19678308 (-0.30%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402 >
2023-06-07 12:30:11 +00:00
Georg Lehmann
177dba62a1
aco: use v_add_f{16,32} with clamp for fsat
...
v_add can be dual issued on gfx11, v_med3 cannot.
Don't use v_add directly to still optimize omod(fsat(x)).
Foz-DB GFX1100:
Totals from 32702 (24.24% of 134913) affected shaders:
Latency: 475008203 -> 474928037 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 59226198 -> 59140787 (-0.14%); split: -0.14%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402 >
2023-06-07 12:30:11 +00:00
Georg Lehmann
3a0bc8f007
aco/statistics: improve v_fma_mix dual issuing detection
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402 >
2023-06-07 12:30:11 +00:00
Georg Lehmann
0f023d90c0
aco/ir: return true in hasRegClass for Operand(reg, rc)
...
Makes isOfType() usable after lower_to_hw_instr.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402 >
2023-06-07 12:30:11 +00:00
Rhys Perry
53383fe8a5
aco: fix ds_sub_gs_reg_rtn validation
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Fixes: 8d5cc23c18 ("aco: use gds reg when ordered xfb counter add")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23434 >
2023-06-06 16:09:28 +00:00
Qiang Yu
bd88c75d4c
aco,radv: remove unused gs aco shader info
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23432 >
2023-06-06 10:55:10 +00:00
Rhys Perry
9ae5c942da
aco/tests: add discard export target tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23371 >
2023-06-05 11:42:21 +00:00
Rhys Perry
21867b45c1
aco: fix has_color_exports=true for mrtz exports
...
V_008DFC_SQ_EXP_NULL is after V_008DFC_SQ_EXP_MRTZ.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com
Fixes: d3611af389 ("aco: support nir_export_amd with ps targets")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9135
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23371 >
2023-06-05 11:42:21 +00:00
Qiang Yu
2d0e8e0258
aco: use ac_get_image_dim for array check when image intrinsic
...
This is to avoid missing array flag when <=GFX8 and 3D image
in which case is treated as 2D array image.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
ed97cd92dc
aco: implement nir_bindless_image_fragment_mask_load_amd
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Sigend-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
50f9d644e8
aco: implement nir_xfb_counter_sub_amd
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
8d5cc23c18
aco: use gds reg when ordered xfb counter add
...
This is currently only used by radeonsi.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
438dcf6d0f
aco/assembler: handle ds_(add|sub)_gs_reg_rtn encoding
...
They are different than normal DS instructions, only
use DATA[0], not use ADDR.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
460b528c9e
aco: implement load buffer with ACCESS_USES_FORMAT_AMD
...
This is used by radeonsi for vs input load and cdna
image load emulation.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:59 +00:00
Qiang Yu
12ee7eccf7
aco,radv: remove unused aco_shader_info fields
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:58 +00:00
Qiang Yu
89dab66561
aco: implement two load lds ngg intrininsic for radeonsi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094 >
2023-06-02 09:21:58 +00:00
Georg Lehmann
79821d7afb
aco: p_start_linear_vgpr doesn't always need exec mask
...
Foz-DB Navi21:
Totals from 1605 (1.21% of 132657) affected shaders:
CodeSize: 14023700 -> 14020320 (-0.02%)
Instrs: 2589881 -> 2589052 (-0.03%)
Latency: 22478420 -> 22473359 (-0.02%)
InvThroughput: 3851237 -> 3851092 (-0.00%)
Copies: 215316 -> 215438 (+0.06%); split: -0.39%, +0.44%
Allows more vcmpx usage.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23302 >
2023-06-01 20:18:23 +00:00
Georg Lehmann
7836260af8
aco: cleanup v_cmp_class usage
...
It's not the best way to check for NaN.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23310 >
2023-05-31 12:56:37 +00:00
Samuel Pitoiset
1947500208
aco: remove nir_intrinsic_load_barycentric_at_sample occurences
...
This is lowered earlier and shouldn't get there.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23307 >
2023-05-31 07:25:46 +00:00
Chia-I Wu
750d641ca6
aco: fix alignment check in emit_load
...
align_offset already takes const_offset into consideration. Remove the
adjustment.
Fixes: 542733dbbf ("aco: add emit_load helper")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9097
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23242 >
2023-05-30 16:02:34 +00:00
Rhys Perry
94958e637d
aco: improve printing of s_delay_alu
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Rhys Perry
54c0088629
aco: insert s_delay_alu on the linear CFG
...
fossil-db (gfx1100):
Totals from 10498 (7.87% of 133428) affected shaders:
Instrs: 22274711 -> 22277717 (+0.01%); split: -0.01%, +0.03%
CodeSize: 114557040 -> 114569064 (+0.01%); split: -0.01%, +0.02%
Latency: 236505186 -> 236497338 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 33425052 -> 33423876 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Rhys Perry
d7f48a61ec
aco: use pass_flags to recover s_delay_alu cycles
...
This is simpler and more accurate.
fossil-db (gfx1100):
Totals from 11678 (8.75% of 133428) affected shaders:
Instrs: 25448655 -> 25436028 (-0.05%)
CodeSize: 130364728 -> 130314220 (-0.04%)
Latency: 325247603 -> 325231064 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 45901166 -> 45900022 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Rhys Perry
d9cdb3524a
aco: fix update_alu(clear=true) for exports
...
For:
v_mov_b32_e32 v0, 1.0
exp mrtz v0, off, off, off
we should completely remove the ALU entry before creating the EXP's WaR entry for v0.
Otherwise, the two will be combined into an entry which will wait for
expcnt(0) for later uses of v0.
gen_alu() should also be before gen(), since gen_alu() performs the clear
while gen() creates the WaR entry.
fossil-db (gfx1100):
Totals from 3589 (2.69% of 133428) affected shaders:
Instrs: 5591041 -> 5589047 (-0.04%); split: -0.04%, +0.00%
CodeSize: 28580840 -> 28572864 (-0.03%); split: -0.03%, +0.00%
Latency: 65427923 -> 65427543 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 11109079 -> 11109065 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Konstantin Seurer
03a9715a68
amd: Use the Mesa base style
...
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23275 >
2023-05-29 21:06:12 +00:00
Rhys Perry
c9cfe7bc80
aco/tests: add fix_derivs_in_divergent_cf tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636 >
2023-05-25 16:29:16 +00:00
Rhys Perry
02b933981c
aco/tests: improve performance of declaration parsing
...
Unlike \S, \w only matches characters which are valid in identifiers. This
seems to be much faster, especially for longer identifier names.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636 >
2023-05-25 16:29:16 +00:00