Rhys Perry
185fa04baa
aco/gfx6: set glc for buffer_store_byte/short
...
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Daniel Schürmann
c452a4d1cc
aco/ra: use round robin register allocation
...
Totals from 74681 (94.06% of 79395) affected shaders: (GFX11)
MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10%
Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05%
CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05%
VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22%
Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23%
InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21%
VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27%
SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33%
Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76%
Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02%
VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19%
SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14%
VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
197943ae27
aco/ra: change heuristic to first fit
...
Totals from 73175 (92.17% of 79395) affected shaders: (GFX11)
MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01%
Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15%
CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12%
VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24%
Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30%
InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25%
VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33%
SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26%
Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20%
Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01%
VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12%
SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11%
VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
d76fc005b6
aco/ra: re-use registers from killed operands
...
Totals from 77283 (97.34% of 79395) affected shaders: (GFX11)
MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02%
Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11%
CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11%
VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65%
Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52%
InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34%
VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53%
SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14%
Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64%
Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03%
VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19%
SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07%
VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
b054cfe704
aco/ra: move can_write_m0() check into get_reg_specified()
...
This way, affinities are also covered.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
8e817cf52b
aco/ra: refactor get_reg_simple() with increased stride.
...
This should avoid some redundant calls.
Totals from 153 (0.19% of 79395) affected shaders: (GFX11)
Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05%
CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05%
VGPRs: 10068 -> 10348 (+2.78%)
Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11%
InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02%
VClause: 3868 -> 3870 (+0.05%)
Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34%
Branches: 6479 -> 6480 (+0.02%)
VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
1b0edf3f33
aco/ra: Fix array access when finding register for subdword variables
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
5326e033ff
aco/ra: fix handling of killed operands in compact_relocate_vars()
...
Found by inspection.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:14 +00:00
Rhys Perry
4cfb7a0c17
aco: remove support for sub-dword push constants
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480 >
2024-06-06 17:52:05 +00:00
Georg Lehmann
3fb1a64918
aco: move s_add_u32 -> s_addk_i32 optimization fully to ra
...
Having this in one place is better.
When I wrote the old I wasn't aware that checking the kill flag on definitions
is the same as checking zero uses.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
60f3f0fdbb
aco/ra: use a switch to check vop2acc instruction support
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
fdc2fb6835
aco: move literal unswizzle opt to RA
...
Much simpler.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
c63c750380
aco/gfx11+: fix inline constants for v_pk_fmac_f16
...
On newer hardware, the hi operation reads the lo half of the inline constant.
On older hardware, it reads the hi half (zero).
I tested this on Navi31 for gfx11 and Raphael for gfx10.
Foz-DB Navi31:
Totals from 4 (0.01% of 79395) affected shaders:
CodeSize: 36832 -> 36448 (-1.04%)
Latency: 20362 -> 20334 (-0.14%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
39380d475a
aco: add affinities for possible sopk optimizations
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
fac475bc25
aco: rework how affinities for acc operands are determined
...
Improve accuracy by adding a helper that's also used by
the optimization function.
Foz-DB Navi31:
Totals from 50 (0.06% of 79206) affected shaders:
CodeSize: 126148 -> 126128 (-0.02%); split: -0.05%, +0.04%
Latency: 334049 -> 334060 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 59203 -> 59205 (+0.00%)
Copies: 2011 -> 1998 (-0.65%); split: -0.75%, +0.10%
VALU: 14221 -> 14208 (-0.09%); split: -0.11%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Rhys Perry
8e475bba61
aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
1ad05d4ca8
aco: implement nir_atomic_op_ordered_add_gfx12_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
0dee5fdd3c
aco: don't combine vgpr into writelane src0
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
2a4424425a
aco/gfx12: fix s_wait_event immediate
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
c651eed1d8
aco/gfx12: implement load_subgroup_id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Georg Lehmann
818ff03865
aco: optimize branching sequence with p_create_vector exec producer
...
This happens with inverse_ballot and wave64.
Foz-DB Navi21:
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 2689 -> 2683 (-0.22%)
CodeSize: 14988 -> 14972 (-0.11%)
Latency: 20207 -> 20204 (-0.01%)
Copies: 144 -> 141 (-2.08%)
Branches: 76 -> 73 (-3.95%)
SALU: 241 -> 238 (-1.24%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502 >
2024-06-04 15:40:57 +00:00
Georg Lehmann
dcab408a6c
nir: remove unpack_half_flush_to_zero
...
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433 >
2024-05-31 09:46:35 +00:00
Mike Blumenkrantz
2aaa6ebba1
build/amd: add amd-use-llvm build option
...
this allows amd drivers to disable llvm support while still allowing
llvmpipe/lavapipe to be built
by disabling llvm support in amd drivers, the load times for these drivers
decreases by 5-10ms
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969 >
2024-05-30 19:05:00 +00:00
Samuel Pitoiset
ce6557cc04
aco: adjust loading local invocation ID for GS on GFX12
...
It uses gs_vtx_offset[0] instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29417 >
2024-05-30 11:05:04 +00:00
Rhys Perry
ac47ee1be7
meson: remove --depfile for aco_tests
...
This isn't needed right now and probably doesn't work. glsl_scraper.py
writes to the same depfile several times.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29348 >
2024-05-30 09:44:52 +00:00
Rhys Perry
1829d74ad3
aco: fix fddx/y with uniform inf/nan input
...
inf or nan subtracted by itself is not zero.
I don't think Vulkan requires this, but this better matches NIR's constant
folding and the divergent implementation.
fossil-db (navi31):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 537 -> 588 (+9.50%)
CodeSize: 3132 -> 3380 (+7.92%)
Latency: 2806 -> 2819 (+0.46%)
InvThroughput: 286 -> 316 (+10.49%)
Copies: 24 -> 39 (+62.50%)
VALU: 262 -> 289 (+10.31%)
SALU: 33 -> 51 (+54.55%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29418 >
2024-05-29 15:18:52 +00:00
Georg Lehmann
b04d99d093
aco/optimizer: use p_create_vector to create mask when a copy can't be used
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
2b56a97374
aco/lower_to_hw: optimize split 64bit constant copies
...
Foz-DB Navi21:
Totals from 3209 (4.04% of 79395) affected shaders:
Instrs: 6502065 -> 6496612 (-0.08%)
CodeSize: 35578300 -> 35556596 (-0.06%)
Latency: 66092924 -> 66092668 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 16968953 -> 16968900 (-0.00%); split: -0.00%, +0.00%
SClause: 198651 -> 198647 (-0.00%)
Copies: 597323 -> 591872 (-0.91%)
SALU: 930918 -> 925467 (-0.59%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
5910a46101
aco/lower_to_hw: use copy_constant_sgpr for masks
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
23d88e68fc
aco: small constant copy optimizations
...
Foz-DB Navi21:
Totals from 13 (0.02% of 79395) affected shaders:
CodeSize: 93432 -> 93376 (-0.06%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
54ad07c32a
aco/lower_to_hw: add copy_constant_sgpr
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
56354c6cd7
aco: don't pass program to emit_bpermute
...
Also change the param order, because the builder typically comes first.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Konstantin Seurer
a93f95c69c
radv/rt: Remove load_rt_dynamic_callable_stack_base_amd
...
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Konstantin Seurer
432f3eb9ca
radv/rt: Track ray_launch_size reads
...
Totals from 33 (8.71% of 379) affected shaders:
Instrs: 1434025 -> 1433988 (-0.00%); split: -0.01%, +0.00%
CodeSize: 7578824 -> 7578472 (-0.00%); split: -0.01%, +0.00%
Latency: 9241632 -> 9241639 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3407014 -> 3407049 (+0.00%); split: -0.00%, +0.00%
VClause: 40399 -> 40391 (-0.02%)
SClause: 37755 -> 37760 (+0.01%); split: -0.04%, +0.05%
Copies: 169588 -> 169567 (-0.01%); split: -0.04%, +0.02%
PreSGPRs: 4323 -> 4319 (-0.09%)
VALU: 940500 -> 940484 (-0.00%); split: -0.00%, +0.00%
SALU: 220508 -> 220509 (+0.00%); split: -0.03%, +0.03%
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Konstantin Seurer
7ba8fccad3
radv/rt: Track ray_launch_id reads
...
We can expect the z-component to be unused most of the times. Avoid
preserving it in those cases.
Totals from 94 (24.80% of 379) affected shaders:
MaxWaves: 916 -> 935 (+2.07%)
Instrs: 3316697 -> 3318357 (+0.05%); split: -0.06%, +0.11%
CodeSize: 17618704 -> 17616680 (-0.01%); split: -0.09%, +0.08%
VGPRs: 11632 -> 11520 (-0.96%)
SpillSGPRs: 1139 -> 1205 (+5.79%); split: -0.35%, +6.15%
Latency: 22595907 -> 22598225 (+0.01%); split: -0.15%, +0.16%
InvThroughput: 7036479 -> 6923740 (-1.60%); split: -1.74%, +0.14%
VClause: 104325 -> 104361 (+0.03%); split: -0.16%, +0.19%
SClause: 83920 -> 83925 (+0.01%); split: -0.08%, +0.08%
Copies: 328140 -> 330687 (+0.78%); split: -0.27%, +1.05%
Branches: 134521 -> 134541 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 8753 -> 8806 (+0.61%)
PreVGPRs: 10984 -> 10937 (-0.43%)
VALU: 2149880 -> 2151318 (+0.07%); split: -0.08%, +0.15%
SALU: 499107 -> 499128 (+0.00%); split: -0.08%, +0.09%
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Rhys Perry
de07fd384d
aco/gfx12: disallow SCC and most constants for BUF SOFFSET
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
12b4bdc134
aco/gfx12: decrease max_nsa_vgprs for VSAMPLE
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
b1b3237590
aco/gfx12: remove MIMG vector affinity
...
Since GFX12 uses NSA unconditionally, there is no code size advantage to
avoiding it.
fossil-db (gfx1200):
Totals from 41700 (52.52% of 79395) affected shaders:
MaxWaves: 1063633 -> 1063623 (-0.00%); split: +0.00%, -0.00%
Instrs: 32745913 -> 32736332 (-0.03%); split: -0.10%, +0.07%
CodeSize: 177664256 -> 177623280 (-0.02%); split: -0.08%, +0.06%
VGPRs: 1668640 -> 1665280 (-0.20%); split: -0.26%, +0.06%
Latency: 248630176 -> 248803989 (+0.07%); split: -0.23%, +0.30%
InvThroughput: 51923793 -> 51958560 (+0.07%); split: -0.15%, +0.22%
VClause: 633381 -> 633594 (+0.03%); split: -0.31%, +0.34%
SClause: 1090207 -> 1090206 (-0.00%); split: -0.02%, +0.02%
Copies: 2042437 -> 2040188 (-0.11%); split: -0.53%, +0.42%
Branches: 680437 -> 680416 (-0.00%); split: -0.01%, +0.01%
VALU: 19387160 -> 19384917 (-0.01%); split: -0.06%, +0.04%
SALU: 3112590 -> 3112540 (-0.00%); split: -0.01%, +0.00%
VOPD: 5474 -> 5527 (+0.97%); split: +2.87%, -1.90%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
ef74407577
aco/gfx12: use ttmp9/ttmp7 for workgroup id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
c8123b67e0
aco/gfx12: don't create v_fmac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
e79a8219d2
aco/gfx12: sign-extend s_getpc_b64
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
ae18c88409
aco/gfx12: implement workgroup barrier
...
Same sequence LLVM uses for llvm.amdgcn.s.barrier.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
fae2a85d57
aco/gfx12: implement subgroup shader clock
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
872dda2bc5
aco: support GFX12 in insert_NOPs
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Samuel Pitoiset
3d6957268b
aco: use new common helpers for building buffer descriptors
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29268 >
2024-05-22 08:31:39 +00:00
Rhys Perry
b99c48b011
aco/lower_phis: don't create boolean loop header phis in some situations
...
If we have a loop with continue_or_break and no divergent exits, there is
no need for a loop header phi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121 >
2024-05-21 21:28:13 +00:00
Rhys Perry
4ae8a558b2
aco: remove nir_to_aco
...
This isn't used anymore
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121 >
2024-05-21 21:28:13 +00:00
Rhys Perry
b1964f03e7
aco: use scalar phi lowering for lcssa workaround
...
This lets us use non-undef for the last operand, if necessary
(demonstrated in the test).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121 >
2024-05-21 21:28:13 +00:00
Rhys Perry
bbe4652430
aco: create lcssa phis for continue_or_break loops when necessary
...
These might not exist because adding would decrease the quality of
divergence analysis. They are necessary for continue_or_break though, so
add them later, where they won't affect divergence analysis.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10623
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121 >
2024-05-21 21:28:13 +00:00
Rhys Perry
3fc7207f50
aco/lower_phis: create loop header phis for non-boolean loop exit phis
...
These might be necessary if continue_or_break and divergent breaks are both used:
loop {
if (divergent) {
a = loop_invariant_sgpr
break
}
discard_if
}
b = phi a
If we break because discard_if makes exec empty but only did so in
previous iterations, then the phi should use "a" from those previous
iterations.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121 >
2024-05-21 21:28:13 +00:00