Commit graph

206 commits

Author SHA1 Message Date
Natalie Vock
ea66a8d1c5 aco,nir: Add support for GFX12 ds_bvh_stack_push8_pop1_rtn_b32 instruction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00
Natalie Vock
9707b30965 nir,aco: Add ds_bvh_stack_rtn
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:39 +00:00
Natalie Vock
c515f1fd58 aco: Use vector-aligned operands for image_bvh8_intersect_ray
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:38 +00:00
Georg Lehmann
d45f375a9d aco: only insert fp mode when needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35746>
2025-07-10 13:48:50 +00:00
Natalie Vock
630913e1b4 aco: Introduce static_scratch_rsrc program member
Function callees get their scratch resource as a parameter instead of
generating it on-the-fly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
4a62b342f3 aco: Add common utility to load scratch descriptor
Also modifies the scratch descriptor to take the stack pointer into
account.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:52 +00:00
Georg Lehmann
001cd632ee aco: select float8 to fp32 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:27 +00:00
Georg Lehmann
9e6adcbca0 aco: select fp32 to float8 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:26 +00:00
Georg Lehmann
94c191e6d9 aco: remove p_v_cvt_pk_u8_f32
Now unused.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:04 +00:00
Rhys Perry
86ccceb4de aco: don't consider gfx1153 to have point sample acceleration
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:55:13 +01:00
Rhys Perry
1fdfdbaf92 aco/hard_clauses: simplify and complete get_type()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This now includes image_msaa_load and the new atomic instructions in
GFX12.

It also treats point sample accelerated MIMG as either sample or load,
like the waitcnt insertion pass. I'm not sure if that's necessary or not,
though.

No fossil-db changes (gfx1201, gfx1150 and navi31).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>
2025-06-02 10:28:10 +00:00
Rhys Perry
8764ec0230 aco: consider image_msaa_load a sample operation before gfx12
LLVM commit 62dea99a7d7df9daedbb86133f3d46699cd2728d made this instruction
a sample for all GFX levels, then with f898161bfa95723954a273a519180e070a5ccd2e
it was changed to be GFX12+. Now 34b6285735c999d2fab77b0ff8e5b497d86df3af
changed it to be all GFX levels again.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>
2025-06-02 10:28:09 +00:00
Rhys Perry
c04ef28c56 aco: rename ops_fixed_to_def to tied_defs
This is less words.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>
2025-05-20 15:40:46 +00:00
Georg Lehmann
0257644130 aco: assume sram ecc is enabled on Vega20
There are D16 load issues on Vega20 that are expected if sram ecc is enabled.
It's a professional class chip and I found mentions of it supporting ecc,
so assume it's enabled.

Maybe this could be improved by querying ecc info from the kernel, but
I'm not sure which query should be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13189
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12393
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35045>
2025-05-20 08:24:21 +00:00
Georg Lehmann
f98d20fec6 aco: replace get_operand_size with get_operand_type
To not use constant_bits everywhere.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35020>
2025-05-19 13:05:48 +00:00
Georg Lehmann
fa3f207035 aco: swap operands without instructions
For future work.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35020>
2025-05-19 13:05:48 +00:00
Georg Lehmann
3f70433ff0 aco: add type information for operands/definitions
More information available for use in the optimizer.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29695>
2025-05-15 12:17:17 +00:00
Rhys Perry
171920ceed aco/gfx115: consider point sample acceleration
Like 15428e0d786939a5c7629a9978947c8a9112ce96 in LLVM.

fossil-db (gfx1150):
Totals from 909 (1.14% of 79653) affected shaders:
Instrs: 5840489 -> 5840705 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31133460 -> 31134296 (+0.00%); split: -0.00%, +0.00%
Latency: 52982280 -> 53438577 (+0.86%); split: -0.00%, +0.86%
InvThroughput: 10841454 -> 10942682 (+0.93%); split: -0.00%, +0.93%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935>
2025-05-14 11:22:13 +00:00
Georg Lehmann
c0e88c376a aco/optimizer: validate context data
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:25 +00:00
Georg Lehmann
906b7dbcec aco: replace novalidateir with novalidate debug option
The next commits will add more validation that's enabled by default.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:25 +00:00
Rhys Perry
6338ed44c5 aco/gfx12: increase maximum vbuffer offset
fossil-db (gfx1201):
Totals from 301 (0.38% of 79377) affected shaders:
Instrs: 2734478 -> 2728816 (-0.21%); split: -0.21%, +0.00%
CodeSize: 14347476 -> 14306568 (-0.29%)
Latency: 15508055 -> 15502202 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 2846419 -> 2842387 (-0.14%); split: -0.14%, +0.00%
VClause: 68286 -> 68101 (-0.27%); split: -0.30%, +0.03%
SClause: 49487 -> 49500 (+0.03%)
Copies: 207179 -> 206093 (-0.52%); split: -0.57%, +0.04%
Branches: 72941 -> 72942 (+0.00%); split: -0.00%, +0.00%
VALU: 1549156 -> 1544727 (-0.29%); split: -0.29%, +0.00%
SALU: 339620 -> 338989 (-0.19%); split: -0.19%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34730>
2025-04-29 17:44:41 +00:00
Rhys Perry
d987d5e341 aco/gfx12: increase maximum global/scratch offset
fossil-db (gfx1201):
Totals from 29 (0.04% of 79377) affected shaders:
Instrs: 1439082 -> 1439070 (-0.00%)
CodeSize: 7641688 -> 7641608 (-0.00%)
Latency: 9836296 -> 9836280 (-0.00%)
VALU: 799504 -> 799496 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34730>
2025-04-29 17:44:41 +00:00
Rhys Perry
02d193f058 aco/gfx12: increase maximum smem offset
fossil-db (gfx1201):
Totals from 55 (0.07% of 79377) affected shaders:
Instrs: 3203775 -> 3200809 (-0.09%)
CodeSize: 16817140 -> 16813440 (-0.02%); split: -0.04%, +0.02%
Latency: 17838315 -> 17829658 (-0.05%)
InvThroughput: 3352905 -> 3351689 (-0.04%)
SClause: 57377 -> 57273 (-0.18%)
Copies: 231006 -> 230941 (-0.03%)
SALU: 436900 -> 435234 (-0.38%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34730>
2025-04-29 17:44:41 +00:00
Rhys Perry
c26851b80b aco: increase max_const_offset_plus_one for SMEM load_global
fossil-db (gfx1201):
Totals from 1115 (1.40% of 79377) affected shaders:
Instrs: 1473805 -> 1467571 (-0.42%); split: -0.43%, +0.01%
CodeSize: 7852972 -> 7819656 (-0.42%); split: -0.44%, +0.02%
SpillSGPRs: 1632 -> 1460 (-10.54%); split: -11.27%, +0.74%
Latency: 11975762 -> 11971915 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 2496961 -> 2496448 (-0.02%); split: -0.03%, +0.01%
VClause: 25213 -> 25218 (+0.02%); split: -0.00%, +0.02%
SClause: 28822 -> 28565 (-0.89%); split: -1.41%, +0.52%
Copies: 106377 -> 105715 (-0.62%); split: -1.23%, +0.61%
Branches: 27497 -> 27473 (-0.09%)
PreSGPRs: 52071 -> 51310 (-1.46%)
VALU: 871051 -> 870694 (-0.04%); split: -0.04%, +0.00%
SALU: 186090 -> 181811 (-2.30%); split: -2.32%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34730>
2025-04-29 17:44:41 +00:00
Konstantin Seurer
978e9b670e aco,nir: Add support for new GFX12 ray tracing instructions
Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can
handle the new BVH format. Both instructions write up to 10 VGPRs so
they need to use a vec16 definition in nir.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Natalie Vock
f309d76aab aco: Add support for multiple ops fixed to defs
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Natalie Vock
d1ff9e951a aco: Fix RT VGPR limit on Navi31/32, GFX11.5, GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since 128 is not a multiple of the VGPR allocation granule, we will
actually allocate 134 VGPRs. No reason not to use the extra 6.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34265>
2025-04-09 10:02:52 +00:00
Georg Lehmann
64cae5c48d aco: form mixed MTBUF/MUBUF clauses
This should be one clause (all of the instructions load from the same vertex buffer)

s_clause 0x2                                                ; bfa10002
tbuffer_load_format_xyzw v[8:11], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:36 ; e9c32024 80010805
tbuffer_load_format_xyzw v[12:15], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:16 ; e9c32010 80010c05
tbuffer_load_format_xyzw v[16:19], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:12 ; e9c3200c 80011005
s_clause 0x2                                                ; bfa10002
buffer_load_dwordx3 v[20:22], v5, s[4:7], 0 idxen           ; e03c2000 80011405
buffer_load_dwordx3 v[23:25], v5, s[4:7], 0 idxen offset:20 ; e03c2014 80011705
buffer_load_dwordx4 v[28:31], v5, s[4:7], 0 idxen offset:48 ; e0382030 80011c05
tbuffer_load_format_xy v[0:1], v5, s[4:7], 0 format:[BUF_FMT_8_8_UNORM] idxen offset:32 ; e8712020 80010005

Foz-DB Navi21:
Totals from 5624 (7.08% of 79395) affected shaders:
MaxWaves: 149894 -> 149898 (+0.00%)
Instrs: 3032697 -> 3034853 (+0.07%); split: -0.05%, +0.12%
CodeSize: 15907852 -> 15915752 (+0.05%); split: -0.05%, +0.10%
VGPRs: 216248 -> 216144 (-0.05%)
Latency: 10955137 -> 11008760 (+0.49%); split: -0.22%, +0.70%
InvThroughput: 2032857 -> 2033916 (+0.05%); split: -0.03%, +0.08%
VClause: 50120 -> 41778 (-16.64%); split: -16.66%, +0.02%
SClause: 62034 -> 62004 (-0.05%); split: -0.33%, +0.29%
Copies: 253836 -> 254505 (+0.26%); split: -0.17%, +0.43%
VALU: 1621606 -> 1622274 (+0.04%); split: -0.03%, +0.07%
SALU: 653251 -> 653252 (+0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
Daniel Schürmann
c1b124ab6c aco/lower_branches: properly consider exec mask needs of branch targets
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33619>
2025-03-26 08:45:11 +00:00
Daniel Schürmann
eecdb45d61 aco: consider s_cbranch_exec* instructions in needs_exec_mask()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32477>
2025-01-23 00:11:06 +00:00
Rhys Perry
5375d77488 aco: wait for scratch stores to complete before dealloc_vgprs
fossil-db (navi31):
Totals from 392 (0.49% of 79395) affected shaders:
Instrs: 5052043 -> 5054100 (+0.04%)
CodeSize: 26701200 -> 26709428 (+0.03%)
Latency: 43614861 -> 43615368 (+0.00%)
InvThroughput: 7353147 -> 7353216 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:05 +00:00
Rhys Perry
295b7d606f aco: insert NOP before dealloc_vgpr in the insert_NOPs pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:05 +00:00
Rhys Perry
0ad713ca9f aco: add waitcnt build helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:04 +00:00
Daniel Schürmann
e2705a9d85 aco: set Precolored flag before register allocation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387>
2024-10-22 12:29:18 +00:00
Georg Lehmann
977f435f4c aco/ir: add function to parse depctr waits
No Foz-DB changes on Navi31.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
27cf11dc8a aco: remove per block inf/nan/sz control
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31172>
2024-09-18 20:46:17 +00:00
Georg Lehmann
9bb10b58f3 aco: use v_cvt_pk_u8_f32 for f2u8
Foz-DB Navi31:
Totals from 42 (0.05% of 79395) affected shaders:
Instrs: 3253747 -> 3248867 (-0.15%); split: -0.15%, +0.00%
CodeSize: 16690136 -> 16661772 (-0.17%); split: -0.17%, +0.00%
VGPRs: 4176 -> 4128 (-1.15%)
Latency: 18485157 -> 18479752 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 3659404 -> 3658222 (-0.03%); split: -0.03%, +0.00%
Copies: 231891 -> 228145 (-1.62%)
VALU: 1785800 -> 1782054 (-0.21%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29532>
2024-09-02 19:49:20 +00:00
Rhys Perry
7b92e11e16 aco: forget valu delays after certain s_waitcnt_depctr/LDSDIR
fossil-db (navi31):
Totals from 55242 (69.58% of 79395) affected shaders:
Instrs: 40507666 -> 40138006 (-0.91%); split: -0.91%, +0.00%
CodeSize: 212516104 -> 211025880 (-0.70%); split: -0.70%, +0.00%
Latency: 281643258 -> 281628053 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 46370668 -> 46369637 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:01 +00:00
Daniel Schürmann
b0c8c5e42e aco: implement aco::validate_live_vars()
This is intended for passes which manually update live variables
and RegisterDemand, like e.g. the scheduler, and can be enabled
with ACO_DEBUG=validate-livevars.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30182>
2024-08-14 08:11:48 +00:00
Georg Lehmann
62fa5b9d6f aco/gfx11+: apply neg to vinterp
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30350>
2024-07-30 15:25:19 +00:00
Rhys Perry
08a4853ffd aco: add export instructions to should_form_clause
fossil-db (gfx1150):
Totals from 1 (0.00% of 79395) affected shaders:
Instrs: 906 -> 901 (-0.55%)
CodeSize: 4692 -> 4672 (-0.43%)
Latency: 1582 -> 1561 (-1.33%)

fossil-db (navi21):
Totals from 9917 (12.49% of 79395) affected shaders:
Instrs: 6120530 -> 6116252 (-0.07%)
CodeSize: 31808916 -> 31791804 (-0.05%)
Latency: 23854892 -> 23866438 (+0.05%); split: -0.01%, +0.06%
InvThroughput: 4655278 -> 4655490 (+0.00%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30241>
2024-07-23 13:14:52 +00:00
Rhys Perry
3b732340ec aco/gfx11.5: skip dealloc_vgprs for stages with exports
fossil-db (gfx1150):
Totals from 72997 (91.94% of 79395) affected shaders:
Instrs: 35047397 -> 34679136 (-1.05%)
CodeSize: 182414292 -> 180934440 (-0.81%)
Latency: 225452700 -> 225045411 (-0.18%)
InvThroughput: 40416910 -> 40395608 (-0.05%); split: -0.06%, +0.00%
Branches: 604968 -> 601660 (-0.55%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30241>
2024-07-23 13:14:52 +00:00
Georg Lehmann
6affd916b5 aco/gfx11.5: fix s_fmac acc to definition
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245>
2024-07-18 08:36:14 +00:00
Georg Lehmann
4399c7bac3 aco: add aco_opcode::p_s_cvt_f16_f32_rtne
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245>
2024-07-18 08:36:14 +00:00
Georg Lehmann
d58d0274a8 aco/gfx12: use trans s_delay_alu for pseudo scalar
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245>
2024-07-18 08:36:14 +00:00
Georg Lehmann
7fc8ad2ddd aco/ir: remove unused vopc helpers
And rename get_swapped and get_inverse to show that they should only be used for VOPC.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>
2024-06-27 08:12:30 +00:00
Rhys Perry
17f2ebe8d2 aco: use 1.5x vgprs for gfx1151 and gfx12
From LLVM.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29839>
2024-06-24 18:39:40 +00:00
Georg Lehmann
c3c398d56d aco: make local functions static in files without anonymous namespace
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Friedrich Vock
15f2c9c553 aco: Limit rt stages to 128 vgprs
Totals from 35472 (7.40% of 479373) affected shaders:

MaxWaves: 206239 -> 283776 (+37.60%)
Instrs: 193922210 -> 202721106 (+4.54%)
CodeSize: 1056819972 -> 1110833680 (+5.11%); split: -0.00%, +5.11%
VGPRs: 6026704 -> 4540416 (-24.66%)
SpillSGPRs: 23742 -> 25754 (+8.47%)
SpillVGPRs: 118897 -> 2295118 (+1830.34%)
Scratch: 7201792 -> 152752128 (+2021.03%)
Latency: 2713432565 -> 3194796286 (+17.74%); split: -0.20%, +17.94%
InvThroughput: 1052131232 -> 935049835 (-11.13%); split: -16.59%, +5.46%
VClause: 6972784 -> 8716721 (+25.01%); split: -0.02%, +25.03%
SClause: 4879313 -> 4852452 (-0.55%); split: -0.88%, +0.33%
Copies: 32782141 -> 35223995 (+7.45%)
Branches: 11075847 -> 11094087 (+0.16%); split: -0.00%, +0.17%
VALU: 118525960 -> 120929058 (+2.03%)
SALU: 33924572 -> 33973293 (+0.14%); split: -0.03%, +0.17%
VMEM: 12419116 -> 17104582 (+37.73%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Rhys Perry
2a4424425a aco/gfx12: fix s_wait_event immediate
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00