Commit graph

3833 commits

Author SHA1 Message Date
Natalie Vock
df5495b934 aco/assembler: Support vector-aligned operands on DS instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00
Natalie Vock
ea66a8d1c5 aco,nir: Add support for GFX12 ds_bvh_stack_push8_pop1_rtn_b32 instruction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00
Natalie Vock
9707b30965 nir,aco: Add ds_bvh_stack_rtn
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:39 +00:00
Natalie Vock
c515f1fd58 aco: Use vector-aligned operands for image_bvh8_intersect_ray
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:38 +00:00
Natalie Vock
c279dd6e61 aco: Support vector-aligned ops fixed to defs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:38 +00:00
Natalie Vock
f17fe05e32 aco/isel: Improve vector splits for image_bvh8_intersect_ray
Using split_vector to split everything into scalars allows copy-prop to
eliminate the final p_create_vector. Considerably reduces copies and
register thrashing.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:38 +00:00
Marek Olšák
d12bc87dda aco: implement upcasting 16-bit types for 32-bit color buffers in PS epilog
This was missed when implementing the change for LLVM.

Fixes: fbbf029529 - radeonsi: enable 16-bit mediump IO for PS outputs only, and VS->PS with env var

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36112>
2025-07-15 18:28:30 +00:00
Daniel Schürmann
47ef60cbf1 aco/ra: always use bytes for register stride requirements
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
instead of a mixture between dwords and bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36053>
2025-07-14 08:45:29 +00:00
Marek Olšák
5ded4f3c7d aco: remove unused aco_symbol_lds_ngg_gs_out_vertex_base
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Georg Lehmann
92d433c54a aco: vectorize conversions from 8bit to 16bit
Massively helps emulated fp8 performance.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:15 +00:00
Georg Lehmann
7fece5592c aco: vectorize 16bit extracts
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:14 +00:00
Rhys Perry
3b9a1ce4ca aco: remove RegClass::as_subdword
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:09 +00:00
Rhys Perry
9c55b0ca20 aco: use MUBUF for global access with SGPR address on GFX7/8
This should be better than using FLAT, which only supports a VGPR address.

fossil-db (polaris10):
Totals from 159 (0.26% of 62070) affected shaders:
MaxWaves: 789 -> 803 (+1.77%)
Instrs: 234284 -> 230557 (-1.59%); split: -1.71%, +0.12%
CodeSize: 1212324 -> 1186716 (-2.11%); split: -2.23%, +0.11%
SGPRs: 10504 -> 10712 (+1.98%)
VGPRs: 10556 -> 10236 (-3.03%); split: -3.37%, +0.34%
SpillSGPRs: 579 -> 577 (-0.35%)
Latency: 3903056 -> 3875625 (-0.70%); split: -0.87%, +0.16%
InvThroughput: 3139443 -> 3114426 (-0.80%); split: -0.86%, +0.07%
VClause: 4205 -> 4433 (+5.42%); split: -0.43%, +5.85%
SClause: 4461 -> 4445 (-0.36%); split: -0.43%, +0.07%
Copies: 30889 -> 31507 (+2.00%); split: -0.29%, +2.29%
PreSGPRs: 7370 -> 7609 (+3.24%)
PreVGPRs: 8339 -> 8193 (-1.75%)
VALU: 175025 -> 170232 (-2.74%); split: -2.77%, +0.03%
SALU: 27269 -> 28532 (+4.63%); split: -0.01%, +4.64%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:08 +00:00
Rhys Perry
0094e6c32a aco: optimize lds-only or vmem-only flat access
fossil-db (polaris10):
Totals from 138 (0.22% of 62070) affected shaders:
Instrs: 233452 -> 234436 (+0.42%)
CodeSize: 1209392 -> 1213220 (+0.32%)
Latency: 3934496 -> 3928089 (-0.16%); split: -0.17%, +0.00%
InvThroughput: 3040782 -> 3038562 (-0.07%); split: -0.07%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:08 +00:00
Rhys Perry
d705b6198c aco: simplify waitcnt insertion for flat access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:08 +00:00
Rhys Perry
6396a82695 aco: return a format in lower_global_address
No fossil-db changes (navi10, pitcairn).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:07 +00:00
Rhys Perry
89c2c94147 aco: increase global constant offset limit slightly
Before, this wasn't actually the maximum value plus one.

fossil-db (navi10):
Totals from 4 (0.01% of 63207) affected shaders:
(no stat changes)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:07 +00:00
Rhys Perry
d7dcd81c77 aco/gfx6: allow both constant and gpr offset for global with sgpr address
fossil-db (pitcairn):
Totals from 81 (0.13% of 62069) affected shaders:
MaxWaves: 332 -> 335 (+0.90%)
Instrs: 150087 -> 149737 (-0.23%); split: -0.30%, +0.06%
CodeSize: 754636 -> 752712 (-0.25%); split: -0.31%, +0.06%
SGPRs: 6128 -> 6184 (+0.91%)
VGPRs: 7220 -> 7208 (-0.17%); split: -0.28%, +0.11%
SpillSGPRs: 288 -> 287 (-0.35%)
Latency: 2199197 -> 2198338 (-0.04%); split: -0.20%, +0.17%
InvThroughput: 1613474 -> 1614303 (+0.05%); split: -0.07%, +0.12%
VClause: 2905 -> 2862 (-1.48%); split: -2.34%, +0.86%
SClause: 2366 -> 2378 (+0.51%); split: -0.17%, +0.68%
Copies: 17312 -> 17264 (-0.28%); split: -1.03%, +0.76%
PreSGPRs: 5080 -> 5004 (-1.50%)
PreVGPRs: 5656 -> 5640 (-0.28%)
VALU: 114097 -> 113831 (-0.23%); split: -0.31%, +0.07%
SALU: 16004 -> 15944 (-0.37%); split: -0.41%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:06 +00:00
Rhys Perry
684943bd1f aco/gfx6: allow vgpr offset for global access with sgpr address
No reason why we can't use offen like normal buffer loads.

fossil-db (pitcairn):
Totals from 122 (0.20% of 62069) affected shaders:
MaxWaves: 521 -> 525 (+0.77%)
Instrs: 238341 -> 237228 (-0.47%); split: -0.57%, +0.10%
CodeSize: 1196260 -> 1188076 (-0.68%); split: -0.78%, +0.09%
SGPRs: 8752 -> 8760 (+0.09%); split: -0.64%, +0.73%
VGPRs: 10456 -> 10440 (-0.15%); split: -0.88%, +0.73%
Latency: 3958385 -> 3946186 (-0.31%); split: -0.38%, +0.07%
InvThroughput: 3097193 -> 3084417 (-0.41%); split: -0.42%, +0.01%
VClause: 4058 -> 4500 (+10.89%); split: -0.02%, +10.92%
SClause: 4511 -> 4500 (-0.24%); split: -0.42%, +0.18%
Copies: 31228 -> 31718 (+1.57%); split: -0.38%, +1.95%
PreSGPRs: 7211 -> 7461 (+3.47%)
PreVGPRs: 8174 -> 8147 (-0.33%); split: -0.34%, +0.01%
VALU: 174779 -> 173294 (-0.85%); split: -0.87%, +0.02%
SALU: 29138 -> 29641 (+1.73%); split: -0.09%, +1.82%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:05 +00:00
Rhys Perry
09a5af121f aco: simplify the load callback
We can put these parameters in the LoadEmitInfo instead.

No fossil-db changes (navi10, pitcairn).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:04 +00:00
Rhys Perry
101d0b791f aco: add too-large constant offset to the address instead of the offset
In case the addition with the offset overflows.

No fossil-db changes (navi10, pitcairn).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:03 +00:00
Rhys Perry
bd9a9a77fe aco: use addition helper in emit_load
No fossil-db changes (navi10, pitcairn).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:03 +00:00
Rhys Perry
8defd1bc16 aco/gfx6: disallow global access with sgpr address and two offsets
No fossil-db changes (navi10, pitcairn).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:03 +00:00
Rhys Perry
1c0d065dae aco: count flat as vmem in statistics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:03 +00:00
Georg Lehmann
d45f375a9d aco: only insert fp mode when needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35746>
2025-07-10 13:48:50 +00:00
Georg Lehmann
46c1bd1147 aco: add a dedicated pass for better float MODE insertion
Foz-DB Navi48:
Totals from 14 (0.02% of 80251) affected shaders:
Instrs: 13998 -> 11684 (-16.53%)
CodeSize: 104464 -> 86260 (-17.43%)
Latency: 108722 -> 106667 (-1.89%)
InvThroughput: 100332 -> 100324 (-0.01%)
VClause: 621 -> 595 (-4.19%); split: -4.99%, +0.81%
VALU: 6875 -> 6871 (-0.06%)
SALU: 3256 -> 1015 (-68.83%)
VOPD: 1328 -> 1332 (+0.30%)

Removes the s_setreg spam in FSR4.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35746>
2025-07-10 13:48:50 +00:00
Daniel Schürmann
610a19cf31 aco/isel: allow to select SGPR defs for vectorized bcsel and logical operations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:37 +00:00
Daniel Schürmann
d7477111d2 aco: split vectorized bcsel and bitwise logic VGPR definitions
This has a slightly negative effect on parallel-rdp, but positively affects FSR4.

Totals from 14 (0.02% of 79839) affected shaders: (Navi48)

Instrs: 63543 -> 63646 (+0.16%); split: -0.01%, +0.17%
CodeSize: 352888 -> 353608 (+0.20%); split: -0.02%, +0.23%
Latency: 1822354 -> 1825036 (+0.15%)
InvThroughput: 364683 -> 365738 (+0.29%); split: -0.04%, +0.32%
Copies: 9299 -> 9363 (+0.69%); split: -0.11%, +0.80%
PreVGPRs: 1381 -> 1394 (+0.94%)
VALU: 34511 -> 34575 (+0.19%); split: -0.03%, +0.21%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Daniel Schürmann
764ee3a834 radv: don't lower subdword phis to scalar
Totals from 193 (0.24% of 79839) affected shaders: (Navi48)

MaxWaves: 6004 -> 6024 (+0.33%)
Instrs: 169276 -> 166784 (-1.47%); split: -3.01%, +1.53%
CodeSize: 940608 -> 915768 (-2.64%); split: -4.29%, +1.64%
VGPRs: 8012 -> 7716 (-3.69%); split: -3.99%, +0.30%
SpillVGPRs: 185 -> 0 (-inf%)
Scratch: 13568 -> 0 (-inf%)
Latency: 2159787 -> 2147084 (-0.59%); split: -2.86%, +2.28%
InvThroughput: 664022 -> 395859 (-40.38%); split: -42.59%, +2.21%
VClause: 2998 -> 2880 (-3.94%); split: -4.27%, +0.33%
SClause: 3117 -> 3120 (+0.10%)
Copies: 21290 -> 16278 (-23.54%); split: -24.74%, +1.20%
Branches: 4757 -> 4760 (+0.06%); split: -0.34%, +0.40%
PreSGPRs: 7369 -> 7378 (+0.12%); split: -0.11%, +0.23%
PreVGPRs: 4257 -> 3859 (-9.35%); split: -9.94%, +0.59%
VALU: 83173 -> 79804 (-4.05%); split: -5.68%, +1.63%
SALU: 36672 -> 37318 (+1.76%); split: -0.02%, +1.78%
VMEM: 4012 -> 3762 (-6.23%); split: -6.83%, +0.60%
SMEM: 4300 -> 4303 (+0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Daniel Schürmann
fc2fcac04e aco: allow vectorized nir_op_mov
nir_lower_phis_to_scalar() can create these with the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Daniel Schürmann
3f35b1329e aco: allow subdword vector-definitions on some VALU instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Daniel Schürmann
025306a95d aco/isel: refactor emission of bitwise logical operations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Georg Lehmann
9e8ba10447 aco/vn: remove dead instructions early
Dead p_create_vector/p_split_vector left behind by instruction selection slow down
the other passes and negatively affect extract labels in aco_optimizer.

Foz-DB GFX1201:
Totals from 964 (1.20% of 80251) affected shaders:
MaxWaves: 29206 -> 29030 (-0.60%); split: +0.08%, -0.68%
Instrs: 669369 -> 668842 (-0.08%); split: -0.16%, +0.09%
CodeSize: 3385192 -> 3383216 (-0.06%); split: -0.13%, +0.07%
VGPRs: 46788 -> 46848 (+0.13%); split: -0.85%, +0.97%
Latency: 3985660 -> 3892742 (-2.33%); split: -2.54%, +0.21%
InvThroughput: 538296 -> 536761 (-0.29%); split: -0.38%, +0.10%
VClause: 8336 -> 8418 (+0.98%); split: -0.17%, +1.15%
SClause: 17111 -> 17120 (+0.05%); split: -0.20%, +0.25%
Copies: 44393 -> 44239 (-0.35%); split: -1.25%, +0.91%
PreSGPRs: 45417 -> 45419 (+0.00%)
PreVGPRs: 30401 -> 31644 (+4.09%); split: -0.00%, +4.09%
VALU: 348282 -> 348167 (-0.03%); split: -0.15%, +0.12%
SALU: 121454 -> 121410 (-0.04%); split: -0.04%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35825>
2025-07-09 07:23:09 +00:00
Georg Lehmann
82af226690 aco: remove unused swap_srcs from emit_vop3p_instruction
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35825>
2025-07-09 07:23:09 +00:00
Georg Lehmann
96793fb0c1 aco/isel: implement 16bit vec2 shifts
The source bit size mismatch is a bit annoying, but it's still worth it to
vectorize these.

Foz-DB Navi48:
Totals from 85 (0.11% of 80251) affected shaders:
Instrs: 119073 -> 118827 (-0.21%); split: -0.21%, +0.00%
CodeSize: 669604 -> 667552 (-0.31%); split: -0.31%, +0.00%
VGPRs: 4796 -> 4736 (-1.25%)
Latency: 1907685 -> 1901983 (-0.30%); split: -0.32%, +0.02%
InvThroughput: 642603 -> 640680 (-0.30%); split: -0.33%, +0.03%
VClause: 2088 -> 2091 (+0.14%)
Copies: 18300 -> 18394 (+0.51%); split: -0.01%, +0.52%
Branches: 3452 -> 3440 (-0.35%)
VALU: 63378 -> 63144 (-0.37%); split: -0.37%, +0.00%
SALU: 23065 -> 23076 (+0.05%); split: -0.00%, +0.05%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35825>
2025-07-09 07:23:08 +00:00
Daniel Schürmann
2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Rhys Perry
34f1a8f707 aco: handle FPAtomicToDenormModeHazard
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is quite unlikely to happen, but I guess it might be possible and
it's relatively simple to work around.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35884>
2025-07-07 13:02:43 +00:00
Marek Olšák
4263b49778 ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass
This is a cleanup.

Old gs LDS layout: [es outputs][gs outputs][scratch]
Old nogs LDS layout: [xfb/cull][scratch]

New gs LDS layout: [es outputs][scratch|gs outputs]
New nogs LDS layout: [scratch|xfb/cull]

The LDS scratch is moved to the beginning of the preceding buffer in LDS,
while the addresses in that LDS buffer are offset by the scratch size.
It effectively merges the LDS scratch with the preceding buffer in LDS.
Thanks to that, we no longer need the ngg_scratch ABI and the offset
in a user SGPR.

The lowering passes now return the LDS scratch size, which is used
by the drivers to determine the final LDS size.

The ngg_lds_layout SGPR is now unused without GS in RADV.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
2025-07-02 20:27:41 +00:00
Rhys Perry
dce1d4ad4c aco/ra: fix repeated compact_linear_vgprs() in get_reg()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b7738de4f9 ("aco/ra: rework linear VGPR allocation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13431
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35838>
2025-07-02 09:26:04 +00:00
Rhys Perry
21c4400278 aco: update ctx.block when inserting discard block
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13432
Backport-to: 25.1
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35833>
2025-07-01 14:31:11 +00:00
Alyssa Rosenzweig
67237b6f1b treewide: use nir_break_if
Via Coccinelle patch:

    @@
    expression builder, condition;
    @@

    -nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, NULL);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:24 -04:00
Natalie Vock
af86cc37d5 aco/spill: Don't spill scratch_rsrc-related temps
These temps are used to create the scratch_rsrc. Spilling them will
never benefit anything, because assign_spill_slots will insert code
that keeps them live. Since the spiller assumes all spilled variables
to be dead, this can cause more variables being live than intended and
spilling to fail.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
acf29e403a aco/spill: Add a null scratch offset if no scratch_offset arg exists
Function callees' scratch_rsrc comes with the scratch offset
pre-applied.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
630913e1b4 aco: Introduce static_scratch_rsrc program member
Function callees get their scratch resource as a parameter instead of
generating it on-the-fly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
e006f68b11 aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
a5eba11657 aco/isel: Use stack pointer parameter in load/store_scratch
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:53 +00:00
Natalie Vock
4a62b342f3 aco: Add common utility to load scratch descriptor
Also modifies the scratch descriptor to take the stack pointer into
account.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:52 +00:00
Natalie Vock
cd2caa5e2b aco/spill: Use scratch stack pointer
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:52 +00:00
Natalie Vock
22624d6f12 aco: Add scratch stack pointer
Function callees shouldn't overwrite caller's stacks.
Track where to write scratch data with a stack pointer.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:52 +00:00
Natalie Vock
be89c02be5 aco: Add pseudo instr to calculate a function callee's stack pointer
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>
2025-06-26 11:02:52 +00:00