fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 20:08:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	51ba9b956a	aco: use no contract/reassoc instead of exact Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40872>	2026-04-12 17:10:29 +00:00
Georg Lehmann	b0194d0416	aco/tests: fix med3 NaN tests Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40872>	2026-04-12 17:10:28 +00:00
Rhys Perry	1619288a19	aco: ignore copykill+latekill operands in get_temp_reg_changes This is possible with two vectors which share a temporary, though I don't think it currently happens in practice. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40825>	2026-04-10 10:34:45 +00:00
Daniel Schürmann	8cb8c710fb	aco: remove remaining occurences of block_kind_continue It has no purpose anymore. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	74661ccec2	aco/lower_branches: remove handling of block_kind_continue Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	7f0709cff5	aco/opt_value_numbering: remove handling of block_kind_continue Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	a8c4b9f100	aco/lower_phis: remove handling of block_kind_continue Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	16396f2ce6	aco/insert_exec_mask: remove handling of loop continues Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	495c7271a3	aco/isel: remove handling of nir_jump_continue Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	5e89be331f	aco/lower_branches: Fix try_rotate_latch_block() Found by inspection. Fixes: `97f095f6e0` ('aco/lower_branches: Add try_rotate_latch_block() optimization') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Daniel Schürmann	60b3e5b3f0	aco/lower_branches: Don't remove branches which jump over loops Entering a loop with empty exec mask might lead to not be able to execute the break condition and lead to infinite loops. Totals from 81 (0.04% of 202440) affected shaders: (Navi48) Instrs: 3040566 -> 3040716 (+0.00%) CodeSize: 17506768 -> 17507188 (+0.00%) Latency: 16342966 -> 16345166 (+0.01%) InvThroughput: 3112932 -> 3113286 (+0.01%) Branches: 82229 -> 82365 (+0.17%) Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40628>	2026-04-10 08:51:39 +00:00
Olle Lögdahl	c69da756d1	aco/isel: added test-case for iterative cf visitor Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details isel.cf.deep_traversal is a new ACO test that verifies that the iterative nir cf visitor allows arbitrary depth. A depth of 10000 would cause a stack overflow on x86-64 linux (4096 kB stack) for the old recursive code. This test is by default not enabled. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>	2026-04-09 13:46:23 +00:00
Olle Lögdahl	aa49d69ea0	aco/isel: use iterative visitor during traversal When iterating control-flow recursively, we always run the risk of causing a stack overflow if the control-flow depth is too large. This patch resolves this by visiting control-flow nodes in an iterative way, managing an explicit stack on the heap. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>	2026-04-09 13:46:23 +00:00
Daniel Schürmann	37e2deab74	aco/isel: Remove if_context* parameter from begin_if() / end_if() helper functions We can transparently create the context inside the functions, now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>	2026-04-09 13:46:23 +00:00
Daniel Schürmann	53836320a9	aco/isel: Remove loop_context* parameter from begin_loop() / end_loop() helper functions We can transparently create the context inside the functions, now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>	2026-04-09 13:46:22 +00:00
Olle Lögdahl	5c1dea7ee4	aco/isel: move if_context and loop_context to heap if_context and loop_context are large structs and may cause stack overflows during CF traversal. This fix moves them to the heap. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>	2026-04-09 13:46:22 +00:00
Georg Lehmann	44a061a034	aco/spill: fix mixed lds+scratch spill/reload We shouldn't increment the scratch offset while accessing LDS. Fixes: `133ef9f94b` ("aco: spill VGPRs to LDS if it doesn't further limit occupancy") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40855>	2026-04-09 07:51:52 +00:00
Rhys Perry	463e3643f2	nir: add and use block predecessor helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40242>	2026-04-08 15:06:32 +00:00
Georg Lehmann	d1ed4e1774	aco/optimizer: do not try to create 3 byte constant operands Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Operand::get_const will assert. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15239 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40828>	2026-04-08 09:17:26 +00:00
Georg Lehmann	792ce7ddf6	aco/isel: optimize 16/64bit non constant valu bit test By using the constant path we can combine the v_and and the v_cmp. Foz-DB GFX1201: Totals from 2 (0.00% of 205032) affected shaders: Instrs: 2833 -> 2831 (-0.07%) Latency: 27385 -> 27367 (-0.07%) InvThroughput: 1712 -> 1710 (-0.12%) VALU: 1301 -> 1299 (-0.15%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40705>	2026-04-08 08:44:20 +00:00
Natalie Vock	fded5e321d	aco: Nuke ACO-side prolog selection Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	b53dc3f052	aco/lower_to_hw_instr: Run p_init_scratch if the program has a call Callees may use scratch even if the caller doesn't. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	378c9536de	aco/isel: Fix stack_ptr synthesis info.stack_ptr.is_reg is always true. We have a stack pointer to use if and only if the program is a callee. Also, apply_scratch_offset needs to be true in a few more places. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	31e08322d7	aco/spill_preserved: Only compute preserved registers if in a callee Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Georg Lehmann	5453419086	aco/isel: use s_bitcmp1 for 1bit ubfe Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Avoid the s_pack at the cost of having to use scc. Foz-DB GFX1201: Totals from 1514 (0.74% of 205032) affected shaders: Instrs: 3443431 -> 3434096 (-0.27%); split: -0.27%, +0.00% CodeSize: 19062100 -> 19024320 (-0.20%); split: -0.20%, +0.00% Latency: 22343329 -> 22342802 (-0.00%); split: -0.01%, +0.01% InvThroughput: 4471707 -> 4471632 (-0.00%); split: -0.00%, +0.00% Copies: 280191 -> 279645 (-0.19%); split: -0.21%, +0.01% PreSGPRs: 71333 -> 71327 (-0.01%) VALU: 1598064 -> `1598058` (-0.00%); split: -0.00%, +0.00% SALU: 691458 -> 686437 (-0.73%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40707>	2026-03-31 10:42:33 +00:00
Karol Herbst	5bb3c9f69c	nir: rename fsin_amd and fcos_amd to a more generic name Nvidia implements both the same way as AMD does, so it makes sense to allow for code sharing here. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541>	2026-03-31 01:47:29 +02:00
Georg Lehmann	ae2968c4ec	aco: allow spilling to LDS in RT shaders without stack pointer Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details No Foz-DB changes because most RT shaders use function calls now. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367>	2026-03-27 13:08:44 +00:00
Georg Lehmann	133ef9f94b	aco: spill VGPRs to LDS if it doesn't further limit occupancy Only use LDS for VGPR spilling if we can use addtid access, to avoid having a VGPR addr. Limit to single wave workgroups, to avoid needing the wave_id for the offset. If we have a scratch stack pointer, don't use LDS at all. Limit LDS spilling to not reduce occupancy further. Note that in theory, this can still limit occupancy of other shaders running on the CU at the same time, but that's unlikely and impossible to know at this point. Removes all scratch usage in emulated FSR4 and parallel_rdp. Besides that, only a single GoW shader is affected. Foz-DB Navi31: Totals from 9 (0.01% of 114641) affected shaders: Instrs: 68863 -> 68830 (-0.05%); split: -0.07%, +0.02% CodeSize: 416108 -> 416000 (-0.03%); split: -0.05%, +0.02% LDS: 2048 -> 45056 (+2100.00%) Scratch: 261888 -> 220672 (-15.74%) Latency: 727951 -> 657155 (-9.73%); split: -9.73%, +0.00% InvThroughput: 418644 -> 383269 (-8.45%) VClause: 1506 -> 1200 (-20.32%) Copies: 10651 -> 10624 (-0.25%) VALU: 48700 -> 48684 (-0.03%) SALU: 6200 -> 6199 (-0.02%); split: -0.05%, +0.03% VMEM: 4139 -> 3589 (-13.29%) VOPD: 580 -> 574 (-1.03%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367>	2026-03-27 13:08:44 +00:00
Georg Lehmann	17a9ee7152	aco/optimizer: apply dpp to v_dot before RA for gfx10.3 This is a bit unusual, as we otherwise only use the VOP2 codesize optimization opcodes in the register allocator. But unless we change the scheduler to not split v_mov_b32_dpp and v_dot, we have no other choice. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40510>	2026-03-24 09:05:40 +00:00
Emre Cecanpunar	c60e5df798	aco: drop optimizer peephole TODO comment The remaining items are either handled elsewhere or unlikely to be implemented in the optimizer. Signed-off-by: Emre Cecanpunar <emreleno@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40497>	2026-03-23 11:03:59 +00:00
Georg Lehmann	559a35dcb3	aco: skip fract for sin/cos on gfx6-8 if the src is already in range Foz-DB Polaris10: Totals from 1301 (1.86% of 69950) affected shaders: Instrs: 1447217 -> 1445610 (-0.11%); split: -0.11%, +0.00% CodeSize: 7775988 -> 7769588 (-0.08%); split: -0.08%, +0.00% SGPRs: 101712 -> 101776 (+0.06%) SpillSGPRs: 931 -> 927 (-0.43%) Latency: 16119433 -> 16115293 (-0.03%); split: -0.03%, +0.01% InvThroughput: 9605952 -> 9577042 (-0.30%); split: -0.31%, +0.01% VClause: 24591 -> 24593 (+0.01%); split: -0.01%, +0.02% SClause: 29656 -> 29655 (-0.00%) Copies: 133968 -> 134001 (+0.02%); split: -0.01%, +0.03% VALU: 1157855 -> 1156235 (-0.14%) SALU: 124626 -> 124639 (+0.01%); split: -0.00%, +0.01% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40545>	2026-03-23 09:27:32 +00:00
Marek Olšák	2283244975	nir: change export_amd intrinsics to use target instead of base Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	b75a3112fd	nir: change export_amd intrinsics to use enabled_channels instead of write_mask Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Daniel Schürmann	4b238690cb	aco/tests: add and lower loop continue constructs in all tests which use continues We are going to disallow continue statements without loop continue constructs. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Rhys Perry	e2ebcba11b	aco/tests: fix assembler/isel tests with LLVM 23 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 26.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513>	2026-03-20 10:24:06 +00:00
Rhys Perry	0826685f1b	aco/tests: fix assembler tests with LLVM 22 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 26.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513>	2026-03-20 10:24:06 +00:00
Samuel Pitoiset	639207701d	aco,radv,radeonsi: remove debug report support in ACO This doesn't seem very useful since ACO will abort and print the error messages to stderr. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40379>	2026-03-16 11:55:45 +00:00
Georg Lehmann	d7348ea501	aco/ra: don't tie definition when the operand is in a preserved reg Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:56 +00:00
Georg Lehmann	444eb3dce5	aco/ra: try to allocate registers for dot2 to allow VOPD Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:56 +00:00
Georg Lehmann	788aafba2a	aco/sched_vopd: create dot2acc from VOP3P dot2 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:56 +00:00
Georg Lehmann	47599b2c38	aco/opt_postRA: remove try_convert_fma_to_vop2 This is now done directly in the VOPD scheduler. Foz-DB GFX1201: Totals from 600 (0.52% of 114655) affected shaders: no stats changed Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:56 +00:00
Georg Lehmann	6cef434478	aco/sched_vopd: convert fma with inline constants to fmamk/fmaak This optimization was previously done in the post-RA optimizer, but it is more fitting for the vopd scheduler. Doing it here also has the benefit that we don't unnecessarily use the constant bus when VOPD can't be used. No Foz-DB changes on GFX12 until the next commit. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:56 +00:00
Georg Lehmann	1ae9931145	aco/scheld_vopd: make VOPDInfo more flexible by adding a swizzle No Foz-DB changes on GFX1201. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>	2026-03-10 14:21:55 +00:00
Georg Lehmann	08cac48170	aco/isel: skip min/max for SALU fsat if possible Foz-DB Navi48: Totals from 789 (0.95% of 82636) affected shaders: Instrs: 4144156 -> 4141345 (-0.07%); split: -0.07%, +0.00% CodeSize: 23345212 -> 23333960 (-0.05%); split: -0.05%, +0.00% Latency: 22988205 -> 22986666 (-0.01%); split: -0.01%, +0.00% InvThroughput: 4378321 -> 4377874 (-0.01%); split: -0.01%, +0.00% Copies: 302311 -> 302313 (+0.00%); split: -0.00%, +0.00% SALU: 647622 -> 645901 (-0.27%); split: -0.27%, +0.00% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Rhys Perry	82420ebc2c	aco: fix PS epilog dual-source blending with only one color output Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>	2026-03-05 09:38:23 +00:00
Marek Olšák	fae7aef5ca	ac: tidy up ac_hw_cache_flags Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022>	2026-03-04 21:14:56 +00:00
Rhys Perry	5c3b5688a1	amd: rename ac_cu_info to ac_compiler_info Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>	2026-03-03 08:50:12 +00:00
Rhys Perry	8801ca188d	ac/nir: don't pass radeon_info to ac_nir_set_options Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>	2026-03-03 08:50:10 +00:00
Rhys Perry	17b18496f6	aco: perform dce for blocks skipped for process_block() We might need to DCE users of dead instructions removed by process_block(). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `9e8ba10447` ("aco/vn: remove dead instructions early") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40091>	2026-03-02 13:38:16 +00:00
Marek Olšák	f22f117d1a	amd: add meson variable idep_amd_generated_headers for all generated headers group all generated header under the same variable Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40084>	2026-02-28 05:23:59 +00:00

1 2 3 4 5 ...

4348 commits