fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 15:58:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	f047a67fba	nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Georg Lehmann	9e6adcbca0	aco: select fp32 to float8 conversions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:26 +00:00
Georg Lehmann	3a45802514	aco/lower_to_hw: support saturating fp8 conversions Sadly amd only made this behavior controlable with global state. We add a new pseudo opcode for this purpose and change FP16_OVFL for each instruction. Ideally we would only do it once for clauses and after ilp scheduling, but this can be improved in the future. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:25 +00:00
Georg Lehmann	65650cfef8	aco: emit float8 wmma Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:25 +00:00
Rhys Perry	325dfd809a	radv,aco: switch to shader statistics framework Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12756 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35583>	2025-06-20 09:26:58 +00:00
Rhys Perry	2cfd2d3b1d	aco/tests: add lower_branches tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:39 +00:00
Rhys Perry	c45482e652	aco: validate that preds/succs match This isn't done in validate_cfg() because that's called less frequently. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:39 +00:00
Rhys Perry	85db025cd7	aco: continue when try_remove_simple_block can't remove a predecessor We should update linear_preds so that the predecessors we can remove are actually removed. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:38 +00:00
Rhys Perry	5344abbc56	aco/lower_branches: keep blocks with multiple logical successors It might be the case that both the branch and exec mask write in a divergent branch block are removed. try_remove_simple_block() might then try to remove it, but fail because it has multiple logical successors. Instead, just skip these blocks. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.1 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:38 +00:00
Georg Lehmann	001fe8c236	aco: optimize boolean phi with empty else block We can keep the else empty by handling the phi in the "then" block. Foz-DB Navi21: Totals from 921 (1.15% of 80065) affected shaders: Instrs: 4532598 -> 4527309 (-0.12%); split: -0.12%, +0.00% CodeSize: 24498484 -> 24481780 (-0.07%); split: -0.08%, +0.01% Latency: 41016915 -> 41020477 (+0.01%); split: -0.10%, +0.11% InvThroughput: 9998405 -> 9991873 (-0.07%); split: -0.08%, +0.02% SClause: 128261 -> 128267 (+0.00%) Copies: 409949 -> 408585 (-0.33%); split: -0.36%, +0.02% Branches: 169740 -> 169222 (-0.31%); split: -0.58%, +0.27% PreSGPRs: 64408 -> 64398 (-0.02%) VALU: 2972521 -> 2972518 (-0.00%) SALU: 673844 -> 668973 (-0.72%); split: -0.72%, +0.00% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35165>	2025-06-19 07:32:43 +00:00
Georg Lehmann	88753ddd1d	aco: allow nir divergence to be printed again Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32990>	2025-06-19 07:02:20 +00:00
Samuel Pitoiset	d23de4918e	aco: add support for image f32 atomic add It's supported on GFX12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35493>	2025-06-13 08:47:59 +00:00
Pierre-Eric Pelloux-Prayer	3bcbd11a33	aco/isel: fix visit_tex handling of is_sparse For cases when less than 4 components are read, the original code would compute an incorrect dmask. eg: with a single component + is_sparse, the dmask was 0x13: - 0x 3 = coming from nir_def_components_read - 0x10 = the sparse bit While it should have at 2 bits set (1 for the color/depth, 1 for tfe). This caused problem when expand_vector() used the dmask to generate the final results, because the value for the sparse component was read from the wrong index. So after the call to emit_mimg() dmask needs to be adjusted because the components will be stored in order, so if mask is 0x11 the tfe value would be stored at invalid index=5 (while it should be at index=1). This fixes KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupResidency_texture_2d_depth_component16 and KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16 with ACO. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>	2025-06-11 12:11:28 +00:00
Georg Lehmann	f36ac8434c	aco: add a readme entry for v_pk_cvt_u8_f32 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:05 +00:00
Georg Lehmann	94c191e6d9	aco: remove p_v_cvt_pk_u8_f32 Now unused. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:04 +00:00
Georg Lehmann	d95e90ab5f	aco: do not use v_cvt_pk_u8_f32 for f2u8 The ISA docs don't mention this, but instead of always truncating like other integer conversions, this opcode actually uses the single precision rounding mode. We could continue to use the opcode and set the rounding mode to rtz in lower_to_hw_instrs, but I think I should just concede that f2u8 isn't worth the effort. Fixes: `9bb10b58` ("aco: use v_cvt_pk_u8_f32 for f2u8") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:04 +00:00
Natalie Vock	a28515f096	aco/opt: Rename loop header phis Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fossil stats on top of !35269: Totals from 133 (0.16% of 81077) affected shaders: Instrs: 4328456 -> 4327891 (-0.01%) CodeSize: 22890004 -> 22887732 (-0.01%); split: -0.01%, +0.00% Latency: 28406452 -> 28404732 (-0.01%) InvThroughput: 5361458 -> 5361153 (-0.01%) Copies: 376788 -> 376222 (-0.15%) VALU: 2429210 -> 2428645 (-0.02%) VOPD: 57 -> 56 (-1.75%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35270>	2025-06-09 14:36:44 +00:00
Rhys Perry	00dd0d0dd1	aco: update VALUReadSGPRHazard comment Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>	2025-06-09 10:12:25 +00:00
Rhys Perry	a714a19e16	aco/gfx12: fix VALUReadSGPRHazard with carry-out fossil-db (gfx1201): Totals from 370 (0.46% of 79653) affected shaders: Instrs: 3933639 -> 3935914 (+0.06%) CodeSize: 20743448 -> 20752068 (+0.04%); split: -0.00%, +0.04% Latency: 26261246 -> 26261921 (+0.00%); split: -0.00%, +0.00% InvThroughput: 5363675 -> 5363760 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `65f95ae74e` ("aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>	2025-06-09 10:12:25 +00:00
Marek Olšák	80236f2367	ac/nir/tess: add if/endif for HS threads in NIR instead of ACO/LLVM This just removes the if/endif wrapping for LLVM, and hopefully the ACO change does the same thing. ACO had redundant code in endif_merged_wave_info, which is removed here. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Rhys Perry	86ccceb4de	aco: don't consider gfx1153 to have point sample acceleration Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	f10b49781d	aco: make all wait entries linear If we remove exec skips, then we can wait for an entry on all paths in the linear cfg, but not the logical cfg. fossil-db (gfx1201): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 1586 (1.99% of 79653) affected shaders: Instrs: 5118897 -> 5113206 (-0.11%); split: -0.11%, +0.00% CodeSize: 28365852 -> 28343696 (-0.08%); split: -0.08%, +0.00% Latency: 47820341 -> 47799532 (-0.04%); split: -0.09%, +0.05% InvThroughput: 9904391 -> 9908653 (+0.04%); split: -0.02%, +0.06% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	1088ac49db	aco: sometimes join linear wait entries on logical edges fossil-db (gfx1201): Totals from 1303 (1.64% of 79653) affected shaders: Instrs: 6920949 -> 6917692 (-0.05%); split: -0.06%, +0.01% CodeSize: 37112404 -> 37095728 (-0.04%); split: -0.05%, +0.01% Latency: 70471343 -> 70365986 (-0.15%); split: -0.15%, +0.00% InvThroughput: 11515673 -> 11504666 (-0.10%); split: -0.10%, +0.01% fossil-db (navi31): Totals from 1293 (1.62% of 79653) affected shaders: Instrs: 6500186 -> 6496761 (-0.05%); split: -0.06%, +0.01% CodeSize: 34562712 -> 34549236 (-0.04%); split: -0.04%, +0.01% Latency: 68604746 -> 68666532 (+0.09%); split: -0.15%, +0.24% InvThroughput: 11276591 -> 11284914 (+0.07%); split: -0.10%, +0.17% fossil-db (navi21): Totals from 811 (1.02% of 79653) affected shaders: Instrs: 4110953 -> 4108788 (-0.05%); split: -0.05%, +0.00% CodeSize: 22955984 -> 22948064 (-0.03%); split: -0.03%, +0.00% Latency: 35070231 -> 35064448 (-0.02%); split: -0.02%, +0.00% InvThroughput: 6945610 -> 6945053 (-0.01%); split: -0.01%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	c1f8537131	aco: skip waitcnt between two vmem writing different lanes fossil-db (gfx1201): Totals from 1382 (1.74% of 79653) affected shaders: Instrs: 6531704 -> 6523935 (-0.12%); split: -0.12%, +0.00% CodeSize: 34992076 -> 34933568 (-0.17%); split: -0.17%, +0.01% Latency: 70183360 -> 69616066 (-0.81%); split: -0.81%, +0.00% InvThroughput: 11155445 -> 11068667 (-0.78%); split: -0.78%, +0.00% fossil-db (navi31): Totals from 46 (0.06% of 79653) affected shaders: Instrs: 1833768 -> 1833732 (-0.00%) CodeSize: 9468788 -> 9468716 (-0.00%) Latency: 11683092 -> 11667865 (-0.13%) InvThroughput: 2274377 -> 2272872 (-0.07%) fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9649deb50e	aco: skip waitcnt between two vmem writing different halves fossil-db (gfx1201): Totals from 4 (0.01% of 79653) affected shaders: Instrs: 41374 -> 41380 (+0.01%); split: -0.01%, +0.02% CodeSize: 238912 -> 238924 (+0.01%); split: -0.01%, +0.01% Latency: 706714 -> 706410 (-0.04%) InvThroughput: 352269 -> 352118 (-0.04%) VClause: 803 -> 798 (-0.62%) fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13028 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9a38ad3ca7	aco: add wait_entry::logical_events Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	bb99de00f7	aco: add wait_entry::vm_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	b70ecfa588	aco: only join barrier_imm/barrier_events for logical edges fossil-db (gfx1201): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2904 -> 2893 (-0.38%) CodeSize: 14944 -> 14900 (-0.29%) Latency: 14703 -> 14248 (-3.09%) InvThroughput: 1237 -> 1210 (-2.18%) fossil-db (navi31): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2742 -> 2731 (-0.40%) CodeSize: 14136 -> 14092 (-0.31%) Latency: 14744 -> 14287 (-3.10%) InvThroughput: 1241 -> 1213 (-2.26%) fossil-db (navi21): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2326 -> 2315 (-0.47%) CodeSize: 12472 -> 12428 (-0.35%) Latency: 14921 -> 14465 (-3.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	62a9b4b976	aco: set vmem_types for args_pending_vmem fossil-db (gfx1201): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi31): Totals from 11 (0.01% of 79653) affected shaders: Instrs: 4543 -> 4554 (+0.24%) CodeSize: 23256 -> 23300 (+0.19%) fossil-db (navi21): Totals from 8 (0.01% of 79653) affected shaders: Instrs: 2333 -> 2341 (+0.34%) CodeSize: 12328 -> 12360 (+0.26%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Georg Lehmann	a6675f35b2	aco: clamp exponent of 16bit ldexp The hw uses only a 16bit int, but NIR's src is 32bit. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34073>	2025-06-03 06:34:18 +00:00
Rhys Perry	1fdfdbaf92	aco/hard_clauses: simplify and complete get_type() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This now includes image_msaa_load and the new atomic instructions in GFX12. It also treats point sample accelerated MIMG as either sample or load, like the waitcnt insertion pass. I'm not sure if that's necessary or not, though. No fossil-db changes (gfx1201, gfx1150 and navi31). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>	2025-06-02 10:28:10 +00:00
Rhys Perry	8764ec0230	aco: consider image_msaa_load a sample operation before gfx12 LLVM commit 62dea99a7d7df9daedbb86133f3d46699cd2728d made this instruction a sample for all GFX levels, then with f898161bfa95723954a273a519180e070a5ccd2e it was changed to be GFX12+. Now 34b6285735c999d2fab77b0ff8e5b497d86df3af changed it to be all GFX levels again. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35235>	2025-06-02 10:28:09 +00:00
Samuel Pitoiset	9692ef41a3	aco: implement bitfield_extract for 8-bit/16-bit Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35199>	2025-05-29 12:24:59 +00:00
Samuel Pitoiset	8596150ae8	aco: implement bitfield_reverse for types other than 32-bits Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34583>	2025-05-28 09:52:12 +00:00
Daniel Schürmann	5b4d284493	aco/isel: use vector-aligned operands for image_bvh64_intersect_ray Totals from 93 (0.12% of 79377) affected shaders: (Navi48) MaxWaves: 1376 -> 1368 (-0.58%) Instrs: 3583500 -> 3581861 (-0.05%); split: -0.05%, +0.00% CodeSize: 18792300 -> 18785296 (-0.04%); split: -0.04%, +0.00% VGPRs: 8652 -> 8592 (-0.69%); split: -1.25%, +0.55% Latency: 20861347 -> 20834407 (-0.13%); split: -0.17%, +0.04% InvThroughput: 4032604 -> 4028020 (-0.11%); split: -0.14%, +0.03% VClause: 90507 -> 90525 (+0.02%); split: -0.01%, +0.03% Copies: 279429 -> 277839 (-0.57%); split: -0.58%, +0.01% Branches: 100260 -> 100251 (-0.01%) PreVGPRs: 8949 -> 8771 (-1.99%) VALU: 1955635 -> 1954053 (-0.08%); split: -0.08%, +0.00% SALU: 477347 -> 477329 (-0.00%); split: -0.01%, +0.01% VOPD: 69 -> 61 (-11.59%) Totals from 93 (0.12% of 79377) affected shaders: (Navi31) MaxWaves: 1376 -> 1374 (-0.15%) Instrs: 3442606 -> 3440344 (-0.07%); split: -0.07%, +0.00% CodeSize: 17801008 -> 17790476 (-0.06%); split: -0.07%, +0.01% VGPRs: 8652 -> 8556 (-1.11%); split: -1.25%, +0.14% Latency: 20590943 -> 20542279 (-0.24%); split: -0.27%, +0.03% InvThroughput: 3978133 -> 3969497 (-0.22%); split: -0.25%, +0.03% VClause: 91784 -> 91769 (-0.02%); split: -0.05%, +0.03% Copies: 277177 -> 275263 (-0.69%); split: -0.70%, +0.01% Branches: 100098 -> 100092 (-0.01%); split: -0.02%, +0.01% PreVGPRs: 9021 -> 8843 (-1.97%) VALU: 2001794 -> 1999893 (-0.09%); split: -0.10%, +0.00% SALU: 419504 -> 419559 (+0.01%); split: -0.01%, +0.02% VOPD: 77 -> 64 (-16.88%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Rhys Perry	c50f9541e4	aco/tests: Add tests for vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	b5382faa9c	aco/validate: validate register assignment of vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	9091c3bf5b	aco/ra: add affinities for MIMG vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	fb689f133e	aco/ra: handle register assignment of vector-aligned operands Vector-aligned operands are handled by temporarily allocating a vector-SSA value for the duration of the instruction. On completion of the register assignment, the individual operands are assigned to the reserved register space and, if necessary, parallelcopies are emitted. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	92b1154397	aco/ra: Always rename copy-kill operands, even if the temporary doesn't match This makes it independent of whether the operand already got renamed or not. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	4fad3514a9	aco/ra: only change registers of already handled operands in update_renames() Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	51a2e1eb94	aco/ra: don't use kill-flags as indicator in get_reg_create_vector() We are about to re-use this function for vector-aligned operands. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	3d8b355f22	aco/assembler: support vector-aligned operands on MIMG instructions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:17 +00:00
Daniel Schürmann	8cb1700c74	aco/print_ir: print parenthesis around vector-aligned operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:16 +00:00
Daniel Schürmann	6aabcb02a1	aco/print_ir: only print 'lateKill' if requested via print_kill flag Also only print lateKill for actually killed operands. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:16 +00:00
Daniel Schürmann	a9645fdd89	aco: introduce concept of vector-aligned Operands Operand::isVectorAligned indicates that the Operand is part of a vector consisting of multiple operands. Therefore, it must reside in a register aligned with the next Operand. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:16 +00:00
Daniel Schürmann	a4fa3935fd	aco/live_var_analysis: set same lateKill flags for same operands Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:16 +00:00
Daniel Schürmann	ee0ee282b9	aco: simplify Operand() constructor Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>	2025-05-28 09:24:16 +00:00
Rhys Perry	072e6d1ab5	aco/tests: add tests for tied definitions Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Some of these would have failed before the rewrite. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>	2025-05-20 15:40:47 +00:00
Rhys Perry	b341a12526	aco/ra: rewrite handling of tied definitions The old version worked by precoloring both the operand and definition to whatever register the operand was at the time. This didn't allow moving the operand/definition after precoloring them, or two tied operands with the same temporary. The new version works by temporarily making the operands late kill and creating a copy if the temporary is live-through or used as another tied operand. Then we can simply later make the operands early kill again and assign the definitions to the operand's register. This way, we can move the operand to make space and the new location will not intersect with any other definition and won't cause the operand and definition registers to mismatch. fossil-db (gfx1201): Totals from 2253 (2.84% of 79377) affected shaders: Instrs: 1634747 -> 1630799 (-0.24%); split: -0.27%, +0.03% CodeSize: 8688148 -> 8672348 (-0.18%); split: -0.20%, +0.02% VGPRs: 106500 -> 106512 (+0.01%) Latency: 11385480 -> 11382965 (-0.02%); split: -0.04%, +0.01% InvThroughput: 1754430 -> 1754326 (-0.01%); split: -0.01%, +0.00% SClause: 38954 -> 38964 (+0.03%); split: -0.01%, +0.04% Copies: 110772 -> 110800 (+0.03%); split: -0.02%, +0.04% Branches: 29093 -> 29092 (-0.00%) VALU: 902011 -> 902008 (-0.00%) SALU: 260175 -> 260203 (+0.01%); split: -0.01%, +0.02% fossil-db (navi31): Totals from 2 (0.00% of 79377) affected shaders: Latency: 1766 -> 1765 (-0.06%) InvThroughput: 3219 -> 3215 (-0.12%) fossil-db (navi21): Totals from 14 (0.02% of 79377) affected shaders: (no affected stats) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>	2025-05-20 15:40:47 +00:00

1 2 3 4 5 ...

3777 commits