fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 00:48:07 +02:00

Author	SHA1	Message	Date
Georg Lehmann	96793fb0c1	aco/isel: implement 16bit vec2 shifts The source bit size mismatch is a bit annoying, but it's still worth it to vectorize these. Foz-DB Navi48: Totals from 85 (0.11% of 80251) affected shaders: Instrs: 119073 -> 118827 (-0.21%); split: -0.21%, +0.00% CodeSize: 669604 -> 667552 (-0.31%); split: -0.31%, +0.00% VGPRs: 4796 -> 4736 (-1.25%) Latency: 1907685 -> 1901983 (-0.30%); split: -0.32%, +0.02% InvThroughput: 642603 -> 640680 (-0.30%); split: -0.33%, +0.03% VClause: 2088 -> 2091 (+0.14%) Copies: 18300 -> 18394 (+0.51%); split: -0.01%, +0.52% Branches: 3452 -> 3440 (-0.35%) VALU: 63378 -> 63144 (-0.37%); split: -0.37%, +0.00% SALU: 23065 -> 23076 (+0.05%); split: -0.00%, +0.05% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35825>	2025-07-09 07:23:08 +00:00
Daniel Schürmann	2c51a8870d	nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to nir_lower_alu_width(), the callback can return the desired number of components for a phi, or 0 for no lowering. The previous behavior of nir_lower_phis_to_scalar() with lower_all=true can be elicited via nir_lower_all_phis_to_scalar() while the previous behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar() with NULL callback. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
Rhys Perry	34f1a8f707	aco: handle FPAtomicToDenormModeHazard Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This is quite unlikely to happen, but I guess it might be possible and it's relatively simple to work around. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35884>	2025-07-07 13:02:43 +00:00
Marek Olšák	4263b49778	ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass This is a cleanup. Old gs LDS layout: [es outputs][gs outputs][scratch] Old nogs LDS layout: [xfb/cull][scratch] New gs LDS layout: [es outputs][scratch\|gs outputs] New nogs LDS layout: [scratch\|xfb/cull] The LDS scratch is moved to the beginning of the preceding buffer in LDS, while the addresses in that LDS buffer are offset by the scratch size. It effectively merges the LDS scratch with the preceding buffer in LDS. Thanks to that, we no longer need the ngg_scratch ABI and the offset in a user SGPR. The lowering passes now return the LDS scratch size, which is used by the drivers to determine the final LDS size. The ngg_lds_layout SGPR is now unused without GS in RADV. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:41 +00:00
Rhys Perry	dce1d4ad4c	aco/ra: fix repeated compact_linear_vgprs() in get_reg() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `b7738de4f9` ("aco/ra: rework linear VGPR allocation") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13431 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35838>	2025-07-02 09:26:04 +00:00
Rhys Perry	21c4400278	aco: update ctx.block when inserting discard block Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13432 Backport-to: 25.1 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35833>	2025-07-01 14:31:11 +00:00
Alyssa Rosenzweig	67237b6f1b	treewide: use nir_break_if Via Coccinelle patch: @@ expression builder, condition; @@ -nir_push_if(builder, condition); -{ -nir_jump(builder, nir_jump_break); -} -nir_pop_if(builder, NULL); +nir_break_if(builder, condition); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>	2025-06-30 14:51:24 -04:00
Natalie Vock	af86cc37d5	aco/spill: Don't spill scratch_rsrc-related temps These temps are used to create the scratch_rsrc. Spilling them will never benefit anything, because assign_spill_slots will insert code that keeps them live. Since the spiller assumes all spilled variables to be dead, this can cause more variables being live than intended and spilling to fail. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:53 +00:00
Natalie Vock	acf29e403a	aco/spill: Add a null scratch offset if no scratch_offset arg exists Function callees' scratch_rsrc comes with the scratch offset pre-applied. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:53 +00:00
Natalie Vock	630913e1b4	aco: Introduce static_scratch_rsrc program member Function callees get their scratch resource as a parameter instead of generating it on-the-fly. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:53 +00:00
Natalie Vock	e006f68b11	aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:53 +00:00
Natalie Vock	a5eba11657	aco/isel: Use stack pointer parameter in load/store_scratch Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:53 +00:00
Natalie Vock	4a62b342f3	aco: Add common utility to load scratch descriptor Also modifies the scratch descriptor to take the stack pointer into account. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:52 +00:00
Natalie Vock	cd2caa5e2b	aco/spill: Use scratch stack pointer Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:52 +00:00
Natalie Vock	22624d6f12	aco: Add scratch stack pointer Function callees shouldn't overwrite caller's stacks. Track where to write scratch data with a stack pointer. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:52 +00:00
Natalie Vock	be89c02be5	aco: Add pseudo instr to calculate a function callee's stack pointer Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35031>	2025-06-26 11:02:52 +00:00
Daniel Schürmann	7620957193	aco/ra: always set fill_operands=true when handling operands Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This makes the behavior consistent and less prone to error. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35735>	2025-06-26 10:05:07 +00:00
Daniel Schürmann	ee8424d839	aco/ra: always fill moved operands when handling vector-operands update_renames() assumes that killed operands are already removed from the register file, except for precolored and copy-kill operands. When dealing with vector-operands, however, unrelated operands might also be moved, in order to make space. Fixes: `fb689f133e` ('aco/ra: handle register assignment of vector-aligned operands') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35735>	2025-06-26 10:05:07 +00:00
Samuel Pitoiset	e91029c82d	aco: consider that nir_tex_src_{coord,ddx} can be the first source Only -1 means it's not found, but 0 is still valid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35736>	2025-06-25 17:20:02 +00:00
Georg Lehmann	01d20680e2	aco/optimizer: generalize p_create_vector of split vector opt Foz-DB Navi48: Totals from 116 (0.14% of 80251) affected shaders: MaxWaves: 2965 -> 2972 (+0.24%) Instrs: 145933 -> 144632 (-0.89%); split: -0.91%, +0.02% CodeSize: 815968 -> 806512 (-1.16%); split: -1.20%, +0.04% VGPRs: 7240 -> 7144 (-1.33%); split: -1.66%, +0.33% Latency: 3065858 -> 3063802 (-0.07%); split: -0.11%, +0.05% InvThroughput: 745395 -> 743506 (-0.25%); split: -0.26%, +0.01% VClause: 3702 -> 3694 (-0.22%); split: -0.65%, +0.43% SClause: 3187 -> 3191 (+0.13%) Copies: 12716 -> 11804 (-7.17%); split: -7.42%, +0.25% Branches: 3501 -> 3503 (+0.06%) PreVGPRs: 5400 -> 5327 (-1.35%); split: -1.41%, +0.06% VALU: 76455 -> 75492 (-1.26%); split: -1.30%, +0.04% SALU: 23594 -> 23595 (+0.00%); split: -0.00%, +0.01% VOPD: 1478 -> 1527 (+3.32%); split: +4.67%, -1.35% Mostly helps FSR4. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35674>	2025-06-25 11:03:30 +00:00
Georg Lehmann	001cd632ee	aco: select float8 to fp32 conversions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Georg Lehmann	19ca4be6b0	aco/isel: fix get_alu_src with 8bit vec2 source Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Georg Lehmann	f047a67fba	nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Georg Lehmann	9e6adcbca0	aco: select fp32 to float8 conversions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:26 +00:00
Georg Lehmann	3a45802514	aco/lower_to_hw: support saturating fp8 conversions Sadly amd only made this behavior controlable with global state. We add a new pseudo opcode for this purpose and change FP16_OVFL for each instruction. Ideally we would only do it once for clauses and after ilp scheduling, but this can be improved in the future. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:25 +00:00
Georg Lehmann	65650cfef8	aco: emit float8 wmma Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:25 +00:00
Rhys Perry	325dfd809a	radv,aco: switch to shader statistics framework Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12756 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35583>	2025-06-20 09:26:58 +00:00
Rhys Perry	2cfd2d3b1d	aco/tests: add lower_branches tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:39 +00:00
Rhys Perry	c45482e652	aco: validate that preds/succs match This isn't done in validate_cfg() because that's called less frequently. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:39 +00:00
Rhys Perry	85db025cd7	aco: continue when try_remove_simple_block can't remove a predecessor We should update linear_preds so that the predecessors we can remove are actually removed. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:38 +00:00
Rhys Perry	5344abbc56	aco/lower_branches: keep blocks with multiple logical successors It might be the case that both the branch and exec mask write in a divergent branch block are removed. try_remove_simple_block() might then try to remove it, but fail because it has multiple logical successors. Instead, just skip these blocks. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.1 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>	2025-06-19 10:58:38 +00:00
Georg Lehmann	001fe8c236	aco: optimize boolean phi with empty else block We can keep the else empty by handling the phi in the "then" block. Foz-DB Navi21: Totals from 921 (1.15% of 80065) affected shaders: Instrs: 4532598 -> 4527309 (-0.12%); split: -0.12%, +0.00% CodeSize: 24498484 -> 24481780 (-0.07%); split: -0.08%, +0.01% Latency: 41016915 -> 41020477 (+0.01%); split: -0.10%, +0.11% InvThroughput: 9998405 -> 9991873 (-0.07%); split: -0.08%, +0.02% SClause: 128261 -> 128267 (+0.00%) Copies: 409949 -> 408585 (-0.33%); split: -0.36%, +0.02% Branches: 169740 -> 169222 (-0.31%); split: -0.58%, +0.27% PreSGPRs: 64408 -> 64398 (-0.02%) VALU: 2972521 -> 2972518 (-0.00%) SALU: 673844 -> 668973 (-0.72%); split: -0.72%, +0.00% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35165>	2025-06-19 07:32:43 +00:00
Georg Lehmann	88753ddd1d	aco: allow nir divergence to be printed again Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32990>	2025-06-19 07:02:20 +00:00
Samuel Pitoiset	d23de4918e	aco: add support for image f32 atomic add It's supported on GFX12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35493>	2025-06-13 08:47:59 +00:00
Pierre-Eric Pelloux-Prayer	3bcbd11a33	aco/isel: fix visit_tex handling of is_sparse For cases when less than 4 components are read, the original code would compute an incorrect dmask. eg: with a single component + is_sparse, the dmask was 0x13: - 0x 3 = coming from nir_def_components_read - 0x10 = the sparse bit While it should have at 2 bits set (1 for the color/depth, 1 for tfe). This caused problem when expand_vector() used the dmask to generate the final results, because the value for the sparse component was read from the wrong index. So after the call to emit_mimg() dmask needs to be adjusted because the components will be stored in order, so if mask is 0x11 the tfe value would be stored at invalid index=5 (while it should be at index=1). This fixes KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupResidency_texture_2d_depth_component16 and KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16 with ACO. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>	2025-06-11 12:11:28 +00:00
Georg Lehmann	f36ac8434c	aco: add a readme entry for v_pk_cvt_u8_f32 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:05 +00:00
Georg Lehmann	94c191e6d9	aco: remove p_v_cvt_pk_u8_f32 Now unused. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:04 +00:00
Georg Lehmann	d95e90ab5f	aco: do not use v_cvt_pk_u8_f32 for f2u8 The ISA docs don't mention this, but instead of always truncating like other integer conversions, this opcode actually uses the single precision rounding mode. We could continue to use the opcode and set the rounding mode to rtz in lower_to_hw_instrs, but I think I should just concede that f2u8 isn't worth the effort. Fixes: `9bb10b58` ("aco: use v_cvt_pk_u8_f32 for f2u8") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>	2025-06-10 07:32:04 +00:00
Natalie Vock	a28515f096	aco/opt: Rename loop header phis Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fossil stats on top of !35269: Totals from 133 (0.16% of 81077) affected shaders: Instrs: 4328456 -> 4327891 (-0.01%) CodeSize: 22890004 -> 22887732 (-0.01%); split: -0.01%, +0.00% Latency: 28406452 -> 28404732 (-0.01%) InvThroughput: 5361458 -> 5361153 (-0.01%) Copies: 376788 -> 376222 (-0.15%) VALU: 2429210 -> 2428645 (-0.02%) VOPD: 57 -> 56 (-1.75%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35270>	2025-06-09 14:36:44 +00:00
Rhys Perry	00dd0d0dd1	aco: update VALUReadSGPRHazard comment Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>	2025-06-09 10:12:25 +00:00
Rhys Perry	a714a19e16	aco/gfx12: fix VALUReadSGPRHazard with carry-out fossil-db (gfx1201): Totals from 370 (0.46% of 79653) affected shaders: Instrs: 3933639 -> 3935914 (+0.06%) CodeSize: 20743448 -> 20752068 (+0.04%); split: -0.00%, +0.04% Latency: 26261246 -> 26261921 (+0.00%); split: -0.00%, +0.00% InvThroughput: 5363675 -> 5363760 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `65f95ae74e` ("aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>	2025-06-09 10:12:25 +00:00
Marek Olšák	80236f2367	ac/nir/tess: add if/endif for HS threads in NIR instead of ACO/LLVM This just removes the if/endif wrapping for LLVM, and hopefully the ACO change does the same thing. ACO had redundant code in endif_merged_wave_info, which is removed here. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Rhys Perry	86ccceb4de	aco: don't consider gfx1153 to have point sample acceleration Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	f10b49781d	aco: make all wait entries linear If we remove exec skips, then we can wait for an entry on all paths in the linear cfg, but not the logical cfg. fossil-db (gfx1201): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 1586 (1.99% of 79653) affected shaders: Instrs: 5118897 -> 5113206 (-0.11%); split: -0.11%, +0.00% CodeSize: 28365852 -> 28343696 (-0.08%); split: -0.08%, +0.00% Latency: 47820341 -> 47799532 (-0.04%); split: -0.09%, +0.05% InvThroughput: 9904391 -> 9908653 (+0.04%); split: -0.02%, +0.06% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:55:13 +01:00
Rhys Perry	1088ac49db	aco: sometimes join linear wait entries on logical edges fossil-db (gfx1201): Totals from 1303 (1.64% of 79653) affected shaders: Instrs: 6920949 -> 6917692 (-0.05%); split: -0.06%, +0.01% CodeSize: 37112404 -> 37095728 (-0.04%); split: -0.05%, +0.01% Latency: 70471343 -> 70365986 (-0.15%); split: -0.15%, +0.00% InvThroughput: 11515673 -> 11504666 (-0.10%); split: -0.10%, +0.01% fossil-db (navi31): Totals from 1293 (1.62% of 79653) affected shaders: Instrs: 6500186 -> 6496761 (-0.05%); split: -0.06%, +0.01% CodeSize: 34562712 -> 34549236 (-0.04%); split: -0.04%, +0.01% Latency: 68604746 -> 68666532 (+0.09%); split: -0.15%, +0.24% InvThroughput: 11276591 -> 11284914 (+0.07%); split: -0.10%, +0.17% fossil-db (navi21): Totals from 811 (1.02% of 79653) affected shaders: Instrs: 4110953 -> 4108788 (-0.05%); split: -0.05%, +0.00% CodeSize: 22955984 -> 22948064 (-0.03%); split: -0.03%, +0.00% Latency: 35070231 -> 35064448 (-0.02%); split: -0.02%, +0.00% InvThroughput: 6945610 -> 6945053 (-0.01%); split: -0.01%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	c1f8537131	aco: skip waitcnt between two vmem writing different lanes fossil-db (gfx1201): Totals from 1382 (1.74% of 79653) affected shaders: Instrs: 6531704 -> 6523935 (-0.12%); split: -0.12%, +0.00% CodeSize: 34992076 -> 34933568 (-0.17%); split: -0.17%, +0.01% Latency: 70183360 -> 69616066 (-0.81%); split: -0.81%, +0.00% InvThroughput: 11155445 -> 11068667 (-0.78%); split: -0.78%, +0.00% fossil-db (navi31): Totals from 46 (0.06% of 79653) affected shaders: Instrs: 1833768 -> 1833732 (-0.00%) CodeSize: 9468788 -> 9468716 (-0.00%) Latency: 11683092 -> 11667865 (-0.13%) InvThroughput: 2274377 -> 2272872 (-0.07%) fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9649deb50e	aco: skip waitcnt between two vmem writing different halves fossil-db (gfx1201): Totals from 4 (0.01% of 79653) affected shaders: Instrs: 41374 -> 41380 (+0.01%); split: -0.01%, +0.02% CodeSize: 238912 -> 238924 (+0.01%); split: -0.01%, +0.01% Latency: 706714 -> 706410 (-0.04%) InvThroughput: 352269 -> 352118 (-0.04%) VClause: 803 -> 798 (-0.62%) fossil-db (navi31): Totals from 0 (0.00% of 79653) affected shaders: fossil-db (navi21): Totals from 0 (0.00% of 79653) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13028 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	9a38ad3ca7	aco: add wait_entry::logical_events Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	bb99de00f7	aco: add wait_entry::vm_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00
Rhys Perry	b70ecfa588	aco: only join barrier_imm/barrier_events for logical edges fossil-db (gfx1201): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2904 -> 2893 (-0.38%) CodeSize: 14944 -> 14900 (-0.29%) Latency: 14703 -> 14248 (-3.09%) InvThroughput: 1237 -> 1210 (-2.18%) fossil-db (navi31): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2742 -> 2731 (-0.40%) CodeSize: 14136 -> 14092 (-0.31%) Latency: 14744 -> 14287 (-3.10%) InvThroughput: 1241 -> 1213 (-2.26%) fossil-db (navi21): Totals from 3 (0.00% of 79653) affected shaders: Instrs: 2326 -> 2315 (-0.47%) CodeSize: 12472 -> 12428 (-0.35%) Latency: 14921 -> 14465 (-3.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>	2025-06-06 11:51:08 +01:00

1 2 3 4 5 ...

3799 commits