fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 22:00:13 +01:00

Author	SHA1	Message	Date
Qiang Yu	5ba68f92b4	aco: create exit block for p_end_with_regs to branch to To handle ps discard in radeonsi part mode shader. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>	2023-10-10 02:36:34 +00:00
Georg Lehmann	34d8fa6185	aco/gfx11: optimize dual source export We can combine dpp with the v_cndmask_b32. Foz-DB Navi31: Totals from 222 (0.28% of 79330) affected shaders: Instrs: 564392 -> 563373 (-0.18%); split: -0.19%, +0.01% CodeSize: 2867040 -> 2864728 (-0.08%); split: -0.09%, +0.01% Latency: 4278957 -> 4275199 (-0.09%); split: -0.09%, +0.00% InvThroughput: 586636 -> 585824 (-0.14%); split: -0.14%, +0.00% SClause: 20210 -> 20211 (+0.00%); split: -0.02%, +0.02% Copies: 39763 -> 39778 (+0.04%); split: -0.13%, +0.17% PreVGPRs: 13924 -> 13922 (-0.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25541>	2023-10-05 10:37:34 +00:00
Rhys Perry	26fce534b5	aco: shrink DPP8_instruction Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>	2023-10-04 18:53:43 +00:00
Georg Lehmann	4ea611bca0	aco: fix p_extract with v1 dst and s1 operand Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f14023666c` ("aco: Allow p_extract to have different definition and operand sizes.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25403>	2023-09-27 14:12:29 +00:00
Daniel Schürmann	040142684c	aco: make p_wqm a marker instruction without Operands/Definitions Totals from 28277 (36.93% of 76572) affected shaders: (GFX11) MaxWaves: 833930 -> 833898 (-0.00%); split: +0.01%, -0.01% Instrs: 21366950 -> 21353346 (-0.06%); split: -0.11%, +0.05% CodeSize: 112855368 -> 112610508 (-0.22%); split: -0.24%, +0.03% VGPRs: 1157748 -> 1158540 (+0.07%); split: -0.10%, +0.17% SpillSGPRs: 2465 -> 2463 (-0.08%); split: -0.16%, +0.08% Latency: 168339886 -> 168383646 (+0.03%); split: -0.10%, +0.12% InvThroughput: 25164895 -> 25158376 (-0.03%); split: -0.08%, +0.06% VClause: 347660 -> 346256 (-0.40%); split: -0.55%, +0.15% SClause: 794460 -> 799521 (+0.64%); split: -0.33%, +0.97% Copies: 1151908 -> 1148370 (-0.31%); split: -0.54%, +0.23% Branches: 359447 -> 359437 (-0.00%); split: -0.01%, +0.00% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>	2023-09-14 09:25:22 +00:00
Rhys Perry	1d29a1e2fc	aco: add adjust_bpermute_dst helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>	2023-08-23 12:36:46 +01:00
Rhys Perry	9169fbf83c	aco: clarify bpermute pseudo opcode names Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>	2023-08-23 12:36:46 +01:00
Rhys Perry	8a024c985f	aco: fix p_bpermute_gfx6's exec save/restore with wave32 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>	2023-08-23 12:36:46 +01:00
Rhys Perry	85957dd6e5	aco: fix p_bpermute_gfx6 with input at non-zero byte Same as the other bpermute pseudo instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>	2023-08-23 12:36:46 +01:00
Georg Lehmann	7a3e5dd2ec	aco: use s_bitreplicate_b64_b32 to set exec to 0xffff0000ffff0000 Foz-DB Navi21: Totals from 29 (0.02% of 132657) affected shaders: Instrs: 19342 -> 19301 (-0.21%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24561>	2023-08-09 20:29:01 +00:00
Vitaliy Triang3l Kuzmin	e0f4b52559	aco: Add Primitive Ordered Pixel Shading waitcnt rules When letting the overlapping waves enter their ordered sections, there must be no memory accesses to resources which need primitive-ordered access that are still pending, or there would be a race between the current wave and the overlapping waves. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>	2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin	a87628cd08	aco: Send MSG_ORDERED_PS_DONE where necessary If the wave has set the Primitive Ordered Pixel Shading packer ID hardware register, it must send MSG_ORDERED_PS_DONE once before the program ends. It's also safe to send the message if the packer ID register hasn't been set yet, therefore the message may be sent conservatively. For simplicity, to ensure that it's sent on all execution paths after setting the packer ID register, always sending it from a top-level block. This is required for GFX9-10.3 POPS. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>	2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin	f8e744f07f	aco: Add Primitive Ordered Pixel Shading pseudo-instructions Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>	2023-06-26 15:58:04 +00:00
Timur Kristóf	05928f4200	aco: Use ac_hw_stage instead of aco-specific HWStage. The new ac_hw_stage is going to be used by drivers as well. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>	2023-06-23 12:49:04 +00:00
Rhys Perry	cfa7eec06c	aco: don't set exec_hi for wave32 scan reductions fossil-db (wave32): Totals from 21 (0.02% of 133428) affected shaders: Instrs: 10778 -> 10712 (-0.61%) CodeSize: 56604 -> 56208 (-0.70%) Latency: 168293 -> 168251 (-0.02%) InvThroughput: 25256 -> 25253 (-0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23745>	2023-06-21 17:58:44 +00:00
Georg Lehmann	68f7c53814	aco/gfx10+: use v_cndmask with literal for reduction identity Totals from 10 (0.01% of 132657) affected shaders: CodeSize: 171576 -> 171288 (-0.17%) Instrs: 32127 -> 32055 (-0.22%) Latency: 219145 -> 219027 (-0.05%) InvThroughput: 130287 -> 130041 (-0.19%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23695>	2023-06-20 14:48:18 +00:00
Eric Engestrom	6b21653ab4	aco: reformat according to its .clang-format Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23253>	2023-06-16 19:59:52 +00:00
Daniel Schürmann	f66f274304	aco: implement nir_intrinsic_load_resume_shader_address_amd Similar to p_constaddr but targeting BBs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>	2023-06-08 00:37:03 +00:00
Rhys Perry	35c133a77b	aco: add MIMG_instruction::strict_wqm This lets us use linear VGPRs for part of the texture sample's address. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>	2023-05-25 16:29:16 +00:00
Rhys Perry	1a6a57ac96	aco: let p_start_linear_vgpr take an operand Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>	2023-05-25 16:29:16 +00:00
Qiang Yu	3c59df7318	aco: get scratch addr from symbol for radeonsi Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22727>	2023-04-28 11:33:28 +08:00
Harri Nieminen	aea48a4ff1	amd: fix typos Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22432>	2023-04-13 23:08:22 +00:00
Timur Kristóf	8e9d269da6	aco: Don't use nir_selection_control in aco_ir. We don't want to rely on any NIR structures in ACO, because we would like to avoid the need to include nir.h in aco_ir. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>	2023-04-10 20:01:28 +00:00
Timur Kristóf	54da863956	aco: Consider p_cbranch_nz as divergent branch too. A p_cbranch_nz instruction that reads exec is divergent too. Fixes: `f030b75b7d` Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>	2023-04-03 14:36:07 +00:00
Georg Lehmann	8ee1519cee	aco/to_hw_instr: use VOP1 opsel for v_mov_b16 Foz-DB GFX1100: Totals from 4661 (3.46% of 134864) affected shaders: CodeSize: 36500568 -> 36391704 (-0.30%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069>	2023-03-30 03:34:34 +00:00
Daniel Schürmann	39c828cb9f	aco: remove aco::rt_stack variable Since we initialize scratch in the RT proglog, there is no need for this variable anymore. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>	2023-03-16 01:40:30 +00:00
Daniel Schürmann	7d35bf24f6	aco: create hw_init_scratch() function for p_init_scratch lowering Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>	2023-03-16 01:40:30 +00:00
Daniel Schürmann	41ae2d0725	radv/rt: use terminate() when returning from raygen shaders Q2RTX stats: Totals from 7 (0.01% of 134913) affected shaders: CodeSize: 204712 -> 204744 (+0.02%); split: -0.06%, +0.07% Instrs: 37526 -> 37522 (-0.01%); split: -0.07%, +0.06% Latency: 950563 -> 956024 (+0.57%) InvThroughput: 187915 -> 188977 (+0.57%) Copies: 4829 -> 4763 (-1.37%) Branches: 1570 -> 1583 (+0.83%) PreSGPRs: 407 -> 400 (-1.72%) PreVGPRs: 614 -> 617 (+0.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736>	2023-03-08 16:59:41 +00:00
Georg Lehmann	097a97cc42	aco: remove VOP[123C]P? structs Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>	2023-03-07 11:53:23 +00:00
Georg Lehmann	77afe7d960	aco: treat VINTERP_INREG as VALU It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly be handled as a pure VALU instruction. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>	2023-03-07 11:53:23 +00:00
Daniel Schürmann	b338d59047	radv: unconditionally enable scratch for RT shaders Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Rhys Perry	b4383821e7	aco: don't modify exec in p_interp_gfx11 The RDNA3 ISA docs say that lds_param_load write the entire quad regardless of exec, so this isn't needed. fossil-db (gfx1100): Totals from 5291 (3.93% of 134574) affected shaders: Instrs: 4891396 -> 4789628 (-2.08%) CodeSize: 25519032 -> 25111960 (-1.60%) Latency: 36122982 -> 36074300 (-0.13%); split: -0.14%, +0.00% InvThroughput: 4162436 -> 4161424 (-0.02%); split: -0.02%, +0.00% Copies: 263862 -> 263838 (-0.01%) PreSGPRs: 225012 -> 224179 (-0.37%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21171>	2023-02-08 19:35:54 +00:00
Georg Lehmann	2b264455b5	aco: use s_pack_ll_b32_b16 for constant copies Totals from 2 (0.00% of 134913) affected shaders: CodeSize: 28636 -> 28628 (-0.03%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970>	2023-02-01 17:07:25 +00:00
Georg Lehmann	9ee9b0859b	aco: use s_bfm_64 for constant copies Foz-DB Navi21: Totals from 1025 (0.76% of 134913) affected shaders: CodeSize: 1436752 -> 1432412 (-0.30%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970>	2023-02-01 17:07:25 +00:00
Rhys Perry	c3dd1931d9	aco: allow Builder::Result to be dereferenced Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251>	2023-01-10 16:01:38 +00:00
Rhys Perry	e386523380	aco/gfx11: fix discard early exit removal optimization This optimization never happened because the NULL target was removed in GFX11. fossil-db (gfx1100): Totals from 5439 (4.04% of 134574) affected shaders: Instrs: 407865 -> 387123 (-5.09%) CodeSize: 2163340 -> 2060644 (-4.75%) Latency: 3432378 -> 3327802 (-3.05%) InvThroughput: 270133 -> 262980 (-2.65%) Branches: 8524 -> 3085 (-63.81%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20513>	2023-01-10 14:01:29 +00:00
Georg Lehmann	39b7502f04	aco: Use v_mov_b16 on GFX11. Foz-DB GFX1100: Totals from 4684 (3.47% of 134913) affected shaders: CodeSize: 41086444 -> 41043476 (-0.10%) Instrs: 8176019 -> 8175995 (-0.00%) Latency: 83792071 -> 83792023 (-0.00%) InvThroughput: 10311371 -> 10311369 (-0.00%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20369>	2023-01-03 22:49:46 +00:00
Rhys Perry	192486b7aa	aco/gfx11: export mrtz in discard early exit for non-color shaders If a shader doesn't export any color targets and instead only exports mrtz, the discard early exit block should match. Fixes artifacts on Lara in Rise of the Tomb Raider benchmark and hair in The Witcher 3 (classic). https://reviews.llvm.org/D128185 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `bc8da20dda` ("aco: export MRT0 instead of NULL on GFX11") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20345>	2022-12-16 15:35:28 +00:00
Timur Kristóf	db5c3f170f	aco: Emulate Wave64 bpermute on GFX11. Similar to emit_gfx10_wave64_bpermute, but uses the new v_permlane64_b32 instruction to swap data between wave halves. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	853e76f007	aco: Stylistic changes to emit_gfx10_wave64_bpermute. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	640e801651	aco: Split opcodes for GFX6 and GFX10 emulated bpermute. Different sequences are emitted for these, so it makes sense to have different opcodes too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Bas Nieuwenhuizen	89663828ea	aco: Don't use v_lshrrev_b64 for moves on GFX11. Looking at VOPD things, shifts are not very likely to get dual issued but plain moves are. Looking at RDNA2 v_lshrrev_b64 are half the perf of v_mov_b32 (but you need twice as many moves), so on GFX11 this likely reaches the threshold where moves are faster. Totals from 68400 (50.70% of 134906) affected shaders: CodeSize: 275489516 -> 275459536 (-0.01%); split: -0.01%, +0.00% Instrs: 51775474 -> 51991286 (+0.42%) Latency: 589884847 -> 589066439 (-0.14%); split: -0.15%, +0.01% InvThroughput: 127154986 -> 126037619 (-0.88%); split: -0.88%, +0.00% Copies: 3756157 -> 3976193 (+5.86%) Branches: 1259604 -> 1260072 (+0.04%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19633>	2022-12-02 13:25:57 +00:00
Rhys Perry	9b6ab40b3b	aco: improve do_pack_2x16() with zero constants We can skip the v_or_b32 or use an instruction smaller than v_alignbyte_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19933>	2022-12-01 21:43:28 +00:00
Rhys Perry	ce5838599d	aco/gfx11: use v_cvt_i32_i16/v_cvt_u32_u16 fossil-db (gfx1100): Totals from 52753 (39.07% of 135032) affected shaders: CodeSize: 153603860 -> 153163384 (-0.29%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19933>	2022-12-01 21:43:28 +00:00
Samuel Pitoiset	ce11c06429	aco: fix emitting DEALLOC_VGPRS in the discard block It should be emitted right before s_endpgm. Cc: 22.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19931>	2022-11-22 19:52:04 +00:00
Samuel Pitoiset	bb90d29660	aco: add p_dual_src_export_gfx11 for dual source blending on GFX11 Dual source blending must be in strict WQM mode. Cc: 22.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19643>	2022-11-16 18:35:10 +00:00
Daniel Schürmann	efc0835787	aco: move statistics enum to aco_shader_info.h to make it accessible from the driver. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19721>	2022-11-15 16:34:07 +00:00
Samuel Pitoiset	369c9b6425	aco: fix p_interp_gfx11 to not overwrite SCC s_wqm_b64 clobbers SCC. Found this while working on dual source blending. Fixes: `6113ee650a` ("aco/gfx11: fix FS input loads in quad-divergent control flow") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19747>	2022-11-15 15:57:31 +00:00
Georg Lehmann	9746ddf1d6	aco: Use s_pack_ll_b32_b16 for scalar zero extend. Foz-DB Navi21: Totals from 2403 (1.78% of 134913) affected shaders: CodeSize: 25329156 -> 25311244 (-0.07%) Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19413>	2022-11-01 18:59:53 +00:00
Rhys Perry	6113ee650a	aco/gfx11: fix FS input loads in quad-divergent control flow This is not ideal and it would be great to somehow make it better some day. fossil-db (gfx1100): Totals from 5208 (3.86% of 135032) affected shaders: MaxWaves: 127058 -> 126962 (-0.08%); split: +0.01%, -0.09% Instrs: 3983440 -> 4072736 (+2.24%); split: -0.00%, +2.24% CodeSize: 21872468 -> 22230852 (+1.64%); split: -0.00%, +1.64% VGPRs: 206688 -> 206984 (+0.14%); split: -0.05%, +0.20% Latency: 37447383 -> 37491197 (+0.12%); split: -0.05%, +0.17% InvThroughput: 6421955 -> 6422348 (+0.01%); split: -0.03%, +0.03% VClause: 71579 -> 71545 (-0.05%); split: -0.09%, +0.04% SClause: 148289 -> 147146 (-0.77%); split: -0.84%, +0.07% Copies: 259011 -> 258084 (-0.36%); split: -0.61%, +0.25% Branches: 101366 -> 101314 (-0.05%); split: -0.10%, +0.05% PreSGPRs: 223482 -> 223460 (-0.01%); split: -0.21%, +0.20% PreVGPRs: 184448 -> 184744 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19370>	2022-11-01 12:42:43 +00:00

1 2 3 4

181 commits