fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
Timur Kristóf	640e801651	aco: Split opcodes for GFX6 and GFX10 emulated bpermute. Different sequences are emitted for these, so it makes sense to have different opcodes too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Ian Romanick	eb76cee9f8	nir: Eliminate nir_op_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variations that include both ine and i2b), always lower i2b to a != 0. At this point in the series, it should be impossible for anything to generate i2b, so there /should not/ be any changes. The failing test on d3d12 is a pre-existing bug that is triggered by this change. I talked to Jesse about it, and, after some analysis, he suggested just adding it to the list of known failures. v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b. v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py. v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after nir_lower_doubles makes progress. The latter can generate b2i instructions, but nir_lower_int64 can't handle them (anymore). v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I had accidentally removed the f2b(bf2(x)) optimization. v6: Just eliminate the i2b instruction. v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused) emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this function was still used. 🤷 No shader-db changes on any Intel platform. All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141165875 -> 141165873 (-0.0%) Instructions helped: 2 Cycles in all programs: 9098956382 -> 9098956350 (-0.0%) Cycles helped: 2 The two Vulkan shaders are helped because of the "new" (('b2i32', ('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern. Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Marek Olšák	716ac4a55d	nir: replace IS_SWIZZLED flag with ACCESS_IS_SWIZZLED_AMD Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19422>	2022-12-13 20:33:05 +00:00
Marek Olšák	7998c3bdd3	nir: remove redundant SLC_AMD in favor of ACCESS_STREAM_CACHE_POLICY ACCESS_STREAM_CACHE_POLICY was added to map to SLC for AMD. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19422>	2022-12-13 20:33:05 +00:00
Samuel Pitoiset	011a0b97b2	radv,aco: move radv_ps_epilog_key to the graphics pipeline key To avoid redundant structs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20199>	2022-12-08 13:28:00 +00:00
Samuel Pitoiset	9079bd821c	radv,aco: rename color output related fields for consistency Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20199>	2022-12-08 13:28:00 +00:00
Rhys Perry	bd30adf89d	aco: apply NUW to additions for scratch access fossil-db (navi21): Totals from 52 (0.04% of 135636) affected shaders: Instrs: 79036 -> 78567 (-0.59%) CodeSize: 431188 -> 427984 (-0.74%) Latency: 1318142 -> 1313821 (-0.33%) InvThroughput: 293842 -> 292836 (-0.34%) VClause: 2555 -> 2361 (-7.59%); split: -8.06%, +0.47% Copies: 8746 -> 8767 (+0.24%); split: -0.11%, +0.35% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20117>	2022-12-06 15:23:38 +00:00
Samuel Pitoiset	da32cbb5c6	aco: fix missing uses of MRT output flags Fixes regressions on GFX6 and the RAGE2 workaround. Fixes: `a297ac10a4` ("radv,aco: stop lowering FS outputs in NIR") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20154>	2022-12-05 15:01:19 +00:00
Samuel Pitoiset	a297ac10a4	radv,aco: stop lowering FS outputs in NIR This was a bad idea because: - it diverges too much with the fragment shader epilog - it doesn't allow to implement alpha-to-coverage via MRTZ correctly - it was supposed to be used by LLVM but this never happened Reverting this back allows us to fix alpha-to-coverage via MRTZ on GFX11 easily, including for fragment shader epilogs. fossils-db (NAVI21): Totals from 20411 (15.13% of 134913) affected shaders: VGPRs: 972056 -> 971400 (-0.07%); split: -0.08%, +0.01% CodeSize: 92284804 -> 92295392 (+0.01%); split: -0.05%, +0.06% MaxWaves: 465010 -> 465166 (+0.03%); split: +0.03%, -0.00% Instrs: 17034162 -> 17034963 (+0.00%); split: -0.00%, +0.01% Latency: 252013190 -> 251971764 (-0.02%); split: -0.03%, +0.02% InvThroughput: 45859625 -> 45842556 (-0.04%); split: -0.04%, +0.01% VClause: 324627 -> 324629 (+0.00%); split: -0.03%, +0.03% SClause: 672918 -> 672826 (-0.01%); split: -0.05%, +0.04% Copies: 1172126 -> 1158152 (-1.19%); split: -1.20%, +0.01% Branches: 420602 -> 420604 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 1025441 -> 1025481 (+0.00%) PreVGPRs: 861787 -> 860650 (-0.13%); split: -0.17%, +0.03% Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>	2022-12-05 08:22:28 +00:00
Samuel Pitoiset	3be728f1d0	aco: fix indexing MRT0 alpha channel for alpha-to-coverage via MRTZ on GFX11 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>	2022-12-05 08:22:28 +00:00
Samuel Pitoiset	20856bfe0f	aco: always use 32-bit for exporting alpha-to-coverage via MRTZ on GFX11 16-bit isn't possible. Note that this is currently style broken for compressed formats because the w channel is never written to. Ported from RadeonSI ('radeonsi/gfx11: fix alpha-to-coverage with stencil or samplemask export') Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>	2022-12-05 08:22:28 +00:00
Georg Lehmann	a3beb82cf6	aco: Use wave size specific opcode for s_or in cube map coord code. Cc: mesa-stable Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20041>	2022-12-01 01:39:27 +00:00
Georg Lehmann	22be0d09a0	aco: Don't prematurely emit s_andn2. Split s_not + s_and allows more inverse comparision and s_cbranch_vccz optimizations. Foz-DB Navi21: Totals from 516 (0.38% of 134913) affected shaders: CodeSize: 7273724 -> 7273720 (-0.00%) Instrs: 1364408 -> 1364407 (-0.00%) Latency: 14604862 -> 14604858 (-0.00%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19143>	2022-11-30 18:25:15 +00:00
Rhys Perry	0cb48ec3b7	radv,aco: remove old streamout code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>	2022-11-29 14:28:11 +00:00
Rhys Perry	3a96977542	radv,aco: remove old GS copy shader code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>	2022-11-29 14:28:11 +00:00
Rhys Perry	17bd2721e6	radv,aco: implement GS copy shaders using NIR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>	2022-11-29 14:28:11 +00:00
Rhys Perry	12becb8839	radv: lower streamout in NIR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>	2022-11-29 14:28:11 +00:00
Rhys Perry	19d0403594	radv,aco: export legacy vertex outputs in NIR This new behaviour will let us insert exports in GS copy shader control flow. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>	2022-11-29 14:28:11 +00:00
Rhys Perry	3061bc792d	aco: ensure MRT0 is written with dual source blending Fixes crucible test func.shader.dualsrc_mrt0_undef on polaris10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 22.3 mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19806>	2022-11-21 15:01:56 +00:00
Yonggang Luo	887e0fdace	aco: fixes error: 'uint' was not declared in aco_instruction_selection.cpp uint is from pipe/p_compiler.h error message: ../../src/amd/compiler/aco_instruction_selection.cpp:11061:4: error: 'uint' was not declared in this scope; did you mean 'rint'? 11061 \| uint en_mask = 1; Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19676>	2022-11-19 01:37:46 +00:00
Samuel Pitoiset	50fe37070f	aco: fix FS inputs loads in WQM with 16-bit p_wqm needs to use the same size. Fixes: `16d2c7ad55` ("aco/gfx11: perform FS input loads in WQM") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19788>	2022-11-17 07:00:02 +00:00
Samuel Pitoiset	fb781bfb0a	aco: fix dual source blending on GFX11 Assembly looks similar to LLVM. Cc: 22.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19643>	2022-11-16 18:35:10 +00:00
Samuel Pitoiset	5a3cc2d453	aco: fix missing SCC for p_interp_gfx11 in emit_interp_mov_instr() Fixes: `369c9b6425` ("aco: fix p_interp_gfx11 to not overwrite SCC") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19777>	2022-11-16 15:59:56 +00:00
Samuel Pitoiset	fc193133d4	aco: adjust an assertion about nir_texop_txf_ms and GFX11 This can fail with RADV_DEBUG=nofmask. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19613>	2022-11-16 07:59:15 +01:00
Samuel Pitoiset	369c9b6425	aco: fix p_interp_gfx11 to not overwrite SCC s_wqm_b64 clobbers SCC. Found this while working on dual source blending. Fixes: `6113ee650a` ("aco/gfx11: fix FS input loads in quad-divergent control flow") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19747>	2022-11-15 15:57:31 +00:00
Rhys Perry	50073d6135	aco/gfx11: increase gfx1100/gfx1101 physical vgprs https://reviews.llvm.org/D134522 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18825>	2022-11-02 17:09:32 +00:00
Rhys Perry	a71d068fd0	radv/llvm: fix GS shaders on GFX8/9 `6698753cdb` switched our GS output stores to use MUBUF. The stride doesn't matter for the ESGS descriptor (because idxen=false and the index stride is 64), but this fixes it anyway. This also changes ACO to use MUBUF store too, since MTBUF doesn't seem to work correctly with an invalid data format in the descriptor. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `6698753cdb` ("ac/llvm: don't use tbuffer_store as a fallback for swizzled stores") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18885>	2022-11-02 12:48:01 +00:00
Rhys Perry	6113ee650a	aco/gfx11: fix FS input loads in quad-divergent control flow This is not ideal and it would be great to somehow make it better some day. fossil-db (gfx1100): Totals from 5208 (3.86% of 135032) affected shaders: MaxWaves: 127058 -> 126962 (-0.08%); split: +0.01%, -0.09% Instrs: 3983440 -> 4072736 (+2.24%); split: -0.00%, +2.24% CodeSize: 21872468 -> 22230852 (+1.64%); split: -0.00%, +1.64% VGPRs: 206688 -> 206984 (+0.14%); split: -0.05%, +0.20% Latency: 37447383 -> 37491197 (+0.12%); split: -0.05%, +0.17% InvThroughput: 6421955 -> 6422348 (+0.01%); split: -0.03%, +0.03% VClause: 71579 -> 71545 (-0.05%); split: -0.09%, +0.04% SClause: 148289 -> 147146 (-0.77%); split: -0.84%, +0.07% Copies: 259011 -> 258084 (-0.36%); split: -0.61%, +0.25% Branches: 101366 -> 101314 (-0.05%); split: -0.10%, +0.05% PreSGPRs: 223482 -> 223460 (-0.01%); split: -0.21%, +0.20% PreVGPRs: 184448 -> 184744 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19370>	2022-11-01 12:42:43 +00:00
Rhys Perry	16d2c7ad55	aco/gfx11: perform FS input loads in WQM fossil-db (gfx1100): Totals from 48184 (35.68% of 135032) affected shaders: MaxWaves: 1131876 -> 1131960 (+0.01%); split: +0.05%, -0.04% Instrs: 36755466 -> 36782290 (+0.07%); split: -0.04%, +0.11% CodeSize: 200812068 -> 200915348 (+0.05%); split: -0.04%, +0.09% VGPRs: 2163980 -> 2163828 (-0.01%); split: -0.15%, +0.14% Latency: 484174459 -> 484341018 (+0.03%); split: -0.06%, +0.09% InvThroughput: 87941284 -> 87944874 (+0.00%); split: -0.04%, +0.04% VClause: 652984 -> 653085 (+0.02%); split: -0.09%, +0.10% SClause: 1510995 -> 1528832 (+1.18%); split: -0.40%, +1.58% Copies: 1997689 -> 2001857 (+0.21%); split: -0.49%, +0.69% Branches: 676629 -> 676584 (-0.01%); split: -0.02%, +0.01% PreSGPRs: 2033070 -> 2036725 (+0.18%) PreVGPRs: 1903922 -> 1903897 (-0.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `3730be9873` ("aco: mostly implement FS input loads on GFX11") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19370>	2022-11-01 12:42:43 +00:00
Rhys Perry	b7ea47ede6	radv,aco: don't use lower_to_fragment_fetch_amd on GFX11+ FMask doesn't exist on GFX11. Have txf_ms take the fragment_fetch_amd path. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19375>	2022-10-31 16:26:30 +00:00
Rhys Perry	14a1925727	aco: don't split swizzled store_buffer_amd on GFX9+ This isn't necessary. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19228>	2022-10-31 14:33:43 +00:00
Rhys Perry	e6d26cb288	nir,ac/nir,aco,radv: replace has_input_*_amd with more general intrinsics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19228>	2022-10-31 14:33:43 +00:00
Samuel Pitoiset	db7ffa4006	aco: implement NIR intrinsics for NGG streamout Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19317>	2022-10-31 13:48:39 +00:00
Qiang Yu	13fb7f8f2c	ac/nir/ngg,ac/llvm,aco: save nogs ngg culling one lds dword TES rel patch id is <256, so we can use an existing unused LDS byte instead of extra dword. To ease the programing, change the index of repacked_arg_vars for these variables. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832>	2022-10-27 07:35:01 +00:00
Samuel Pitoiset	db573f7362	aco: add support for device clock on GFX11 According to LLVM, s_sendmsg_rtn(GET_REALTIME) should be used instead of s_memrealtime. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267>	2022-10-25 20:23:08 +02:00
Georg Lehmann	361b47b1f0	aco: Implement signed idot instructions on GFX11. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19114>	2022-10-24 19:07:16 +00:00
Samuel Pitoiset	152b90efcd	aco,radv/llvm: do not export parameters on GFX11 They will be exported through the attribute ring instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19216>	2022-10-24 07:55:06 +02:00
Georg Lehmann	d57f5c9cac	radv,aco: Lower uclz in NIR. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>	2022-10-22 11:57:23 +02:00
Georg Lehmann	058174c4de	aco: Implement [ui]find_msb_rev. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>	2022-10-22 11:57:00 +02:00
Rhys Perry	dfce433385	aco/gfx11: optimize LS/HS load_local_invocation_index fossil-db (gfx1100): Totals from 1361 (1.01% of 135032) affected shaders: Instrs: 501227 -> 500469 (-0.15%); split: -0.16%, +0.01% CodeSize: 2730012 -> 2724820 (-0.19%); split: -0.20%, +0.00% VGPRs: 63716 -> 63688 (-0.04%) Latency: 2228848 -> 2228858 (+0.00%); split: -0.00%, +0.00% InvThroughput: 878418 -> 878275 (-0.02%); split: -0.02%, +0.00% VClause: 14866 -> 14868 (+0.01%); split: -0.03%, +0.04% SClause: 16674 -> 16645 (-0.17%); split: -0.22%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19196>	2022-10-21 20:01:30 +00:00
Samuel Pitoiset	ef5fc6a764	aco: fix tcs_wave_id unpacking on GFX11 Only the first 3 bits are useful. Ported from ac/llvm. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19194>	2022-10-21 07:15:44 +00:00
Timur Kristóf	e52c2f4fca	nir, ac, aco: Add index src to load_buffer_amd/store_buffer_amd. Also modify all existing uses to pass a zero to this new src. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>	2022-10-20 20:00:50 +00:00
Timur Kristóf	b67aa87810	aco: Cleanup load_vmem_mubuf and store_vmem_mubuf functions. Remove unused arguments, clean up allow_combining vs. swizzled etc. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>	2022-10-20 20:00:50 +00:00
Timur Kristóf	c918f0934e	nir, ac, aco: Add ACCESS intrinsic index to load/store_buffer_amd. Previously, we always treated these as coherent, but now let's make this configurable. Also set all current users to ACCESS_COHERENT. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>	2022-10-20 20:00:49 +00:00
Bas Nieuwenhuizen	1252d63cc2	aco: Pre-split result of bvh64_intersect_ray_amd. Avoids later moves with extractions from the vector. Reduces VALU operation in the raytrace loop by ~6%, increasing the RT performance in Q2RTX on a 6800 XT by about ~1.3%. Suggested by Georg. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19148>	2022-10-19 21:44:48 +00:00
Timur Kristóf	d8639b7a80	aco: Allow explicitly removing jumps on GFX10+ when beneficial. "Removing jumps" in ACO means skipping the jump instruction at the beginning of a divergent branch (but still modify exec). ACO already supports implicitly removing jumps when it decides that executing a branch with empty exec mask is more beneficial than a jump. This commit adds the possibility to use this explicitly through nir_selection_control. ACO will respect this setting and remove the branch instructions when this is specified, unless it decides that this would cause bugs (eg. exp instruction). There are two cases that benefit from the new change: 1. When the application requests to "flatten" a branch (ie. remove control flow), we now respect that. 2. When the compiler stack determines that a divergent branch is always taken. v2 by Georg Lehmann: fixed applying sel_ctrl to else blocks Fossil DB stats on Navi 21: Totals from 13 (0.01% of 134906) affected shaders: CodeSize: 136616 -> 136496 (-0.09%) Instrs: 26196 -> 26166 (-0.11%) Latency: 417928 -> 417889 (-0.01%) Branches: 1241 -> 1211 (-2.42%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-By: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>	2022-10-11 15:42:54 +00:00
Georg Lehmann	cc06b7e00d	aco: Use s_pack_ll for s_bfe operand on GFX9+. Foz-DB Navi21: Totals from 1 (0.00% of 134913) affected shaders: CodeSize: 340 -> 336 (-1.18%) Instrs: 77 -> 76 (-1.30%) Latency: 1065 -> 1063 (-0.19%) InvThroughput: 4260 -> 4252 (-0.19%) Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18936>	2022-10-04 11:39:13 +00:00
Rhys Perry	39a6067635	aco/gfx11: swap ds_cmpst_* data operands According to an LLVM comment, these are swapped in GFX11. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710>	2022-09-30 20:57:02 +00:00
Rhys Perry	3730be9873	aco: mostly implement FS input loads on GFX11 Quad-divergent CF and vertex selection doesn't work, but should at least prevent crashes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>	2022-09-26 14:49:57 +00:00
Rhys Perry	a7a9aad14d	aco: limit GFX11 to 128 VGPRs for now See https://reviews.llvm.org/D128054 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>	2022-09-26 14:49:56 +00:00

1 2 3 4 5 ...

867 commits