fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-20 20:20:18 +01:00

Author	SHA1	Message	Date
Daniel Schürmann	3f03db848d	aco: simplify statistics collection for copies Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Daniel Schürmann	0560831593	aco: fix register assignment for p_create_vector on GFX6/7 In case, some operand was already placed in the definition space, it could happen that it wasn't considered for live-range splits. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>	2020-06-09 21:25:38 +00:00
Rhys Perry	43e69475ad	aco: use v_xor3_b32 fossil-db (Navi): Totals from 334 (0.26% of 128321) affected shaders: CodeSize: 3345532 -> 3345484 (-0.00%); split: -0.00%, +0.00% Instrs: 624662 -> 622778 (-0.30%); split: -0.30%, +0.00% Mostly affects some parallel-rdp shaders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5357>	2020-06-08 13:20:01 +00:00
Samuel Pitoiset	6391f9ab4c	aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>	2020-06-05 16:04:06 +02:00
Samuel Pitoiset	e1523b34c2	aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7 SDWA is GFX8+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>	2020-06-05 16:04:05 +02:00
Samuel Pitoiset	ee4bc13de2	aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>	2020-06-05 16:04:03 +02:00
Samuel Pitoiset	77f08982af	aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7 16-bit subgroup ops are implemented with 32-bit instructions on GFX6-GFX7. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>	2020-06-03 19:48:43 +02:00
Samuel Pitoiset	f31c9b4edf	aco: fix subdword copies on GFX6-GFX7 SDWA is only GFX8+. Use v_mov_b32 since the upper 16 bits don't matter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>	2020-06-03 19:48:42 +02:00
Samuel Pitoiset	a521c67d22	aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>	2020-06-03 19:48:40 +02:00
Samuel Pitoiset	6b08d269bf	aco: implement 16-bit reduce operations on GFX6-GFX7 No fp16 on GFX6-GFX7. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>	2020-06-03 19:48:37 +02:00
Oschowa	663e8cb4e6	aco: Use correct reference type in for-range-loop. Fixes a clang warning. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5228>	2020-06-02 21:31:17 +00:00
Oschowa	7b1bc460fd	aco: Don't std::move temporary object. Fixes the following clang warning: mesa/src/amd/compiler/aco_optimizer.cpp:2928:15: warning: moving a temporary object prevents copy elision [-Wpessimizing-move] ctx.uses = std::move(dead_code_analysis(program)); Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5228>	2020-06-02 21:31:17 +00:00
Oschowa	536339b0dd	aco: Don't declare 'Block' as class, but define as struct. Fixes clang warnings. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5228>	2020-06-02 21:31:17 +00:00
Timur Kristóf	045c9ffa7d	aco: Implement subgroup shuffle on GFX6-7. GFX6 and GFX7 don't have the ds_bpermute (or permute) instruction, but we would like to support subgroup shuffle on these old GPUs. So we introduce a new pseudio instruction which will be lowered to an "unrolled loop" that emulates bpermute on GFX6 and GFX7 using readlane instructions, while also respecting the exec mask thanks to v_cmpx. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>	2020-06-02 21:12:12 +00:00
Timur Kristóf	14a5021aff	aco/gfx10: Refactor of GFX10 wave64 bpermute. The emulated GFX10 wave64 bpermute no longer needs a linear_vgpr, so we don't consider it a reduction anymore. Additionally, the code is slightly reorganized in preparation for the GFX6 emulated bpermute. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>	2020-06-02 21:12:12 +00:00
Dylan Baker	a8e2d79e02	meson: use gnu_symbol_visibility argument This uses a meson builtin to handle -fvisibility=hidden. This is nice because we don't need to track which languages are used, if C++ is suddenly added meson just does the right thing. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>	2020-06-01 18:59:18 +00:00
Samuel Pitoiset	e22567089c	aco: sign-extend input/indentity for 32-bit reduce ops on GFX10 Because some 16-bit instructions are already VOP3 on GFX10, we use the 32-bit variants to remove the temporary VGPR and to use DDP with the arithmetic instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>	2020-05-29 11:20:58 +00:00
Samuel Pitoiset	83dcd1690b	aco: allow gfx10_wave64_bpermute with 8-bit/16-bit input Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>	2020-05-29 11:20:58 +00:00
Samuel Pitoiset	8ece71507d	aco: allocate a temp VGPR for some 8-bit/16-bit reduction ops on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>	2020-05-29 11:20:58 +00:00
Samuel Pitoiset	2e0ea9bcca	aco: implement 8-bit/16-bit reductions on GFX10 Some 16-bit instructions are VOP3 on GFX10 and we have to emit a 32-bit DPP mov followed by the ALU instruction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>	2020-05-29 11:20:58 +00:00
Samuel Pitoiset	75a730ced5	aco: fix register allocation for subdword instructions on GFX10 Cc: 20.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>	2020-05-29 11:20:58 +00:00
Rhys Perry	01ce7887bf	aco: fix 64-bit shared_atomic_exchange Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>	2020-05-28 10:34:03 +00:00
Rhys Perry	1f2fd9c62e	aco: don't reorder barriers in the scheduler Unless we're reordering it around a barrier of the same type No shader-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>	2020-05-28 10:34:03 +00:00
Rhys Perry	e1900ee2c7	aco: preserve more fields when combining additions into SMEM Totals from 11 (0.01% of 127638) affected shaders: Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>	2020-05-28 10:34:03 +00:00
Rhys Perry	95d5c1b8a1	aco: check instruction format before waiting for a previous SMEM store Totals from 7 (0.01% of 127638) affected shaders: CodeSize: 40336 -> 40320 (-0.04%) Instrs: 7807 -> 7803 (-0.05%) Cycles: 118588 -> 118344 (-0.21%); split: -0.23%, +0.02% SMEM: 331 -> 339 (+2.42%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `1749953ea3` ('aco/gfx10: Wait for pending SMEM stores before loads') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>	2020-05-28 10:34:03 +00:00
Rhys Perry	5ccc7c277c	aco: consider SDWA during value numbering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `23ac24f5b1` ('aco: add missing conversion operations for small bitsizes') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5164>	2020-05-28 09:55:58 +00:00
Rhys Perry	8aa98cebc1	aco: fix interaction with 3f branch workaround and p_constaddr The offset was incorrect if we inserted a nop before the p_constaddr. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5164>	2020-05-28 09:55:58 +00:00
Samuel Pitoiset	94570e87bd	aco: add support for bias/lod with texture gather Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5147>	2020-05-25 08:51:10 +02:00
Samuel Pitoiset	cecd4aad46	aco: implement nir_intrinsic_shader_clock with device scope Use s_memrealtime instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>	2020-05-24 20:37:52 +02:00
Samuel Pitoiset	2d4493ee11	aco: sign-extend the input and identity for 8-bit subgroup operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	c76595aec2	aco: use a temporary SGPR for 8-bit/16-bit literal reduction identities Otherwise, the compiler overwrites s0 which contains the exec mask. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	b3c87c52ea	aco: implement 8-bit/16-bit nir_intrinsic_quad_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	dfa62d97a0	aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	f03e56eaf0	aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	af7e2c6133	aco: validate 8-bit/16-bit VGPR operands for readfirstlane/readlane/writelane I would expect it to just work as intended and other solutions, like v_and_b32 to make sure the upper bits are 0, might have some overhead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	86e2b03e3f	aco: implement 8-bit/16-bit reductions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Samuel Pitoiset	cc79945b21	aco: declare 8-bit/16-bit reduce operations The 8-bit float variants are only for consistency but are unused. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>	2020-05-21 15:06:48 +00:00
Rhys Perry	40ed7fcc0b	aco: fix typo in insert_waitcnt's kill() No shader-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3004 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5126>	2020-05-21 13:01:41 +00:00
Daniel Schürmann	51f4b22fee	aco: don't allow unaligned subdword accesses on GFX6/7 There are no SDWA instructions which means that only full registers can be accessed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>	2020-05-21 12:07:40 +00:00
Daniel Schürmann	ae390755fe	aco: fix corner case in register allocation We mark dead operands in the register file when searching for a register for a definition. Only do so, if this space has not yet been taken by a different definition. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>	2020-05-21 12:07:40 +00:00
Daniel Schürmann	acec00eae0	aco: don't move create_vector subdword operands to unsupported register offsets Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>	2020-05-21 12:07:40 +00:00
Daniel Schürmann	5201985332	aco: restrict copying of create_vector operands to GFX9+ This improves code size for Polaris and earlier due to less register swapping Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>	2020-05-21 12:07:40 +00:00
Samuel Pitoiset	1ad9a8a884	aco: fix missing break in label_instruction() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5129>	2020-05-21 09:00:02 +02:00
Samuel Pitoiset	cc1a1da8ab	aco: fix off-by-one error with 16-bit MTBUF opcodes on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	1647e098e9	aco: implement 16-bit interp For 16-bit bank LDS (ie. Kabini/Stoney) we need a slightly different path. It's completely untested though because I don't have these chips but according to vkpipeline-db the generated assembly seems fine. Note that 16-bit I/O is currently only exposed on GFX9+ for both compiler backends. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	bbbb4057e6	aco: emit v_interp_*_f16 instructions as VOP3 instead of VINTRP This adds a separate emission path in the assembly for the 16-bit interp instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	34f2c4dc6a	aco: validate v_interp_*_f16 as VOP3 instructions instead of VINTRP 16-bit interp instructions are considered VINTRP by the compiler but they are emitted as VOP3 by the assembler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	3fba5bb9cc	aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	7ffd394605	aco: implement 8-bit/16-bit mov's with p_create_vector ACO doesn't lower 8-bit/16-bit mov's in NIR. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2997 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00
Samuel Pitoiset	860b4d16f4	aco: allow to load/store 16-bit values in VMEM for tess and geom We only have to adjust some assertions to allow storing/loading 16-bit values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>	2020-05-19 17:05:05 +00:00

... 2 3 4 5 6 ...

845 commits