Commit graph

128 commits

Author SHA1 Message Date
Rhys Perry
bd8f8dda8c aco: fix p_constaddr with a non-zero offset
Seems this broke a while ago and we never noticed.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 0af7ff49fd ("aco: lower p_constaddr into separate instructions earlier")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16460>
2022-05-23 11:52:54 +00:00
Marek Olšák
2443054932 amd: rename fishes to Navi21, Navi22, Navi23, Navi24, and Rembrandt
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16604>
2022-05-19 11:55:50 +00:00
Marek Olšák
39800f0fa3 amd: change chip_class naming to "enum amd_gfx_level gfx_level"
This aligns the naming with PAL.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469>
2022-05-13 14:56:22 -04:00
Samuel Pitoiset
0cb1b12ec0 aco: recognize GFX11 in few places
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369>
2022-05-12 15:46:20 +00:00
Dave Airlie
04c07a2413 aco/radv: convert to aco shader info at the radv level.
This removes the radv shader info type from aco completely.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342>
2022-05-11 19:07:11 +00:00
Jason Ekstrand
1b8a43a0ba util: Remove util_cpu_detect
util_cpu_detect is an anti-pattern: it relies on callers high up in the call
chain initializing a local implementation detail. As a real example, I added:

...a Mali compiler unit test
...that called bi_imm_f16() to construct an FP16 immediate
...that calls _mesa_float_to_half internally
...that calls util_get_cpu_caps internally, but only on x86_64!
...that relies on util_cpu_detect having been called before.

As a consequence, this unit test:

...crashes on x86_64 with USE_X86_64_ASM set
...passes on every other architecture
...works on my local arm64 workstation and on my test board
...failed CI which runs on x86_64
...needed to have a random util_cpu_detect() call sprinkled in.

This is a bad design decision. It pollutes the tree with magic, it causes
mysterious CI failures especially for non-x86_64 developers, and it is not
justified by a micro-optimization.

Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding
the footgun where it fails to be called.  This cleans up Mesa's design,
simplifies the tree, and avoids a class of a (possibly platform-specific)
failures. To mitigate the added overhead, wrap it all in a (fast) atomic
load check and declare the whole thing as ATTRIBUTE_CONST so the
compiler will CSE calls to util_cpu_detect.

Co-authored-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15580>
2022-04-20 18:44:35 +00:00
Rhys Perry
63e40adf8c aco: fix disassembly of SMEM with both SGPR and constant offset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15890>
2022-04-14 20:58:36 +00:00
Daniel Schürmann
2fe005a3fe aco: remove occurences of VCC hint
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>
2022-04-13 21:52:43 +00:00
Rhys Perry
5b4e41e4db aco: don't use v_mad_mix on GFX9 if 16-bit denormals must be preserved
This probably effectively disables the v_mad_mix optimization on GFX9.

fossil-db (Vega):
Totals from 11545 (7.15% of 161366) affected shaders:
MaxWaves: 43025 -> 42780 (-0.57%); split: +0.06%, -0.63%
Instrs: 18571635 -> 18734201 (+0.88%); split: -0.00%, +0.88%
CodeSize: 96483568 -> 96611012 (+0.13%); split: -0.11%, +0.24%
SGPRs: 1079056 -> 1077616 (-0.13%); split: -0.14%, +0.01%
VGPRs: 819248 -> 821868 (+0.32%); split: -0.04%, +0.36%
SpillSGPRs: 13313 -> 12464 (-6.38%)
Latency: 293804093 -> 295046122 (+0.42%); split: -0.09%, +0.51%
InvThroughput: 110002239 -> 110994978 (+0.90%); split: -0.03%, +0.93%
VClause: 342458 -> 342596 (+0.04%); split: -0.12%, +0.16%
SClause: 648566 -> 648046 (-0.08%); split: -0.12%, +0.04%
Copies: 1728225 -> 1726679 (-0.09%); split: -0.66%, +0.57%
Branches: 552973 -> 552963 (-0.00%); split: -0.02%, +0.02%
PreSGPRs: 862360 -> 856820 (-0.64%); split: -0.69%, +0.05%
PreVGPRs: 773689 -> 776818 (+0.40%); split: -0.02%, +0.42%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15718>
2022-04-04 19:27:12 +00:00
Daniel Schürmann
b98a9dcc36 aco/optimizer: fix call to can_use_opsel() in apply_insert()
The definition index is -1.

Fixes: 54292e99c7 ('aco: optimize 32-bit extracts and inserts using SDWA ')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551>
2022-03-25 22:02:50 +00:00
Rhys Perry
177b54ebe9 aco/tests: add v_fma_mix tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769>
2022-03-17 19:04:17 +00:00
Daniel Schürmann
70aea6b41a aco/ra: refactor collect_vars() to return a sorted vector
The vector of IDs is sorted with decreasing sizes,
and by increasing assigned registers.
This decouples register assingment from ssa IDs.

Totals from 12694 (9.41% of 134913) affected shaders: (GFX10.3)
VGPRs: 757864 -> 757848 (-0.00%); split: -0.00%, +0.00%
CodeSize: 72350540 -> 72348688 (-0.00%); split: -0.02%, +0.02%
MaxWaves: 237018 -> 237020 (+0.00%); split: +0.00%, -0.00%
Instrs: 13545494 -> 13544699 (-0.01%); split: -0.03%, +0.02%
Latency: 148539203 -> 148533292 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 30319086 -> 30320382 (+0.00%); split: -0.01%, +0.01%
VClause: 326875 -> 327028 (+0.05%); split: -0.05%, +0.09%
SClause: 479833 -> 479837 (+0.00%); split: -0.00%, +0.00%
Copies: 862152 -> 860914 (-0.14%); split: -0.43%, +0.28%
Branches: 317775 -> 317777 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11526>
2022-03-14 08:32:10 +00:00
Rhys Perry
c3070773f8 aco/tests: add test for branch definition RA
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
5e3b8eeac4 aco: add test for optimizations with casts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
27f1f5537d aco/tests: implement sub-dword program inputs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
e86b88f85b aco/tests: add a bunch more building helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Daniel Schürmann
8a78706643 nir: refactor nir_opt_move
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.

v2: Only return true if an instruction really changed the position.
    Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657>
2022-01-12 13:41:54 +00:00
Tatsuyuki Ishi
da0412e55b aco: support DPP8
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971>
2021-12-31 20:56:39 +00:00
Rhys Perry
6afba80534 aco: don't create DPP instructions with SGPR operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2e6834d4f6 ("aco: combine DPP into VALU before RA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13976>
2021-11-30 20:11:48 +00:00
Tony Wasserka
b70e551a51 aco/tests: Assert that the requested IR is actually provided
In particular, assembly will not be provided if no disassembler is available
for the given GPU architecture.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11319>
2021-10-01 10:40:18 +02:00
Rhys Perry
8cf37fc8a8 aco/tests: add idep_amdgfxregs_h
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 9bf30c4a5c ("aco/tests: add tests for form_hard_clauses()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12017>
2021-09-24 11:53:23 +01:00
Timur Kristóf
f2e41eda9e aco: Add ability to optimize v_lshl + v_sub into v_mad_i32_i24.
Also change combine_add_lshl to use check_vop3_operands instead
of its own checks of the operands.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>
2021-09-20 12:39:03 +02:00
Rhys Perry
6ed18749de aco: allow live-range splits of linear vgprs in top-level blocks
Fixes dEQP-VK.ssbo.phys.layout.random.8bit.all_per_block_buffers.46 on
GFX8.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12172>
2021-09-17 14:36:03 +00:00
Rhys Perry
8d50385bbd aco: implement linear vgpr copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12172>
2021-09-17 14:36:03 +00:00
Rhys Perry
b1e4794f0f aco/tests: add regalloc.scratch_sgpr.create_vector
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12172>
2021-09-17 14:36:03 +00:00
Rhys Perry
f41200d289 aco/tests: fix finish_ra_test()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12172>
2021-09-17 14:36:03 +00:00
Rhys Perry
f241bd3749 aco: don't coalesce constant copies into non-power-of-two sizes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12702>
2021-09-03 14:01:27 +01:00
Daniel Schürmann
8bd7e2392b aco: preserve subdword RC when lowering p_insert/p_extract
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
73481338fe aco/print_ir: always print SDWA dst & src selections
This way, it becomes more apparent how SDWA behaves.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
0988f7b9ba aco: remove explicit dst_preserve flag
Instead, we can rely on the fact that subdword definitions
must preserve the unused bits while dword definitions either
pad or sign-extend.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
9e3ff06c38 aco: rewrite SDWA selector
This commit introduces a new struct SubdwordSel
in order to ease and clean up the usage of SDWA
selections. This includes removing the distinction
between register-allocated and fixed SDWA selections.
Instead, SDWA selections can now also access the high
bits of subdword variables. Alignment and sizes are
validated accordingly. Size, offset and sign_extend
can be evaluated via helper methods.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
cc4682ed47 aco: fix p_insert lowering with 16bit sources
The previous lowering only wrote a single byte.

Fixes: 2f94353735 ('aco: add p_extract/p_insert')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Rhys Perry
33ddbd220f aco: remove DPP when applying constants/literals/sgprs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Rhys Perry
7d95f7510f aco/tests: test copy propagation with DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Daniel Schürmann
23d5865f42 aco: refactor nir_op_imul selection
Previously, the optimization to use v_mul_lo_u16 for
32bit multiplications was done in instruction_selection.
This was moved to the optimizer to ease some case distinctions.

The mixed results are due to increased use of SDWA.

Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3)
VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01%
CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01%
Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01%
Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07%
InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06%
VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03%
SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01%
Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Rhys Perry
4a7714ab7b aco/tests: add tests for post-RA DPP combining
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
12be7c8feb aco/tests: add tests for pre-RA DPP combining
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Tony Wasserka
66e51dc474 aco: Remove use of deprecated Operand constructors
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)

Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka
4e33688f23 aco: Remove use of deprecated Operand constructors in test_to_hw_instr.cpp
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Daniel Schürmann
7a31567db3 aco/meson: remove inc_gallium from include_directories
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>
2021-07-12 21:27:31 +00:00
Daniel Schürmann
5e3297a97d aco/meson: remove unnecessary dependencies
Also moves idep_vulkan_util_headers to /tests/meson.build

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Rhys Perry
ebeda07801 aco/tests: fix 32-bit build
"call of overloaded ‘Operand(long unsigned int)’ is ambiguous"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11627>
2021-06-29 09:55:32 +00:00
Rhys Perry
c768d7d8f2 aco/tests: add SDWA tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
24418304b0 aco/tests: add tests for p_extract/p_insert lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
cf22eabc68 aco: make validate_ir() output usable in tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
bb52484df5 aco/tests: improve reporting of failed code checks
Instead of just reporting the failed statements, print where they
originated. This is useful for tests which have a number of similar
checks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Rhys Perry
9bf30c4a5c aco/tests: add tests for form_hard_clauses()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Rhys Perry
81162265b1 aco: do not clause NSA instructions
According to LLVM, this has "unpredictable results on GFX10.1".

https://reviews.llvm.org/D102211

fossil-db (Navi10):
Totals from 26690 (17.81% of 149839) affected shaders:
CodeSize: 167935160 -> 167706280 (-0.14%); split: -0.14%, +0.00%
Instrs: 31801427 -> 31744142 (-0.18%); split: -0.18%, +0.00%
Latency: 732672435 -> 732622463 (-0.01%)
InvThroughput: 163361435 -> 163357838 (-0.00%); split: -0.00%, +0.00%
VClause: 546131 -> 546903 (+0.14%); split: -0.00%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c92 ("aco: use non-sequential addressing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Timur Kristóf
aabe9d2f6e aco: Eliminate SALU comparison when SCC can be used instead.
For example:

s0, scc = s_and_u32 ...
scc = s_cmp_eq_u32 s0, 0
p_cbranch_sccz

is turned into:

s0, scc = s_and_u32 ...
p_cbranch_sccnz

Fossil DB results on Sienna Cichlid:

Totals from 85267 (56.91% of 149839) affected shaders:
CodeSize: 202539256 -> 202237268 (-0.15%)
Instrs: 38964493 -> 38888996 (-0.19%)
Latency: 750062328 -> 749913450 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 167408952 -> 167405157 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
a93092d0ed aco: Use s_cbranch_vccz/nz in post-RA optimization.
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.

It works on the following pattern:

vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc

The result looks like this:

vcc = v_cmp ...
p_cbranch vcc

Fossil DB results on Sienna Cichlid:

Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00