Having NIR-based instruction selection tests instead of GLSL lets us be
more precise with the input NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28301>
vk_format.h is resident under src/vulkan/util/vk_format.h, so do the rename to avoid
conflict
Only include radv_formats.h when needed.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28266>
We allocate them at the end of the register file and keep them separate
from normal VGPRs. This is for two reasons:
- Because we only ever move linear VGPRs into an empty space or a space
previously occupied by a linear one, we never have to swap a normal VGPR
and a linear one. This simplifies copy lowering.
- As linear VGPR's live ranges only start and end on top-level blocks, we
never have to move a linear VGPR in control flow.
fossil-db (navi31):
Totals from 5493 (6.93% of 79242) affected shaders:
MaxWaves: 150365 -> 150343 (-0.01%)
Instrs: 7974740 -> 7976073 (+0.02%); split: -0.06%, +0.08%
CodeSize: 41296024 -> 41299024 (+0.01%); split: -0.06%, +0.06%
VGPRs: 283192 -> 329560 (+16.37%)
Latency: 64267936 -> 64268414 (+0.00%); split: -0.17%, +0.17%
InvThroughput: 10954037 -> 10951735 (-0.02%); split: -0.09%, +0.07%
VClause: 132792 -> 132956 (+0.12%); split: -0.06%, +0.18%
SClause: 223854 -> 223841 (-0.01%); split: -0.01%, +0.01%
Copies: 559574 -> 561395 (+0.33%); split: -0.24%, +0.56%
Branches: 179630 -> 179636 (+0.00%); split: -0.02%, +0.02%
VALU: 4572683 -> 4574487 (+0.04%); split: -0.03%, +0.07%
SALU: 772076 -> 772111 (+0.00%); split: -0.01%, +0.01%
VOPD: 1095 -> 1099 (+0.37%); split: +0.73%, -0.37%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
The LLVM backend does this when lowering ordered_xfb_counter_add_amd. I
guess there is some missing dependency checking or something.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345>
These are intended to make sure that the post-RA optimizer works
correctly across control flow. The new tests emit a divergent
if-else branch (with full logical+linear CFG).
- dpp_across_cf:
Simple case of DPP optimizable across control flow. Should pass.
- dpp_across_cf_overwritten:
Similar case but the DPP source register is overwritten in CF.
This shows a bug so the test fails now (will be fixed).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16604>
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)
Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
This commit adds the skeleton of a new ACO post-RA optimizer,
which is intended to be a simple pass called after RA, and
is meant to do code changes which can only be done
after RA.
It is currently empty, the actual optimizations will be added
in their own commits. It only has a DCE pass, which deletes
some dead code generated by the spiller.
Fossil DB results on Sienna Cichlid:
Totals from 375 (0.25% of 149839) affected shaders:
CodeSize: 2933056 -> 2907192 (-0.88%)
Instrs: 534154 -> 530706 (-0.65%)
Latency: 12088064 -> 12084907 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 4433454 -> 4432421 (-0.02%); split: -0.02%, +0.00%
Copies: 81649 -> 78203 (-4.22%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
It doesn't support Navi1x and the removal enables this nice code cleanup.
v2: rebase - mareko
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>
Fix defect reported by Coverity Scan.
Uninitialized scalar variable (UNINIT)
uninit_use: Using uninitialized value ds_state.front. Field ds_state.front.writeMask is uninitialized.
Fixes: d488d0fd7b ("aco: add framework for testing isel and integration tests")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9033>
This patch delays the calculation of GPR limits in order to
precisely incorporate extra registers (VCC etc.) and shared VGPRs.
Additionally, the allocation granularity is used to set the config.
This has some effect on the reported SGPR stats.
Totals (Navi10):
SGPRs: 6971787 -> 17753642 (+154.65%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>