The RDNA ISA doc says that omod doesn't preserve -0.0 in 6.2.2. LLVM
appears to always disable omod in this situation, but clamp is unaffected.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605>
These opcodes are never used and they always write the carry-out
according to the GCN3 ISA documentation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7569>
Check for clamp, SDWA or DPP. The optimization isn't possible with SDWA
and DPP, so it would have been skipped anyway. Doing any of these with a
clamp modifier present would be incorrect.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
This is similar to AMD_DEBUG=tex, but for radv.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5734>
This is mostly copied from si_print_texture_info, with the si-specific
bits removed. Moving it into common code will allow to use it from both
radeonsi and radv.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5734>
The first operand of v_bcnt should always be a VGPR because if it's
a SGPR, isel selects s_bcnt1 but I added a sanity check to prevent
any problems.
fossils-db (Vega10):
Totals from 23 (0.02% of 139517) affected shaders:
CodeSize: 106828 -> 106664 (-0.15%)
Instrs: 20242 -> 20201 (-0.20%)
Cycles: 213112 -> 211352 (-0.83%)
VMEM: 3200 -> 3184 (-0.50%)
SMEM: 928 -> 927 (-0.11%)
Helps Control, Assassins Creeds Origins and Youngblood.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7568>
This primarily tests that:
- multiple GPUs with the same GPU modifier parameters result
in the same tiling layout.
- The size & alignment calculations don't change for a given
modifier & image parameters.
It does this primarily based on addrlib. Radeonsi has used addrlib
for the retiling of displayable DCC for a while already, so the
DCC tiling should be pretty reliable.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
Some architectures like aarch64 and ppc64el have char = unisgned char.
This breaks meta equation generation for DCC coords, as addrlib tries
to filter all the Z bits > -1 which ends up being all the Z bits > 255.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7593>
Only on GFX8-9 because GFX10 doesn't zero the upper 16 bits.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
v_mad_u32_u16 will be selected by isel to keep the range analysis
information around and to combine more v_add_u32+v_mad_u32_u16
together. When it's not possible to optimize that pattern, fallback
to v_mul_u32_u24 which is VOP2 instead of VOP3.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
To indicate that the upper 16-bits are always 0 and that optimizing
v_mad_u32_u16 to v_mul_u32_u24 is valid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
When one instruction doesn't fit into the existing labels, use
the generic one.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
This cleans up a bunch of gross sprintfs and keeps the caller from needing
to remember to ralloc_strdup. I added a couple of '"%s", name ? name :
""' to radv where I didn't fully trace through whether a non-null name was
being passed in.
I also took the liberty of adding a basic name to a few shaders (pan_blit,
unit tests)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
The extension is only exposed on ACO and LLVM 11+ because of a LLVM bug.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7234>
64-bit image atomics only work with LLVM 11+ because of a LLVM bug.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7234>
Otherwise FMASK metadata segfaults and on import we disable it ...
CC: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7358>
If we don't have a non-visible VRAM heap, we should be counting
our non-visible VRAM allocations to the visible-VRAM heap.
CC: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6827>
When I enable "Above 4G decoding" in my BIOS I still get 16 MiB of
non-visible VRAM on my 8G VRAM GPU ...
CC: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6827>
Validation errors mention the pretty-printed instruction including
operands with the reserved % character, which caused vasprintf to
expect more format arguments than aco provided.
Fixes: c2b1978aa4 ("aco: rework the way various compilation/validation errors are reported")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7442>
Fix defect reported by Coverity Scan.
Missing varargs init or cleanup (VARARGS)
missing_va_end: va_end was not called for debugPrintInput.ap.
Fixes: 69ea473eeb ("amd/addrlib: update to the latest version")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7299>
The loop variable "k" shadowed another variable in the outer scope, so
this loop had no actual effect.
Fixes: 52cc1f8237 ("aco: improve p_create_vector RA for sub-dword operands")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7427>
There are too many algebraic optimizations to be certain that one of them
couldn't create instructions which need lowering. It also creates better
code for some reason.
fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
VGPRs: 7716 -> 7672 (-0.57%)
CodeSize: 1516152 -> 1510688 (-0.36%); split: -0.38%, +0.02%
MaxWaves: 3964 -> 3982 (+0.45%)
Instrs: 269445 -> 268508 (-0.35%); split: -0.36%, +0.02%
Cycles: 37963416 -> 37912592 (-0.13%); split: -0.15%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
load/store vectorization can create 8/16-bit alu to do packing/unpacking,
which would make shader_info::bit_sizes_used out of date.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>