The TBA/TMA scalar registers are only available on GFX6-GFX8.
On GFX9+, TBA/TMA addr are stored in hardware registers and
the number of TTMP scalar registers is thus increased by 4.
Just keep in mind that tba_lo is actually ttmp0. Best would
be to support ttmp registers in RA but that's more complicated.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6384>
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
We were using the wrong conversion opcode. The high bits are also not
zero'd on GFX10, which can cause v_cvt_pk_u16_u32 to clamp.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: df645fa369 ('aco: implement VK_KHR_shader_float_controls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6346>
Sounds useful for debugging missing wait-states and for improving
detection of the faulty instruction in case of memory violations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6386>
The OpenCL image_width/height/depth functions have variants which can
take an LOD parameter. More importantly, LLVM-SPIRV-Translator always
generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero.
Given that over half the hardware out there has an LOD field for image
size queries (based on a rudimentary scan through their NIR -> whatever
code), we may as well just add the source to the NIR intrinsic. If this
is ever a problem for anyone, the lowering is pretty trivial.
I've also added asserts to everyone's drivers that should alert them if
they ever see an LOD other than zero. This will never happen with GL or
Vulkan so there's no need for panic.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>
The upcoming change will allow to report all ACO errors (or warnings)
directly to the app via VK_EXT_debug_report. This is similar to what
we already do for reporting various SPIRV->NIR errors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318>
It could happen that only the branch condition was computed in WQM
and not the branch instruction.
There is now some rendundancy which should be cleaned up.
Fixes: 3817fa7a4d ('aco: fix WQM handling in nested loops')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6260>
Fixes random failures of dEQP-VK.image.qualifiers.volatile.cube_array.r32i
and similar tests on Vega.
fossil-db (Navi):
Totals from 6 (0.00% of 135946) affected shaders:
VMEM: 1218 -> 1110 (-8.87%); split: +2.46%, -11.33%
SMEM: 174 -> 189 (+8.62%)
Copies: 84 -> 87 (+3.57%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: cd392a10d0 ('radv/aco,aco: use scoped barriers')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6174>
This should do much more trimming than shrink_load, and is a win on i965's
vec4 and nir-to-tgsi. For scalar backends like this that don't need ALU
shrinking, it still gets more load intrinsics covered.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6050>
We only need one s_bfe for a conversion with a swizzled source.
shader-db (parallel-rdp, Navi):
Totals from 487 (71.30% of 683) affected shaders:
SpillSGPRs: 3284 -> 3233 (-1.55%); split: -2.71%, +1.16%
SpillVGPRs: 2174 -> 2150 (-1.10%); split: -1.24%, +0.14%
CodeSize: 2497864 -> 2445544 (-2.09%); split: -2.11%, +0.01%
Instrs: 450613 -> 445104 (-1.22%); split: -1.27%, +0.05%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5259>
This information is exposed through shader->options->lower_int64_options.
Removing the extra arg forces drivers to initialize this field correctly.
This also allows us to check the int64 lowering options from each int64
lowering helper and decide if we should lower the instructions we
introduce.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3521>
And add some simple tests to demonstrate/test the pipeline builder and
glsl_scraper.py.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3521>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3521>
And add some "tests" to test and document currently unused features of the
framework.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3521>
NIR doesn't have atomic loads/stores, so we have to workaround that with
this for dEQP-VK.memory_model.* to pass.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>
Previously, we've set element_size == 16 which causes loads from
packed vec3 arrays to cross the boundary and return wrong data.
This patch sets element_size = 4 and splits loads into single channel.
Fixes all of dEQP-VK.subgroups.ballot_broadcast.*
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5977>