This gets us fewer comparisons in the shaders that we need to optimize
back out, and reduces backend code.
total instructions in shared programs: 11547270 -> 7219930 (-37.48%)
total full in shared programs: 334268 -> 319602 (-4.39%)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6378>
If no options are provided, existing intrinsics are used.
If the lowering pass indicates there should be offsets used for global
invocation ID or work group ID, then those instructions are lowered to
include the offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
The actual variable -> intrinsic lowering stays where it is, but
ops which convert one intrinsic to be implemented in terms of
another have moved.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
The OpenCL image_width/height/depth functions have variants which can
take an LOD parameter. More importantly, LLVM-SPIRV-Translator always
generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero.
Given that over half the hardware out there has an LOD field for image
size queries (based on a rudimentary scan through their NIR -> whatever
code), we may as well just add the source to the NIR intrinsic. If this
is ever a problem for anyone, the lowering is pretty trivial.
I've also added asserts to everyone's drivers that should alert them if
they ever see an LOD other than zero. This will never happen with GL or
Vulkan so there's no need for panic.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>
As the original comment says, we can't really give the user what they
want if there's a timestamp inside a GMEM renderpass, but we can give
them a better approximation of it. At least sysmem renderpasses will now
have an accurate timestamp.
Also, don't emit the WFI if it's not necessary, based on the stage
flags.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5720>
Loads, stores, clears, and resolves now happen per-view. Since we only
support multiview with sysmem rendering, we only implement this for
sysmem clears and resolves.
There aren't any tests that mix multiview and MSAA, so no coverage of
the resolve path.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5720>
The *2 here would bump into the *2 in regset, causing assertion failures
dumping CS programs. Just set the mergedregs flag on a6xx, and don't
duplicate the mergedregs logic. If you're dealing with new HW where we
don't know if mergedregs is set, you may need to tweak the flag during
disasm setup for the stats to make sense.
Fixes: f7bd3456d7 ("freedreno: deduplicate a3xx+ disasm")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6323>
This matches what ir3.c does in the mergedregs case: just count max full
reg used. This flag is unset so far, but will be soon and keeps our
output comparable between blob and freedreno.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6323>
During build on Android 10, build error occurred:
'''
[ 26% 456/1718] Gen Header: libfreedreno_registers_32 <= a3xx.xml.h
FAILED: out/target/product/pinephone/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/adreno/a3xx.xml.h
/bin/bash -c "PATH=/usr/bin:\$PATH python3 external/mesa3d/src/freedreno/registers/gen_header.py external/mesa3d/src/freedreno/registers/adreno/a3xx.xml > out/target/product/pinephone/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/adreno/a3xx.xml.h"
Traceback (most recent call last):
File "external/mesa3d/src/freedreno/registers/gen_header.py", line 470, in <module>
main()
File "external/mesa3d/src/freedreno/registers/gen_header.py", line 446, in main
xml_file = sys.argv[2]
IndexError: list index out of range
'''
Align build rules with meson fixes it.
Fixes: 62ebd342 ("freedreno/registers: split header build into subdirs")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6170>
Although it's kind-of similar to "(rptN)" in the shader ISA, I called it
"xmov" to make it clear that it's completely orthogonal to "(rep)",
although you certainly can use both modifiers on the same instruction.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6368>
This adds an option to the WSI support for a software path to be
used with the vulkan sw drivers. There is probably some changes
that could be made to improve this and use present, for now
just use put image.
v2: roll out flag across all drivers (Eric)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6082>
This involves rolling our own int packing functions, because the u_format
versions do clamping which differs from VK spec requirement.
This reduces the size of libvulkan_freedreno.so significantly.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6304>
Fixes these dEQP tests:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.2d_d32_sfloat_s8_uint_d32_sfloat_s8_uint.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6304>
Move the handling for catching asserts when we start decoding garbage
into disasm-a3xx. This way it can also cover other cases where cffdec
tries to disassemble memory, such as SP_xS_OBJ_START.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6242>
The fixes tag isn't so much because it was incorrect before, but because
I'm going to send a kernel patch to fix the typo, and that will break
old crashdec.
Fixes: 1ea4ef0d3b ("freedreno: slurp in decode tools")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6242>
Add tracking for # of instructions per category, similar to the last
patch. Also add a few other shader-db stats that were missing on the
disasm side, to make it easier to compare to shaders from cmdstream
traces.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6243>
Freedreno needs at least Lua 5.2, but the current code will report found
for 5.1, which doesn't actually work.
Fixes: caa107cb8d
("freedreno/decode: move dependencies up a level")
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6229>