This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>
The GLSL lowering of half float packing involves software conversion
to half-float; instead, use the lowering in NIR.
Both Midgard and Bifrost are already set to lower the instructions to
bit operations, but change mdg_should_scalarize so that the lowerable
split variants of the pack/unpack instructions are generated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16175>
This is to match other NIR terminology.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.
Fixes shaders with only sysvals but no UBOs when push constants are
disabled.
This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.
Fixes: c246af0dd8 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
There are up to 4 instructions in the latter stage (if a branch is included),
not 3. Bump the limit to fix memory corruption.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15147>
We always blit from a particular level, so it's a waste to compute the LOD.
This corresponds to a simple texture instruction with implement 0 LOD, which is
the optimal texturing path on Bifrost -- it maps to TEXS_2D but does not require
helper invocations.
Functional change on Bifrost: Blit shaders no longer set .computed_lod or
shader_contains_barrier.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
cont -> skip, last -> kill, and fix the special case handling. It's just an
enum. Makes the disassembly easier to read and closer to Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
To help identify problems across the compiler, print more forms of the shader
with MIDGARD_MESA_DEBUG=shaders. Roughly matches the Bifrost compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>
Commit 38800b38 changed nir_opcodes.py, but that doesn't seem to have
triggered nir_opt_algebraic.py. The change in 75ef5991 depends on
opt_algebraic lowering 16-bit versions of slt, but if opt_algebraic is
not rebuilt, this may not happen. This resulted in some people seeing
assertion failures in, for example,
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_3.step,
due to the backend seeing nir_op_slt that it didn't know how to handle.
v2: Add nir_opcodes.py to nir_algebraic_py so that all the per-driver
algebraic passes pick up the dependency too. Rename it to
nir_algebraic_depends. Suggested by Emma.
Closes: #6047
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15050>
Compute shaders might do vec3(Xbits) loads which are translated
to LD.<next_pow2(3 * X)> by the midgard compiler. This might cause
out-of-bound accesses potentially leading to pagefaults if the
access is at the end of a BO. One solution to avoid that (maybe not
the best) is to split non-log2 loads to make sure we only read what's
requested.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
In that case, the mask is specified on 32bit lanes, so we need to shift
it if it's > 0x3. The expand modifier will take care of selecting the
right side of the 32bit vector.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
Swizzling happens in 2 steps on Midgard:
1. Vector expansion/shuffling
2. Swizzling at the instruction-size granularity, but defined using
the source size. Those size are different if the source is expanded.
So, when we print 64 bit swizzles on an expanded source, we first need
to apply an offset if the high part of the 32bit vector was selected,
and then divide the result by 2 to account for vector expansion.
To sum-up, swizzling on midgard is complicated, and I'm not sure I got
it right, but it seems to print what I expect on the few compute
shaders using 64bit arithmetic I debugged.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
Even though 8-bit ALUs are not supported, we can have [un]pack_32_4x8
instructions which translate to IMOVs, and those operate on 8-bit
vectors. The problem is, the swizzling granularity is 16 bit, which
means we don't support
MOV.i8 R0.xyzw, TMP0.xxxx, R1.zyxw
and the compiler doesn't even complain, it just applies 8 bit
swizzling directly, which obviously doesn't work.
This is probably not the right way to fix that, but I thought I'd
raised the issue with a hack to fix, so we can get the discussion
started.
(Found while debugging FB store lowering on Midgard).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
The register allocator assumes instructions are executed sequentially
and allows one instruction to overwrite a portion of a register written
by a previous instruction if this portion is never used. But scalar and
vector ALUs might be executed in parallel if they are part of the same
bundle, and when such instructions write to the same portion of the
register file, the result is undefined.
Let's add intra-bundle interferences to avoid this situation.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
There is a single quirk it cares about. Pass just that, so the relevant quirk
can be made a Midgard compiler quirk and not a global Panfrost quirk.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14726>
`base` is meaningless for combined stores, so don't read it. This allows
us to remove the base index from the intrinsic and simplify the lowering
code generating the combined store intrinsic.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13714>
Needed to link the disassembler separate from the rest of the compiler,
as in out-of-tree pandecode builds. Which I haven't done for Midgard in
well over a year, enough time for this to bit rot.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14185>
This is needed to allow lowering blend operations in fragment shaders
when the blend operation uses dynamic blend constants.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>
The output swizzle defined in the render-target descriptor is ignored
when the format is RAW. In that case, we have to swap the components
when lowering FB stores/loads if we want to get the right color.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12793>
This is where it should be rather than having to pass it into the
optimisation pass every time.
It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
Similar to the fix in 6bf8e960fa ("pan/bi: Do helper termination
analysis on clauses")
Though apparently a "theoretical issue only", fixes artefacts in
DarkPlaces with both D3D9 and GL renderers.
Fixes: 9a7f0e268b ("pan/mdg: Use the helper invo analyze passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12156>
As discussed with Jason and Connor, this is probably subtly broken on
Mali T720.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>