This switches us over to Mesa's code style [1], normalizing us within the tree.
The results aren't perfect, but they bring us a hell of a lot closer to the rest
of the tree. Panfrost doesn't feel so foreign relative to Mesa with this, which
I think (in retrospect after a bunch of years of being "different") is the right
call.
I skipped PanVK because that's paused right now.
find panfrost/ -type f -name '*.h' | grep -v vulkan | xargs clang-format -i;
find panfrost/ -type f -name '*.c' | grep -v vulkan | xargs clang-format -i;
clang-format -i gallium/drivers/panfrost/*.c gallium/drivers/panfrost/*.h ; find
panfrost/ -type f -name '*.cpp' | grep -v vulkan | xargs clang-format -i
[1] https://docs.mesa3d.org/codingstyle.html
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
We don't care to support i8vec16, we just need a bit of 8-bit support to
implement format packing/unpacking in blend shaders. We're already doing
this by using the 16-bit pipe, we just need to commit to it all the way
-- reporting the correct sizes in max_bitsize_for_alu so the mask
packing logic works as intended -- and dropping the imov-specific hack
that was introduced to workaround a similar class of bugs.
With the previous patch, fixes:
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1
Fixes: 39e4b7279d ("pan/midg: Fix swizzling on 8-bit sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>
Even if we only mask a single component from the result of CSEL.vector,
in our IR we treat its semantics as vector which causes trouble with
when scheduled to a scalar unit.
The problematic bundle looks like this:
vmul.MOV.i32 R31, TMP0.xxxx, R0.yzww
sadd.MAX.i32 TMP0.y, R0.y, #65408
smul.CSEL.vector.i32 R0.y, TMP0.y, #127
As the comment in midgard.h illuminates, these CSEL instructions are
actually operating per-bit, lining up with the all-1's booleans in
Midgard. The Bifrost analogue is MUX.i32.bit, not CSEL.i32. We should
probably rename the Midgard instruction to make that clear.
Anyhoo, on the scalar unit, CSEL/MUX operates on the bottom 32-bits of
its source. That's ok for the usual r31.w case, because that's secretly
replicating to its nonexistent register, I think? But that doesn't work
with the CSEL.vector (MUX.vector) form, because the condition it's
actually muxing on is r31.x, which here is R0.y, not the intended R0.x.
Rather than adding more special cases to the already overcomplicated
scheduler (for the dubious benefit of avoiding a small shaderdb
regression), just avoid scheduling CSEL.vector to smul.
With the next patch, fixes:
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>
We can go up to 15 instructions out of order (performance fix) but we
can't go past a branch (bug fix).
Fixes: 30a393f458 ("pan/mdg: Enable out-of-order execution after texture ops")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19762>
Unconditionally lowering prevents GL drivers from natively
implementing these ops. Drivers that need lowering should set
lower_uadd_carry and lower_usub_borrow on nir_shader_compiler_options to
get the nir lowerings.
Tested with dEQP-GLES31.functional.shaders.builtin_functions.integer.*
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19704>
NIR deemphasizes nir_variable. We want to transition off it. Instead of walking
the list of variables and playing games with the GLSL types to collect varying
information, walk the list of instructions and use the I/O semantics to collect
similar information.
In addition to avoiding the reliance on nir_variable, this fixes handling of
struct varyings under certain circumstances. Such programs are compiled by the
GLES3.1 CTS but not used, so without this fix, the affected tests would regress
when precompiling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Move the pass from the Bifrost compiler to the Midgard/Bifrost common code
directory, and take advantage of it on Midgard, where it fixes the same
tests as it fixed originally on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Now we're back to a single XFB implementation for all gens. Fixes:
KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-arrays
KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-elements
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19238>
gl_PointCoord is implemented via a special attribute descriptor on Midgard. This
descriptor has an orientation bit, the orientation is driver-controlled. That
means we can map rast->sprite_coord_mode to this bit, rather than lowering in
the shader.
This is a bug fix for point sprites, which are implemented natively on Midgard
for dubious reasons and need to be flipped this way. It is also an optimization
for apps reading gl_PointCoord, removing the extra arithmetic to flip, although
the value of this is somewhat dubious.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19237>
The loop over sources has to happen for every instruction, regardless of whether
we also need to register allocate the destination. The other source loops handle
this properly, but this one was missed.
Fixes spilling failure in shaders/android/angle/aztec_ruins/16.shader_test when
the input NIR is shuffled a bit (from reordering passes).
Fixes: 129d390bd8 ("pan/mdg: Fix bound setting in RA for sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19093>
Now the state tracker's responsible to lower away for us (and the state tracker
can do it correctly, our implementation is incorrect with a strict reading of
the Gallium contract).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18658>
This brings us in line with the rest of Mesa, fixing multithreaded shader-db
reports and the Total CPU Time report at the end.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18351>
The third source exists logically but not architecturally. We still need to
print it. Caught by the assertion.
Fixes: 0ee24c46e0 ("pan/mdg: Only print 2 sources for ALU")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18338>
This might not be optimal but it matches our current behaviour and is much more
justified than the "accidental" code before. Caught by the gcc warning:
../src/panfrost/midgard/midgard_schedule.c:1227:48: warning: the comparison will
always evaluate as ‘true’ for the address of ‘writeout_branch’ will never be
NULL [-Waddress]
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18338>
This is almost always a nir_instr and updating the src of a nir_if will
have to work slightly differently in the future.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
There are a bunch of subtle details of how 32-bit sources are
zero-extended to 64-bit, how their swizzles work, how 64-bit
destinations are shrunk to 32-bit, and how those two interact. This
fixes the interactions... mostly.
Fixes umul_high, all such tests should be passing now. Unblocks idiv
lowering that depends on umul_high.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
This works around issue packing 32-bit scalar swizzles zero-extended to
64-bit, seen with the umul_high implementation. I tried for a while
figuring out the root cause (even rewrote a big chunk of disassembler)
but am still a bit lost. Nevertheless this is a safe workaround with no
performance impact (and avoids relying on NIR undefined behaviour to
implement GPU undefined behaviour), so let's do this for now to fix
umul_high.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
source_root function is deprecated in Meson version 0.56.0, so let's use
instead a current_source_dir() function, available in all Meson
versions. This also allows to deduplicate some code by declaring
commonly used string at the top meson.build file.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17974>
..and assert the other sources are null. The one place this might fail in the
future is for real FMA, but we don't support that for GL.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>
These do not convey any additional information, and fail to account for
shrinking. In particular, a 64-bit writemask with .keephi would fail to
disassemble and instead trip the assertion, since that would be the ZW
components. Just delete the broken code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>
Otherwise, we can get vec3 with u2u32 with 64-bit sources which we need lowered.
Since our current approach is "scalarize all 64-bit ops", we need to check for
conversions too.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>
The callback allows to request different vectorization factors
per instruction depending on e.g. bitsize or opcode.
This patch also removes using the vectorize_vec2_16bit option
from nir_opt_vectorize().
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13080>
NIR already has the necessary lowering, and the GLSL lowering violates
GLSL IR validation rules. Once quadop lowering was turned off, the IR
validation at the end of the compile path on DEBUG builds caught the
problem.
In order to move the lowering to NIR, though, we need to make sure that
drivers supporting these functions actually have the lowering flag set.
xfails added for t860, where apparently this tickles a variety of existing
64-bit bugs in the backend.
Fixes: #6461
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16437>
This is set to true for all drivers that have a GLSL level
of support lower than 4.00. This matches the rule for setting the
GLSL IR option EmitNoIndirectSampler.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16543>
Vulkan doesn't need nearly as many system values and would like to bake
its layout up-front instead of having it provided by the back-end
compiler.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>
In 3559efb9bf ("panfrost: Allow passing an explicit UBO index for the
sysval UBO"), an explicit UBO index was added and it was implicitly
assumed that it would be > num_ubos. This was convenient because it
meant 0, the default for designated initializers, implicitly meant
compiler-assigned. However, we're about to move the sysval UBO to 0
which breaks this assumption. Also, we don't want the back-end
compiler to even look at num_ubos since it's meaningless in Vulkan.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>
If two instructions in a single bundle both write to a spilt
destination, then we need to reuse the fill and spill instructions,
otherwise the value will be overwritten.
This and the rest of this set of Midgard bug fixes were found from a
vertex shader in Firefox WebRender that is used when a video is
clipped, for example by setting the border-radius CSS property.
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>