Commit graph

4500 commits

Author SHA1 Message Date
Eric Engestrom
aab4a260db meson: add missing dependency
Now that renderonly.h includes util/simple_mtx.h, which itself includes
valgrind.h, dep_valgrind is required by any module that includes
renderonly.h.

    In file included from ../src/gallium/auxiliary/renderonly/renderonly.h:33,
                     from ../src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c:39:
    ../src/util/simple_mtx.h:34:12: fatal error: valgrind.h: No such file or directory
       34 | #  include <valgrind.h>
          |            ^~~~~~~~~~~~
    compilation terminated.

dep_valgrind is part of idep_mesautil, which should be used instead of
copying the list of deps for each util header included (which would
have to be updated every time a util header changes its own includes),
so let's add idep_mesautil everywhere that includes renderonly.h.

Fixes: ad4d7ca833 ("kmsro: Fix renderonly_scanout BO aliasing")
Tested-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20530>
2023-01-06 15:40:39 +00:00
Alyssa Rosenzweig
980df9ede1 pan/bi: Move Bifrost specific C code to src/compiler/bifrost
The goal is to make files at the root of src/compiler/ apply to both Bifrost and
Valhall, while ISA-specific code (e.g. instruction packing) code goes in
compiler/bifrost/ or compiler/valhall/. This is what Valhall is already doing,
the Bifrost specific stuff was just grandfathered in.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
2023-01-02 17:54:49 +00:00
Alyssa Rosenzweig
551c2aadd4 pan/bi: Remove standalone compiler
This functionality is now available on Linux with drm-shim + shader-db, and I
suspect the version bundled here is broken anyway. Strictly this drops
Windows/macOS support for the known-broken frontend to the shader compiler but I
can't say I'm terribly worried about that.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
2023-01-02 17:54:48 +00:00
Alyssa Rosenzweig
1a35acd8d9 pan/bi: Rename panfrost/bifrost -> panfrost/compiler
This is the compiler for both Bifrost and Valhall, and presumably future
Mali GPUs too. Give it a more generic name so we can use the bifrost/ path for
something a bit more specific.

For historical reasons the compiler's name is still "bifrost" and uses the
prefix `bi_`. I think that's ok in the same way that i915 in the kernel supports
way more than just i915.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
2023-01-02 17:54:48 +00:00
Aleksey Komarov
dcae301828 pan/va: Fix MUX.i32 and MUX.v2i16 description. Should be:
`(A &amp; mask) | (B &amp; ~mask)`

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20441>
2022-12-28 21:36:54 +00:00
Aleksey Komarov
d14d7c49db pan/va: Fix d0 description in enum "Load lane (8-bit)"
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20441>
2022-12-28 21:36:54 +00:00
Aleksey Komarov
f102b57423 pan/va: Fix description for constant 0xFAFCFDFE: -2, -3, -4, -6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20441>
2022-12-28 21:36:54 +00:00
nihui
e584447aed panvk: Fix null pointer dereference on cmd_buffer->ops
Fixes: 84cd81e104 (panvk: Use common code for command buffer lifecycle
management)

Signed-off-by: Hui Ni <shuizhuyuanluo@126.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20406>
2022-12-26 12:57:07 +00:00
Asahi Lina
bb4aa8a3ea panfrost: Fix race condition in BO imports
When importing a BO, if it is already imported, then the handle will
alias an existing BO instance. It is possible for the existing owner to
free the BO after the import and leave a dangling handle before we get a
chance to increase the refcount, so we need to lock the BO table mutex
before importing, to make sure nobody else goes through the free path
during that window.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20403>
2022-12-25 22:04:24 +00:00
Alyssa Rosenzweig
b53fa25587 panfrost: Clang-format pan_layout.c
Messed up the "clang-format off" for this file.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Aleksey Komarov <q4arus@ya.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20431>
2022-12-23 21:43:08 -05:00
Alyssa Rosenzweig
0afd691f29 panfrost: clang-format the tree
This switches us over to Mesa's code style [1], normalizing us within the tree.
The results aren't perfect, but they bring us a hell of a lot closer to the rest
of the tree. Panfrost doesn't feel so foreign relative to Mesa with this, which
I think (in retrospect after a bunch of years of being "different") is the right
call.

I skipped PanVK because that's paused right now.

  find panfrost/ -type f -name '*.h' | grep -v vulkan | xargs clang-format -i;
  find panfrost/ -type f -name '*.c' | grep -v vulkan | xargs clang-format -i;
  clang-format -i gallium/drivers/panfrost/*.c gallium/drivers/panfrost/*.h ; find
  panfrost/ -type f -name '*.cpp' | grep -v vulkan | xargs clang-format -i

[1] https://docs.mesa3d.org/codingstyle.html

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig
a4705afe63 panfrost: Fix up some formatting for clang-format
clang-format will make a mess of these otherwise.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig
e35719be6f panfrost: Add missing #includes
Found shuffling headers with clang format.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig
90e128ae03 panfrost: Remove perfetto-specific .clang-format
We'll use the one in src/panfrost/.clang-format instead, which isn't identical
but should be good enough. This way they don't conflict with each other.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig
ee2dcdc3df panfrost: Add clang-format file
Based on freedreno settings, tweaked for panfrost's foreach macros.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>
2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig
bd83e5ddaf pan/bi: Use write masks on Valhall texture instrs
I noticed a sequence like the following in a scheduled SuperTuxKart shader:

   TEX_SINGLE.slot0 @r0:r1, ..
   LD_VAR.wait0 @r2, ...
   FMA r1, ...

Why do we stall waiting for the TEX_SINGLE instruction when it's not actually
read? Because its upper channels are *never* read, leading to a
write-after-write dependency when the register allocator puts some unrelated ALU
destination in there. By appropriately masking the texture instruction's write,
that false dependency disappears, avoiding the stall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20426>
2022-12-23 19:05:10 +00:00
Alyssa Rosenzweig
7d9c771b9b pan/va: Pack texture write masks
We'll generate nontrivial ones in a moment.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20426>
2022-12-23 19:05:10 +00:00
Alyssa Rosenzweig
f6d73ea7b4 pan/lower_framebuffer: Remove unused pack
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20420>
2022-12-23 16:27:16 +00:00
Alyssa Rosenzweig
8dd35e0ac7 pan/mdg: Remove unused disassembler functions
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20420>
2022-12-23 16:27:16 +00:00
Alyssa Rosenzweig
9cd6d0873d panfrost: Remove experimental v7-only indirect draw path
There are too many problems with indirect draws on v7 that we never got this
code path to the finish line, and none of us have a good plan (or reason) to fix
this. Proper indirect draws are only possible since v10 on Mali.

There was interest in using this path to implement indexed draws in PanVK, that
MR is stalled and it's not clear how much sense it makes to do Vulkan on
anything older than v9 or v10 at this point. This code isn't *gone*, it'll still
be in git history, but I don't see a lot of reason in keeping it in tree if it's
unused and complicating e.g. the sysval upload path of the driver.

Indirect dispatch remains supported on v7, as that path *is* working and flipped
on for end users. Indirect dispatch on v7 is considerably less complicated than
indirect draws.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20420>
2022-12-23 16:27:16 +00:00
Alyssa Rosenzweig
486c341769 panfrost: Add architecture description XML for v10
Add the GenXML hardware description for Mali architecture v10, as implemented in
Mali-G610. This is not 100% complete but it should be good enough for parity
with v9.

The XML itself is forked off of v9, with all Job Managerisms replaced with
CSFisms. This notably includes a large number of new structures defining the
instructions that run on the Command Execution Unit (CEU).

This is the first step towards supporting Mali-G610 (i.e. RK3588) upstream. Next
up will be pandecode support.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20360>
2022-12-17 20:33:39 +00:00
Yonggang Luo
9f5ace9857 panvk: Fixes -Werror,-Wunused-but-set-variable for clang-15 in panvk_descriptor_set.c
../../src/panfrost/vulkan/panvk_descriptor_set.c:67:13: error: variable 'dynoffset_idx' set but not used [-Werror,-Wunused-but-set-variable]
   unsigned dynoffset_idx = 0, img_idx = 0;
            ^

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
2022-12-16 19:02:17 +00:00
Alyssa Rosenzweig
476be5cb27 panfrost: Don't use texture format swizzles on v7
They're too restricted for AFBC. Fix up instead. There are two problems at play:

1. We can't just map the format swizzle to the pixel format ordering on v7,
   because the "reordered" values aren't allowed with compression.
2. We can't just compose the format swizzle with the API swizzle, because the
   composed swizzle is applied to the border colour, so we need to be able to
   apply an inverted swizzle to the border colour. That only works for bijective
   format swizzles.

Fortunately, there's a neat solution: decompose the format's swizzle into two
swizzles, the first mapping to a reordering that IS allowed for compression, and
the second a bijection. Then we use the allowed reordering when texturing, apply
the bijective swizzle to the API swizzle, and apply the inverse of the bijective
swizzle to the border colour. When we're sampling a border colour, what's now
happening mathematically is:

   (API swizzle o bijective swizzle)((bijective swizzle^-1)(border colour)) =
   (API swizzle o (bijective swizzle o bijective swizzle^-1))(border colour) =
   API swizzle(border colour)

which is exactly what we wanted.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
2022-12-16 18:27:47 +00:00
Alyssa Rosenzweig
f159ff530e panfrost: Allow swizzled AFBC on v9+
On v6 and earlier, the hardware supports arbitrary format swizzles for AFBC, so
there's no restriction on AFBC. On v8 and newer, the format swizzle gets applied
to the *decompressed* interchange format, so we can effectively support BGRA of
AFBC images without any special handling. (Confirmed working on v9. Obviously I
can't test on v8 but the expression is cleaner if we assume optimistically it's
like v9. Without hardware, we get to make that assumption :-p)

That just leaves v7 as the only architecture where format swizzles are
restricted for compression but there are no plane descriptor. Don't apply the
restriction to the newer parts.

This gets us AFBC of window surfaces on v9+. As the limiting case, fullscreen
glmark2-es2-wayland -btexture (1080p) in sway on Mali-G57 from 1300fps to
2353fps.

45% reduction in frame time is nothing to sneeze at.

Achoo.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
2022-12-16 18:27:47 +00:00
Alyssa Rosenzweig
cb5e417c01 panfrost: Introduce pan_afbc_mode
Introduce an enum to represent an AFBC compression mode. These modes are not
formats, on Valhall they are decoupled from the format. As such, it does not
make sense to use a pipe_format to represent them. Add an enum that we can use
in a straightforward way on Midgard and Bifrost to fallback for texture views,
and can map 1:1 to the Valhall hardware enum.

In addition to being less overloaded semantically, this lets -Wswitch kick in to
ensure that we handle all enums when translating. The straightforward
translation raises the following warnings:

../src/panfrost/lib/pan_cs.c:437:9: warning: enumeration value ‘PAN_AFBC_MODE_R5G5B5A1’ not handled in switch [-Wswitch]
  437 |         switch (panfrost_afbc_format(PAN_ARCH, format)) {
      |         ^~~~~~

...indicating that some formats were missed, leading to assertion fails "unknown
canonical AFBC format" when rendering RGB5A1, which dEQP-GLES31 does. Fixes
regressions in
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.*
on Valhall.

Given how scarce v9 hardware is, that v10 isn't upstream yet, and the offending
code was merged a week ago, this should not have actually affected anyone. At
any rate, it's a good reminder we really do need CI for v9...

Fixes: 8e125b6c15 ("panfrost: Enable AFBC of more formats")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
2022-12-16 18:27:47 +00:00
Alyssa Rosenzweig
0784adc668 panfrost: Luminance-alpha AFBC unsupported on v7+
The L8_UNORM, A8_UNORM, and L8A8_UNORM v7 formats do not support AFBC,
regardless of swizzling. We're about to lift the restrictions on swizzling with
AFBC on v7, so we'll need to handle these cases explicitly to avoid using AFBC
in these cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
2022-12-16 18:27:47 +00:00
Alyssa Rosenzweig
a3f9aa3b3e panfrost: Align WSI strides for tiled AFBC
When calculating legacy WSI strides for tiled AFBC, we need to account for the
greater alignment requirement of tiled AFBC, or importing resources will fail
later.

Since tiled AFBC is only supported on v7 and later, and AFBC of window surfaces
isn't being used on Linux on v7 and later, this probably hasn't been hit in
practice. Probably.

We're about to fix AFBC of window surfaces so we need to fix this side first.

Fixes: 0255f554f3 ("panfrost: Advertise 16x16 tiled AFBC")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
2022-12-16 18:27:47 +00:00
Alyssa Rosenzweig
a861501632 panfrost: Add tool to print supported texture formats
While all Panfrost-supported Mali GPUs support all the compressed texture
formats architecturally, the system integrator decides which formats will
actually be wired up in the production system-on-chip. In the past there may
have been legal considerations, I'm neither a lawyer nor a system integrator so
couldn't say.

It's useful for users to know which compressed texture formats are supported by
their hardware, to understand its performance characteristics (and perhaps to
buy systems that support their needs, especially if they need BCn formats which
are omitted in many Mali implementations).

To help with that, this commit adds a small standalone tool that prints which
formats are supported. It is tested so far on Mali-T860 and Mali-G57.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20086>
2022-12-14 22:48:47 +00:00
Ian Romanick
eb76cee9f8 nir: Eliminate nir_op_i2b
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b.  Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.

At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.

The failing test on d3d12 is a pre-existing bug that is triggered by
this change.  I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.

v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.

v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.

v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress.  The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).

v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.

v6: Just eliminate the i2b instruction.

v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷

No shader-db changes on any Intel platform.

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2

Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2

The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.

Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
2022-12-14 06:23:21 +00:00
Rebecca Mckeever
f381187b8f panvk: Delete panvk_CmdSetDeviceMask, panvk_GetDeviceGroupPeerMemoryFeatures
Delete panvk_CmdSetDeviceMask and panvk_GetDeviceGroupPeerMemoryFeatures
so that the vk_common_* version will be used instead. This will avoid
repeated code.

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20218>
2022-12-09 14:08:14 -06:00
Alyssa Rosenzweig
8e125b6c15 panfrost: Enable AFBC of more formats
Enable AFBC for all RGBA UNORM formats possible in v5. This does not
cover the AFBC rules for newer gens, nor for YUV.

Noticed with an uncompressed R8 UNORM texture in SuperTuxKart.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19758>
2022-12-01 02:03:15 +00:00
Alyssa Rosenzweig
c7eb6a9fbb panfrost: Enable AFBC of sRGB formats
AFBC of sRGB formats should just work. We just need to flip it on and enjoy
the improved performance.

In particular, this means that RGBA8 UNORM and RGBA8 sRGB UNORM are now
considered compatible formats for AFBC. That's a bug fix, because
GALLIUM_HUD use will act like a texture view between sRGB and linear
views. For FBOs, that will "just" result in a decompression, hurting
performance. For window system rendering with AFBC, that will cause an
assertion failure, as we cannot decompress SHARED resources.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19758>
2022-12-01 02:03:15 +00:00
Alyssa Rosenzweig
cd21cf5ab6 panfrost: Handle all RGB AFBC modes on v9
We're about to enable AFBC on more formats in the core AFBC code. The plane
descriptor packing needs to be aware of these new formats.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19758>
2022-12-01 02:03:15 +00:00
Alyssa Rosenzweig
976405907e pan/mdg: Emulate 8-bit with the 16-bit pipe
We don't care to support i8vec16, we just need a bit of 8-bit support to
implement format packing/unpacking in blend shaders. We're already doing
this by using the 16-bit pipe, we just need to commit to it all the way
-- reporting the correct sizes in max_bitsize_for_alu so the mask
packing logic works as intended -- and dropping the imov-specific hack
that was introduced to workaround a similar class of bugs.

With the previous patch, fixes:

   dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1

Fixes: 39e4b7279d ("pan/midg: Fix swizzling on 8-bit sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>
2022-12-01 00:52:53 +00:00
Alyssa Rosenzweig
261d48fc9b pan/mdg: Refuse to schedule CSEL.vector to SMUL
Even if we only mask a single component from the result of CSEL.vector,
in our IR we treat its semantics as vector which causes trouble with
when scheduled to a scalar unit.

The problematic bundle looks like this:

   vmul.MOV.i32 R31, TMP0.xxxx, R0.yzww
   sadd.MAX.i32 TMP0.y, R0.y, #65408
   smul.CSEL.vector.i32 R0.y, TMP0.y, #127

As the comment in midgard.h illuminates, these CSEL instructions are
actually operating per-bit, lining up with the all-1's booleans in
Midgard. The Bifrost analogue is MUX.i32.bit, not CSEL.i32. We should
probably rename the Midgard instruction to make that clear.

Anyhoo, on the scalar unit, CSEL/MUX operates on the bottom 32-bits of
its source. That's ok for the usual r31.w case, because that's secretly
replicating to its nonexistent register, I think? But that doesn't work
with the CSEL.vector (MUX.vector) form, because the condition it's
actually muxing on is r31.x, which here is R0.y, not the intended R0.x.

Rather than adding more special cases to the already overcomplicated
scheduler (for the dubious benefit of avoiding a small shaderdb
regression), just avoid scheduling CSEL.vector to smul.

With the next patch, fixes:

dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>
2022-12-01 00:52:53 +00:00
Jessica Clarke
750325730b panfrost/blend: Fix invalid const values leading to NIR validation errors
Using a designated initializer like this leaves padding bits, which form
part of the aliasing u64/f64 member of the union, uninitialised, but a
nir_const_value must always have the unused bits zeroed out. Thus, use
the nir_const_value_for_float helper instead like everywhere else which
will do a memset 0 for us first.

Without this, using the pan_blend shader in a build with validation
enabled fails with:

  NIR validation failed after nir_lower_vars_to_ssa
  ...
            vec4 32 ssa_58 = load_const (0x3f7cfcfd /* 0.988235 */, 0x3f7cfcfd /* 0.988235 */, 0x3f7cfcfd /* 0.988235 */, 0x3f800000 /* 1.000000 */)
  error: memcmp(val, &cmp_val, sizeof(cmp_val)) == 0 (../src/compiler/nir/nir_validate.c:976)

Fixes: 1378c67bcf ("panfrost/blend: Inline blend constants")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20071>
2022-11-30 04:49:17 +00:00
Martin Roukala (né Peres)
0cee008fee Revert "glx: Fix drawable refcounting for naked Windows"
This reverts commit 768238fdc0 which
is not only leading to memory leaks, but also reportedly breaks KDE
pretty badly.

Fixes: #7674, #7435
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19972>
2022-11-25 20:08:45 +00:00
Alyssa Rosenzweig
4b19725ee5 panfrost: Revert "Require 64-byte alignment on imports"
This reverts commit 811f8a1946. As Alpine put it
-- this is causing more problems than it's fixing. Hotfix to revert the
offending commit until a more measured fix can be implemented.

Closes: #7731
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Jan Palus
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19993>
2022-11-24 23:46:55 +00:00
Alyssa Rosenzweig
044428211c pan/mdg: Fix out-of-order execution
We can go up to 15 instructions out of order (performance fix) but we
can't go past a branch (bug fix).

Fixes: 30a393f458 ("pan/mdg: Enable out-of-order execution after texture ops")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19762>
2022-11-23 20:23:50 +00:00
Simon Ser
b74a1c8fad panfrost: drop wayland-protocols dep
wayland-protocols is not a library, it just contains a bunch of
XML files. No need to try to link to it.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19894>
2022-11-23 12:15:23 +00:00
Yonggang Luo
40a9fc57aa tree-wide: Use __func__ instead of __FUNCTION__ in non-gallium code
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19861>
2022-11-22 06:53:46 +00:00
Yonggang Luo
94886a2975 util: Move src/gallium/include/pipe/p_format.h to src/util/format/u_formats.h
Because p_format.h shared between vulkan drivers and opengl drivers

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19629>
2022-11-19 03:38:19 +00:00
David Heidelberg
17aea35c44 ci/panfrost: drop glmark2 traces, useless
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19838>
2022-11-18 14:42:32 +00:00
M Henning
f3ee9be836 glsl: Drop borrow/carry lowerings in favor of nir
Unconditionally lowering prevents GL drivers from natively
implementing these ops. Drivers that need lowering should set
lower_uadd_carry and lower_usub_borrow on nir_shader_compiler_options to
get the nir lowerings.

Tested with dEQP-GLES31.functional.shaders.builtin_functions.integer.*

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19704>
2022-11-15 21:51:04 +00:00
Yonggang Luo
660b110494 util: Move src/gallium/auxiliary/os/os_mman.h to src/util/os_mman.h
Use "util/detect_os.h" instead of "pipe/p_config.h" and "pipe/p_compiler.h"
in src/util/os_mman.h

This is a prepare to implement os_mman on windows

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19645>
2022-11-15 19:55:01 +00:00
Dave Airlie
1486b54e80 panvk: move to using common command buffer status
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16922>
2022-11-11 05:01:24 +00:00
Jason Ekstrand
84cd81e104 panvk: Use common code for command buffer lifecycle management
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16922>
2022-11-11 05:01:24 +00:00
Jason Ekstrand
2126bb6c92 panvk: Drop panvk_cmd_buffer::queue_family_index
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16922>
2022-11-11 05:01:24 +00:00
Alyssa Rosenzweig
811f8a1946 panfrost: Require 64-byte alignment on imports
While Panfrost allocates linear images with strides that are a multiple of 64
bytes, other dma-buf producers on the system may not satisfy this requirement.
However, at least on v7 and newer, any image with a regular format must have a
stride that is a multiple of 64 bytes.

This fixes a real bug in an application that created a linear R8_UNORM image
with stride 480 bytes, imported it as an EGL_image, and then tried to texture
from it with the GPU. Previously, the driver allowed this situation but it
resulted in an imprecise fault from the GPU. This patch corrects the driver to
reject the import as invalid due to the unaligned stride, ensuring we never
attempt to texture from such a resource.

To implement, we add some new layout queries to centralize knowledge about the
stride alignment requirements, and we sprinkle in asserts to show how the
invariant is upheld throughout the lifecycle of image creation to texturing.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19620>
2022-11-10 14:37:18 +00:00
Alyssa Rosenzweig
1827b4a2db panfrost: Compile indirect dispatch shader on first use
For 2D UI workloads and even most 3D workloads, the indirect dispatch shader
won't actually be needed, but we currently compile it during eglInitialize() on
every v7 application. That hurts app start-up time, especially given that this
shader doesn't hit the disk cache. We can instead defer compiling this shader
until it's actually needed, when glDispatchComputeIndirect() gets called.

The tradeoff is that the first glDispatchComputeIndirect() call will be (much)
slower than successive calls, since we need to build and compile this internal
shader. I'm unconvinced that's a problem in practice.

An app would need to call glDispatchComputeIndirect for the first time in the
middle of a scene.  2D apps never would call that, OpenCL doesn't have that, and
GL compute will have the same costs just moved around.  So it's down to a 3D
GLES3.1 app that indirectly dispatches compute for the first time time in the
middle of a scene. Which, meh? It's not entirely implausible but we have bigger
fish to fry, and this fixes a real problem (about 5% of eglInitialize time spent
building this shader that won't actually get used).

es2_info starts slightly faster with this change.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19622>
2022-11-10 14:22:56 +00:00