See 0afd691f29 ("panfrost: clang-format the tree") for why I'm doing this.
Asahi already mostly follows Mesa style so this doesn't do much. But this means
we can all stop thinking about formatting and trust the robot poets to do that
for us.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
There are no outstanding commits to these files in any branch, so they don't
need to be considered for the rebasing script. That said, they are massive and
bottleneck the rebasing script, so we'll want to split them out to keep rebasing
efficient.
(Nominally I should make the rebasing script less stupid but with these files
ignored it works pretty well.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
Fixes
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
when run with a GLES2 version.
We wire up seamless cube maps for GLES3+ only, working around an obscure
mesa/st limitation. See 6148e3aae7 ("mesa: Fix
ctx->Texture.CubeMapSeamless") for the full context.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
Lower FRAG_RESULT_DEPTH and FRAG_RESULT_STENCIL writes to a combnied zs_emit
instruction with a multisampling index. To be used in the following commit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
The hash table needs a key pointer with at least the lifetime of the
hash entry, which the key pointer we get does not have (since it is
stack-allocated by agx_build_meta). Copy it into the shader struct
itself and use that for the hash table.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
It seems triangle merging is incompatible with calculating derivatives
along primitive edges correctly. Take the appropriate NIR shader info
flags in the compiler and pass them down as a flag to the driver, so it
can set the disable triangle merging flag (formerly called "lines or
points").
TODO: Is this what macOS does when you set a sample mask there (which
apparently fixes the same bug on the Darwinia Metal backend)? Do we
also need to set this when sample masks are used?
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes Darwinia and dEQP2 projected tests.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
Probably impossible to hit in practice but let's get it right. Found when
forcing RA to use the upper half of the reg file.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
Extract out the code for unpack_64_2x32_split_x and use it for other integer
downcasts too to coalesce out a move. Pointless, but I wanted to have a little
RA fun after getting stencil export working.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
Layer strides are based on the full miptree, and even for single-layer
images macOS always allocates a full one (possibly relevant for
compression). Make sure we do the same, regardless of how many mip
levels the user asked for.
Fixes Darwinia.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
Now we support all the vertex formats! This means we don't hit u_vbuf for format
translation, which helps performance in lots of applications. By doing the
lowering in NIR, the vertex fetch code itself can be optimized by NIR (e.g.
nir_opt_algebraic) which can improve generated code quality.
In my first implementation of this, I had a big switch statement mapping format
enums to interchange formats and post-processing code. This ends up being really
unwieldly, the combinatorics of bit packing + conversion + swizzles is
enormous and for performance we want to support everything (no u_vbuf
fallbacks). To keep the combinatorics in check, we rely on parsing the
util_format_description to separate out the issues of bit packing, conversion,
and swizzling, allowing us to handle bizarro formats like B10G10R10A2_SNORM with
no special casing.
In an effort to support everything in one shot, this handles all the formats
needed for the extensions EXT_vertex_array_bgra, ARB_vertex_type_2_10_10_10_rev,
and ARB_vertex_type_10f_11f_11f_rev.
Passes dEQP-GLES3.functional.vertex_arrays.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
Long term, I think having i2i16 and i2i32 available with 8-bit sources should
make lowering the rest of 8-bit away a bit easier. Short term, this avoids
special casing 8-bit in the VBO lowering code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
The coefficient register is 16-bit so our builder will make the iter 16-bit too
(maybe not the best design...), force fp32 to match the NIR intrinsic.
Fixes glsl-fs-fragcoord-zw-ortho
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20017>
Although the metadata is possibly one byte per 8x4 block, the
logical block size for compression/allocation is a 16x16 block,
so align to that. Also align the initial dimensions to that size,
and change the minification to a simple DIV_ROUND_UP.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20031>