Commit graph

193509 commits

Author SHA1 Message Date
Job Noorman
f448cf90c8 zink/ci: add a618 flake
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
72bb4d79dc ir3/legalize: handle scalar ALU WAR hazards for a0.x
It turns out that mova executes on the normal pipeline, which means that
users of a0.x on the scalar pipeline might cause a WAR hazard with mova.

Fixes: 876c5396a7 ("ir3: Add support for "scalar ALU"")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
dead168200 ir3: make fullsync sync after shared writes
fullsync would only sync after cat4/5/6 instructions. However, since the
introduction of scalar ALU, we also need to sync after writes to shared
registers. This commit fixes this by using the is_ss/sy_producer
helpers. This should also catch all cases where (ss) is need for WAR
hazards.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
2e40dda3cd ir3/ci: remove fixed tests from a307-fails
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
83b55a7d7c ir3: use correct bit size for bools in emit_alu
The special case for 32b bools on pre-a5xx gens was not taken into
account everywhere.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
cf395d1437 ir3: use rpt instructions for frag coord
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
6e6b338f33 ir3: add support for rpt bary.f/flat.b
These can be repeated like other instructions with one interesting
wrinkle: their immediate input location can also be repeated and its
value gets incremented by one for every repeat. They seem to be the only
instructions to support this.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
4a6d48cf4c ir3: enable load/store_const_ir3 vectorization
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
9998b65695 nir/load_store_vectorize: add load/store_const_ir3
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
db2859cb7f nir/load_store_vectorize: support stores without wrmask
Some store intrinsics (e.g., store_const_ir3) don't have a wrmask so
don't assume it always exists.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
97aefc4405 nir/load_store_vectorize: support non-byte offset
Some load/store intrinsics (e.g., load/store_const_ir3) use offsets in
units other than bytes. Currently, byte offsets were assumed in multiple
places.

This patch adds a new offset_scale field to intrinsic_info and uses it
were needed.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
fbd2c80671 ir3: rename @store_uniform_ir3 to @store_const_ir3
Uniforms are a legacy thing and this intrinsic was only used to store to
the const file so the new naming is less confusing.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
e0bad1dd20 ir3: replace @load_uniform by new @load_const_ir3 intrinsic
Uniforms are a legacy thing and this intrinsic was only used to load
from const registers so the new naming is less confusing.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
94c49b2cc3 ir3: add support for vectorized NIR phi nodes
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
6b611dbe79 nir/opt_vectorize: add support for phi nodes
Phi nodes are mostly handled the same way as ALU instructions: if all
sources point to the same def (which happens if they are scalar or have
been previously vectorized), combine them into a single vectorized phi
node.

There is one case where this doesn't work, however: sources that come
from a loop back-edge. Since their defs haven't been processed yet, they
are generally not the same. We could simply refuse to vectorize such
phi nodes but this could leave many values used in loops unnecessarily
scalarized.

Instead, this patch implements a simple heuristic: if all defs coming
from a back-edge have the same instructions type and, in case of ALU,
the same operation, assume they will be vectorized later. Since we
require that normal edges are vectorized already, chances are that the
back-edge can also be vectorized.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
79eb57de93 nir/opt_vectorize: process blocks in source-code order
To handle phi nodes, it's important that all sources have been processed
before processing the phi node itself. The current traversal order
(depth-first on dom_children) does not guarantee this. This patch
rewrites the pass to visit blocks in source-code order.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
b451575989 nir/opt_vectorize: prepare for multiple try_combine functions
Dispatch to different functions inside instr_try_combine. To prepare for
upcoming support for phi nodes.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
e2cb646148 nir/opt_vectorize: move rewriting of uses to a function
Will be shared with upcoming support for phi nodes.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
58d18bc7a8 ir3: lower vectorized NIR instructions
Use the new repeat group builders to lower vectorized NIR instructions.
Add NIR pass to vectorize NIR before lowering.

Support for repeated instruction is added over a number of different
commits. Here's how they all tie together:

ir3 is a scalar architecture and as such most instructions cannot be
vectorized. However, many instructions support the (rptN) modifier that
allows us to mimic vector instructions. Whenever an instruction has the
(rptN) modifier set it will execute N more times, incrementing its
destination register for each repetition. Additionally, source registers
with the (r) flag set will also be incremented.

For example:

(rpt1)add.f r0.x, (r)r1.x, r2.x

is the same as:

add.f r0.x, r1.x, r2.x
add.f r0.y, r1.y, r2.x

The main benefit of using repeated instructions is a reduction in code
size. Since every iteration is still executed as a scalar instruction,
there's no direct benefit in terms of runtime. The only exception seems
to be for 3-source instructions pre-a7xx: if one of the sources is
constant (i.e., without the (r) flag), a repeated instruction executes
faster than the equivalent expanded sequence. Presumably, this is
because the ALU only has 2 register read ports. I have not been able to
measure this difference on a7xx though.

Support for repeated instructions consists of two parts. First, we need
to make sure NIR is (mostly) vectorized when translating to ir3. I have
not been able to find a way to keep NIR vectorized all the way and still
generate decent code. Therefore, I have taken the approach of
vectorizing the (scalarized) NIR right before translating it to ir3.

Secondly, ir3 needs to be adapted to ingest vectorized NIR and translate
it to repeated instructions. To this end, I have introduced the concept
of "repeat groups" to ir3. A repeat group is a group of instructions
that were produced from a vectorized NIR operation and linked together.
They are, however, still separate scalar instructions until quite late.

More concretely:
1. Instruction emission: for every vectorized NIR operation, emit
   separate scalar instructions for its components and link them
   together in a repeat group. For every instruction builder ir3_X, a
   new repeat builder ir3_X_rpt has been added to facilitate this.
2. Optimization passes: for now, repeat groups are completely ignored by
   optimizations.
3. Pre-RA: clean up repeat groups that can never be merged into an
   actual rptN instruction (e.g., because their instructions are not
   consecutive anymore). This ensures no useless merge sets will be
   created in the next step.
4. RA: create merge sets for the sources and defs of the instructions in
   repeat groups. This way, RA will try to allocate consecutive
   registers for them. This will not be forced though because we prefer
   to split-up repeat groups over creating movs to reorder registers.
5. Post-RA: create actual rptN instructions for repeat groups where the
   allocated registers allow it.

The idea for step 2 is that we prefer that any potential optimizations
take precedence over creating rptN instructions as the latter will only
yield a code size benefit. However, it might be interesting to
investigate if we could make some optimizations repeat aware. For
example, the scheduler could try to schedule instructions of a repeat
group together.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
4c4366179b ir3: add post-RA pass to merge repeat groups into rptN instructions
For repeat groups where the register assignment allows it, merge them
into a single rptN instruction.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
c510b83a4d ir3: add pre-RA pass to clean up repeat groups
Clean up repeat groups that can never be merged into an actual rptN
instruction (e.g., because their instructions are not consecutive
anymore). This ensures no useless merge sets will be created for RA.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
4fcee235a6 ir3: make RA aware of repeat groups
Create merge sets for the sources and defs of the instructions in repeat
groups. This way, RA will try to allocate consecutive registers for
them. This will not be forced though because we prefer to split-up
repeat groups over creating movs to reorder registers.

When choosing a register for a repeat group's merge set, if its merge
set is unique (i.e., only used for these repeated instructions), try to
first allocate one of their sources (for the same reason as for ALU/SFU
instructions). This also prevents us from allocating a new register
range for this merge set when the one from a source could be reused.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
a5b03fc316 ir3: add builders for repeated instructions
For every instruction builder ir3_X, this patch adds new repeat builder
ir3_X_rpt to create a repeated version of an instruction.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
6aea957948 ir3: add backend support for repeated instructions
In order to represent repeated instructions (rptN) in ir3, this patch
introduces the concept of "repeat groups". A repeat group is a group of
instructions that were produced from a vectorized NIR operation and
linked together. They are, however, still separate scalar instructions.

Repeat groups are created by linking together multiple instructions
using a new rpt_node list. This patch adds this list as well as a number
of helper functions the can be used to create and manipulate repeat
groups.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
849005a471 ir3: print (sat) modifier of instructions
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
cd171964a6 ir3: add debug option to expand rpt instructions
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
ef162f9a6f ir3: correctly count vectorized instructions for tex prefetch
The tex prefetch heuristic simply counts the number of NIR instructions.
Since a vectorized NIR instruction expands to an ir3 instruction per
component, we have to take this into account while counting them.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
fe09ea29b9 ir3: fix counting of repeated registers
(r) registers also have their wrmask set so the instruction's rpt field
should not be taken into account.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
ddb0f5f4e6 ir3: fix wrong dstn used in postsched
Fixes: 750e6843c0 ("ir3: Rewrite postsched dependency handling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:26 +00:00
Job Noorman
28d2a27030 ir3: fix clearing merge sets after shared RA
After spilling during regular RA, merge sets need to be fixed up. To
find all merge sets, fixup_merge_sets used ra_foreach_dst. However,
after shared RA has run, shared dsts wouldn't have the IR3_REG_SSA flag
set anymore leaving their merge sets lingering. This patch fixes this by
using foreach_dst instead.

Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:26 +00:00
Job Noorman
9013e11d8c ir3: update merge set affinity in shared RA
The preferred register for merge sets was not updated after allocating
one. This caused a new merge set to be allocated for every register it
contains. This patch fixes this by reusing the update function from the
standard RA.

Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:26 +00:00
Connor Abbott
de1d36d054 ci: Uprev VK-CTS to 1.3.9.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29766>
2024-08-15 09:01:26 +00:00
Connor Abbott
bc1521e601 ci: Move two failing loader-related tests to all-skips.txt
There's no value testing these tests in CI until the loader is upgraded,
so don't force every driver to add them to their fails list.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29766>
2024-08-15 09:01:26 +00:00
Connor Abbott
f146c1d562 freedreno/ci: Combine and document failures due to test bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29766>
2024-08-15 09:01:26 +00:00
Pavel Ondračka
a1a06f386e r300: fix RGB10_A2 CONSTANT_COLOR blending
Just reverse the color order the same way we do for RGBA8.

Fixes: 910bac63df
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30656>
2024-08-15 07:02:44 +00:00
David Rosca
4b60918138 radeonsi: Don't allow DCC for encode in is_video_target_buffer_supported
This accidentally allowed DCC with format conversion, which is not supported.
Also disable EFC with VCN5 for now.

Fixes: 40c3a53fec ("radeonsi: Implement is_video_target_buffer_supported")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30562>
2024-08-15 06:26:16 +00:00
David Rosca
79ce0e3b2f frontends/va: Fix use after free with EFC
This happens when the source surface is destroyed before being used
in encoding operation. It also needs to disable EFC in this case.

Fixes: a7469a9ffd ("frontends/va: Rework EFC logic")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11653
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30562>
2024-08-15 06:26:16 +00:00
Eric Engestrom
1f34eb527c ci/build: reuse alpine llvm version to make sure it stays coherent
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30657>
2024-08-15 02:17:44 +00:00
Eric Engestrom
34aba675aa ci/container: define LLVM_VERSION in the alpine container job
Instead of allowing defining it in the job, but then not doing that.

The alternative being to delete only the dead `${LLVM_VERSION:=` and `}`
parts, but this way allows for the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30657>
2024-08-15 02:17:44 +00:00
Caio Oliveira
2150bc6d80 intel/brw: Use %td format for pointer difference
Fixes build for 32-bit, again.

Fixes: e72bf2d02f ("intel: Add executor tool")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30669>
2024-08-14 17:28:41 -07:00
Caio Oliveira
8a44b4812a intel/executor: Use PRIx64 to fix building in 32-bit
Fixes: e72bf2d02f ("intel: Add executor tool")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30668>
2024-08-14 21:41:28 +00:00
Eric Engestrom
ecad4eaeda docs: add sha256sum for 24.1.6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30667>
2024-08-14 20:48:19 +02:00
Eric Engestrom
3de0b1f7d7 docs: add release notes for 24.1.6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30667>
2024-08-14 20:30:29 +02:00
Eric Engestrom
409e4b09f7 docs: update calendar for 24.1.6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30667>
2024-08-14 20:29:56 +02:00
Eric Engestrom
3a0bb4c9fa docs: add sha256sum for 24.2.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30664>
2024-08-14 19:23:38 +02:00
Eric Engestrom
0b3a2a6285 docs: add release notes for 24.2.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30664>
2024-08-14 19:23:37 +02:00
Eric Engestrom
08c34b00df docs: update calendar for 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30664>
2024-08-14 19:09:02 +02:00
Karol Herbst
5d0c870c00 rusticl/mem: do not check against image base alignment for 1Dbuffer images
The CL cap in question is only valid for 2D images created from buffer.

Fixes: 20c90fed5a ("rusticl: added")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30655>
2024-08-14 15:33:01 +00:00
WANG Xuerui
cc2dbb8ea5 meson: Additionally probe -mtls-dialect=desc for TLSDESC support
Previously only `-mtls-dialect=gnu2` was probed, which was appropriate
for arm, x86 and x86_64, but not for newer architectures such as
aarch64, loongarch64 and riscv64 which all use `-mtls-dialect=desc`
instead. Because the driver option is not consistent across
architectures (and probably will not), try both variants and choose the
first one working.

While at it, rename "gnu2_*" variables to "tlsdesc_*" respectively, for
clarity.

Cc: mesa-stable
Reviewed-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Yukari Chiba <i@0x7f.cc>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30599>
2024-08-14 14:47:40 +00:00
WANG Xuerui
56f38672a2 meson: Force use of LLVM ORCJIT for hosts without MCJIT support
Although the ORCJIT codepath is fresh and relatively less tested, this
is still better than no llvmpipe at all for those newer architectures
that will not gain MCJIT support, such as LoongArch or RISC-V.

Fixes: 6f02ec5ed1 ("llvmpipe: add an implementation with llvm orcjit")
Reviewed-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Yukari Chiba <i@0x7f.cc>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30599>
2024-08-14 14:47:40 +00:00