If the caller passed in the same DRM file description, use it for sws->fd
as well. This is simpler than the previously reverted commit and also
fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/12208.
v2:
* Move fallback sws->fd assignment to proper scope, fixes CI failures.
* Remove close(sws->fd) from amdgpu_winsys_create failure path, it can
never be a valid file descriptor != aws->fd there.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32377>
If the condition of the loop terminator is based on an unsigned value we
can in some cases find the max number of possible loop trips. With the
max loop trips know a complex unroll can unroll the loop.
For example:
uniform uint x;
uint i = x;
while (true) {
if (i >= 4)
break;
i += 6;
}
The above loop can be unrolled even though we don't know the initial
value of the induction variable because it can have at most 1 iteration.
There were no changes with my shader-db collection. Change was inspired
by MR #31312 where builtin shader code failed to unroll.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31701>
This is mostly adapted from radv's BVH building. This defines a common
"IR" for BVH trees, two algorithms for constructing it, and a callback
that the driver implements for encoding. The framework takes care of
parallelizing the different passes, so the driver just has to split the
encoding process into "stages" and implement just one part for each
stage.
The runtime changes are:
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
The radv changes are;
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
This is mostly adapted from radv's BVH building. This defines a common
"IR" for BVH trees, two algorithms for constructing it, and a callback
that the driver implements for encoding. The framework takes care of
parallelizing the different passes, so the driver just has to split the
encoding process into "stages" and implement just one part for each
stage.
The runtime changes are:
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
The radv changes are;
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
RADV has a pipeline cache for meta shaders that can be used. It is also
required to correctly identify the pipelines as meta pipelines.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
All of these are functions that could reasonably be incorporated into a
Vulkan extension, but are currently missing. While we could in theory do
BVH building without them, using them simplifies the code significantly
and both radv and turnip can reasonably implement them.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
All functions that used to take an ir3_block as argument to append
instructions to now take an ir3_builder as argument.
Add an ir3_builder field to ir3_context and replace all uses of
ir3_context::block for creating instructions with ir3_context::build.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32088>
During instruction selection, instructions are sometimes emitted to
blocks other than the current one. For example, to predecessor blocks
for phi sources or to the first block for inputs. For those cases, a new
builder is created to emit at the end of the target block. However, if
the target block happens to be the same as the current block, the main
builder would not be updated to point past the new instructions.
Therefore, don't update the cursor when it points to the end of a block
to ensure that new instructions will always be added at the end.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32088>
Use box with largest ray interval for shadow rays (terminate on first
hit) as it maximizes the probability of finding some object in that box;
for reflection (closest hit) rays, use midpoint instead, which defers
processing of larger boxes the ray origin is in in favor of smaller
boxes closer to origin.
Since the sorting mode must be uniform, when terminate_on_first_hit flag
is divergent, we leave it as closest.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32416>
This allows `gbm_bo_get_offset()` to return the correct offset for e.g.
the second plane of a resource with the NV12 format. Crucially this
fixes direct scanout / hardware plane usage in Mutter and possibly other
clients.
While on it also add support for stride, modifier and n_planes queries.
The later two should not change in behavior and just safe a few CPU
cycles. The stride query support in theory fixes queries for multi-plane
formats, however in practice most/all currently used formats such as NV12,
P010 and YUV420 use the same stride for all planes.
Cc: mesa-stable
Acked-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32282>
This is implemented differently depending on the cluster size:
- At most 8: in this case, executing brcst.active will leave the
reduction in the last invocation of each cluster. Simply iterate the
clusters and broadcast the last invocation to the rest.
- Otherwise, also iterate the clusters but execute the usual
reduce_clusters_ir3 loop for each of them.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
lower_boolean_reduce only supports ballot_components == 1. Fall back to
lower_scan_reduce when this is not the case.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
It might be convenient for filter implementations to have access to
extra information. This will be used, for example, by ir3 to access
compiler features.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
Like read_first_invocation but using getlast. Note that I intentionally
used the name of the ir3 instruction in the name as its semantics are
tricky to exactly describe otherwise.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
Now that we use shfl for lowering shuffle operations, the generic
lowering of scan/reduce to shuffles results in faster code than our
custom loop for 64b operations.
Note that this was measured using a micro benchmark on full subgroups.
The generic lowering might be slower when not all invocations are active
but this should be a rare case.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
Some targets (e.g., ir3) don't always know the exact subgroup size.
Calculate the maximum subgroup size in that case by multiplying
ballot_components and ballot_bit_size.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
Tackles cases where mod propagation candidate ops have a restriction on one
of their sources but are commutative, thus allowing the restriction to be
worked around by swapping the sources.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32258>
Basic vertex/fragment shader I/O and sysval allocation rewritten to use
the new compiler/driver interface, with allocation moved entirely into
the driver.
RHW coeffs now only emitted when required.
Boilerplate support for converting formats for vs inputs/fs outputs.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32258>