Now that we have the block's final cycle value available in its state,
we don't have to subtract it at the end of a block anymore, but we can
do it at the beginning when merging it into its successor state. This
will save us one iteration over all its ready slots.
Signed-off-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34108>
Having the cycle as part of the state will become convenient for two
reasons:
- It will allow us to merge the state of predecessors without having to
normalize states at the end of blocks (i.e., we now have to subtract
the block's final cycle value from its ready slots at the end of the
block; having its final cycle value available in its state will allow
us to do this when merging predecessor states at the start of the
block).
- We can update the cycle value as part of delay/sync state update
routines. This way, the user doesn't have to worry about which
instructions should actually update the cycle as this logic is nicely
encapsulated.
This is part of the preparation for making the delay/sync legalization
logic available outside of ir3_legalize.
Signed-off-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34108>
Resetting the ss/sy delays at the start of blocks would underestimate
the actual delays at runtime. Make the estimate more accurate by keeping
track of outstanding delays at the end of blocks and setting the initial
delays of blocks to the maximum of their predecessor blocks.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34108>
This commit adds most operations to enable compute and basic draw tasks
for KeplerB (also known as Kepler 2.0, chips GK110 to GK180 or codename
NVF0-NVF1). There are still major aspects such as as textures,
surfaces, shared atomics and scheduling that still need work and will be
added in other commits.
Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34329>
Kepler and earlier GPUs do not support the ISBERD instruction but have a
different VILD (Vertex Indirect Load) instruction that provides less
functionality. This commit adds support for the op in nak and nir,
needed for the upcoming encoder commit.
Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34329>
This patch changes the update code to launch 8 invocations for every
internal node. The internal nodes update their child leaf nodes using
the geometry index and primitive index stored inside the primitive node.
Processing 8 child nodes in parallel is faster than looping over them.
Moving to one dispatch that updates all nodes in one go lets us get rid
of atomics and will also enable updatable BVHs to use pair compression.
Improves Elden Ring (high settings, max RT settings, 1080p) by around
10%.
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
During the maintenance, and proably the previous MR to the first attempt
to merge, fixed a failing test without testing on fluster because the
farm was disabled.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34922>
The maximal length is 65535 as an offset of 16 bits is being used to encode it.
Afterwards in VIRGL_CMD0, the buf_len equals 65536, so buf_len << 16 overflows its type which is uint32_t.
CID: 1604743 Overflowed constant
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34816>
Do not override the handle number with 0 if we fail to create a new resource.
Also make sure to store the handle consistently in an uint32_t.
CID: 1644460 Overflowed constant
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34816>
For coherent/volatile access, this would be too high for vector access.
Even when we didn't set the alignment, LLVM seemed to assume too high of
an alignment for 8/16-bit vector access.
Fixes generated_tests/cl/vload/vload-char-constant.cl
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
This assumes that the start of the load is 32-bit aligned.
For example, a vec3 16-bit store with align_offset=2 should split off the
first component, not the last.
This probably also fixed splitting with 8-bit stores.
Fixes arb_copy_buffer-overlap
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
This handles basic operations where clang promotes integers to 32 bits
according to the C99 spec in OpenCL C source code.
This is its own opt_algerbraic pass, because we don't wanna fight with
nir_lower_bit_size.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34641>
Storage access to images using LEA_TEX[_IMM] has limitations on some
fields in the texture descriptors, making them incompatible with the
descriptors required for texture access, specifically in the case
non-zero levels.
This change sets up two sets of texture descriptors for image views of
storage images, then picks the correct one when writing the image view
descriptors.
Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>
We're currently not setting the v10+ width/height in the plane
descriptors. This change ensures we do.
Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>