Somehow the update of the fence value to write was dropped, so the
cmdstream that wrote the fence value would simply write zero over and
over again.
Fixes: 84d6eedd5e ("tu: Refactor the submit path")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
(cherry picked from commit 081869e591)
Shared RA might insert new defs to be handled by regular RA (e.g.,
shared spills). However, their interval offsets were not initialized
which caused their intervals to sometimes be mistakenly matched with
those containing offset 0. Fix this by calling index_merge_sets after
shared RA and modifying that function to only index new defs in that
case.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
(cherry picked from commit a0db2f9737)
With "classic" renderpasses, the VkFramebuffer's layerCount must be 1 if
multiview is enabled. We accidentally rely on this to not disable GMEM
for multiview, and possibly for other things too. Apparently the dynamic
rendering equivalent, VkRenderingInfo::layerCount, can be anything when
multiview is enabled, and some CTS tests set it to the number of views.
Sanitize it when constructing the internal framebuffer for dynamic
rendering.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34080>
(cherry picked from commit 15660caa90)
These changes happened with no mesa code change, only infrastructure
changes, which is really weird, but to be able to move on, let's simply
document the "new normal".
(Was missed in 69d6923cdb)
We were never setting has_multiview. It's not actually necessary anyway,
since we can just do the optimization we were trying to do whenever
num_views is 1 instead.
This doesn't affect the actual fragment size, which was already correct,
only gl_FragSizeEXT.
Fixes: 6f2be52487 ("tu, ir3: Handle FDM shader builtins")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33991>
(cherry picked from commit 8864ee7b0f)
When updating delays, we'd update all dst regs based on reg_elems.
However, when wrmask has gaps, this would update delays for regs that
aren't actually written. Fix this by skipping regs for which the
corresponding wrmask bit is zero.
Note that this wasn't just a performance issue but could result in
illegal code because the delay is reset to zero for tex/sfu
instructions. For example, the following (post-legalization) code was
observed in the wild:
(rpt1)add.f r1.w, (r)r2.w, (r)c3.z
sam.base0 (f32)(w)r2.x, r3.y, s#0, t#1
rcp r2.x, r2.x
Here, the add would result in a required delay for r2.x which would then
be cleared by the sam (even though it doesn't write to it), resulting in
insufficient delay before the rcp.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 61b2bd861f ("ir3: Rewrite nop insertion")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34107>
(cherry picked from commit 84dbd34332)
It is expected that inputs and prefetches are always in the first block.
However, ir3_create_empty_preamble would create blocks before the first
one, leaving inputs after the preamble. This causes issues with
(probably among others) spilling/RA where precolored inputs could
illegally reuse the spill base register.
Fixes RA validation failures on a7xx for
dEQP-VK.ray_query.multiple_ray_queries.vertex_shader
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f3026b3d3e ("ir3: add some preamble helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33977>
(cherry picked from commit c58ba21ba8)
When merging multiple instructions into one rpt instruction, the false
deps of the rpt instruction should be the union of the false deps of its
parts.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 4c4366179b ("ir3: add post-RA pass to merge repeat groups into rptN instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32454>
(cherry picked from commit 0f6ec14925)
We would set the `src` flag on the interval of reloaded sources.
However, the interval might be merged with its parent when inserted and
the parent wouldn't have this flag set. This caused the parent interval
to potentially be reused to reload later sources. Fix this by setting
the `src` flag on the top-level interval after insertion.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33810>
(cherry picked from commit 2d540b8074)
Turns out we were missing the glapi bits, making it impossible to use get
the function pointers for this extension. Whoops?!
[daniels: Squashed in a618 SkQP fails, presumably caused by these not
being skipped anymore.]
Fixes: 9f5af68995 ("mesa/main: expose `EXT_multi_draw_indirect`")
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
(cherry picked from commit fde6aeb886)
Apparently fast path cannot handle mismatched mutability and we
should use CP_BLIT which has SP_PS_2D_SRC_INFO.MUTABLEEN to signal
src mutability. Previously it was partially handled by
tu_attachment_store_mismatched_swap.
Fixes: a104a7ca1a
("tu: Handle non-identity GMEM swaps when resolving")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
(cherry picked from commit 97f851e7c5)
The fix in e7ac1094f6 to emit preamble defs in the correct block would
move the cursor of the builder that is later used to insert descriptor
prefetches, emitting them at the wrong place. Fix this by resetting the
cursor before emitting the prefetches.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: e7ac1094f6 ("ir3: rematerialize preamble defs in block dominated by sources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33399>
(cherry picked from commit 8404e7428b)
Currently the `nolrz` TU_DEBUG options are only checked during
device creation and image creation, respectively. This means that if
the options are enabled after the device/image is created, LRZ
will still be used. Similarly, if the options are disabled after the
device/image is created, LRZ will still be disabled.
This change moves the checks to the point where the LRZ is actually
used, allowing for runtime toggling of LRZ.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32906>
This adds a new environment variable, TU_DEBUG_FILE, which can be used to
enable/disable various debug options at runtime via writing to a file. This
is useful for switching between different debug options (such as toggling
between SYSMEM/GMEM) without needing to restart the application.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32906>
The division between color and d/s resolve groups appears to be
important. Not doing so causes corruption in some of games
when forcing gmem, e.g. in Arma, War Thunder.
Fixes: 25b73dff5a
("tu/a7xx: use concurrent resolve groups")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33169>
a630-gl takes just over 7 minutes on 4 DUTs, so we can safely reduce
the parallelism to 3 and stay within the time limit.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33082>
The a618-gl and a618-egl jobs are covered by a630-gl, which also
does egl testing, while a630-piglit is a more comprehensive
equivalent of a618-piglit, so we can de-duplicate these jobs.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33082>
Currently there are only 8 sm8350-hdk DUTs in LAVA, but there were
10 pre-merge jobs scheduled for them.
Add a full nightly job to cover the gaps.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33082>
This trace contains generated GL IDs from the time it was recorded,
making it invalid.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Mike Blumenkrantz <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33256>
LLVM 15 is pretty old, and notably not supported by either ANGLE nor
Skia anymore. So let's move up to LLVM 19 using packages provided by
LLVM themselves, apart from PPC and ARMv7 which don't have builds.
The Skia build now requires a bunch of new warning exclusions; hopefully
most of these are no longer needed when we can upgrade Skia shortly.
The ci-deb-repo revision has also been bumped to get us a new version of
xtensor which builds with LLVM 19, and a version of spirv-tools which
also works with LLVM 19.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Closes: mesa/mesa#11538
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33137>
cat3 instructions read their 3rd src later than their first two srcs.
Pre-a7xx, this was only supported for mad(sh) but on a7xx, it works for
all cat3 instructions.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33183>
cat3 instructions read their 3rd src later than their first two srcs.
This was implemented in two different places: once for scheduling and
once for legalization. Extract this logic in a new helper and also add
similar logic for gat/swz there (which the scheduling logic failed to
account for).
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33183>
Preamble defs were rematerialized at the end of the preamble. However,
when some of the sources were defined inside control flow, this would
lead to these sources not dominating their use. Fix this by finding the
block that is dominated by all sources and inserting the new instruction
there.
Also make sure we only de-duplicate instructions if the new instruction
is dominated by the existing one.
Fixes a NIR validation error in Devil may cry 5.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fdfe86aa52 ("ir3: Expand preamble rematerialization")
Fixes: 6a744ddebc ("ir3: Initial support for pushing globals with ldg.k")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33270>
This was accidentally deleted when rewriting to use common Vulkan
dynamic state. This meant we wouldn't correctly fall back when someone
accidentally used FDM together with multiple viewports.
Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33241>
Scheduling an alias.rt right before an alias.tex causes a GPU hang.
Follow the blob and schedule all alias.rt at the end of the preamble to
prevent this from happening.
Fixes a hang in Borderlands 3 on medium or higher graphics settings.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 0aa9678d4d ("ir3: add support for alias.rt")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33238>
Due to the slow startup time of deqp-vk, the previous default of
500 tests per group caused the jobs to run up to twice as slowly
compared to using a higher number of tests per group.
Since the CTS uprev did not introduce any new leaks that caused
the entire caselist to get marked as fails, increasing the number
of tests per group is considered safe and improves efficiency.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32817>
After the fix in commit b016f218fb (ci/android: fix meson C++
cross-compiler argument detection, 2025-01-14) the cross-build meson
hacks should not be necessary anymore for the CI builds to pass.
Remove the hacks making sure that the removed arguments are all in the
regular list of arguments passed through cpp.get_supported_arguments()
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33209>
This kernel was built using b2c's linux builder without any additional
patches... aside from the usual ones I apply for b2c, so we are almost
to the point where we can run a stable kernel on it!
The next uprev should be made from gfx-ci/linux rather than being
hosted on my server, but I would like to use a released kernel for
that.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33162>