This is an old driver bug that could cause Z corruption on gfx8-11.5.
v2: handle allow_expclear differently
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022>
Applications often miss emitting barriers between a shader
initializing data & another shader writing data in the same location
afterward. This is very common for UAVs (see vkd3d-proton).
Vkd3d-proton does a pretty good job as inserting missing barriers
between UAV clears & writes. But some applications also have similar
issues with custom shaders. Here we introduce an analysis pass that
recognize shaders doing clear/initialization. We'll use that
information in the following commit to insert barriers after those
shaders.
Since Gfx12.5 our HW has become a lot more sensitive to those issues
due to the introduction of an L1 untyped data cache that is not
coherent across the shader units. On Gfx20+, typed data is also L1
cacheable exposing even more issues.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40187>
This doesn't have a functional impact, but when this command is run:
$ src/intel/genxml/genxml_import.py --import
It causes the gen125_rt.xml file to be processed in the expected
order, before xe2.xml.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40217>
No reason to require a barrier either because there is already one
before doing resolves and decompressions should already be correctly
synchronized.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40135>
The source image layout must be either TRANSFER_SRC or GENERAL and the
application must emit the image layout transition. There is no reason
the source image wouldn't be readable by shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40135>
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.
This change should have no effect for non saturating conversions.
Fixes "conversions long_sat_*half" CL CTS tests
Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
If the BO comes from a different subsystem
(args.extra_flags & DRM_PANTHOR_BO_IS_IMPORTED), we should normally
add extra DMA_BUF_IOCTL_SYNC calls around CPU accesses to ensure the
CPU mapping consistency, but this is something we never worried about
(we've always assumed exporters were exposing uncached mappings with
NOP {begin,end}_cpu_access() implementations), and it worked fine until
now.
The long term plan is to hook up DMA_BUF_IOCTL_SYNC, but this requires
more work, and we need a quick fix that can be backported easily, hence
this revert+FIXME.
Fixes: b5e47ba894 ("pan/kmod: Add new helpers to sync BO CPU mappings")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14963
Closes: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/282
Closes: https://gitlab.freedesktop.org/wayland/weston/-/issues/1101
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40201>
st->release_counter is initialized to 0, so if we happen to call
st_add_releasebuf with a non-NULL releasebuf when st->work_counter
is 0 due to wraparound in st_context_add_work, we might end up never
calling st_prune_releasebufs.
Since st_context_add_work and st_add_releasebuf both use work_counter
as a "some work was done" and don't care about the actual value, we
can remove the wraparound which will fix the buffer not being released
issue.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14955
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40168>
We can have PHIs like this: m10 = PHI u2, 3.
For these, insert_coupling_code will spill the sources but that doesn't
work properly for FAU values before this commit because bi_index_as_mem
asserts that index.type == BI_INDEX_NORMAL and we also can't look up an
FAU index in ctx->S_exit or ctx->remat.
Fixes: 6c64ad93 ("panfrost: spill registers in SSA form")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40189>
In the following arrangement the old logic leads to the following:
|
v
+----------+------------+
|block5 |
|m815 = PHI m1034, m860 |<-----------+
|343 = FMA.f32 ... | |
+----------+------------+ |
| |
+--------------+ |
| | |
v v |
+-----+ +-----+ |
|b6 | |b7,8 | |
| | | | |
+-----+ +--+--+ |
| +---+ | +---+ |
+----|b9 +-----+----|b10+---+ |
v +---+ +---+ v |
+-------+-------------+ +-------+---------+ |
|block12 | |block11 | |
|m882 = PHI m815, m860| |m860 = MEMMOV 343+--+
+---------+-----------+ +-----------------+
v
The spill of / into m860 (corresponding to 343) ends up in block11 when
insert_coupling_code(succ=block5, pred=block11) because of the memory
phi in block5. Later, in insert_coupling_code(block12, block9), we
reject inserting the spill after ca9c9957. As a result, m860 is
undefined along block5 -> block7,8 -> block9 -> block12.
When the spill position is chosen first, ctx->block is block5 so
choose_spill_position falsely returns the fallback position. The issue
can be fixed by explicitly passing the "current block".
Fixes: ca9c9957 ("pan: Avoid some redundant SSA spills")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40189>
[WHY]
3x4 matrix passed in as part of 3DLUT compound.
Due to HW limitations of coeffs, hdr_mult may need to be used.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40210>
[WHY]
VPE Visual confirm hanging with color format output due to misalignment.
VPE Visual confirm hanging with performance mode.
[HOW]
Adjust bg segment generation
Adjust per op programming and reset second pipe mux.
Signed-off-by: Roy Chan <roy.chan@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40210>
[WHY & HOW]
Currently missing RGB JFIF enum for switch case - add it in
Also updated unit test to pass
Signed-off-by: Brendan Steve Leder <BrendanSteve.Leder@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40210>