We need to update the state var locations before the
st_serialize_base_nir() calls otherwise
_mesa_optimize_state_parameters() can alter params such that
variants wont be able to find the correct match when calling
_mesa_lookup_state_param_idx().
Prior to 891d46f5 this worked because after failing to match
we would end up adding additional params back in that we had
just attempted to optimise.
Fixes: a6fcc2835e ("
st/glsl_to_nir: make sure the variant has the correct locations set")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14837
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit 6c60f423b3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
There was an assumption that if the instruction had non-native float
as a source, the first source would have such type. This doesn't
hold for Select, and the code failed in two ways
- The boolean source of Select was being converted to the non-native
float type.
- The loop that resolves the bit-size for unsized operands would
trip at `assert(i == 0)` because Select has more than one source.
Re-organize the code to track the types of the sources independently,
and fix both issues above.
Fixes: 90e1b12890 ("spirv: Add bfloat16 support to SpecConstantOp")
Fixes: 51d3c4c889 ("spirv: support float8 spec constant op")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 6affcb43a7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Only used by Convert operations, so just pass 0 from callers that
are not Convert and clarify that in the code.
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 1c3c987d5c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit c0079e09ca)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.
Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8624da56ee)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 366410e913)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.
v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a548466186)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 5454221cfb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
This is an old driver bug that could cause Z corruption on gfx8-11.5.
v2: handle allow_expclear differently
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
(cherry picked from commit 4cfe08e583)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
(cherry picked from commit e1ed7de274)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.
This change should have no effect for non saturating conversions.
Fixes "conversions long_sat_*half" CL CTS tests
Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 8e8fb2ebaa)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
If the BO comes from a different subsystem
(args.extra_flags & DRM_PANTHOR_BO_IS_IMPORTED), we should normally
add extra DMA_BUF_IOCTL_SYNC calls around CPU accesses to ensure the
CPU mapping consistency, but this is something we never worried about
(we've always assumed exporters were exposing uncached mappings with
NOP {begin,end}_cpu_access() implementations), and it worked fine until
now.
The long term plan is to hook up DMA_BUF_IOCTL_SYNC, but this requires
more work, and we need a quick fix that can be backported easily, hence
this revert+FIXME.
Fixes: b5e47ba894 ("pan/kmod: Add new helpers to sync BO CPU mappings")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14963
Closes: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/282
Closes: https://gitlab.freedesktop.org/wayland/weston/-/issues/1101
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 30f1d5bab9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
st->release_counter is initialized to 0, so if we happen to call
st_add_releasebuf with a non-NULL releasebuf when st->work_counter
is 0 due to wraparound in st_context_add_work, we might end up never
calling st_prune_releasebufs.
Since st_context_add_work and st_add_releasebuf both use work_counter
as a "some work was done" and don't care about the actual value, we
can remove the wraparound which will fix the buffer not being released
issue.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14955
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 10d32feae8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We can have PHIs like this: m10 = PHI u2, 3.
For these, insert_coupling_code will spill the sources but that doesn't
work properly for FAU values before this commit because bi_index_as_mem
asserts that index.type == BI_INDEX_NORMAL and we also can't look up an
FAU index in ctx->S_exit or ctx->remat.
Fixes: 6c64ad93 ("panfrost: spill registers in SSA form")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 8a4d8d490b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
In the following arrangement the old logic leads to the following:
|
v
+----------+------------+
|block5 |
|m815 = PHI m1034, m860 |<-----------+
|343 = FMA.f32 ... | |
+----------+------------+ |
| |
+--------------+ |
| | |
v v |
+-----+ +-----+ |
|b6 | |b7,8 | |
| | | | |
+-----+ +--+--+ |
| +---+ | +---+ |
+----|b9 +-----+----|b10+---+ |
v +---+ +---+ v |
+-------+-------------+ +-------+---------+ |
|block12 | |block11 | |
|m882 = PHI m815, m860| |m860 = MEMMOV 343+--+
+---------+-----------+ +-----------------+
v
The spill of / into m860 (corresponding to 343) ends up in block11 when
insert_coupling_code(succ=block5, pred=block11) because of the memory
phi in block5. Later, in insert_coupling_code(block12, block9), we
reject inserting the spill after ca9c9957. As a result, m860 is
undefined along block5 -> block7,8 -> block9 -> block12.
When the spill position is chosen first, ctx->block is block5 so
choose_spill_position falsely returns the fallback position. The issue
can be fixed by explicitly passing the "current block".
Fixes: ca9c9957 ("pan: Avoid some redundant SSA spills")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 09e1ba28e5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
allowed precision mismatches on uniforms, however if you lower precision on
16-bit consts, then this error triggers instead.
So here we relax the type matching and just make sure we match int vs
float.
Fixes: 0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5337
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 73bc604128)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
AMD docs support this:
R5xx Acceleration v1.5 says safest handling for ZFUNC changes is to disable
HiZ except specific LESS/LEQUAL and GREATER/GEQUAL transitions.
ATI OpenGL Programming and Optimization Guide advises avoiding ALWAYS when
trying to benefit from HiZ so that would imply fglrx also disables HiZ
there.
On RV530 this fixes the following dEQPs:
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.74
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8093
(cherry picked from commit b0f019f8cf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.
This is the same bug that was fixed in v3d in the previous commit.
Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit ecb6c5d555)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.
This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.
This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 1eaa46da09)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 17b18496f6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.
Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
(cherry picked from commit 7e4744909b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().
This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.
This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit dad942b468)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The alpha instruction always wrote to the same rendertarget as the rgb and the
original target was ignored (surprisingly the HW docs explicitly allows rgb and
alpha to write to different targets). This makes tesseract rendering a bit
better, but there are still some remaining issues.
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Reviewed-by: Filip Gawin <filip@gawin.net>
(cherry picked from commit 87a881558f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
When building with "-Dvideo-codecs=h264dec,h265dec,av1dec" va/encode.c
won't be built but it's still required because it's used from
picture.c
Fixes: c4f05bdf60 ("frontends/va: include picture_*.c based on selected codec")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 82a51ba9b3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Among all subcommands, only gfx subcommands are bound to a query pool,
other subcommands seem to need no special handling.
In addition, if a ResetQuery is done before BeginQuery, the last
subcommand will be a event one, which fails the current assert that
assumes it's a gfx one.
Change the assertion of the subcommand being a gfx one to an addition
check of whether the subcommand is a gfx one.
This fixes crash of Vulkan CTS 1.4.5.1 test
dEQP-VK.query_pool.discard.normal.no_depth.none.discard .
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 5a497316d4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
In 1f83e73145, the pre-encode input picture size was also reduced.
However it was recently discovered that VCN FW uses the input picture
pitch as the pitch for this, which means that previous change broke
pre-encode.
Fixes: 1f83e73145 ("radeonsi/vcn: Reduce allocated size for pre-encode recon pics")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 2b2b1d405a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The logic is supposed to find the stage with the maximum constlen to
trim for each time we have to trim a stage. But by not resetting
max_constlen each time, we would "trim" the same stage repeatedly,
leaving us thinking the total is below the limit when it actually isn't.
Cc: mesa-stable
(cherry picked from commit ae8928b638)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The HW uses ViewportIndex to select which GRAS_BIN_FOVEAT offset to use.
For normal 3d draws, either the ViewportIndex equals the view/layer or
we make the offset the same for all viewports/layers, but we aren't
aware of this in the 3d path and we always use viewport 0.
Use the HW offset 0 when subtracting the HW offset. This is a bit of a
hack, but it should work. This fixes LOAD_OP_LOAD with FDM.
Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")
(cherry picked from commit 68c0031f56)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
- RGBA8888_* is a preprocessor alias for R8G8B8A8_* in u_format.yaml.
- Both entries in the format tables collide on the same enum value, and
RGBA8888 overwrites R8G8B8A8.
- The fix was reverting to the version that was in the commit
39e949434c because there is a different format
was used that did not cause any collisions.
dEQP fixes:
dEQP-VK.api.info.format_properties.r8g8b8a8_sint
dEQP-VK.api.info.format_properties.r8g8b8a8_snorm
dEQP-VK.api.info.format_properties.r8g8b8a8_uint
dEQP-VK.api.info.format_properties.r8g8b8a8_unorm
Fixes: 9f740b26a6 ("pvr: Fix bugs in the format table")
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 7c6dbb099a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
We should be returning if no GS is needed and no GS shader is bound.
This fix various segfaults introduced by the original fix.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e10f29399f ("hk: fix passthrough GS key invalidation")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 6d040df750)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
With perfetto that string is processed later leading to
use-after-free.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 413e169f45)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>