This is not safe when we have conditional spills since we could be
spilling disabled lanes with undefined values that could overwrite
valid data for those lanes from a previous spill of the same temp
that was unconditional (or that condionally enabled those same
lanes).
Fixes some Piglit OpenCL tests as well as the following OpenCL tests:
integer_divideAssign
integer_moduloAssign
integer_mad_sat
integer_ops integer_divideAssign
integer_ops integer_mad_sat
integer_ops integer_moduloAssign
integer_ops quick_char_math
integer_ops quick_short_math
math_brute_force half_powr
math_brute_force pow
math_brute_force pown
math_brute_force powr
math_brute_force rootn
Fixes: 597560e27c ('broadcom/compiler: always enable per-quad on spill operations')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>
(cherry picked from commit d1f8351f3c)
The multop instruction implicitly writes rtop which is not preserved
acrosss thread switches. We can spill the sources of the multop
(since these would happen before multop) and the destination of
umul24 (since that would happen after umul24).
Fixes some OpenCL tests when V3D_DEBUG=opt_compile_time is used to
choose a different compile configuration.
cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29909>
(cherry picked from commit 38b7f411a1)
In a scenario where two non-concurrent cmdbufs are submitted to the
compute queue and with the second one using DGCC, the driver would have
chained the CS of the first cmdbuf to the new IB created right after
the DGC IB is executed.
Found while working on DGC task shader with vkd3d-proton.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29913>
(cherry picked from commit fec9b56f17)
nir_opt_preamble sometimes adds useless expressions, in which case we
may have stc instructions and no corresponding use of the constant.
Things can go sideways when these aren't included in the constlen, so
far only observed when earlypreamble is enabled.
Fixes: ccc64b7e00 ("ir3: Plumb through store_uniform_ir3 intrinsic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29903>
(cherry picked from commit c42f6597f9)
radv has a comment in radv_meta_dcc_retile.c:
* BPE is always 4 at the moment and the rest is derived from the tilemode.
radeonsi has in si_retile_dcc:
/* We have only 1 variant per bpp for now, so expect 32 bpp. */
assert(tex->surface.bpe == 4);
This fixes ext_image_dma_buf_import-modifiers for radeonsi.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29612>
(cherry picked from commit abd048124a)
Fixes:
//exec is empty here
loop {
%1:s[16-17] = ...
if () {
break
}
%2:s[16-17] = ...
continue_or_break
}
%3 = phi %1, undef
//because of the undef, %2 can use s[16-17] and overwrite the address
load(%3:s[16-17])
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bbe4652430 ("aco: create lcssa phis for continue_or_break loops when necessary")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11333
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29838>
(cherry picked from commit e3ffc244f5)
indirect store lowering will use if/else which changes
the control flow of the shader.
Fixes: 110887de2b ("nir: Add a new pass to lower array dereferences on vectors")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29894>
(cherry picked from commit 93f790b04a)
indirect store lowering will change control flow,
so we should not preserve control flow metadate
when it's present.
Fixes: 35b8f6f40b ("nir: Add a new pass to lower array dereferences on vectors")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29894>
(cherry picked from commit 09b4ba27a3)
Since commit 18d8c3ca33 we were allocating a little more than what
we were actually using (2621440 bytes instead of 2097152, aka 0x280000
instead of 0x200000), and we were not properly marking the BO as
internal. No applications should be misbehaving because of this.
Fixes: 18d8c3ca33 ("anv: Add missing ANV_BO_ALLOC_INTERNAL")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
(cherry picked from commit 2f65acfbb8)
This will disable cases with 2D array views (which could be views to 3D
texture) but enables on regular 2D surfaces which seems to work fine.
Fixes: 70382f7f06 ("intel/isl/xe2: Enable route of Sampler LD message to LSC")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29760>
(cherry picked from commit 4a0a716b6a)
This will prevent the dynamic loader to pick the wrong function once we
rename things to the proper API names.
In any case, this should have been done all along anyway.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29855>
(cherry picked from commit be090abf2e)
Makes Cyberpunk, Hitman and Total War Warhammer 3 run on LNL.
Fixes: c9e41f25a1 ("anv: Add heaps for Xe KMD in platforms without LLC")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29775>
(cherry picked from commit 87787c4a87)
In this assert we want to enforce that if a cached buffer is created
it is a cached+coherent as Xe KMD don't support cached+incoherent.
Did not caught this issue because it only reproduces in platforms with
GPU outside of LLC.
Fixes: 9d8d5cf8c9 ("anv: Remove block promoting non CPU mapped bos to coherent")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29826>
(cherry picked from commit 73ce3143a8)
Snce the *args parameter was added it's assumed to be non-null. If it is
null then the function is going off to UB land. As such, a later check
added for args being NULL is useless, and confuses coverity.
fixes: 3a752256f5
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29664>
(cherry picked from commit e67a8dc59a)
This whole thing was just a mess and never really worked the way it was
supposed to. All drivers can support GDI interop and double-buffering
independently at this point, so just remove it.
Fixes: c432fbe5 ("wgl: Add no-gdi-single-buffered and gdi-double-buffered PFDs")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jose Fonseca <jose.fonseca@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29819>
(cherry picked from commit a02b759f41)
LLVM implements multiple workarounds for gfx11.
The problem is that they're not applied for shaders built in
parts.
LLVM will be modified to be more conservative and apply the
workaround in more places but in the meantime, add a simpler
implementation in the NIR to LLVM backend: insert a wait at
the end of each shader part.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10785
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29304>
(cherry picked from commit 14974fd097)
Makes sure that sample_functions is not modified while shaders are
running.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
(cherry picked from commit ce85f3a431)
sample_functions can be reallocated between get_sample_function and llvmpipe_clear_sample_functions_cache.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
(cherry picked from commit 5941bee017)
The intention of this block was to set one of the flags that is used
to select a PAT index but this was doing more than that.
It was promoting WB+0 way coherency BOs to WC+1 way coherency possibly
causing regression in platforms without LLC.
anv_device_get_pat_entry() return WC/writecombining if no flags is
set so we don't need this block after all.
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Fixes: a65e982b44 ("anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29769>
(cherry picked from commit 9d8d5cf8c9)
I have no idea why I flipped the order of these to checks vs. the C
code when I wrote the NIR helper. We need to deal with NaN first or
else the fmin will smash NaN to MAX_RGB9E5 and it won't get handled as
NaN.
Fixes: 9981709d8f ("nir/format_convert: Add a function to pack RGB9_E5 formats")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>
(cherry picked from commit 86aad90e2a)
[Eric]
swap `color` and `clamped`, see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793#note_2459681
This fix failure on "dEQP-VK.api.buffer.basic.size_max_uint64".
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 822478ec20 ("panvk: Move the VkBuffer logic to its own source file")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
(cherry picked from commit c0f8465fa8)
Fix a crash when a null handle is passed.
(dEQP-VK.api.null_handle.destroy_command_pool)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: afbac1af77 ("panvk: Move the VkCommandPool logic to panvk_cmd_pool.{c,h}")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
(cherry picked from commit 716e0e1568)
CTS tests both layered and separate DPB, but radv wasn't handling
layered properly when used with the tier 2 dpb handling.
This adjusts the addresses to use the layer index for tier2.
Fixes dEQP-VK.video.decode.*layered*
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29758>
(cherry picked from commit b888946f7a)
The enums were mixed up. Code was working because they were being
used only for their numerical values.
Fixes: e666872c75 ("intel/compiler: Initial bits for DPAS instruction")
Acked-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29762>
(cherry picked from commit f982d2bb79)
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in LDS.
Fixes: c61eb54806
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
(cherry picked from commit 0f0ebd8512)
On newer devices where ZPASS_DONE events have sample count writing
abilities the firmware expects these events to come in begin-end pairs,
essentially corresponding to a typical occlusion query usage. Since this
event is also used in the autotuner we have to avoid event pairs to be
emitted in an interleaved fashion.
Additional renderpass state now tracks whether a given renderpass contains
an occlusion query. If so, autotuner will emit miscellaneous ZPASS_DONE
events in order to form its own begin-end pairs before and after the
renderpass commands.
Occlusion query behavior inside a renderpass doesn't change. But when used
outside of a renderpass, possible autotuner usage requires to again emit
ZPASS_DONE events that end up forming begin-end pairs of these events both
at the start and the end of the query.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 4e6a1f8852 ("tu/autotune: Use `CP_EVENT_WRITE7::ZPASS_DONE` on A7XX")
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
(cherry picked from commit 5653c52151)
When both an interval and some of its children would be live-in, we used
to add phis for all of them. This could lead to cases where the pressure
after spilling was higher than before.
This happens, for example, when both a split and its parent are live-in.
Before spilling, the split wouldn't add to the pressure because its
parent had already been inserted. After spilling, since we created a phi
for the split, the link with its parent would be lost and it would add
to the pressure.
Fix this by only adding phis for top-level intervals and adding splits
after them.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 6bc7cd6108)
There were a few places that used an instruction pointer to decide where
new instructions should be created. NULL was used to add them at the end
of the block. While fixing a spilling bug, a new option was needed to
add instructions at the beginning of the block. This will be much easier
to implement using cursors.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 18cd803cef)
Whenever instructions need to be created at specific locations, ir3
often passes around an instruction pointer. When set, new instructions
are added before or after it (depending on the context). When NULL, new
instructions are added at the end of the block. This whole scheme is
confusing.
This patch adds ir3_cursor and ir3_builder structs and the associated
helper functions. The API mirrors the one from nir_cursor/nir_builder.
This patch does not refactor existing code to use the new API. This will
happen in future patches.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 1972db36c6)