L3 configurations with an ALL partition of 128 ways per bank or more
cannot be represented with the normal L3ALLOC partitioning mechanism
since the "All L3 client pool" field would overflow, instead the
L3FullWayAllocationEnable bit has to be set, which causes the whole L3
to be used in a unified cache configuration.
That's precisely the configuration we're currently using on recent
platforms, but previously we were relying on the L3 config tables
being empty and the selected L3 configuration being a NULL pointer to
detect this condition. This is about change, the L3 configuration
structure will be defined for gfx12.5+ platforms since they provide
useful information about the cache hierarchy to the drivers. Instead
of checking whether the pointer is NULL in order to apply a unified L3
cache configuration, use it when there is a single ALL partition
larger than can be represented via L3ALLOC.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
as has recently been exposed by ci, there are some cases where running
lots of tests simultaneously can temporarily result in depleted vram,
which torpedos everything
as this scenario is transient (vram will very soon become available again),
it makes more sense to add some retries at fixed intervals to try soldiering
onward instead of exploding and probably blocking a merge
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25938>
This resolves a memory leak when the application drops its last reference
to the queue, but never waits explicitly.
The problem was, that the queue was refed by QueueState::last and that ref
only gets dropped on a blocking wait. This is problematic as non user
Event objects also hold a ref on the Queue they are created on, therefore
causing a cyclic ref relation.
In order to resolve it, just use a weak reference. A failure of upgrading
the Weak ref is not an issue as in this case we'd only wait on an already
destroyed or processed event. The worker thread already makes sure
everything stays in sync.
Fixes: 5b3ff7e3f3 ("rusticl/queue: overhaul of the queue+event handling")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: @LingMan <18294-LingMan@users.noreply.gitlab.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25926>
I didn't check if it's a valid vulkan SPIR-V opcode and turns out it isn't
Fixes: 82eed326f4 ("zink: support more nir opcodes")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
This is required by OpenCL who relies on flushing behavior to match the
runtimes advertized feature, but also later once rusticl does support
denorms, to flush them if applications whish to do so.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
This aliases each access as required by OpenCL. It's up to the vulkan
driver to vectorize to wider loads/stores if possible.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
It's kinda pointless to have it too big, it also causes weird shaders to
be generated and causes stack overflows in `nir_opt_gcm`.
Nothing needs big values here anyway.
Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
On NVIDIA, we can do a vote_ieq on bool in one hardware op so we don't
want that lowered. We do want to lower vote_feq and other vote_ieq,
though.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25894>
if a texture is provably idle, either by never having been used or
by exhaustively checking usage data, a texture subdata can occur
without any synchronization
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
Only align resource dimensions on creation, not when importing existing D3D resource object.
Otherwise importing the resource fails since the resource descriptor does not match the aligned
dimensions passed in the template.
Fixes: 62fded5e4f ("d3d12: Allocate d3d12_video_buffer with higher alignment for compatibility")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25913>
There are no GL CTS tests for 1darrays, relying on OpenCL CTS instead.
This patch should fix the `clEnqueueFillImage` tests in OpenCL CTS.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25630>
In function ‘generate_clipmask’,
inlined from ‘draw_llvm_generate’ at ../src/gallium/auxiliary/draw/draw_llvm.c:1975:24:
../src/gallium/auxiliary/draw/draw_llvm.c:1302:25: warning: ‘sum’ may be used uninitialized [-Wmaybe-uninitialized]
1302 | sum = lp_build_fmuladd(builder, planes,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1303 | (LLVMValueRef[]){cv_x, cv_y, cv_z, cv_w}[i], sum);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/gallium/auxiliary/draw/draw_llvm.c: In function ‘draw_llvm_generate’:
../src/gallium/auxiliary/draw/draw_llvm.c:1149:44: note: ‘sum’ was declared here
1149 | LLVMValueRef plane1, planes, plane_ptr, sum;
| ^~~
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25884>