LDS isn't divided among SIMDs, and it doesn't make sense to launch a
fraction of a compute workgroup.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33873>
We already merged it below, when the library has both fragment output
interface and fragment shader state (which is when we'd compute it
anyway). Setting it twice is probably harmless but also confusing and
useless.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33902>
fd_pipe_new2 can segfault when trying to set the is_64bit flag on new
pipes. This can happen when the current GPU is not be listed in the
fd_dev_recs table because it's not supported by mesa, but is supported by
the kernel.
Add a helper function to test if the current GPU is in the supported table,
and use it in fd_pipe_new2.
Signed-off-by: Loïc Minier <loic.minier@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33830>
The fast path for kgsl_queue_submit when there are no command buffers
and only sync objects led to breakage for two reasons:
* The fast path was not properly handling duplication of the merged sync
object assigned to signalled `kgsl_syncobj`(s), which could lead to
multiple `kgsl_syncobj`s owning the same FD and consequently issues
such as double close of that FD leading to UB. This is fixed by moving
to the slow path as it always produces a timestamp sync object which
can be trivially duplicated.
* The Vulkan specification requires that drivers strictly follow the
order of submission of command buffers and consequently the order of
semaphore signal/wait operations. Since no submission was being made
to the kernel, subsequent submissions could be executed without waiting
for wait/signal operations from previous submissions to complete.
As both of these issues are fixed by moving to the slow path, this patch
removes the fast path in favor of the more correct slow path.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33894>
RC_FILE_INPUT is pretty much just a RC_FILE_TEMPORARY with an initial
value in it. So we regalloc it the same way we do normal temps, however
for unknown reasons (probably to have a bit more readable shader dumps)
we still keep the RC_FILE_INPUT type even though its the same as
temporary. This is handled correctly when emitting the machine code,
however, it was not taken into account in shader stats.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33817>
The free list must be re-initialized. Found the bug while running:
dEQP-VK.ray_tracing_pipeline.acceleration_structures.device_compability_khr.gpu_built.top
where it invokes VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT to purge
the cmd pool resources, and the next alloc still gets cache hit with the
"empty" list.
Fixes: e2c4bafccc ("venus: free query batches for VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33908>
Compared to vcn4, vcn5's implementation has changed.
It needs to apply the qp_delta directly instead of
dividing by 5.
Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33886>
If somebody needs these in the future, they can add them back, but a lot
of these extensions are very old (SUN, SGI, ...).
No code is added, though git diff is having trouble detecting that.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33794>
Shared glapi doesn't make GL functions globally available, so we have
to use the dispatch API.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33794>
Shared glapi is already statically linked with libmesa (src/mesa),
and some parts are statically linked with loaders.
Static glapi will be removed after this is merged.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33794>
This changes makes it strongly typed and gives more context.
No changes in behavior expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
Xe KMD now has support for protected memory, so lets move it
to common code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
CCS don't support MI_SET_APPID instruction, that might be the reason
some tests protected memory tests fail on CCS.
Re-enable it if a workaround/solution is found.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
If PXP initialization is not completed and application requested a
protected context the GEM_CONTEXT_CREATE will wait up to 250ms for
PXP to finish initialization but if that do not happens it will
return a error and set errno to EIO.
This patch add the missing retry handling.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33459>
All formats that support sampled images should also be suitable for
storage images.
Fixes: d970fe2e ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33459>
Migrate the panfrost-g57-gl job to a new device in LAVA,
mt8195-cherry-tomato-r2. This DUT is faster than the
mt8192-asurada-spherion-r0 device it replaces.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33865>
It is safe to attempt inserting a node into the same instruction as
successor if successor is an unconditional branch.
ppir_instr_insert_node() will take care of conflicts with ALU_COMBINE
slot
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33754>
Implement 2 optimizations for branches:
1) if unconditional branch target is a block that only has
unconditional branch, propagate the target
2) optimize following contruction:
block 1:
...
if (cond) block 3
block 2:
branch block N
block 3:
...
into
block 1:
...
if (!cond) block N
block 2:
block 3:
...
Note: optimization 1) significantly improves runtime of if ladders, but
it is not visible in shader-db because we do not track shortest/longest
path and it doesn't always create dead code (usually just a single
instruction)
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33754>
Discard block is always added to the block list after translation from NIR,
so we can just assign it an index that equals to block list size.
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33754>
This saves us jumping through an entrypoing just to fetch a uint64_t.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33897>
On a7xx, const registers should be loaded via the preamble instead of
uploaded by the driver to the const state. This commit implements this
by adding a new pass that emits the consts created by ir3_cp to a
sequence of stc instructions in the preamble.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32454>