This pass uses constant folding to determine which invocation is read by shuffle
for each invocation. Then, it detects patterns in the result and uses more
a specialized intrinsic if possible.
For booleans it creates inverse_ballot.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24650>
A single ldc is probably more efficient than a 64-bit load and the pile
of math we were generating before. The only reason for the old method
was that it let us avoid indirect cbuf loads because we didn't support
them for a while. Now that we can support all cbuf loads, we can just
do an indirect 1B load and call it good.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30218>
initializing the winsys from a /dev/dri/cardX node (as discovered by
gbm) doesn't work, as the kernel abi expects a render node
thus, the winsys needs to open the card's rendernode and use that
everywhere except when importing buffers, where it has to explicitly
export from the card node and import to the rendernode
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30224>
This reproduces panvk logic of showing how much memory is available
while taking into account the address space limits we have.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Valhall can allow up to 48-bit of address space, we should reflect this
here to allow more memory to be mapped in the same address space when
possible.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Before this, CL buffers would not be attached to subsequent batches.
This fix spurious fails when running OpenCL CTS test_basic astype and
likely others.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Not the most efficient approach but functional.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Also ensure that we never emit vector with more than 4 components.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
nouveau uses the OS page size which is almost always 4096. The next
patch will make this properly queried but this version is back-portable.
Fixes: 58181b7bbc ("nvk: Bump the sparse alignment requirement on buffers to 64K")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30138>
We skip loading the tile buffer from memory if the job has flagged
a clear (job->clears) for the buffer, however, this only tracks
clears emitted via the TLB. In some cases we may need to fallback
to clearing with a draw call, in which case we also want to skip
the load.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30205>
We need to call iris_wait_syncobj() on both release and debug builds,
so take it out of the assert() call. Only assert the result.
With this patch, gnome-session finally works for me. Also steam.
Fixes: 665d30b544 ("iris: Wait for drm_xe_exec_queue to be idle before destroying it")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30195>
Xe KMD renamed XE_PERF to XE_OBSERVATION to better match with Intel
specification and avoid confusion.
This uAPI rename will land in the same kernel version that added
the uAPI being renamed.
There is no uAPI change, just renames.
Sync xe_drm.h with 63347fe031e3 ("drm/xe/uapi: Rename xe perf layer as xe observation layer").
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30027>