We had some special cases before when we could actually get some IFs on
R300 with VDPAU. Now that VDPAU is gone and everything goes through
ntt, we don't have to worry anymore. Remove the complicated logic and
just always transform KILL into KIL none.-1
No shader-db change on RV530 or RV370.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21503>
There is no UVD and the mpeg2 shader-based decoding is broken and doesn't
lead to CPU savings anyway. VDPAU output works, but there is no real
benefit so just disable VDPAU altogether so we can clean the backend a
bit and also open a way to potentially drop the mpeg2 deconding altogether
from the fronted.
Acked-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20524>
The workaround address is used as a source for push constants when
there's no resource available, that address must be 32B aligned.
This fixes invalid address being used for buffers in
3DSTATE_CONSTANT_* packets.
Now that intel_debug_write_identifiers() already add the padding,
there's no need to include extra "+ 8" to the offset.
Thanks to Xiaoming Wang that contributed to find and fix this issue.
Fixes: 2a4c361b06 ("iris: add identifier BO")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21479>
We need to emit 3DSTATE_HS for each primitive with tessellation.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21308>
Tigerlake PRM: Volume 2c: Command Reference: Registers Part 2 - Registers M through Z
RCU_MODE :: Compute Engine Enable
This bit indicates if Compute Engine (a.k.a Dual Context or Multi
Context) is enabled or not. This bit must be treated as global
control for enabling and disabling of compute engine. Hardware
allocates required resources for the compute engine based on this
bit.
....
HW reserves 4KB of URB space...
Right now no gen12 platform has Dual Context enabled in kernel side,
exposing a compute engine but that can change, so here adding
has_compute_engine to intel_device_info and only reserving URB space
if compute engine is available.
While at it also fixing the error path when pb_slabs_init() fails.
Bspec: 46034
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21031>
Fixes: 5b205ef413
r600: Store nir shaders serialized to save memory
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7faf89c3bb48 in __interceptor_realloc (/usr/lib64/libasan.so.6+0xb1b48)
#1 0x7faf7be5981d in grow_to_fit ../src/util/blob.c:67
#2 0x7faf7be5a538 in grow_to_fit ../src/util/blob.c:49
#3 0x7faf7be5a538 in blob_reserve_bytes ../src/util/blob.c:177
#4 0x7faf7be5a538 in blob_reserve_uint32 ../src/util/blob.c:190
#5 0x7faf7d248a8c in nir_serialize ../src/compiler/nir/nir_serialize.c:2109
#6 0x7faf7df4fdbb in r600_pipe_shader_create ../src/gallium/drivers/r600/r600_shader.c:401
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21443>
If the view's seqno increments, it needs to happen *before* the tex cache
key is constructed. Normally this happens when the sampler views are
bound. But if the texture backing a current sampler view is rebound we
need to handle this before the cache lookup.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21408>
Layout transitions caused by access as a various format must happen at
state bind time, before batch_draw_tracking(). Add a helper to assert
this fact.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21408>
Elsewhere we are comparing it against the seqno for the "primary" z32
buffer, so be consistent. Otherwise we'll think we need to re-validate
every time the sampler view is bound.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21408>
Since we invalidate tex cache entries if an associated pipe_resource is
rebound, we don't rely on the rsc_seqno being part of the tex cache key.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21408>
Since 13cb41f666 PIPE_BIND_SHARED was used to allocate driver internal
video buffers. These buffers are never shared, but the intent was to
get non-suballocated buffers and SHARED was used as an indirect flag.
This commit switches to PIPE_BIND_CUSTOM which isn't used anywhere else,
and is now translated as "no suballocation".
The main benefit here is that this allows these buffers to set
use_reusable_pool to true reducing the CPU overhead a lot.
For instance, running the following command on my system:
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \
-i tears_of_steel_1080p.mov -an -c:v h264_vaapi output.mp4
takes 35 sec with this commit vs 45 sec without.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21416>
At the moment we implement constant memory as normal global memory, but
we still should limit to the actual constant buffer cap once we properly
use UBOs for that.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20161>
Some of the OpenCL tests are flaky, because they just take that long.
Builtins can generated really complex code and if we are unlucky they can
timeout.
Proper support for functions would also solve the issue, probably, but for
now increase the deqp-runner timeout so it's less of an annoyence.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20161>
There's just two places where we need any of the WSI specific vulkan
includes, the rest of Zink should do just fine with vulkan_core.h. So
let's include the win32-specific header explicitly in those two places,
and reduce the need for WSI specifics inside zink itself. Kopper
handles the rest of the WSI integration.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21441>
When establishing a texture transfer map, if there is any pending changes on the
texture, instead of trying direct map with DONTBLOCK first, just
use the upload buffer path.
Fixes piglit tests gen-teximages, arb_copy_images-formats
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
Add typeless format to the compatible format lists for shareable surfaces.
Fixes webgl benchmark crash in eglCreateImage running from firefox on Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
When an EGLImage is created from a 2D texture and used for texture sharing,
the texture surface might not have been created with the SHARED bind flag.
To allow these surfaces for sharing, this patch sets the USAGE SHARED bit
for surfaces that can be potentially used for sharing even when the SHARED
bind flag is not originally set. Instead of unconditionally enabling the
SHARED bind flag for all surfaces and unnecessarily bypass the surface cache
optimization, this patch only enables the USAGE SHARED bit for surfaces
that also have the RENDER TARGET bind flag.
When the surface handle is inquired and if the surface is currently
marked as cachable, we will need to unset the cachable bit so
the surface handle will not be recycled again.
This patch fixes an assertion in svga_resource_get_handle() when the
EGL_MESA_image_dma_buf_export extension is used in webgl benchamrk running
from firefox in Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
This hack caused problems with some dx9 tests before (due to mipgen
test using nearest filter sampling with tex coords exactly between two
texels hence being extremely sensitive to arithmetic inaccuracies),
and we can no longer distinguish this by using pixel_offset to not get
it enabled. But to pass other tests we don't really need the hack when
there's texture sampling involved anyway.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21407>
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
Blend states can require masking colour. Currently, this is handled by
nir_lower_blend, which lowers masks to a read-modify-write operation as required
on Mali hardware. However, our "tilebuffer store" instruction supports a write
mask, allowing us to write only a subset of channels to the tilebuffer. It's
more efficient to use that than to emit pointless tilebuffer loads.
Note that even without tilebuffer loads, non-opaque masks don't work with opaque
pass types. Here, we handle this with a translucent pass type, which gets HSR
to do the right thing and is consistent with the pass type used previously.
However, it's a bit heavy handed -- Apple manages to use an opaque pass type
with masking but with some unknown HSR fields twiddled. IMO reverse-engineering
those details shouldn't block this because this gets us closer to optimal (just
not all the way there) and is strictly better than what we had before.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
If radv is initialized before radeonsi, doing:
aws->fd = fd;
is incorrect because the device was initialized using the fd
passed by radv.
libdrm has a helper to query the fd used to create the device,
so use it.
We also need to init the kms_handles table in this case
because we're going to share BOs between radeonsi's fd and
the device fd.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3424
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20983>