Indeed, the object ring_bo was not freed.
For instance, this issue is triggered with:
"piglit/bin/arb_shader_image_load_store-host-mem-barrier -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 5438b19104 ("iris: enable generated indirect draws")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30975>
(cherry picked from commit 6ac3beeb85)
The mutex needs to be locked before accessing the handle table.
After 64ca0fd2f2 ("frontends/va: Allocate surface buffers on demand")
the issue is now much more likely to happen and can be reproduced when
transcoding using ffmpeg.
Cc: mesa-stable
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30935>
(cherry picked from commit fccf31c231)
Conflicts:
src/gallium/frontends/va/image.c
The int filter enable bit in that state depends on both sampler view
and sampler state, so the state need to be re-evaluated and emitted
not only on sampler view changes, but also when the sampler state
changes.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30605>
(cherry picked from commit 8fc977cb29)
The bug report has a sequence that looks like this:
* set tex as framebuffer
* dispatch a compute shader that doesn't use tex
* dispatch a compute shader that uses it
Since we were updating the counters at step 2, step 3 failed to realize
that calling si_make_CB_shader_coherent was needed.
While at it, this commit splits the draw call tracking counter in 2: one
for CB, one for DB.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11638
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30591>
(cherry picked from commit bfcee149ed)
Tested with upstream drm-next kernel:
commit 6d0ebb3904853d18eeec7af5e8b4ca351b6f9025
Merge: 8bdb468dd7a5 b5d4657e192b
Author: Dave Airlie <airlied@redhat.com>
Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Ref: drm-next 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement")
Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30056>
(cherry picked from commit e32989a698)
This is a single bit field.
Fixes: d46162981a ("vulkan/video: add h264 headers encode")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Tested-by: Bernhard C. Schrenk <clemy@clemy.org>
Reviewed-by: Bernhard C. Schrenk <clemy@clemy.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30968>
(cherry picked from commit af425a63f7)
This fixes test_execute_indirect_state_vbo_offsets, a new vkd3d-proton
test.
The drawid/base_instance bits were cleared by mistake.
Fixes: e59a16bbb8 ("radv: use an indirect draw when IBO isn't updated as part of DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30971>
(cherry picked from commit 8873382703)
This reverts 2fc396ae75 ("intel/dev: Disable LNL PCI IDs on Mesa 24.2
(require INTEL_FORCE_PROBE)") from the 24.2 branch.
Mesa's 2fc396ae75 was needed because, although we were compatible
with Linux 6.11, this kernel change which will appear in Linux 6.12
would break our previous LNL support, causing the driver to fail to
load:
7108b4a589cd ("drm/xe/uapi: Expose SIMD16 EU mask in topology query")
Since d8b5ee8d65 ("intel/dev: Support new topology type with SIMD16
EUs"), this issue was resolved, but we were told we had to wait for
the kernel to remove force_probe before they would guarantee backwards
compatibility with uapi changes.
Since drm-next 6d0ebb390485 ("Merge tag 'drm-intel-next-2024-08-29' of
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next"), the
upstream drm kernel has now removed force_probe for LNL. Ref:
9c57bc08652a ("drm/xe/lnl: Drop force_probe requirement")
Tested with upstream drm kernel:
commit 6d0ebb3904853d18eeec7af5e8b4ca351b6f9025
Merge: 8bdb468dd7a5 b5d4657e192b
Author: Dave Airlie <airlied@redhat.com>
Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30959>
Don't claim to support extendedDynamicState3SampleLocationsEnable on pre-A650 GPUs,
which can't advertise VK_EXT_sample_locations.
Fixes dEQP-VK.info.device_mandatory_features on A6xx Gen 1 and Gen 2.
Fixes: 84726da2f4 ("tu: Implement extendedDynamicState3SampleLocationsEnable")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30730>
(cherry picked from commit 98d52cf292)
The storage allocated was always the same for both the ordinary texture
result data as well as the residency info. However, the former can be
float vector, whereas the latter is always int vector.
At least some llvm versions/builds will assert on this mismatch when
storing the data.
While here, also cut unnecessary zero initialization (lp_build_alloca()
already explicitly does this).
Fixes: 6168317b84 (lavapipe: Implement shaderResourceResidency)
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30878>
(cherry picked from commit 9b717596b2)
We put minSampleShading in the nvk_shader and [de]serialize that to/from
the binary so it also needs to go in the hash. We could also plumb the
pipeline state through to the deserialize callback but that's quite a
stretch and this literally only affects minSampleShading which is a
rarely used feature.
Fixes: 813b253939 ("nvk: Switch to shader objects")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30914>
(cherry picked from commit 5f402f3aae)
The rehash we're doing here is a bit of a hack but it's a back-portable
hack. We'll fix it properly in following commits.
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>
(cherry picked from commit c0191b20de)
Setting it to the same value as (or higher than) the job timeout
effectively bypasses the safety mechanism.
Let's change it to `job timeout - 5min`.
Fixes: f39ffc6911 ("ci/etnaviv: Get the gc2000_piglit manual job mostly working.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30800>
(cherry picked from commit b978d3eb54)
It was set to 170min, which made sense when the job timeout was 3h, but
then 4bb564f40d ("broadcom/ci: add more jobs to test with rpi5")
lowered the job timeout to 2h without lowering the test timeout to match.
Fixes: 4bb564f40d ("broadcom/ci: add more jobs to test with rpi5")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30800>
(cherry picked from commit aac9c74a83)
We could be trying to extract a D/UD from a Q/UQ, for example. We were
ignoring the top 32-bits, which is incorrect.
Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
(cherry picked from commit da395e6985)
This function never expands a type - it only narrows it. As such, we
don't need to ever sign extend to fill additional new bits. I think
this code was left over from earlier versions of my optimization pass
that was buggy and trying to handle cases it should not have.
Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
(cherry picked from commit 51c85e0363)
This should be done after reads are checked and
sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version
also skipped checking the reads if the write check passed.
fossil-db (navi31):
Totals from 193 (0.24% of 79395) affected shaders:
Instrs: 3212435 -> 3212735 (+0.01%)
CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01%
Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
(cherry picked from commit 61e73c2323)
We observed that Venus on ANV on JSL platform has some cacheline flush
issue. The overflush shows up as:
1. There're 2 threads venus bliting the feedback buffers suballocated
from the same backing device memory, back to back.
2. On thread A, flushing the feedback buffer for cpu read is placed
behind flushing a shader storage buffer for cpu read.
3. On thread B, flushing a different feedback buffer with the same
backing device memory (different offset bound to) can kick the
feedback buffer flush in (2) earlier than it should be flushed.
4. As a result, CPU polling thread for thread B results would see venus
feedback buffer update earlier than shader storage buffer results
being updated, breaking Venus sync primitives optimization.
During investigation, a solid workaround for JSL platform is to force
Venus to align up to 128 bytes for feedback buffer suballocation while
the default is at 64 bytes.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30879>
(cherry picked from commit 7941d705c3)
When barriers are used in invalid shaders with non-uniform control flow
we might get a hang. Forcing 32-wide group can help by making it more
probable that barrier instruction is executed by at least one channel
in each thread, and thus hang will be avoided. This shouldn't affect
Xe2+, where active-thread-only barriers are used anyway.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11497
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
(cherry picked from commit 7e52b67801)