After a refactor last year, the noibs option stopped working
because it hits an assertion when empty IBs are submitted.
Emit a single large NOP packet to avoid submitting empty IBs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>
(cherry picked from commit 132a61c6b7)
The hardware actually compares a pair of 64-bit values, rather than
comparing a single value against zero like we previously assumed.
This wasn't an issue in most cases before because if the buffer is
zero-initialized the previous code happens to work. If we get a
buffer with garbage in it though we would run into issues.
Fixes: 80eac1337d ("nvk: Always copy conditional rendering value before compare")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13821
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37153>
(cherry picked from commit 90ac7d13dc)
nvk_cmd_pool_free_gart_mem_list frees this buffer, so we need to clear
the pointers to it in order to avoid a use after free.
Fixes: 07c70c77de ("nvk: add cond render upload buffer.")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37153>
(cherry picked from commit eaa547f6f2)
The vulkan spec says that we should ignore memoryOffset when
VkBindImageMemorySwapchainInfoKHR is present. wsi common assumes that we
bind the wsi image at offset 0, so set the offset to 0. This change
aligns with common wsi, and also obeys dedicated alloc requirement.
Fixes: f887116c49 ("turnip: adopt wsi_common_get_memory")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
(cherry picked from commit cef48af271)
Fixes: 5f757bb95c ("nir: Make the load_store_vectorizer provide align_mul + align_offset.")
This is found when I am trying to narrow bit_size and num_components to uint8_t
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37042>
(cherry picked from commit 949a056934)
We were testing some conditions in the wrong order, so spilled
registers were being printed as if they were uniforms. This is
incorrect, but only subtly so, and lead to confusion.
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37092>
(cherry picked from commit e3552c427e)
The intention of the code was to allow PHI values to be propagated
if they were in registers (as opposed to in memory). As written though
values were never propagated. I think this typo was due to some
debug code that wasn't removed properly.
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37092>
(cherry picked from commit d482b6ca68)
Secondary command buffers with RENDER_PASS_CONTINUE_BIT don't reset
rp_trace, and without reset we get garbage tracepoints.
Fixes garbage sysmem_clear_all tracepoints in some games running
though DXVK.
Fixes: 630380349b ("tu: Give renderpass events a separate trace buffer")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37161>
(cherry picked from commit 482e0d0d1e)
These two properties reports how the interaction between MSAA coverage
and occlusion queries works. We need to report the correct value here,
otherwise applications might misbehave.
Fixes: 5ee3c10d1e ("panvk: advertise vulkan 1.4 on v10+")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37136>
(cherry picked from commit 166d650c10)
The non-compute end flag should be INTEL_DS_TRACEPOINT_FLAG_END_OF_PIPE.
This fixes the broken anv utrace for anything non-compute that can
potentially overlap (execute in parallel).
Fixes: 6281b207db ("anv: add tracepoints timestamp mode for empty dispatches")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37155>
(cherry picked from commit c0e51bcf24)
ENABLE_DRM_AMDGPU must be defined when amdgpu_virtio is enabled;
otherwise, vdrm and amdgpu_virtio will have different definitions of
struct virgl_renderer_capset_drm. As a result, on amdgpu_virtio side,
the content of struct vdrm_device will be corrupted.
Thanks Honglei Huang <honglei1.huang@amd.com> for pointing out the
different definitions of struct virgl_renderer_capset_drm.
Cc: mesa-stable
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37023>
(cherry picked from commit 5736280730)
Pre-rasterization stages need a CS stall if they need to wait on the
flushes from a PIPE_CONTROL.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37132>
(cherry picked from commit f262865a90)
musl removed the LFS64 APIs like mmap64(), which were intended to be a
transitional measure multiple decades ago, causing a build failure
here. Since virtio-gpu sizes and offsets are 64-bit, we do still want
to make sure that we're using 64-bit mmap here, so I've added
-D_FILE_OFFSET_BITS=64, which will ensure that off_t is always 64-bit
in gfxstream guest, and which is generally the modern solution here.
With this change, I am able to build gfxstream with musl.
Fixes: fec8e296a3 ("Make VirtGpu* interfaces")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37086>
(cherry picked from commit 6f8cdd8a3c)
With the existing union setup, only the first 8 bytes are initialized
properly for UBOs, yet the UBO size is 16, and all 16 bytes are copied
to applications. This leads to broken capture-replay since the
descriptor payload is no longer invariant.
Fix this by ensuring all union members are 16 bytes, which then get
properly initialized with the designated initializers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 8b5835af31 ("nvk: Use bindless cbufs on Turing+")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37053>
(cherry picked from commit f28f72a5a2)
8-bit bit_count cannot simply use the masked result of a 16-bit
bit_count. Make sure it is properly lowered to a 16-bit bit_count.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 8aa2cad5df ("ir3: lower relevant 8-bit ALU ops in nir_lower_bit_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37116>
(cherry picked from commit 603d6fe240)
This fixes graphics artifacts happening with particular shader.
This 'heuristic' hits few very similar shaders but should provide better
performance than current fix to turn off caching from all shaders.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35929>
(cherry picked from commit 4035520ca9)
The highest possible values that can be represented with
16/12/10 bits are 65535/4095/1023, not 65536/4096/1024.
In order to ensure 1023 maps to 65535 in the Sx10 case
we thus need to multiply by 65535 / 1023 ~= 64.06158
instead of 64.
Fixes: a166d7609f ("gles: Add support for 10/12/16 bit SW decoder YCbCr formats")
Suggested-by: Benjamin Otte <otte@redhat.com>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37077>
(cherry picked from commit 1772380307)
On Android, Vulkan loader implements KHR_swapchain and owns both surface
and swapchain handles. On non-Android, common wsi implements the same and
owns the same. So for both cases, the drivers are unable to handle
vkGet/SetPrivateData call on either a surface or a swapchain.
Inspired by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37043
Cc: mesa-stable
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Ryan Zhang <ryan.zhang@nxp.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37064>
(cherry picked from commit 6e1c2e4d83)
Apply any outstanding accumulated PC bits before we proceed on building
Acceleration Structure.
2 reasons for this :
- some of the data accessed by the build might need to be flushed
as a result of a previous barrier
- the scratch buffer might get reused between builds
Cc: mesa-stable
Closes: #13711
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36951>
(cherry picked from commit 90daa80d1d)
If implementation does not actually replay the VA, it must return 0
to not violate:
"If the memory object was allocated with a non-zero value of
opaqueCaptureAddress, the return value must be the same address."
Fixes RenderDoc capture replay, which asserts on the this spec rule
being followed.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37090>
(cherry picked from commit b7a0f0215f)
Missing 32-bit entry point in GLSL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 2ce20170 ("mesa: Add support for GL_EXT_shader_clock")
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36041>
(cherry picked from commit d9b388af27)
The original lower_phis_to_regs_block() is a little too clever. It
crawls up the predecessor tree until it finds a cross edge and places
the register writes as deep as it can. This breaks nak_nir_lower_cf().
Say you have a shader like...
con %0 = load_uniform()
con loop {
if div {
} else {
}
break;
}
con %1 = phi %0
The original lower_phis_to_regs_block() will turn it into
con %0 = load_uniform()
con %r = decl_reg();
con loop {
if div {
reg_store(%r, %0)
} else {
reg_store(%r, %0)
}
break;
}
con %1 = reg_load(%r)
We then convert it into unstructured control-flow and run regs_to_ssa()
to get our phis back, which lowers each of the registers we inserted to
a phi tree. When we try to recover divergence information on phis by
looking at their sources, this works fine if each source maps directly
to a reg_store() whic maps directly to a phi in the original IR.
However, because the reg_store() instructions are placed deeper, it may
introduce false divergence.
Switch to the simple version of nir_lower_phis_to_regs_block() which
places reg writes directly in phi predecessor blocks. We could probably
be more conservative and just avoid placing writes to uniform regs in
divergent control-flow but it's more robust to make the load/store_reg
intrinsics match the original phis directly.
This fixes some shaders in Horizon: Zero Dawn Remastered
Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
(cherry picked from commit c6e831ac44)
Right now it tries to place reg_write instructions as far up the
predecessor chain as possible. This is useful for a bunch of the passes
that call it since it ensures they don't get placed in dead blocks or in
single successors and things like that. But it screws up NAK's control
flow lowering so we need the option to turn it off and make the pass
place the reg_write instructions in the most obvious place possible.
Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
(cherry picked from commit 26e32417b9)