Actually compile and cache the no-op fragment shader but remove it from
the pipeline if we determine it's a no-op. This way we always have it
even if it's not strictly needed.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
(cherry picked from commit 73b3efcd59)
Starting with Ivy Bridge, we implement alpha-to-coverage by writting
gl_SampleMask with a pattern based on alpha. This will show up in
wm_prog_data::uses_omask so we don't need to look at the key.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
(cherry picked from commit 9fe6caf4e7)
In wsi_destroy_image(), if buffer_blit_queue then
do not call extra free. This will fix assert in
debug release and accessing out of allocated memory.
Fixes:
7bd5aa111c
("vulkan/wsi: add a private transfer pool to exec the DRI_PRIME blit")
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset's avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16649>
(cherry picked from commit 1a8b03732f)
The kms_swrast path calls this callback via the dri2 paths,
not flushing caused artifacts when running inside a VM or on hw
in weston/gnome-shell.
Fixes: 6bbbe15a78 ("Reinstate: llvmpipe: allow vertex processing and fragment processing in parallel")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16715>
(cherry picked from commit c219ca3fb7)
According to the Vulkan spec 21.4 "Conditional Rendering",
only clearing attachments with vkCmdClearAttachments is subject to
conditional rendering.
Subpass clear and vkCmdClearColorImage / vkCmdClearDepthStencilImage
should always be executed even if it happens in a
conditional rendering block.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16654>
(cherry picked from commit 55466ca506)
MEC (the compute queue firmware) does not support real
predication, so we have to emulate that using COND_EXEC
packets before each dispatch.
Additionally, COND_EXEC doesn't have an inverted mode, so
in order to support inverted mode conditional rendering, we
allocate a new piece of memory in which we invert the condition.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6533
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16653>
(cherry picked from commit 85a4c5b351)
When vulkan enables depth bias, enable it for all 3 prim types
in gallium.
This fixes:
dEQP-VK.draw.renderpass.depth_bias.depth_bias_*
and
one zink test in CI
Cc: mesa-stable
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16701>
(cherry picked from commit 4896e136b6)
It's meaningful for this intrinsic and so does not add noise to the
lowering pass.
(Although dual-source writes must be to RT 0, depth and stencil
writes, which store_combined_output_pan is also used for, can still be
done with MRT enabled.)
Fixes: 5c168f09eb ("nir: Eliminate store_combined_output_pan BASE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>
(cherry picked from commit 9f9ed959bd)
Seems this broke a while ago and we never noticed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 0af7ff49fd ("aco: lower p_constaddr into separate instructions earlier")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16460>
(cherry picked from commit bd8f8dda8c)
D3D12 fences are capable of handling binary operations, but the
current dzn_sync implementation doesn't match vk_sync expectations
when sync objects are used to back semaphores. In that case, the wait
operation is supposed to set the sync object back to an unsignaled
state after the wait succeeded, but there's no way of knowing what
the sync object is used for, and this implicit-reset behavior is not
expected on fence objects, which also use the sync primitive.
That means we currently have a semaphore implementation that works
only once, and, as soon as the semaphore object has been signaled it
stays in a signaled state until it's destroyed.
We could extend the sync framework to pass an
implicit-reset-after-wait flag, but, given no one else seems to
need that, it's probably simpler to drop the binary sync
capability and rely on the binary-on-top-of-timeline emulation provided
by the core.
Fixes: a012b21964 ("microsoft: Initial vulkan-on-12 driver")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16629>
(cherry picked from commit 1eaba553e2)
This is required by lower_tg4_offsets which split one
sparseTextureGatherOffsetsARB call to four sparseTextureGatherOffsetARB
calls and merge their resisident results into one.
Fixes: ee040a6b63 ("radeonsi: enable ARB_sparse_texture2")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16599>
(cherry picked from commit cc4d5b1666)
Move can take in a vector and write a scalar, depending on the swizzle. We need
to handle this case. Split out mov and pack_32_2x16 so we can specify correct
behaviour for both. Also drop unused 1-bit boolean stuff which obscured the fix.
Fixes: 76cea8e27b ("panfrost: Fix pack_32_2x16 implementation")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>
(cherry picked from commit 9924e6f291)
Piglit tests fbo-generatemipmap-3d RGB9_E5 and
fbo-generatemipmap-cubemap array RGB9_E5 hit assert when debug_flush
is active. Increase the debug map depth to 64.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
(cherry picked from commit c63346eb69)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
SVGAv3 changes the PCI id due to differences in how PCI configuration
is handled - removal of VRAM and FIFO PCI resources, switch to MMIO
registers and MSI/MSI-X IRQ support but the 3D commands remain largely
the same.
This enables 3D/graphics acceleration support on SVGAv3.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 16019ff7cc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
SVGA device always supports direct maps which are preferable in all cases
because they avoid temporary surfaces and extra transfers. Furthermore
DMA transfers on devices with GB objects have undefined timing semantics.
Also the DMA transfers can not work on SVGAv3 because the device lacks
VRAM to be able to perform them.
Fix the last paths still using DMA transfers to make sure they're never
used on GB enabled configs. This fixes gnome-shell startup on SVGAv3.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Michael Banack <banackm@vmware.com>
(cherry picked from commit e5306d190a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
Flushing the command queue before mapping a resource is not enough
to guaruantee that the mapped content is not stale. We have to finish
to make sure that the gb readback actually updated the guest surface.
This fixes races in direct maps (map reads raced with gb readbacks)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
(cherry picked from commit c7b0309723)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
svga used to use vmx backdoor directly to send logs to the host.
This functionality has been implemented in vmwgfx 2.17, but
to make sure we still work with old kernels the functionality
to use the backdoor directly has been kept.
There's no reason to port that code to arm since vmwgfx
implements it and arm64 (or other new platforms) would
depend on vmwgfx versions a lot newer than 2.17, so everywhere
but on x86/x64 it's fine to assume vmwgfx always support the host
logging ioctls.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
(cherry picked from commit 71a749bc7b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
This reverts commit 68ef895674.
When trying out transcode_astc=true with BPTC on Asphalt 9, we observed
very poor image quality - to the point that basic UI icons were blocky,
and buttons with a black border had smeared pixels on the edges. Using
DXT5 had no such issues.
I originally suspected there was a bug in the BPTC encoder, but I now
believe the issue is deeper than that. The commit that introduced the
encoder, 17cde55c53, says:
"The compressor is written from scratch and takes a very simple
approach. It always uses a single mode of the BPTC format (4 for
unorm and 3 for half-floats) and picks the two endpoints by dividing
the texels into those which have more or less than the average
luminance of the block and then calculating an average color of the
texels within each division.
It's probably not really sensible to try to use BPTC compression at
runtime because for example with the Nvidia offline compression tool
it can take in the order of an hour to compress a full-screen image.
With that in mind I don't think it's worth having a proper compressor
in Mesa and this approach gives reasonable results for a usage that
is basically a corner case."
In other words, the reason our BPTC compressor was so fast is that it
only implements one of the modes and does a low quality approximation.
This honestly should probably be improved somewhat, but the original
use case was for online-compression, the uncommon but mandatory OpenGL
feature where you can supply uncompressed data and trust the driver to
compress it for you (at unknown and uncontrolled quality and speed).
Unfortunately, the compressor as it stands is simply not usable for
transcoding ASTC data where we want to preserve the underlying image
quality as much as possible.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16566>
(cherry picked from commit c54555c496)