The shaders in question use:
(memory_load + (gl_SubgroupSize - 1)) & ~(gl_SubgroupSize - 1)
My guess is that this is supposed to be the subgroup size of whatever
produced the value, not the subgroup size in this shader.
And because in the consumer the workgroup size is 32, we use wave32.
Fixes: a2d3cbac2a ("radv: determine subgroup/wave size early")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14187
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 83e9ae2d5c)
Conflicts:
src/amd/vulkan/radv_instance.c
src/amd/vulkan/radv_instance.h
src/util/driconf.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Align with gallium side. When fixed-function blending is not available,
the internal blend shader is used. This is handled by a single ST_TILE
in the blend shader with the current sample ID, which requires sample
shading enablement.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 763d2418b8)
CI fails removed from cherry-pick as the file doesn't exist on stable,
and the main branch change has only removals.
Conflicts:
src/panfrost/ci/panfrost-g925-fails.txt
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Subpasses can have different view masks, although this isn't often used.
So we can't use the view mask of the last subpass when deciding what to
store, instead we have to use the same used_views field that's used by
loads and clears.
Noticed by upcoming tests for VK_QCOM_multiview_per_view_render_areas.
Cc: mesa-stable
(cherry picked from commit c0b5c04b84)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
It's not just used for clears, it was already used for loads and it
needs to be used for stores too so clear_views was a confusing name.
Cc: mesa-stable
(cherry picked from commit 6c3ed74ed2)
Conflicts:
src/freedreno/vulkan/tu_pass.cc
src/freedreno/vulkan/tu_pass.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Some implementations can emit tracepoints when copying u_trace
buffers. It's important to reserve the slots we want to copy into
before emitting the copies so that both processes don't clash with one
another.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit df5f92d114)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
If renderpass has D/S attachment, but pipeline has D/S as UNDEFINED,
D/S should be properly disabled for the pipeline. The easiest way is to
ensure that D/S state is valid when pipeline's D/S format is UNDEFINED.
So we always create VkPipelineDepthStencilStateCreateInfo.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 2798ef7bfd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The handling in dedup_srcs was incorrect because it would apply the
modifier from srcs[i] to the LUT without removing the modifier from the
instruction. We can fix and simplify this code by removing all modifiers
before the dedup_srcs() call, which we were doing immediately after the
call anyway.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13966
Fixes: 66c9c40f68 ("nak: Handle modifiers in dedup_srcs() in opt_lop()")
Reviewed-by: Seán de Búrca <sdeburca@fastmail.net>
Reviewed-by: Lorenzo Rossi <git@rossilorenzo.dev>
(cherry picked from commit 041216e605)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The value depends on the tgsi_interpolate_loc which is not constant for
the loop. llvm should be able to cse in cases where they are the same.
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit aa28fcb610)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The size queries for images do not use function pointers so we need to
be careful that width, height and depth are 0.
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d6dd96e1c7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Fixes import of planar formats like NV12 in gtk4. Allows
`gst-launch-1.0 v4l2src ! gtk4paintablesink` to use vulkan instead of
falling back to OpenGL.
Closes: #14217
Cc: mesa-stable
Signed-off-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 83b97379dc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
We need to handle plane offsets everywhere. I noticed this broken before but
didn't realize it was a GL driver issue. Fix is easy, wrote this on my sofa
while waking up in the morning.
Fixes gst-launch-1.0 v4l2src ! glimagesink
Note that cheese & snapshot both still hang for some reason due to
libgstpipewire, but the Mesa side should be fine now.
Closes: #14217
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
(cherry picked from commit aa9f937116)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
No shader-db changes on any Intel platform.
fossil-db:
Skylake
Intel(R) HD Graphics 530 (SKL GT2)
Totals:
Cycle count: 57669758527 -> 57669757913 (-0.00%); split: -0.00%, +0.00%
Totals from 10 (0.00% of 1736875) affected shaders:
Cycle count: 274949 -> 274335 (-0.22%); split: -0.36%, +0.14%
This change is likely due to subtle differences of different registers
being allocated.
In addition, fossils/google-meet-clvk/BgBlur.1f58fdf742c27594.1.foz and
fossils/google-meet-clvk/Relight.1f58fdf742c27594.1.foz stopped failing
EU validation on Gfx9 platforms.
Closes: #14171
Fixes: e7b7d572b3 ("intel/fs/ra: Re-arrange interference setup")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 3e6af6c5bb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
write_memory is used after encoding every frame to mark the feedback
buffer as ready. Only use it when write_memory can work without PCIe
atomics support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 874e02003a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The llvm::orc::ThreadSafeContext object wraps an llvm::Context and keeps
its reference.
As we are no longer able to squeeze out Context from ThreadSafeContext
in LLVM 21, do not let ThreadSafeContext create Context implicitly for
LLVM 21, instead explicitly create Context and then remember it.
This also eliminates the code creating a Context that is never disposed.
Fixes: cd129dbf8a ("gallivm: support LLVM 21")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
(cherry picked from commit cc60a7a39d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
If the AR is loaded from a register changing that register in a loop was
resulting in a scheduling failure because the AR load was made dependend
on a later instruction. Fix the dependencies by only using dependencies on
older instruuctions in the same block.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14114
Fixes: d21054b4bc ("r600/sfn: Add pass to split addess and index register loads")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 43d9765e35)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The logic here before was wrong. In the case where the set is the same,
it would avoid the flush but then re-initialize anyway, loosing the
dirty information and causing us not to actually flush out all the
descriptors.
Fixes: 1f0fda22f7 ("nvk: Flush descriptor set maps")
(cherry picked from commit 2f6b3b6b91)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The Vulkan spec says:
"VUID-vkCmdDraw-maxFragmentDualSrcAttachments-09239
If blending is enabled for any attachment where either the source
or destination blend factors for that attachment use the secondary
color input, the maximum value of Location for any output attachment
statically used in the Fragment Execution Model executed by this
command must be less than maxFragmentDualSrcAttachments"
Which means it must be disabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14190
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit b2badb2b24)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Struct virgl_renderer_capset_drm has a varying size depending on whether
AMDGPU driver is enabled or not. This breaks offset of struct vdrm_device
members for non-AMD drivers when Mesa is built with multiple native context
drivers including the AMD driver. Place varying capsets in the end struct
vdrm_device to mitigate the issue.
Fixes: 5736280730 ("virtio/vdrm: add ENABLE_DRM_AMDGPU for c_args")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
(cherry picked from commit bd8377bb04)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
v2: - Correctly test in multi-slot split whether the group has kill if
we want to add a multi-slot op.
- update group_has_predicate if an according vector op was added
Fixes: 359bfc3138 ("r600/sfn: make sure that kill and update pred are not in the same group")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 317345cc98)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Intel & AMD Direct3D drivers modify their rounding behaviour for texturing to
match Direct3D expectations. Such behaviour is not conformant in Vulkan, and
Intel hardware lacks a reasonable way to get NVIDIA's behaviour (which uniquely
works for Vulkan & Direct3D). The second best choice is to use
Direct3D-compatible behaviour for Proton (via driconf) and our current
Vulkan-conformant behaviour everywhere else. Given the APIs diverge and there is
no Vulkan extension to control the behaviour explicitly, driconf'ing on the
engineName is the reasonable solution.
anv already has a anv_force_filter_addr_rounding driconf option to force
Direct3D behaviour for certain Direct3D titles. Here we simply apply it to all
D3D10+ titles, aligning us with the Windows driver.
Note that D3D9 does not have this behaviour. We therefore use standard Vulkan
behaviour for D3D9 to avoid breaking D3D9 titles, even though the engineName is
the same as D3D10+.
This is the same solution radv uses, they call it radv_disable_trunc_coord. We
could unify the driconf entries later.
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38098#note_3166306
for a more detailed analysis, as well as the linked references:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27337https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25911https://github.com/HansKristian-Work/vkd3d-proton/pull/1884
This fixes misrendering in piles of Direct3D games run on anv via Proton,
including Assassin's Creed Valhalla.
Cc: mesa-stable
Closes: #13886
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Co-authored-by: Calder Young <cgiacun@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit 7a71952762)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The arr::in_bounds field was set unconditionally for every deref created
for a chain. For struct derefs, which don't have this field, this would
write to an unused memory location, which is probably why this never
caused issues.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f19cbe98e3 ("nir,spirv: Preserve inbounds access information")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 0ac55b786a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
VCN requires 64x16 alignment for HEVC. When the app requests non-aligned
resolutions, make up for it with conformance window cropping.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Cc: mesa-stable
(cherry picked from commit cef8eff74d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>