Previously, we update the sfb dst slot upon vn_SignalSemaphore so that
vn_GetSemaphoreCounterValue can poll just the feedback slot itself.
However, that can race with pending sfb cmds that are going to update
the slot value, ending up with stuck sync progression.
This change fixes it by disallowing vn_SignalSemaphore to touch the sfb
dst slot. To ensure counter query being monotonic, vn_GetSemaphoreCounterValue
now takes the greater of signaled counter and the sfb counter read.
Test with dEQP-VK.synchronization* group:
- w/o this: stuck shows up within 2 min with 8 parallel deqp runs
- with this: no stuck for multiple full runs of the same
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14304
Fixes: 5c7e60362c ("venus: enable timeline semaphore feedback")
(cherry picked from commit 829bd406c0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This is forgotten when advertising the corresponding extension, which
leads to inconsistency, thus fail of
dEQP-VK.api.info.vulkan1p2.feature_extensions_consistency CTS testcase.
Enable the corresponding feature too. I ran all CTS tests with
"mirror_clamp_to_edge" in name, which are all skipped with NotSupported
before (because of the feature being not advertised), and gain
3695/11140 Pass with the remaining ones still NotSupported (no Fail).
This also makes the feature extension consistency CTS testcase Pass too.
Fixes: 4d34c07b7a ("pvr: advertise VK_KHR_sampler_mirror_clamp_to_edge")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit ab9e148bfb)
Conflicts:
src/imagination/vulkan/pvr_physical_device.c
(File has been renamed since branchpoint)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This utilizes the RGBX format faking logic from e8cd7a30 to enable
PIPE_FORMAT_R10G10B10X2_UNORM renderer support using swizzling.
This format is needed for better HDR rendering support in the iris driver, to
support the Proton / Wine DXGI implementation, which requires an RGBA ordered
renderer for its Vulkan implementation. This in turn requires the Wayland
display to support both alpha and opaque formats. The check currently fails,
since only PIPE_FORMAT_R10G10B10A2_UNORM is exposed when Gallium (iris) is
the DRI Wayland renderer.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 858364be71)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Prevents failures with fp16 in lavapipe and Zink on lavapipe when
"nir/lower_flrp: Check and set shader_info::flrp_lowered" is
applied. Lowering with an incomplete mask on the first call to
nir_lower_flrp will prevent later calls (with the complete mask) from
doing anything.
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 341e2d3283)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Dynamic descriptors are mapped an array of offsets provided through
vkCmdBindDescriptorSets*() commands.
When pipelines are compiled with independent sets layouts, the
implementation might have to do additional runtime calculation to
figure out what offset in the contiguous array maps to what dynamic
descriptor in the pipeline layout.
For graphics pipelines you can always compute that information when
binding the shaders. There is always a limited amount of shaders (5
max).
For ray tracing pipelines, there could be lots of shaders to process
at every pipeline binding call. Besides there is no interface from the
runtime to the driver to list all the shaders used at the moment.
So do that tracking in the runtime and pass the information down to
the driver through the cmd_set_rt_state() vfunc.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 69a04151db ("vulkan/runtime: add ray tracing pipeline support")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 5c53c6e693)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Otherwise, the following error is observed:
src/util/cache_ops_x86_clflushopt.c:40:22:
error: arithmetic on a pointer to void is a GNU extension [-Werror,-Wgnu-pointer-arith]
40 | void *end = start + size;
| ~~~~~ ^
src/util/cache_ops_x86_clflushopt.c:44:9:
error: arithmetic on a pointer to void is a GNU extension [-Werror,-Wgnu-pointer-arith]
44 | p += cpu_caps->cacheline;
| ~ ^
This works with GNU extension enabled, but does lead to warnings
with Clang.
v2: Add to trial_c + trial_cpp checks (Erik)
v3: use c_msvc_compat_args to avoid fixing other instances of this issue (Erik)
Fixes: 555881e574 ("util/cache_ops: Add some cache flush helpers")
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 14cfe14626)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The order of pReferenceSlots is not well-defined by spec. Instead we
need to look at the RefPicList0/1 which provides slot indices.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit ab56ce154b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Since we only support 1 L0/L1 ref, the default num refs in the PPS
should always be 0. With that there never any need to set the override
flag in the slice header (until more references are supported).
Also the ref pic list modifications should be clamped to the size of the
ref pic list.
This fixes an issue seen with dEQP-VK.video.encode.h264.i_p_b_13_*.
Cc: mesa-stable
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 2e21eec921)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The subslice IDs provided by the SR0.0 EU register are not adjusted to account
for fusing, so the upper bound max_scratch_ids can vary from device to device
depending on what specific slices were fused during manufacturing.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
(cherry picked from commit c0d809820f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This ensures that g0 is reserved for spilling since there is going to be
spilling.
Fixes: 8bca7e520c ("intel/brw: Only force g0's liveness to be the whole program if spilling")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 1fc2f52d36)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.
Cc: mesa-stable
(cherry picked from commit bc1eea90b9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
If we're using the singleton, we need to add to it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 4b9aa9dc91)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Not having the uses_printf will drop the printf info in serialization.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 67faf6dfbd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Kopper is not supported on Android, and attempting to use it breaks zink
on the platform.
Disable kopper automatically when running on Android, fixing zink without
`LIBGL_KOPPER_DISABLE`.
Fixes: 3294cad341 ("egl: Rename dri2_detect_swrast() and also detect kopper")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14331
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
(cherry picked from commit 8a1ea724b4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
In the WAIT_ALL case in spin_wait_for_sync_file(), we were returning the
moment we saw the first success. However, this isn't a wait-all, it's a
bad wait-any. We should instead just continue on to check the next sync
until we've ensured that every sync in the array has a sync file. The
only reason this wasn't blowing up in our face is because it only
affects non-timeline drivers (pretty rare these days) and because most
of the places where we use WAIT_PENDING on non-timeline drivers is to
guard a sync file export and those typically have only a single sync in
the array.
Cc: mesa-stable
Reviewed-by: Gurchetan Singh <gurchetansingh@google.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
(cherry picked from commit e4e619d685)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The shader-db functionality was interfering with the error
filters.
Two new options are added: R600_DEBUG=shaderdb and
R600_DEBUG=precompile. The option precompile is added
to maintain the compatibility with the shader-db repository.
This change fixes 22 of these tests:
deqp-gles31/functional/debug/error_filters/case_.*: warn pass
deqp-gles31/functional/debug/error_groups/case_.*: warn pass
Fixes: 28d6a5af25 ("r600: Add shader precompile and shader-db support.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit f005c0b5ad)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Without these fixes, H.265 streams using long-term references would
fail to decode correctly as the decoder wouldn't distinguish between
short-term and long-term reference frames.
Fixes: 896f95a37e ("vulkan/video: fix h265 decoding with LT enabled.")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 01de6ac134)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Moving `ci-tron:priority:` out of the variable because an empty value
will not be authorized, and this makes it obvious if that bug ever
happens (job will not be picked up and gitlab will complain that
`ci-tron:priority:` is not a tag registered by any runner), instead of
getting picked up by any runner that will then reject (fail) the job.
(This is caused by GitLab's API not allowing tags to be enforced when
picking up jobs, resulting in jobs with missing tags being picked up by
any runner, like the bug we had with the generic fd.o runners a few
months ago.)
v2 (Martin Roukala):
* use the priority tags in all amdgpu jobs
* add missing tags in etnaviv jobs
* add missing tags in broadcom jobs
Cc: mesa-stable
(cherry picked from commit 53fe1f39a0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Fixes this test on Xe2+:
INTEL_DEBUG=no32 ./deqp-vk -n dEQP-VK.spirv_assembly.instruction.maint9_vectorization.bit_field_u_extract.result_v16i-base_v16i-offset_s64u-count_s16i
Generate invalid code for that platform:
and(16) g37<1>UW g65<16,4,4>UW 0x000fUW { align1 1H I@5 };
ERROR: Invalid register region for source 0. See special restrictions section.
Several helpers like has_subdword_integer_region_restriction() do not
see the final type of the source, so compute it early.
Maybe new_src could be used in more cases. Being conservative for now.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 8f9acc0150)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
When probing on generic Linux platforms, the loading of d3d12 and
the first init of could fail, but the error returned causes a
loader warning to be printed.
Use the correct error return to stop this.
Cc: mesa-stable
(cherry picked from commit c00b66fa71)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The option's description is:
> Whether to use LLVM for the Gallium draw module, if LLVM is included.
Let's disable it right away if LLVM is disabled, to avoid some
configurations from failing.
Cc: mesa-stable
(cherry picked from commit 37c7d19e46)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Prior to commit b8b38d38b1 ("meson: reinstate LLVM requirement for r300
and enforce it for i915 too") it was possible to build and use r300 for
architectures that do not have LLVM (e.g., alpha).
The only SWTCL chips are integrated graphics in x86 systems, and are not
available in discrete cards.
Fixes: b8b38d38b1 ("meson: reinstate LLVM requirement for r300 and enforce it for i915 too")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4235c39a9a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Swizzle can include PIPE_SWIZZLE_0/_1 (4 and 5) which result in indexing
beyond the channel array.
Reported-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Fixes: 76e350671f ("freedreno/a6xx: Sysmem clear fixes")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
(cherry picked from commit f0465ced7f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The versioning scheme changed in v45.0 (the previous version was
3.48.0). As such, this version check would wrongly accept e.g. 48.0.
Fixes: e9341568fa ("meson: require sysprof-capture-4 >= 4.49.0")
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit ad14942300)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Sadly this probably won't change anything in terms of perf as the CCS
engine has a bunch of other restrictions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 243c01c703 ("anv/iris: implement Wa_18040903259")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit 07b7de35cc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
When there are no color outputs in the rendering state, but color write
enable/write aren't masked out (which seems legal with
VK_EXT_dynamic_rendering_unused_attachments), the driver must emit
CB_DISABLE to disable CB rendering completely.
Otherwise, if there is also a depth/stencil attachment in the rendering
state, CB0 is always set to 32_R for RB+. That means, the pixel shader
would still export fragments but to the previously bound color
attachment.
VKCTS is missing coverage.
Fixes: 4580293ab2 ("radv: implement RB+ depth-only rendering for better perf")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14319
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 168a8d0b52)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
vkQueueBeginDebugUtilsLabelEXT and vkQueueEndDebugUtilsLabelEXT
require queue to be externally synchronized, which means these functions
require the lock. Unfortunately, there's no guarantee that the debug
markers will be matched in the multithreaded case, but I suppose this is
better than crashing.
Fixes: 015eda4a41 ("zink: deduplicate VkDevice and VkInstance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 80db8171de)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
We currently only create one queue per queue family on the device. The
device can be shared between multiple zink_screens, so having one lock
per screen can still lead to multiple locks per queue. Fix this by
allocating queue_lock along with the device.
This fixes an issue that was causing crashes with nvk+zink and
QtWebEngine with QTWEBENGINE_FORCE_USE_GBM=1 This can be reproduced by
resizing the window in either:
* anki - https://apps.ankiweb.net/ or
* Qt's simplebrowser example
https://doc.qt.io/qt-6/qtwebengine-webenginewidgets-simplebrowser-example.html
which would then cause this dmesg error:
nouveau 0000:01:00.0: anki[92007]: Failed to find syncobj (-> in): handle=40
along with a context loss.
With VK_LOADER_LAYERS_ENABLE=VK_LAYER_KHRONOS_validation we would additionally
get warnings like:
Validation Error: [ UNASSIGNED-Threading-MultipleThreads-Write ] | MessageID = 0xa05b236e
vkQueueSubmit(): THREADING ERROR : object of type VkQueue is simultaneously used in current thread 139824449189568 and thread 139823901816512
Objects: 1
[0] VkQueue 0x557a666783e0
Fixes: 015eda4a41 ("zink: deduplicate VkDevice and VkInstance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9acce36652)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
We were overflowing an array during bifrost disassembly. This was
only a problem if the user explicitly set an environment variable,
so unlikely to occur in casual use, and also only could be triggered
in very specific, dense code. But we still should get this right!
The specific CTS test that caused the assert is:
'dEQP-VK.graphicsfuzz.stable-quicksort-for-loop-with-injection'
with environment variable `BIFROST_MESA_DEBUG=shaders`. One of the
shaders has a clause with 6 constants (the maximum) and this overflowed
the array because we assume we always have an extra slot (used for
modifier processing).
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 65ba14519e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
GL_EXT_texture_buffer_object requires support for alpha, luminance,
luminance-alpha and intensity formats. If we can't support those, we
can't enable the extension.
Fixes: 45ca7798dc ("glsl: handle interactions between EXT_gpu_shader4 and texture extensions")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6f2b8c3f61)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Most of the time, we remember to check for both extensions. But in one
case, it seems we forgot the GLES extension. Whoops.
Let's switch to a helper here, so we don't have to repeat the logic over
and over again.
Fixes: b4c0c514b1 ("mesa: add OES_texture_buffer and EXT_texture_buffer support")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9d5e0c1ad2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
There's a single underlying bo mapping shared by the initial alloc here
and the later import of the same. The mapping size has to be initialized
with the real size of the created blob resource, since the app can query
the exported native handle size for re-import. e.g. lseek dma-buf size
Similar to virtgpu_bo_create_from_device_memory, the app can do multiple
imports with different sizes for suballocation. So on the initial
import, the mapping size has to be initialized with the real size of the
backing blob resource.
Backport-to: 25.3
(cherry picked from commit 0afc408cb9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
If the allocation originates from the same instance, the tracker map
size follows the allocationSize. After export and re-import, mapping the
whole dma-buf can exceed the original map size. This change backs out
the offending changes.
Test: dEQP-VK.api.external.memory.*.suballocated.host_visible.*
Fixes: 442f242a49 ("venus: requests whole blob mem size for non-dedicated import")
(cherry picked from commit c259ea24ee)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.
This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.
Cc: mesa-stable
(cherry picked from commit 1243d575a5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
When no workgroup size is specified we try to run with the most optimal one
possible. However we didn't take into account that we shouldn't run a
workgroup of higher dimensionality than requested by the application.
Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
(cherry picked from commit d46be8fbf2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
There were two issues:
1. The global_work_offset parameter is optional but we errored on NULL
2. We didn't return the reqd_work_group_size when set on the kernel.
Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
(cherry picked from commit 810dca450c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
_math_matrix_is_dirty() should only be used to decide if we need to
run _math_matrix_analyse(). We already decided that we had a new
texture matrix when we called _mesa_update_texture_matrices() so
we need to set _TexMatEnabled correctly otherwise we might
incorrectly return _NEW_FF_VERT_PROGRAM | _NEW_FF_FRAG_PROGRAM in
the following if-statement.
Fixes: ec978e002f ("mesa: only update fixed-func programs on texture matrix enablement changes")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14286
Reviewed-by: Emma Anholt <emma@anholt.net>
(cherry picked from commit b0047be0c2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The VMA of VkDeviceMemory has to accomodate all the resources that can
be bound to it. For sparse images it's 64KiB alignment, for other
tiled images it's 4KiB. But we also have a workaround that requires a
64KiB alignment for Tile4 images.
The initial version of the slab allocator missed the 4KiB alignment.
This fix adds the workaround handling too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: dabb012423 ("anv: Implement anv_slab_bo and enable memory pool")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 401b2066b0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
SB_ID(LS) is currently equal to zero, so this is not a behavior change,
but worth setting it explicitly for clarity and in case the sb
assignments change.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 885805560f ("panvk/csf: fix case where vk_meta is used before PROVOKING_VERTEX_MODE_LAST")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit ebbf05f9d2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
We check fn_set_fbds_provoking_vertex_stride == 0 to determine whether a
previous function variant has already been allocated, so this value must
be initialized to zero before we start the loop. We could fix this by
explicitly initializing just that field, but I figure it's simpler and
safer to just zero-initialize the whole struct.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 885805560f ("panvk/csf: fix case where vk_meta is used before PROVOKING_VERTEX_MODE_LAST")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit e899bc8be8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
send.ugm (1|M0) r125 r0 null:0 0x0 0x0200651F {$9} // wr:1+0, rd:0; fence invalid flush type scoped to tile
When destination of Send(s) is not null, the response length must not be 0.
Should only affect DG2 products.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 4816318887)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The flag mega_fetch should be set on rv770 for a
read scratch operation (as written in the r700
documentation p357). Without this flag, read scratch
does not work and a gpu hang could be triggered.
Here are the tests fixed:
shaders/glsl-predication-on-large-array: fail pass
spec/glsl-1.10/execution/temp-array-indexing/glsl-fs-giant-temp-array: fail pass
spec/glsl-1.10/execution/temp-array-indexing/glsl-vs-giant-temp-array: fail pass
spec/glsl-1.30/execution/fs-large-local-array: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec2: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec3: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec4: fail pass
spec/glsl-1.30/execution/fs-multiple-large-local-arrays: fail pass
Fixes: 9c48a139b0 ("r600g: Support emitting scratch ops")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit f8de09a811)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This code was no longer needed after switching to os_read_file, but I
accidentally left it around, whoops!
Fixes: 49183bfb79 ("pan/bi: use os_read_file-helper")
CID: 1665295
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit d77279fa9b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
The PANFROST_JM_CTX_PRIORITY values aren't bitmasks, but enum values.
But the kernel interface uses the BIT()-macro on them, so we need to do
the same. We don't have the macro, but it's trivial to do this with a
bitshift instead.
Fixes: f04dbf0bc0 ("pan/kmod: query and cache available context priorities from KMD")
CID: 1666511
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 37a7a157e8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
This was copied from radeonsi which expected seq_force_screen_content_tools = 2
and seq_force_integer_mv = 2.
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 3858a6a696)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Disable sparse mappings on GFX7-8 due to GPU hangs in the VK CTS,
except Polaris where it happens to work "well enough" to pass
the VK CTS and run some games already.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 567e1b56ef)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Also disable the sparse binding queue and other related features.
Using sparse on GFX6-8 can cause GPU hangs at the moment.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1c8881fc60)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
We need to make sure the data part returned by sampler messages is
always aligned to a physical register. Just like the residency data
lives in a single physical register after the data.
Lowering a vec3 16bits per components led to a half a physical
register allocation which then confused the descriptor lowering
(expecting physical register units).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 295734bf88 ("intel/fs: fix residency handling on Xe2")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12794
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 61d6aea401)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
To avoid incompatibility between the compiler implementations used by
the driver and the renderer, seq_cst ordering is picked here, which has
required a full mfence instruction. Then the renderer side acquire is
ensured to be ordered after the cache flush of ring cs updates.
Perf wise, there's no regression in headless vkmark runs. In theory,
the overhead introduced here weighs trivially as compared to the ring
cs encode/decode part. So we should go for better robustness.
Test: venus on windows guest works with renderer on Linux
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14277
(cherry picked from commit 07d059f3e2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
With VK_EXT_unused_attachments, we may have a case where the FS writes
to attachments 0 and 1, both have valid formats and are enabled, yet the
renderpass only has 1 color attachment. In this case we would set
RB_PS_MRT_CNTL to 2, but since we never emitted RB_MRT_BUF_INFO[1] and
so on, we would get garbage attachment info from the last render pass
and end up writing to an attachment that doesn't exist.
Fix this by disabling attachments that are unused. We can't move setting
RB_PS_MRT_CNTL to emitting when we emit color RT state, because then we
have the inverse problem of a FS that writes to attachments 0 and 1, a
renderpass that has 2 attachments, but a blend state that only includes
1 attachment (and therefore disables color writes for attachment 1). At
least one side (blending or RT emission) has to assume that the other
side may have more RTs enabled and disable the rest of the RTs up to
MAX_RTS.
Fixes: c2eb768eb2 ("tu: Expose VK_EXT_dynamic_rendering_unused_attachments")
(cherry picked from commit 6064e3a7d8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
Either we need to save this pointer or toss it.
==146166==ERROR: AddressSanitizer: heap-use-after-free on address 0x7bfe77013920 at pc 0x7b9e6fd5b978 bp 0x7ffc30ef18e0 sp 0x7ffc30ef18d8
READ of size 4 at 0x7bfe77013920 thread T0
#0 0x7b9e6fd5b977 in get_header ../src/util/ralloc.c:83
#1 0x7b9e6fd5b977 in ralloc_parent ../src/util/ralloc.c:382
#2 0x7b9e6fd5b977 in reralloc_size ../src/util/ralloc.c:198
#3 0x7b9e6fd5b977 in reralloc_array_size ../src/util/ralloc.c:241
#4 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21
#5 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33
#6 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126
#7 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42
#8 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200
#9 0x7b9e6f8aab4d in st_finalize_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:735
#10 0x7b9e6f0afb14 in st_create_common_variant ../src/mesa/state_tracker/st_program.c:858
#11 0x7b9e6f0be2d3 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:973
#12 0x7b9e6f0bf9cf in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1478
#13 0x7b9e6f0bf9cf in st_finalize_program ../src/mesa/state_tracker/st_program.c:1596
#14 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633
#15 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#16 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#17 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#18 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
#19 0x7f9e7893dd65 in GOMP_parallel (/lib64/libgomp.so.1+0xdd65) (BuildId: 9cc501fdca53b5d4ab094f709486781c98573bc9)
#20 0x000000400d6a in main /home/alyssa/shader-db/run.c:689
#21 0x7f9e78011574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
#22 0x7f9e78011627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
#23 0x000000401014 in _start (/home/alyssa/shader-db/run+0x401014) (BuildId: a83b8d830cc265be3f54ea3e7a21a0fb5156624b)
0x7bfe77013920 is located 0 bytes inside of 64-byte region [0x7bfe77013920,0x7bfe77013960)
freed by thread T0 here:
#0 0x7f9e782e5beb in free.part.0 (/usr/lib64/libasan.so.8+0xe5beb) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
#1 0x7b9e6fd5bc39 in unsafe_free ../src/util/ralloc.c:319
#2 0x7b9e6fd5bc39 in ralloc_free ../src/util/ralloc.c:264
#3 0x7b9e70063d81 in nir_sweep ../src/compiler/nir/nir_sweep.c:219
#4 0x7b9e6f0bf499 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1585
#5 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633
#6 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#7 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#8 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#9 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
previously allocated by thread T0 here:
#0 0x7f9e782e5e4b in realloc.part.0 (/usr/lib64/libasan.so.8+0xe5e4b) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
#1 0x7b9e6fd5a883 in resize ../src/util/ralloc.c:167
#2 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21
#3 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33
#4 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126
#5 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42
#6 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200
#7 0x7b9e6f8b0ede in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:550
#8 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#9 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#10 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#11 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
Fixes: 17876a00af ("nir: Add a faster lowest common ancestor algorithm")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 65fcdf4c81)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
VCN requires the luma/chroma VAs to be 256 aligned. On VCN5, the
collocated buffer was not 256 aligned which can cause these VAs to be
unaligned.
This fixes VVL PositiveVideoEncodeH264.Basic on VCN5.
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 8848495875)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
GTK is missing a semaphore between QueueSubmit() and QueuePresent()
causing the WSI submit to be "unordered" and to immediately signal the
semaphores (because it's missing a wait semaphore in QueuePresent()).
The workaround is to disable unordered WSI submits until GTK fixes it
properly.
Cc: "25.3"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14087
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0d9d45db4e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
CTA-861-G section 6.9.1 Static Metadata Type 1 declares that zero values
for different groups of HDR Metadata properties are allowed, including
zero nits values for max display mastering luminance, max content light
level, max frame-average light level and min display mastering luminance.
A zero value is meant to be treated by the video sink as "undefined" /
"unknown", and handled accordingly. This is common for dynamically
generated visual content.
The is_hdr_metadata_legal() function in the Vulkan/WSI/Wayland HDR backend
currently declares HDR light level metadata as invalid if the mastering
display min_luminance and max_luminance light levels are set to the legal
level of zero nits. This causes valid HDR metadata as set by the client
via vkSetHdrMetadata() to be not sent to the compositor.
Fix this by skipping checks that don't apply if min_luminance or
max_luminance are zero. If max_luminance is zero then we skip sending
of mastering display min/max luminance to Wayland, as sending a a
max_luminance <= min_luminance would trigger a protocol error. All
other valid data is still send, ie. color primaries, white-point,
content light levels.
Fixes: cb7726bb2c ("vulkan/wsi: validate HDR metadata to not cause protocol errors")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Co-authored-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Xaver Hugl <xaver.hugl@kde.org>
(cherry picked from commit 490f05f82c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
The RGBA4/BGRA4 formats had the PAN_BIND_STORAGE_IMAGE set, but we
cannot support that.
Fixes: d95423686f ("pan/format: Add PAN_BIND_STORAGE_IMAGE flag")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 15868cf6e9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
This was mapped to RG16F, while R16F should be correct.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 1e2ca4dad6)
Conflicts:
src/panfrost/ci/panfrost-g610-fails.txt
src/panfrost/ci/panfrost-g610-flakes.txt
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Add resource table and index check to instruction equality function.
This prevents CSE from mistakenly eliminating LEA_BUF_IMM instructions
that load from different resources, but with the same buffer offset.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 00b5275fe8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Previously the type info for nested values was copied from the source
operand, rather than propagating the new type from the destination
operand.
Fixes: 4c363acf94 ("vtn: Allow for OpCopyLogical with different but compatible types")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 7ac1f7777d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
this code is invalid after the refcounting rework
Fixes: b3133e250e - gallium: add pipe_context::resource_release to eliminate buffer refcounting
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 7d22e4c7ba)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
This is what happens when you leave MR unreviewed for months.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit c4e2878537)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
For GS streamout, we need the following LDS scratch space:
- Repacking streamout vertices takes 1 dword per 4 waves per stream
(max 16 bytes for Wave64, max 32 bytes for Wave32)
- 1 dword per stream for buffer info
(16 bytes)
- 1 dword per buffer for buffer info
(16 bytes)
Previously, the space used for buffer info aliased with the
space for repacking the output vertices in ngg_gs_finale(),
and there was no barrier in between, which caused a race
condition, resulting in random failure.
Fix this by allocating a few more LDS dwords so that aliasing
is not required, which also allows us to remove an extra
workgroup barrier.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 8f99d736d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
1. The prolog needs to have a null check. Libraries don't have prologs.
2. We only need to print the shaders actually included in this pipeline.
Libraries were already printed separately.
3. The traversal shader was wrongly omitted from the output.
Cc: mesa-stable
(cherry picked from commit 73a31dafbc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Per ARB_vertex_program spec result registers are 4-component and initially
undefined, and the FF fragment program expects its intputs to be
4-component too. So, if the client's vertex program does not write the
whole vector it will cause misrenderings unless the same client also
supplies fragment program that expects less than 4 componens.
This commit adds a workaround that initializes results to vec4(0, 0, 0, 1)
which seems to be an expected behavior for such clients.
Cc: mesa-stable
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit f03432c81a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Use tristate for the aligned setting, otherwise it is always
first disabled which contributes to the condition if we set the
new stride active.
v2: set ByteStride in dword units and take secondary cmdbuf
in to account (Lionel)
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
(cherry picked from commit 2741ddd75a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
In ray tracing dispatch, we have dispatch.threads set to 0 since we
calculate the local_size_x/y/z based on the launch sizes.
This change takes 0 threads into an account and returh the TG size 8 in
such scenarios. Before this change, we were setting TG size to 2.
Fixes: 0c4e1c9efc ("intel/common: Add helper for compute thread group dispatch size")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 16f66ffe55)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
With the current stack configuration the rv770 seems to be unable
to go beyond three with the "vs-output-array-float-index-wr-before-gs.shader_test"
test. Anyway, the value four seems to be sufficient for the other tests.
This issue was triggered on rv770, for instance, with:
"piglit/bin/shader_runner tests/spec/glsl-1.50/execution/variable-indexing/gs-output-array-float-index-wr.shader_test -auto -fbo"
"piglit/bin/shader_runner tests/spec/glsl-1.50/execution/variable-indexing/vs-output-array-float-index-wr-before-gs.shader_test -auto -fbo"
Fixes: 713edb5998 ("r600/sfn: handle the IF predicate in the scheduler")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit ae049f6fea)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Drivers already have to track this workaround, so remove the logic
from Blorp and let the driver manage this.
Also in Anv don't accumulate this workaround, emit it directly in
place right after COMPUTE_WALKER. Accumulating can be problematic when
you want to dispatch concurrent compute shaders that do not need any
cache flush interaction (typical example with the internal
simple_shader framework).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3e0ad0176b ("anv: Emit state cache invalidation after every compute dispatch")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit c478b6355a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
CmdWriteAccelerationStructuresPropertiesKHR writes the data with MI
commands, we no longer dispatch shaders to write the properties.
As a result, we don't need to flush untyped cache.
Fixes: f0e18c475b ("intel: remove GRL/intel-clc")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 14194e59a4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
This fixes the VVL PositiveVideoDecodeAV1.* tests, which trigger error
concealment. These DPB addresses would not be normally used, but get
used by the error concealment path.
Fixes: d103b76ad6 ("radv/video: add VK_KHR_video_decode_av1 support.")
Reviewed-by: David Rosca <david.rosca@amd.com>
(cherry picked from commit 82d944b388)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
The base address used for bounds checking the entry was wrong. Directly
pass the end_of_entry address instead.
Fixes: db4bcd48d7 ("panvk: Fix IUB decode")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 89293120f0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
The needs_temp_copy() function was incorrectly identifying some
depth/stencil formats as needing RGB<->RGBA conversion.
VK_FORMAT_D32_SFLOAT_S8_UINT maps to PIPE_FORMAT_Z32_FLOAT_S8X24_UINT,
which has 3 channels (F32 depth, UP8 stencil, X24 padding). The
component count check (== 3) was matching this as an RGB color format,
causing depth/stencil images to incorrectly use the RGB conversion path.
Add an explicit vk_format_is_depth_or_stencil() check before the
component count test to ensure depth/stencil formats always use the
direct copy path.
Fixes: f97b51186f ("anv: intermediate RGB <-> RGBX copy for HIC")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 0be53b2ed8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
At least some drivers need a full modeset to change the Colorspace
property or to en-/disable HDR mode. E.g., at least amdgpu-kms as
tested under Linux 6.8 on Polaris needs it. Otherwise the atomic
commit for disabling HDR in _wsi_display_cleanup_state() will fail,
and the connector stays stuck in HDR mode after vkDestroySwapchainKHR().
Fixes: 1ed78dd7ec ("wsi/display: Clean up DRM hdr/color state on swapchain destruction")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
(cherry picked from commit ba82d36dce)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
For a selected non-default imageColorSpace during swapchain creation,
make sure that proper HDR setup also works even if a client app does not
explicitly call vkSetHdrMetadataEXT() in time.
Assign the EDID provided metadata here, so the 1st atomic commit will
set Colorspace and HDR metadata properties on the connector, to make sure
HDR or other wide color gamut modes get enabled.
Without this, the chain->color_outcome_serial would stay at zero and
the properties would not ever get assigned during drm_atomic_commit(),
leaving HDR disabled on the display sink.
Fixes: 13137393f6 ("wsi/display: Expose HDR10 colorspace based on EDID")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
(cherry picked from commit 19b2e3b81b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
CTA-861-G section 6.9.1 Static Metadata Type 1 declares that zero values
for different groups of HDR Metadata properties are allowed, including
zero nits values for max display mastering luminance, max content light
level, max frame-average light level and min display mastering luminance.
A zero value is meant to be treated by the video sink as "undefined" /
"unknown", and handled accordingly. This is common for dynamically
generated visual content.
Therefore don't assert on some minimum nits level > 0, but only check for
a non-negative level.
Fixes: b4176393a0 ("wsi/display: Implement VK_EXT_hdr_metadata on KHR_display swapchain")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
(cherry picked from commit 19dc09aded)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
For AHB VkBuffer import, the allocationSize comes from the raw external
AHB props query and it can be larger than the underlying buffer memory
requirement. So we must respect the allocationSize for the actual mem
import to support mapping the whole AHB size, and the dedicated buffer
info has to be stripped to obey the spec.
Test: CtsNativeHardwareTestCases no longer crashes on debug build panvk
Fixes: 66bbd9eec8 ("panvk: implement AHB image deferred init and memory alloc")
Tested-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 4ec2a921d3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
We're doing the same in vk_pipeline_precomp_shader_create().
Also fixes valgrind warning due to uninitialized fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit fc6d17a290)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Always copy parameters that are not guarded by a flag, zero init
the structs if not provided by application.
Fixes vk_layer_validation_tests PositiveVideoEncode*.GetEncodedSessionParams
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 6a1c6ab95b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
We skip the stall emission for STATE_BASE_ADDRESS since this one can
be skipped on Gfx12.5+ and instead add a new sba tracepoint that has
valid timestamps.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0147908a89 ("anv: predicate emission of STATE_BASE_ADDRESS")
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
(cherry picked from commit cff047280a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Some new CTS tests have geometry shader looking like this :
void main()
{
gl_Position = gl_in[0].gl_Position;
EmitVertex();
EndPrimitive();
// <-- some storage buffer write
}
The generate shader has :
- a message to write the position
- a message to write to the storage buffer
- a final message to end the thread
This generates an empty EOT URB messages which is apparently not legal
(simulation complains, HW hangs) :
send(8) nullUD g126UD nullUD 0x04088007 0x00000000
urb MsgDesc: offset 0 SIMD8 write masked mlen 2 ex_mlen 0 rlen 0 { align1 1Q A@1 EOT };
Instead emit a write with actual data and the mask set at 0 to discard
the effect :
mov(8) g127<1>UD 0x00000000UD { align1 WE_all 1Q };
mov(8) g125<1>UD 0x00000000UD { align1 1Q };
send(8) nullUD g126UD g125UD 0x04088007 0x00000040
urb MsgDesc: offset 0 SIMD8 write masked mlen 2 ex_mlen 1 rlen 0 { align1 1Q A@1 EOT };
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit ff57c31696)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
This change:
1. Move size validation within sparse binding, but not escape to
non-sparse code path.
2. Error out if sparse is requested on unsupported platforms.
Fixes: d747c4a874 ("lavapipe: Implement sparse buffers and images")
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit e0acc5c2b4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
Not the most optimal solution but 64-bit vertex attributes are rarely
used. Could still revisit if we find a real use case that matters.
This fixes recent VKCTS coverage:
dEQP-VK.pipeline.fast_linked_library.vertex_input.component_mismatch.r64g64b64.*_to_dvec2
dEQP-VK.pipeline.shader_object_.*.vertex_input.component_mismatch.r64g64b64.*_to_dvec2
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14243
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a0d607bfdb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38432>
The shaders in question use:
(memory_load + (gl_SubgroupSize - 1)) & ~(gl_SubgroupSize - 1)
My guess is that this is supposed to be the subgroup size of whatever
produced the value, not the subgroup size in this shader.
And because in the consumer the workgroup size is 32, we use wave32.
Fixes: a2d3cbac2a ("radv: determine subgroup/wave size early")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14187
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 83e9ae2d5c)
Conflicts:
src/amd/vulkan/radv_instance.c
src/amd/vulkan/radv_instance.h
src/util/driconf.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Align with gallium side. When fixed-function blending is not available,
the internal blend shader is used. This is handled by a single ST_TILE
in the blend shader with the current sample ID, which requires sample
shading enablement.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 763d2418b8)
CI fails removed from cherry-pick as the file doesn't exist on stable,
and the main branch change has only removals.
Conflicts:
src/panfrost/ci/panfrost-g925-fails.txt
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Subpasses can have different view masks, although this isn't often used.
So we can't use the view mask of the last subpass when deciding what to
store, instead we have to use the same used_views field that's used by
loads and clears.
Noticed by upcoming tests for VK_QCOM_multiview_per_view_render_areas.
Cc: mesa-stable
(cherry picked from commit c0b5c04b84)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
It's not just used for clears, it was already used for loads and it
needs to be used for stores too so clear_views was a confusing name.
Cc: mesa-stable
(cherry picked from commit 6c3ed74ed2)
Conflicts:
src/freedreno/vulkan/tu_pass.cc
src/freedreno/vulkan/tu_pass.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Some implementations can emit tracepoints when copying u_trace
buffers. It's important to reserve the slots we want to copy into
before emitting the copies so that both processes don't clash with one
another.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit df5f92d114)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
If renderpass has D/S attachment, but pipeline has D/S as UNDEFINED,
D/S should be properly disabled for the pipeline. The easiest way is to
ensure that D/S state is valid when pipeline's D/S format is UNDEFINED.
So we always create VkPipelineDepthStencilStateCreateInfo.
CC: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 2798ef7bfd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The handling in dedup_srcs was incorrect because it would apply the
modifier from srcs[i] to the LUT without removing the modifier from the
instruction. We can fix and simplify this code by removing all modifiers
before the dedup_srcs() call, which we were doing immediately after the
call anyway.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13966
Fixes: 66c9c40f68 ("nak: Handle modifiers in dedup_srcs() in opt_lop()")
Reviewed-by: Seán de Búrca <sdeburca@fastmail.net>
Reviewed-by: Lorenzo Rossi <git@rossilorenzo.dev>
(cherry picked from commit 041216e605)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The value depends on the tgsi_interpolate_loc which is not constant for
the loop. llvm should be able to cse in cases where they are the same.
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit aa28fcb610)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The size queries for images do not use function pointers so we need to
be careful that width, height and depth are 0.
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d6dd96e1c7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Fixes import of planar formats like NV12 in gtk4. Allows
`gst-launch-1.0 v4l2src ! gtk4paintablesink` to use vulkan instead of
falling back to OpenGL.
Closes: #14217
Cc: mesa-stable
Signed-off-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 83b97379dc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
We need to handle plane offsets everywhere. I noticed this broken before but
didn't realize it was a GL driver issue. Fix is easy, wrote this on my sofa
while waking up in the morning.
Fixes gst-launch-1.0 v4l2src ! glimagesink
Note that cheese & snapshot both still hang for some reason due to
libgstpipewire, but the Mesa side should be fine now.
Closes: #14217
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
(cherry picked from commit aa9f937116)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
No shader-db changes on any Intel platform.
fossil-db:
Skylake
Intel(R) HD Graphics 530 (SKL GT2)
Totals:
Cycle count: 57669758527 -> 57669757913 (-0.00%); split: -0.00%, +0.00%
Totals from 10 (0.00% of 1736875) affected shaders:
Cycle count: 274949 -> 274335 (-0.22%); split: -0.36%, +0.14%
This change is likely due to subtle differences of different registers
being allocated.
In addition, fossils/google-meet-clvk/BgBlur.1f58fdf742c27594.1.foz and
fossils/google-meet-clvk/Relight.1f58fdf742c27594.1.foz stopped failing
EU validation on Gfx9 platforms.
Closes: #14171
Fixes: e7b7d572b3 ("intel/fs/ra: Re-arrange interference setup")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 3e6af6c5bb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
write_memory is used after encoding every frame to mark the feedback
buffer as ready. Only use it when write_memory can work without PCIe
atomics support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 874e02003a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The llvm::orc::ThreadSafeContext object wraps an llvm::Context and keeps
its reference.
As we are no longer able to squeeze out Context from ThreadSafeContext
in LLVM 21, do not let ThreadSafeContext create Context implicitly for
LLVM 21, instead explicitly create Context and then remember it.
This also eliminates the code creating a Context that is never disposed.
Fixes: cd129dbf8a ("gallivm: support LLVM 21")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
(cherry picked from commit cc60a7a39d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
If the AR is loaded from a register changing that register in a loop was
resulting in a scheduling failure because the AR load was made dependend
on a later instruction. Fix the dependencies by only using dependencies on
older instruuctions in the same block.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14114
Fixes: d21054b4bc ("r600/sfn: Add pass to split addess and index register loads")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 43d9765e35)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The logic here before was wrong. In the case where the set is the same,
it would avoid the flush but then re-initialize anyway, loosing the
dirty information and causing us not to actually flush out all the
descriptors.
Fixes: 1f0fda22f7 ("nvk: Flush descriptor set maps")
(cherry picked from commit 2f6b3b6b91)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The Vulkan spec says:
"VUID-vkCmdDraw-maxFragmentDualSrcAttachments-09239
If blending is enabled for any attachment where either the source
or destination blend factors for that attachment use the secondary
color input, the maximum value of Location for any output attachment
statically used in the Fragment Execution Model executed by this
command must be less than maxFragmentDualSrcAttachments"
Which means it must be disabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14190
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit b2badb2b24)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Struct virgl_renderer_capset_drm has a varying size depending on whether
AMDGPU driver is enabled or not. This breaks offset of struct vdrm_device
members for non-AMD drivers when Mesa is built with multiple native context
drivers including the AMD driver. Place varying capsets in the end struct
vdrm_device to mitigate the issue.
Fixes: 5736280730 ("virtio/vdrm: add ENABLE_DRM_AMDGPU for c_args")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
(cherry picked from commit bd8377bb04)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
v2: - Correctly test in multi-slot split whether the group has kill if
we want to add a multi-slot op.
- update group_has_predicate if an according vector op was added
Fixes: 359bfc3138 ("r600/sfn: make sure that kill and update pred are not in the same group")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 317345cc98)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
Intel & AMD Direct3D drivers modify their rounding behaviour for texturing to
match Direct3D expectations. Such behaviour is not conformant in Vulkan, and
Intel hardware lacks a reasonable way to get NVIDIA's behaviour (which uniquely
works for Vulkan & Direct3D). The second best choice is to use
Direct3D-compatible behaviour for Proton (via driconf) and our current
Vulkan-conformant behaviour everywhere else. Given the APIs diverge and there is
no Vulkan extension to control the behaviour explicitly, driconf'ing on the
engineName is the reasonable solution.
anv already has a anv_force_filter_addr_rounding driconf option to force
Direct3D behaviour for certain Direct3D titles. Here we simply apply it to all
D3D10+ titles, aligning us with the Windows driver.
Note that D3D9 does not have this behaviour. We therefore use standard Vulkan
behaviour for D3D9 to avoid breaking D3D9 titles, even though the engineName is
the same as D3D10+.
This is the same solution radv uses, they call it radv_disable_trunc_coord. We
could unify the driconf entries later.
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38098#note_3166306
for a more detailed analysis, as well as the linked references:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27337https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25911https://github.com/HansKristian-Work/vkd3d-proton/pull/1884
This fixes misrendering in piles of Direct3D games run on anv via Proton,
including Assassin's Creed Valhalla.
Cc: mesa-stable
Closes: #13886
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Co-authored-by: Calder Young <cgiacun@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit 7a71952762)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38268>
The arr::in_bounds field was set unconditionally for every deref created
for a chain. For struct derefs, which don't have this field, this would
write to an unused memory location, which is probably why this never
caused issues.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f19cbe98e3 ("nir,spirv: Preserve inbounds access information")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 0ac55b786a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
VCN requires 64x16 alignment for HEVC. When the app requests non-aligned
resolutions, make up for it with conformance window cropping.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Cc: mesa-stable
(cherry picked from commit cef8eff74d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
While accounting for an input register's merge set when resetting the
file start after the preamble, we implicitly assume that the allocated
register is the preferred one by asserting that the register's merge set
offset is not smaller than its physreg (to prevent an underflow).
However, inputs are not guaranteed to have their preferred register
allocated which causes the assert to get triggered.
Fix this by only taking the whole merge set into account for inputs that
actually got their preferred register allocated.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 9d4ba885bb ("ir3/ra: make main shader reg select independent of preamble")
(cherry picked from commit f84d85790e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
Wrapping jump instructions that are located inside ifs can break SSA
invariants because the else block no longer dominates the merge block.
Repair the SSA to make the validator happy again.
Cc: mesa-stable
(cherry picked from commit 50e65dac79)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
We may require a bigger more than 16KiB to handle the image copy.
We now always allocate a buffer to handle it properly fixing the
remaining failures on VKCTS 1.4.4.0 for HIC.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Mary Guillemard <mary@mary.zone>
(cherry picked from commit d37ba302d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
We were assuming that every formats used for HIC had a block widgh and
height of 1x1.
This is wrong for compressed formats like BC5, ASTC, ect.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Eric Engestrom <eric@igalia.com>
(cherry picked from commit 887f06a966)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
We don't support it, everyone dropped support for that, let's not expose it.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Mary Guillemard <mary@mary.zone>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 7e636d52f1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
Coverity notices that `nir_get_io_index_src_number` could return -1, and
that we use it to index an array. It cannot understand that -1 only
happens for unhandled enum values, but all of these are handled. Add an
assert to help it out.
CID: 1667234
Fixes: 37a9c5411f ("brw: serialize messages on Gfx12.x if required")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a5b9f428f9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
hk_heap is called during command buffer recording, which may be
concurrent, so writing dev->heap without synchronization is a data race.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
(cherry picked from commit bca29b1c92)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
We need to always use the FMT6_Z24S8_AS_R8G8B8A8 format for GMEM even if
UBWC is disabled, as already done for the 2d store path. Because we
use the pre-baked RB_MRT_BUF_INFO register value, this means we have to
override it.
Cc: mesa-stable
(cherry picked from commit 9417ce287c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
This can happen if we resolve to a resolve attachment and then use that
resolve attachment as an input attachment in a later subpass. We don't
need to put it in GMEM, but it's still considered "written" because
input attachment reads need a dependency after the resolve.
MSRTSS input attachment tests effectively created such a scenario after
lowering to transient multisample attachments and inserting resolves.
Cc: mesa-stable
(cherry picked from commit d491a79027)
Conflicts:
src/freedreno/vulkan/tu_pass.cc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
VK_KHR_maintenance9 requires that vertex attributes in shaders which map
to vertex attributes that aren't bound at the API return a consistent
value. In order to do this, we need toemit SET_VERTEX_ATTRIBUTE_A, even
for unused attributes. The RGBA32F format was chosen to ensure we
return (0, 0, 0, 0) from unbound attributes.
Fixes: 7692d3c0e1 ("nvk: Advertise VK_KHR_maintenance9")
(cherry picked from commit d39221cef3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
Fixes the following building error happening with clang:
../src/util/os_file.c:291:29: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct epoll_event evt = {};
^
1 error generated.
Fixes: 17e28652 ("util: mimic KCMP_FILE via epoll when KCMP is missing")
Cc: "25.3"
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 7bbbfa6670)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
The index of each RT is the remapped color attachment index, so we have
to use the remapped indices when telling the HW the number of RTs.
This fixes KHR-GLES3.framebuffer_blit.scissor_blit on ANGLE once we
enabled VK_EXT_multisampled_render_to_single_sampled, which switched
ANGLE to using dynamic rendering with
VK_KHR_dynamic_rendering_local_read.
Fixes: d50eef5b06 ("tu: Support color attachment remapping")
(cherry picked from commit 8d276e0d70)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
brw_prog_tcs_data::instances can be divided by vertices per threads on
earlier generations.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a91e0e0d61 ("brw: add support for separate tessellation shader compilation")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit e450297ea9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
It's not really clear whether or not it should use gamma 2.2 or the piece-wise
transfer function, or how clients would use it for wider gamut in general.
Currently no compositors I know of support ext_srgb, so this shouldn't affect
applications in practice.
Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Fixes: 4b663d56 ("vulkan/wsi: implement support for VK_EXT_hdr_metadata on Wayland")
(cherry picked from commit 14fcf145e3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
Apparently various tessellation parameters come specified from
TESS_EVAL stage in GLSL while they come from the TESS_CTRL stage in
HLSL.
We switch to store the tesselation params more like shader_info with 0
values for unspecified fields. That let's us merge it with a simple OR
with values from from tcs/tes and the resulting merge can be used for
state programming.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a91e0e0d61 ("brw: add support for separate tessellation shader compilation")
Fixes: 50fd669294 ("anv: prep work for separate tessellation shaders")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit f3df267735)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
It must not be larger than maxInlineUniformBlockSize.
Fixes VKCTS 1.4.4.0's
dEQP-VK.api.maintenance3_check.support_count_inline_uniform_block*.
Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
(cherry picked from commit fd2fa0fbc9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
Based on RADV.
The Vulkan spec says:
"If bindingCount is zero or if this structure is not included in
the pNext chain, the VkDescriptorBindingFlags for each descriptor
set layout binding is considered to be zero. Otherwise, the
descriptor set layout binding at
VkDescriptorSetLayoutCreateInfo::pBindings[i] uses the flags in
pBindingFlags[i]."
Fixes dEQP-VK.api.maintenance3_check.* in VKCTS 1.4.4.0.
Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
(cherry picked from commit 17e25b4983)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38167>
nir_lower_compute_system_values will attempt to lower
load_workgroup_size unless workgroup_size_variable is set. For precomp
shaders, the workgroup size is set statically for each entrypoint by
nir_precompiled_build_variant. Because we call
lower_compute_system_values early, it sets the workgroup size to zero.
Temporarily setting workgroup_size_variable while we are still
processing all the entrypoints together inhibits this.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 20970bcd96 ("panfrost: Add base of OpenCL C infrastructure")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit a410d90fd2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
This can happen with (for example) 32x2 loads with
align_mul=4,align_offset=2.
This patch does bit_size=min(bit_size,bytes) to prevent num_components
from being 0.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 52cd5f7e69 ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit b18421ae3d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
This was missed in the original maintenance9 MR.
Fixes the flakes in test
dEQP-VK.synchronization2.op.single_queue.event.write_ssbo_compute_read_ssbo_compute.buffer_16384_maintenance9
Fixes: 7692d3c0 ("nvk: Advertise VK_KHR_maintenance9")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 28fbc6addb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
With untyped pointer we might get a deref_cast with a 0 ptr_stride. But we
were supposed to ignore the stride information on the pointer anyway, so
let's do that properly now.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Fixes: 05dca16143 ("nak: extract nir_intrinsic_cmat_load lowering into a function")
(cherry picked from commit 3bbf3f7826)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
Those are normally uniform always, but for the purpose of fused
threads handling, we need to check their sources.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ca1533cd03 ("nir/divergence: add a new mode to cover fused threads on Intel HW")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 255d1e883d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
I didn't test "nvk: Fix maxVariableDescriptorCount with iub" as
thoroughly as I should have and it regressed
dEQP-VK.api.maintenance3_check.descriptor_set because we were then
violating the requirement that maxPerSetDescriptors describes a limit
that's guaranteed to be supported (and reported as supported in
GetDescriptorSetLayoutSupport).
That commit was also based on a misreading of nvk_nir_lower_descriptors.c
where I thought that the end offset of an inline uniform block needed to
be less than the size of a UBO. That is not the case - on closer
inspection that code gracefully falls back to placing IUBs in globablmem
if necessary. So, we can afford to be less strict about our IUB sizing
and only require that IUBs follow the existing limit imposed by
maxInlineUniformBlockSize.
Fixes: ff7f785f09 ("nvk: Fix maxVariableDescriptorCount with iub")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 77cd629b34)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
Fix a regression from an unfortunate typo.
Fixes: 48e8d6d207 ("panfrost, panvk: The size of resource tables needs to be a multiple of 4.")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 387f75f43d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
Should only set once outside the multidraw loop so that per draw can
patch its own own desc attribs when needed.
Fixes: a5a0dd3ccc ("panvk: Implement multiDrawIndirect for v10+")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 800c4d3430)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
Mesh shader workgroups always have the same amount of subgroups.
When the API workgroup size is the same as the real workgroup
size, this is a small optimization (using a constant instead of
a shader arg).
When the API workgroup size is smaller than the real workgroup
size (eg. when the number of output vertices or primitves is
greater than the API workgroup size on RDNA 2), this fixes a
potential bug because num_subgroups would return the "real"
workgroup size instead of the API one.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
(cherry picked from commit d20049b430)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
The object buf is referenced at the beginning of the
r600_draw_rectangle() function and should be freed
at the end. This issue was introduced with cbb6e0277f.
Fixes: cbb6e0277f ("r600: stop using util_set_vertex_buffers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit 3b1e3a40a8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
The info was moved to radeon_info, but it was only set for the amdgpu
kernel driver. It was uninitialized for radeon.
Fixes: d82eda72a1 - ac/gpu_info: move HS info into radeon_info
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit f5b648f6d3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
df1876f615 ("nir: Mark negative re-distribution on fadd as imprecise")
fixed the fadd case by marking it as imprecise. This commit fixes the
ffma case for the same reason.
However, "imprecise" isn't necessary and nowadays we have "nsz" which is
more accurate here. Use that for both fadd and ffma.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 62795475e8 ("nir/algebraic: Distribute source modifiers into instructions")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit ad421cdf2e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
The intermediate buffer between the 2 images is linear, its stride
should be a function of the tile's logical width.
Normally this should map to the values reported by ISL except for
TileW where for some reason it was decided to report 128 for TileW
instead of the actual 64 size (see isl_tiling_get_info() ISL_TILING_W
case)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 77fb8fb062)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
After a recent change, `piglit-traces.sh` automatically sets the caching
proxy, so update the docs to reflect this.
Also update the name of the variable from `FDO_HTTP_CACHE_URI` to
`LAVA_HTTP_CACHE_URI`.
Fixes: fa74e939bf ("ci/piglit: automatically use LAVA proxy")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 28e73a6239)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>
This reverts commit a219308867.
It's failing most of the tests on Anv :
$ ./deqp-vk -n dEQP-VK.wsi.xlib.maintenance1.scaling.*
Test run totals:
Passed: 88/2422 (3.6%)
Failed: 576/2422 (23.8%)
Not supported: 1758/2422 (72.6%)
Warnings: 0/2422 (0.0%)
Waived: 0/2422 (0.0%)
The only passing tests seem to be with this pattern :
dEQP-VK.wsi.xlib.maintenance1.scaling.*.same_size_and_aspect
(cherry picked from commit 2baa3b8c06)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38010>