Inverse_ballot result is undefined if the input is not dynamically uniform.
And sinking out of loops might make the input divergent.
Fixes: 18a0ff137f ("nir: sink/move inverse_ballot like moves")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
(cherry picked from commit 91f8e32a85)
Same reason as for load_ubo.
Fixes: d199d65c3a ("nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
(cherry picked from commit 1ec3cc2aed)
The intention here is to detect ALU hardware instructions, but not
virtual instructions that haven't been explicitly whitelisted.
For some reason we had arbitrarily hardcoded 128 here, but our virtual
opcodes don't start at 128. They start at NUM_BRW_OPCODES. So, use
that instead.
This prevents regressions later when we delete some opcodes, shifting
some virtual opcodes into the 72-128 range.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
(cherry picked from commit ab0b9b6792)
- More robust.
- Handles properly UBO cases, needed for proper OpenCL support (rusticl).
- Resolved KHR-GL46.gpu_shader_fp64.fp64.max_uniform_components failure.
Fixes: f5ce806ed7 ("freedreno/ir3: Add wide load/store lowering")
Reviewed-by: Rob Clark <robdclark@freedesktop.org>
Co-authored-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30961>
(cherry picked from commit 78a121b8cf)
Looks like some leftovers from a debugging session.
Fixes: 97f6a62f7e ("pan/kmod: Add a backend for panthor")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30969>
(cherry picked from commit 2149a04de3)
Currently vdpau driver and dri driver are two separate libraries, when
radeonsi is enabled both libraries contain amdgpu winsys. radeonsi
needs shared winsys to ensure sync between OpenGL and VDPAU works.
Build vdpau driver into libgallium to avoid having two instances of amdgpu
winsys.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31079>
(cherry picked from commit b6faf586e6)
In c1a7d520f3, we disabled AUX usage for imported images when they are
using an explicit modifier that doesn't support it.
We need to do the same when the modifier is picked by the driver,
otherwise the memory requirements reported for an exported image don't
match those we report for import.
Fixes: c1a7d520f3 ("anv: Disable aux if the explicit modifier lacks it")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31051>
(cherry picked from commit e4ee0a2ce1)
Prevent the accidental passing of 0 components or bits, as it makes no sense.
Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Suggested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31103>
(cherry picked from commit 6bf7b5bcd8)
This API should allow encoding these back to back into the same
buffer, so handle it properly.
Cc: mesa-stable
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31086>
(cherry picked from commit 7531f6fd9c)
[Eric: drop anv changes, as 3ec8f7f995 "anv/video: initial
support for h264 encoding" is not in 24.2]
Although we don't want to rely on hwconfig for devinfo->verx10 == 120,
due to the dependence on closed source software, we do check to see if
hwconfig reports different values in the DEVINFO_HWCONFIG macro.
Matt was seeing this warning on 8086:a7a0:
> MESA: warning: INTEL_HWCONFIG_TOTAL_PS_THREADS (128) != devinfo->max_threads_per_psd (64)
Reported-by: Matt Turner <mattst88@gmail.com>
Fixes: 3e4f73b3a0 ("intel/dev: Update hwconfig => max_threads_per_psd for Xe2")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31077>
(cherry picked from commit c5c349a690)
If there's difference between scan_inst dest type and inst src type we
should be more careful, because difference in signedness can cause
incorrect results after the propagation.
Updated ror-default.trace hash, as the change fixes misrendering there.
Fixes: b23432c5 ("intel/fs: Fix a cmod prop bug when the source type of a mov doesn't match the dest type of scan_inst")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30998>
(cherry picked from commit fa51595c7f)
As sampler view can be used multiple times, do not attempt to rebind if
it was already bound.
This fixes a crash when replaying half-life-2-v2.trace.
Backport-to: 24.2
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31049>
(cherry picked from commit 8338e2082e)
This reverts commit 0b85476d86.
When mapping a BO in v3d, the map keeps forever until freeing the BO. If
later the map is required again, we reuse the map instead of doing the
map from scratch.
This saves calling map/unmap continuously, as well as a mechanism to
keep control of the map usage, like a reference count.
Thus, when reallocating a BO, if it is mapped it just means the map was
used in the past, but not necessarily it is in use right now.
The reverted commit was causing performance regressions in multiple
applications, reducing from 60fps to 5fps.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11783
Backport-to: 24.2
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31049>
(cherry picked from commit c84be162a1)
This was missing and causing dynamic state to remain dirty once set.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 1f57aae4e4 ("panvk: Move vkCmdDraw* functions to their own file")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31089>
(cherry picked from commit e47e94f9f2)
The following things needed to be fixed:
- lp_build_tiled_sample_offset has to use the block size of the
underlying 3D texture. Remaining calculations use the dimensions of
the view since the z-coordinate is undefined and has to be ignored.
- The layer offset must be calculated using llvmpipe_get_texel_offset.
Fixes: d747c4a ("lavapipe: Implement sparse buffers and images")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31038>
(cherry picked from commit 1ad1b356fc)
This will be useful for lavapipe since the Z-coordinate has to be
ignored for 2D view accesses.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31038>
(cherry picked from commit aa072b137b)
For example, if the application wants to create linked shaders with
VS/TCS/TES but the next stage of TES isn't present, the driver would
have to compile shader variants (ie. nextStage could be NONE,TES,GS).
This isn't supported by RADV without re-compiling all shaders twice,
and it's also likely less optimal than compiling unlinked shaders.
This fixes recent CTS
dEQP-VK.shader_object.link.linked_linked_linked_unlinked_unlinked.*
Fixes: 37d7c2172b ("radv: add support for creating/destroying shader objects")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31062>
(cherry picked from commit 868272c2e9)
There are two bugs:
- VK_KHR_maintenance5 added VkBufferUsageFlags2CreateInfoKHR, so
checking for pCreateInfo->usage is incomplete
- this was also missing the usage flag for descriptor buffer with samplers
This fixes recent VKCTS coverage in
dEQP-VK.binding_model.descriptor_buffer.*.
Fixes: 059391b631 ("radv: use 32bit va range for sparse descriptor buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31054>
(cherry picked from commit eab5b453cd)
v3d_resource_from_handle when importing a DRM_FORMAT_MOD_INVALID
considered that if we had a render-only device the resource layout was
linear and if we didn't have render-only the resource layout was tiled.
This change honors the resource creation with the SCANOUT flag
independently of the availability of the render-only for the
DRM_FORMAT_MOD_INVALID modifier.
It also fixes most of the failing piglit text for:
spec@ext_image_dma_buf_import@ext_image_dma_buf_import.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11594
Cc: mesa-stable
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30946>
(cherry picked from commit 5fed6bee19)
Otherwise OOB layers may render to the wrong layer in the depth image.
While we're here, add the same layer count asserts for color images.
Fixes: 9345b95346 ("nvk: Bind 3D depth/stencil images as 2D arrays")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31033>
(cherry picked from commit e533484d06)
This also fixes a bug where we were potentially emitting copy commands
after we'd called nvk_cmd_buffer_push() but before finishing the current
push. Rust would have caught this...
Fixes: bca2f13dd8 ("nvk: enable rendering to DRM_FORMAT_MOD_LINEAR images")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31033>
(cherry picked from commit d7d0287237)
Until now the RD dumps were stored in files on a per-device basis, using
the device index but assuming only one Vulkan instance is active. With
multiple active instances, different devices separated across those
instances could end up storing RD dumps into files with the same name.
tu_instance struct now has an index member variable that's assigned upon
creation with an incrementally-increasing global counter value. RD dump
output name now also contains this instance index, avoiding the described
naming collisions.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: f9c4e25483 ("freedreno: add fd_rd_output facilities for gzip-compressed RD dumps")
Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30977>
(cherry picked from commit 4c359eae01)
In commit fe3d90aedf ("intel/fs/xe2+: Fix calculation of spill message
width for Xe2 regs.") we aligned the width of scratch messages to
physical register sizes (32B prior to Xe2, 64B for Xe2+).
But our spilling offsets are computed using the register allocations
sizes which are in units of 32B. That means on Xe2, you can end up
spilling a virtual register allocated at 32B (which we use for surface
state computations with exec_all) and then the spilling of that
register will be emitted in SIMD16, having the upper 8 lanes
overwriting the next spilled register.
We could potentially limit spills to SIMD8 messages on Xe2 (only
writing 32B of data), but we're also unlikely to have all 32B virtual
register spilled next to one another. And if not tightly packed, we
would have 64B registers stored on 2 different cachelines which sounds
inefficient.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: fe3d90aedf ("intel/fs/xe2+: Fix calculation of spill message width for Xe2 regs.")
Backport-to: 24.2
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30983>
(cherry picked from commit aa494cbacf)