Command buffer private object destroy callbacks receive a 64-integer so their
signature should respect that to avoid alignment issues when passing pointers.
This is the same we were already doing for color pipelines, but now for D/S
pipelines too.
Fixes crash on 32-bit build with:
dEQP-VK.synchronization2.op.single_queue.fence.write_clear_attachments_read_copy_image_to_buffer.image_128x128_d16_unorm
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33463>
Transition to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR with non-wsi image was
seen with gfxrecon-replay case that ends up hitting weird assertions
later.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33027>
To be able to change a bunch of shaders to use aco. Used to
find problem shader when use aco quickly instead of one by
one when too many shaders.
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33440>
The proper name for the meson options changed to meson.options in Meson
1.1. Since we don't support older versions of Meson anyway, let's just
rename the options-file to the new name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33445>
This adds a proper interface for reporting shader compile failures.
They are propagated to the GLSL linker.
Reporting errors from finalize_nir will be deprecated.
Fixes: dae57e184a
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33341>
Most of those were likely fixed by the unconditional nir_opt_varyings,
since we are less likely to run out of input/output slots.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33341>
Now that we have a more powerful host, we started getting new flakes.
Let's document them!
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446>
3-plane YUV 444 and 16-bit 3-plane YUV are not supported natively by
the HW. Report these formats as unsupported since we may want to switch
to native YUV support in the future.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
This will make it easier to get the feature flags per plane for
multiplane formats.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
We were not correctly following VUID-VkImageCreateInfo-format-01577:
If format is not a multi-planar format, and flags does not
include VK_IMAGE_CREATE_ALIAS_BIT, flags must not contain
VK_IMAGE_CREATE_DISJOINT_BIT.
Fixes: 412c2863 ("panvk: Enable multiplane images and image views")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
Multiple sampler planes (one for luma, one for chroma) are needed to
support CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT.
Multiple texture descriptors (one per plane) are needed for the
downsampling in nir_vk_lower_ycbcr_tex() to work in panvk.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
This will help set things up for multiplane samplers and textures.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
This will help set things up for multiplane samplers.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
Place the view plane at index 0 for single-plane views of multiplane
formats. Does not apply to YCbCr views of multiplane images since
view->vk.aspects for those will contain the full set of plane aspects.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
Since the binding value can be any 32-bit number, we cannot assume that
it is <= 27 bits. We need 64-bit keys to accommodate a 32-bit binding.
This will also provide more bits to store the subdesc id, which will be
needed for multiplane texture and sampler descriptors.
Fixes: 7bea6f86 ("panvk: Overhaul the Bifrost descriptor set implementation")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
This function updates the data of a u64 hash_table entry and is safe to
use inside a hash_table_u64_foreach() loop.
Fixes: 7bea6f86 ("panvk: Overhaul the Bifrost descriptor set implementation")
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
Since copies happen one plane at a time, we can handle multiplanar copies
like color copies. The user gets to decide the format to use for each
plane, but the pipeline type and the optimal tile size applies to the
whole image.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
The logical send lowering code sets these, and is the code which
-should- set these.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
FS_OPCODE_INTERPOLATE_AT_{SAMPLE,SHARED_OFFSET} never have a mlen set.
They are lowered to SHADER_OPCODE_SEND in logical send lowering, at
which point they acquire an mlen, but cease to be those opcodes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
We teach lower_logical_sends to lower these to SHADER_OPCODE_SEND
and drop all the corresponding generator and eu_emit code.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
We're about to start lowering these in the IR, at which point the
scheduler will see SEND instructions with fence messages. Previously,
we handled those in the generator, and didn't handle the virtual opcodes
here, letting them fall through to the default case of 14 cycles.
These new numbers are completely fabricated, matching the times we have
for atomic operations. This is basically what we did for LSC atomics.
While it may not be accurate, it's at least better than 14 cycles.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
Memory fences do not refer to an element of a binding table. Rather,
the reason we had "BTI" in these opcodes was to distinguish what in
modern terms are called UGM (untyped memory data cache) vs. SLM
(cross-thread shared local memory) fences.
Icelake and older platforms used the "data cache" SFID for both
purposes, distinguishing them by having a special binding table
index, 254, meaning "this is actually SLM access". This is where
the notion that fences had BTIs came in. (In fact, prior to Icelake,
separate SLM fences were not a thing, so BTI wasn't used there either.)
To avoid confusion about BTI being involved, we choose a simpler lie: we
have Icelake SLM fences target GFX12_SFID_SLM (like modern platforms
would), even though it didn't really exist back then. Later lowering
code sets it back to the correct Data Cache SFID with magic SLM binding
table index. This eliminates BTI everywhere and an unnecessary source.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
For some reason, we were using UW type for the destination of memory
fences at the generator level, while in the IR we selected UD.
There are some comments in the documentation for the message about it
writing the notification register to the destination, which is 32-bit.
Prior to Xe2, bits 31:16 were Reserved/MBZ. But on Xe2, all 32 bits
are populated with actual data.
I don't know whether this will fix anything in practice, but it seems
like a better plan to use UD. Often we used UW types to avoid having
the destination region of sends span too many registers, but we're in
SIMD1 here, so it shouldn't matter.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
brw_memory_fence() overrides the instructions generated by the
MEMORY_FENCE or INTERLOCK opcodes to be force_writemask_all with
exec_size == 1. But the IR was emitting it in SIMD8 (regardless
of dispatch width). Instead, just emit the IR as SIMD1/NoMask so
the IR matches what we actually generate. Have size_written indicate
that the entire destination is written, however, as it is ultimately
going to be a SEND that writes a whole register.
We were also using a UD register for the source of
FS_OPCODE_SCHEDULING_FENCE when the generator overrides it to UW,
so just specify UW in the IR as well so that they line up.
Also add validation for MEMORY_FENCE/INTERLOCK that we've done the
exec_size and masking right in the IR.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
We can just specify this as a source to the logical FB read/write
opcodes. Notably FB reads had no sources before; now they have one.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
Rather than using a bit in the generic fs_inst data structure, we can
simply set a source on our logical FB write messages. (We already do
so for many other cases.)
In the repclear shader, setting this wasn't actually having an effect,
as we were setting it on a SHADER_OPCODE_SEND message which ignored it.
(We had already correctly set the bit in the message descriptor.)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
This was used for legacy depth passthrough on older hardware. Gfx9+
doesn't actually have dst depth as part of the message, which is the
only hardware brw supports these days.
It sure looks like we were setting it though...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
We already have logical pixel interpolator messages that get lowered
to send messages. We can just add an extra boolean source to those
opcodes rather than sticking a opcode-specific boolean in the generic
fs_inst data structure.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>
brw_lower_logical_sends can just check for the TEX_LOGICAL_SRC_SHADOW_C
source; we don't need a generic instruction bit for this. We used to
have one because this was handled in the generator for older hardware
before the advent of logical opcode lowering.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>