Like last_subpass, add a per-view mask of what subpass first uses an
attachment. This is required for optimizing out some barriers later.
Note that this requires us to do another loop over the subpasses.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36985>
Rather than OR-ing all subpass dependencies together in the Vulkan layer,
pass an array of barriers down to the drivers and allow them to do the
OR-ing if needed.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36985>
Transient attachments have been in Vulkan since 1.0, and are a way to
avoid allocating memory for attachments that can be stored entirely in
tile memory. The driver exposes a memory type with LAZILY_ALLOCATED_BIT,
and apps use this type to allocate images with TRANSIENT_ATTACHMENT
usage, which are restricted to color/depth/stencil/input attachment
usage. The driver is supposed to then delay allocating memory until it
knows that one of the images bound to the VkDeviceMemory must have
actual backing memory.
Implement this using the "lazy VMA" mechanism added earlier. We reserve
an iova range for lazy BOs, and only allocate them if we chose sysmem
rendering or there is a LOAD_OP_LOAD/STORE_OP_STORE. Because we never
split render passes and force sysmem instead, we don't have to deal with
the additional complexity of that here and just allocate everything.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
Up until now tu_device_memory (turnip's VkDeviceMemory) was a thin
wrapper around tu_bo (the GEM BO), so when binding an image to a
VkDeviceMemory we could just store the BO. But now we have to skip
allocating the BO unless we need to for lazily-allocated memory, and the
tracking for that needs to happen at the API level instead of the
kernel/GEM level, so store the tu_device_memory instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
Add an extremely limited form of sparse where zeroing memory is not
supported and only one BO can be fully bound to the sparse VMA
immediately when it's created. This can be implemented on drm/msm even
without VM_BIND, by just reserving the iova range. However kgsl doesn't
let us control iova offsets, so we have to use "real" sparse support to
implement it. In effect this lets us reserve an iova range and then
"lazily" allocate the BO. This will be used for transient allocations in
Vulkan when we have to fallback to sysmem.
As part of this we add skeleton sparse VMA support to virtio, which is
just enough for lazy VMAs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
Reserve an iova range separately before allocating a BO. This reduces
the size of the critical section under the VMA lock and paves the way
for lazy BOs, where iova initialization is separated out.
While we're here, shrink the area where the VMA mutex is applied when
importing a dma-buf. AFAICT it's not useful to lock the entire function,
only the VMA lookup and zombie BO handling.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
When enabling
dEQP-VK.renderpass2.dedicated_allocation.attachment_allocation.grow.17,
we see a hang on a618 when a draw is immediately followed by a blit
without anything in between. The draw and clear are writing completely
different surfaces.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
Wayland EGL applications often set SwapInterval 0 and then use surface
frame callbacks independently to pace their updates. These applications
don't actually want the display to tear, and we do not currently use
wayland tearing_control protocol on EGL to enable tearing behaviour.
Using a tearing presentation mode in Vulkan will lead to unexpected
behaviour for these applications.
Use MAILBOX on wayland instead.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34842>
On Valhall, we can end up with a lot of convertions for 8-bit and 16-bit
values.
However, since Valhall, we have access to a lot more swizzles on widen
sources.
The idea of this pass is to propagate replicate swizzle usages to
simplify things.
We do not attempt to propagate MKVEC.v2i16 as it is already handled by
bi_lower_swizzle.
This changes the following:
9 = V2S8_TO_V2S16 !7.b0
11 = IADD.v2s16 !9.h00, u4
88 = MKVEC.v2i8 11.b0, u256.b0, u256
13 = IMUL.v4i8 !88.b0, 8.b0
14 = V2S8_TO_V2S16 !13.b0
15 = IADD.v2s16 14.h00, !11.h00
89 = MKVEC.v2i8 !15.b0, u256.b0, u256
17 = IMUL.v4i8 !89.b0, !8.b0
Into this:
11 = IADD.v2s16 !7.b0, u4
13 = IMUL.v4i8 11.b0, 8.b0
15 = IADD.v2s16 13.b0, !11.h00
17 = IMUL.v4i8 !15.b0, !8.b0
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37167>
We are going to do more things here that will likely benefit from
possibly running multiple time.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37167>
src/intel/perf/xe/intel_perf.c:420:27: warning: implicit conversion from enumeration type 'enum drm_xe_eu_stall_property_id' to different enumeration type 'enum drm_xe_oa_property_id' [-Wenum-conversion]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37172>
In theory this is of no consequence, but letting the scheduler run
it's normal algorithm is better.
Fixes: a6b6ce84f7 ("r600/sfn: Prepare scheduler to handle WaitAck instructions")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37232>
It seems emmitting more than four MEM_RAT instructions in
a row may lead to a GPU hang, so start scheduling free
instructions earlier.
The problem is triggered by
KHR-GL45.compute_shader.shared-max
Fixes: fc40002de7 ("r600/sfn: Simplify scheduling")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37232>
We don't (yet) have the ability to say that more than one value is
acceptable, and the yaml module we use complains about duplicate keys,
so let's just comment them out as it doesn't care about missing keys and
doesn't remove comments.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37215>
The definitive linux kernel counterpart update is now merged.
The linux kernel commit dc5c742f41c0a1c2b14e4357c752851e015f3bcd
(v6.17-rc1) set the new r600 drm version requirement. This change
updates the drm version accordingly to 2.51.0.
Fixes: 9e1180b335 ("r600: implement ARB_indirect_parameters")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37118>
The function resource_copy_region is expected to have a memcpy()
like behavior. Some formats, like r11g11b10_float, do not preserve
the original raw values. This was the problem. This change
updates r600_resource_copy_region() to use the generic copy path
for those formats.
This change takes into account how 8235d3aa19 "radeonsi: preserve
NaNs in draw-based resource_copy_region" proceeds.
This change fixes the remaining khr-gl and deqp-gles31 copy_image tests. This
change was tested on palm, barts and cayman. Here are the tests fixed:
khr-gl4[2-6]/texture_view/view_classes: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_2d_array_src_format_rgb9_e5_dst_target_texture_2d_array_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_2d_array_src_format_rgb9_e5_dst_target_texture_3d_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_2d_array_src_format_rgb9_e5_dst_target_texture_rectangle_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_3d_src_format_rgb9_e5_dst_target_texture_2d_array_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_3d_src_format_rgb9_e5_dst_target_texture_3d_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_3d_src_format_rgb9_e5_dst_target_texture_rectangle_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_rectangle_src_format_rgb9_e5_dst_target_texture_2d_array_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_rectangle_src_format_rgb9_e5_dst_target_texture_3d_dst_format_r11f_g11f_b10f: fail pass
khr-gl4[3-6]/copy_image/functional_src_target_texture_rectangle_src_format_rgb9_e5_dst_target_texture_rectangle_dst_format_r11f_g11f_b10f: fail pass
deqp-gles31/functional/copy_image/non_compressed/viewclass_16_bits/.*: fail pass
deqp-gles31/functional/copy_image/non_compressed/viewclass_32_bits/.*: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37117>
If we never get rate control info, we would treat it as rate control
disabled and use QP value from slice info which is always 0 for default
rate control.
Cc: mesa-stable
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37156>
We need to store the driver location when we add it to the state
param list. Currently this code only works because
st_nir_assign_uniform_locations() later adds duplicate params but
this will be fixed in a following patch.
Unfortunatly we could not simply move nir_lower_clip to the state
tracker like we did in the reset of this MR as it is also called
directly from some drivers. Also to avoid making nir depend on the
gl_parameter defintions we simply loop over the results of the lowering
and fix up the locations.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This previously worked because the driver locations would later
be set when st_nir_assign_uniform_locations() was called for a
second time but we will be skipping the extra call in a later patch.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This is gl specific and a following fix will add more gl specific
params so here we move it to the st to avoid filling nir.h with
more junk.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This previously worked because the driver locations would later
be set when st_nir_assign_uniform_locations() was called for a
second time but we will be skipping the extra call in a later patch.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This is gl specific and a following fix will add more gl specific
params so here we move it to the st to avoid filling nir.h with
more junk.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
We need to store the driver location when we add it to the state
param list. Currently this code only works because
st_nir_assign_uniform_locations() later adds duplicate params but
this will be fixed in a following patch.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This previously worked because the driver locations would later
be overwritten when st_nir_assign_uniform_locations() was called
for a second time but we will be skipping the extra call in a
later patch.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>
This will allow us to fix bugs with driver_location in following
patches while keeping code to a minimum.
For now we always set uniform packing to false, this will be
corrected as we go in following patches for now it doesn't matter
as the driver location is later overwritten.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037>