Patch converts MI_LOAD_REGISTER_MEM, MI_LOAD_REGISTER_IMM to use
mi_builder in CmdBeginTransformFeedbackEXT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31502>
Since the last changes for EGL_EXT_device_drm_render_node, the
can_present_on_device callback now may receive the render device.
With that, the v3dv implementation may return that this device cannot
be used for presentation.
In particular, this callback is used for x11 wsi, and when through
XWayland it does now get the render device. On x11 wsi, this makes the
swapchain operate on blit mode. The blit mode introduces additional
unneeded overhead on wsi and runs through a different path which
currently causes rendering issues (in particular also with Zink).
Allowing both devices to match in the callback returns all wsi to
operate on the native mode and fixes the issues above.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31490>
The descriptors should be deterministic as long as the memory address it's
assigned to is equal. Enable it by just advertising the feature and putting
a dummy capture replay data requirement of 1 (0 is not permitted).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19952>
Move pan_blitter.{c,h} to the gallium driver and rename it
pan_fb_preload to reflect the fact it's not a generic blitter framework.
While at it, get rid of the remaining generic blitting bits and pick
better names for objects related to the preload stuff in
panfrost_{device,screen}.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
This has several advantages over using pan_blitter for that:
- we can catch allocation failures and flag the command buffer invalid
- we can re-use the vk_meta_device object list to keep track of our
preload shaders
- we can re-use surface descriptors instead of re-emitting them every
time a preload is done
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Will be needed if we want to re-use pre-emitted texture payloads in the
FB preload path.
With this in place, we no longer need the src_iview in the resolve info
struct.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Once we've specialized the framebuffer preload logic in panvk, this
will prevent re-emission of texture descriptors in the preload path.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Now that vk_meta can keep track of VkShaderEXT objects, we can keep
our blend shaders in panvk_device::meta and get rid of our custom
hash-table.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Blend and framebuffer preload shaders will be created as internal
shaders and added to the vk_meta object list.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Useful to debug copy-related issues.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
vk_image_view_type_to_sampler_dim() and vk_image_view_type_is_array()
can be useful to driver-specific meta shaders.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Add VK_META_OBJECT_KEY_DRIVER_OFFSET to define an offset for
driver-specific key types.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
PanVK has a few internal shaders that don't fit in the vk_meta
compute/graphics pipeline model. Teaching vk_meta about VkShaderEXT
allows us to keep track of those internal shaders without using yet
another hash table.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
Models C0 and D0 support these opcodes too.
total instructions in shared programs: 10869461 -> 10856992 (-0.11%)
instructions in affected programs: 1467666 -> 1455197 (-0.85%)
helped: 6012
HURT: 1413
Instructions are helped.
total threads in shared programs: 431014 -> 431010 (<.01%)
threads in affected programs: 8 -> 4 (-50.00%)
helped: 0
HURT: 2
total uniforms in shared programs: 5432771 -> 5430909 (-0.03%)
uniforms in affected programs: 183047 -> 181185 (-1.02%)
helped: 976
HURT: 128
Uniforms are helped.
total max-temps in shared programs: 2235272 -> 2234069 (-0.05%)
max-temps in affected programs: 38163 -> 36960 (-3.15%)
helped: 1262
HURT: 168
Max-temps are helped.
total spills in shared programs: 4331 -> 4363 (0.74%)
spills in affected programs: 964 -> 996 (3.32%)
helped: 6
HURT: 47
total fills in shared programs: 6527 -> 6622 (1.46%)
fills in affected programs: 2047 -> 2142 (4.64%)
helped: 6
HURT: 47
total sfu-stalls in shared programs: 15807 -> 15935 (0.81%)
sfu-stalls in affected programs: 787 -> 915 (16.26%)
helped: 71
HURT: 172
Sfu-stalls are HURT.
total inst-and-stalls in shared programs: 10885268 -> 10872927 (-0.11%)
inst-and-stalls in affected programs: 1469423 -> 1457082 (-0.84%)
helped: 5998
HURT: 1417
Inst-and-stalls are helped.
total nops in shared programs: 184280 -> 185612 (0.72%)
nops in affected programs: 10000 -> 11332 (13.32%)
helped: 311
HURT: 1193
Nops are HURT.
The results show a reduction in register pressure, but an increase in
spills, which looks contradictory. This is because for some reason, this
optimization makes the NIR scheduler produce code for some shaders in Godot
that cause additional spilling, but the problem seems to be exclusive to
Godot shaders and not really related to the optimization itself but to
how the NIR scheduler works. Excluding Godot shaders we actually see a
decrease in spills and a slightly larger improvement in instruction
counts:
total instructions in shared programs: 10720106 -> 10707621 (-0.12%)
instructions in affected programs: 1375316 -> 1362831 (-0.91%)
helped: 5948
HURT: 1364
Instructions are helped.
total threads in shared programs: 428248 -> 428244 (<.01%)
threads in affected programs: 8 -> 4 (-50.00%)
helped: 0
HURT: 2
total spills in shared programs: 3729 -> 3712 (-0.46%)
spills in affected programs: 451 -> 434 (-3.77%)
helped: 6
HURT: 0
total fills in shared programs: 4738 -> 4714 (-0.51%)
fills in affected programs: 564 -> 540 (-4.26%)
helped: 6
HURT: 0
Comparing only shaders from Godot:
total instructions in shared programs: 149355 -> 149371 (0.01%)
instructions in affected programs: 92350 -> 92366 (0.02%)
helped: 64
HURT: 49
Inconclusive result (value mean confidence interval includes 0).
total max-temps in shared programs: 16477 -> 16472 (-0.03%)
max-temps in affected programs: 180 -> 175 (-2.78%)
helped: 5
HURT: 0
Max-temps are helped.
total spills in shared programs: 602 -> 651 (8.14%)
spills in affected programs: 513 -> 562 (9.55%)
helped: 0
HURT: 47
total fills in shared programs: 1789 -> 1908 (6.65%)
fills in affected programs: 1483 -> 1602 (8.02%)
helped: 0
HURT: 47
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
These represent fmov variants for max0 and sat6 unpack modifiers. The variants
for max0 also exist in the add alu, but sat6 is exclusive to the mul alu.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
V3D can use these too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
AV1 has no startcode, so the packed header will not contain any
additional data at the beginning of the buffer that we would need
to skip to get to the OBU header.
Also ignore OBU types that we are not parsing.
Fixes assertions with libva-utils/av1encode.
Fixes: 5edbecb856 ("frontends/va: adding va av1 encoding functions")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31359>
The error message in the linker that checked
gl_MaxCombinedClipAndCullDistances would never be issued because the
compiler was already doing the check. I think the compiler might have
been done this way in the original commit d656736b as the linker
only sets the size when the clip/cull outputs are written so the
piglit test for this wouldn't have been triggered as it does not
write to the outputs.
Here we move the error to the compiler and fix things up so the
correct messages are triggered.
Fixes: d656736bbf ("glsl: Add arb_cull_distance support (v3)")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31471>
When creating a non-compressed view of a compressed image, we need to
divide the extent by the image block size not the view block size.
Fixes: 8ddc527ba4 ("vk/image: Fix the view extent of uncompressed views of compressed images")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31439>
When creating a surface with a non-compressed format that's backed
by a resource with a compressed format, we need to adjust the surface
size accordingly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31439>
Unlike 3D image views, 2D array image views created by meta_copy only
contain the layers needed for rendering, so we shouldn't consider the
base layer an image offset.
Fixes: 07c6459cd8 ("vk/meta: Add copy/fill/update helpers")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31439>
We need to add the asan failures to skip list because otherwise all the
tests run in the same group will be marked as fail too.
Fixes: e120176c58 ("ci: uprev VKCTS to 1.3.9.2")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31477>
This avoids sprinkling those all over the code base. Debug breakpoints
are put in there too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Backport-to: 24.2
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31481>
This is implemented for us by vk_renderpass.c, so let's enable it and
get rid of the skips caused by a NULL dereference on
->CreateRenderPass2(), even though that's a CTS bug.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31462>
Replace the less-than test by a less-or-equal when checking the
dynamic buffer count.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31462>
As soon as we call RUN_IDVS, we need a valid tiler descriptor. Let's
kill the needs_tiling optimization until we have a proper way of
knowing when draws can be skipped or when IDVS jobs can be replaced
by plain compute.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31462>