usually inlining is optimal for cpu drivers since the majority of
time is spent in the shaders, and any amount of reduction to shader code
will be optimal
if, however, the shaders are still really big after inlining, this improvement
will be negated by the insane amount of time spent doing stupid llvm optimizer
passes, so check post-inline size to see whether it exceeds a size threshold
lavapipe release build - 1700% improvement
* spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec4-index-rd-after-barrier
before: 142.15s user 0.42s system 99% cpu 2:23.14 total
after: 8.60s user 0.07s system 99% cpu 8.677 total
fixes#6647
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16977>
it's important to get the full mask in order to accurately provide
dependency info
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17111>
by passing through the draw buffers, more accurate barriers can be generated
to ensure synchronization for both the draw buffer scopes and
descriptor binding scopes
ref #6597
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17111>
this clamps the layer count to the smallest number of layers rather
than the largest
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17061>
(cherry picked from commit f74df6205d)
Conflicts:
src/gallium/drivers/zink/zink_framebuffer.h
having this so far down in the BO allocation path meant that slabs were
getting demoted to device-local and then stored as BAR in the pb cache,
which broke the cache pretty badly
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17093>
(cherry picked from commit 5750050432)
the flags in vk_domain_from_heap() are the MINIMUM required flags for
the heap, but there may be other flags present, so ensure those are picked
up as expected
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17093>
(cherry picked from commit d63e04e583)
flush_region is relative to the map, so passing in the offsets again
breaks the flush (if the flush hook is implemented correctly)
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17095>
(cherry picked from commit 28d23c9e26)
If multisample is enabled and alpha testing happens, the
branch can jump out of the fragment shader before the other
samples are generated. Just don't take the branch optimisation
post alpha test if multisample is enabled.
This should fix some rendering bugs in kicad with multisample
enabled.
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17049>
(cherry picked from commit 6e5d126a65)
The prior behavior can ignore certain failure result, and might also
clean up queues that are never initialized.
Fixes: ddd7533055 ("venus: initial support for queue/fence/semaphore")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16731>
(cherry picked from commit cb8dfa4966)
The failure path was never hit though, and will not either.
Fixes: 65abd1d4ae ("venus: implement vn_buffer_cache_entries_create")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16731>
(cherry picked from commit 01a0bfc3f9)
This was an optimization done a while ago that doesn't seem to be having
much of an impact anymore, and on the other hand, causes all sorts of
breakage with queries, as many of our HW counters don't get incremented
when rasterization is disabled.
This fixes a bunch of issues Zink has with ANV, but more importantly, it
fixes upcoming CTS tests:
dEQP-VK.transform_feedback.primitives_generated_query.*.empty_frag.*
dEQP-VK.transform_feedback.primitives_generated_query.*.no_attachment.*
dEQP-VK.transform_feedback.primitives_generated_query.*.color_write_disable_*
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17038>
(cherry picked from commit 4666ef720e)
Conflicts:
src/gallium/drivers/zink/ci/zink-anv-tgl-fails.txt
CI file was deleted as it doesn't exist on 22.1
These are normally only set once because it's constant across the entire
renderpass, but they're trashed by the 3d store path because it needs to
store to CCU instead of GMEM. Therefore we need to save/restore them. Do
it in a way compatible with #5181.
Fixes: b157a5d ("tu: Implement non-aligned multisample GMEM STORE_OP_STORE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17058>
(cherry picked from commit cba6da2b21)
Even though image views for attachments must use the identity swizzle,
there are cases where we have to add in our own swizzle, in particular
for D24S8 when the view is depth-only/stencil-only. Therefore we have to
reset it to the identity, similar to what we do with input attachments.
Fixes: b157a5d ("tu: Implement non-aligned multisample GMEM STORE_OP_STORE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17058>
(cherry picked from commit 705c0d0373)
Conflicts:
src/freedreno/ci/freedreno-a630-fails.txt
Instead of just checking for the variables to match, check that the
entire deref up to the interface type matches.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
(cherry picked from commit 8492e78f9d)
Instead of having a bunch of mode checks as special cases, assert that
the modes equal and then switch on the mode. This should make the
special cases a bit easier to understand. Handling of `a_var == b_var`
looks redundant now but it won't be in the next patch.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
(cherry picked from commit 0ad2dfe942)
This will let us use it to compare only the first part of a pair of
deref paths and continue the comparison later.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
(cherry picked from commit 130d9d80db)
Instead of incrementing pointers, use an integer index. This makes it
clear that we always increment them together. It'll also make the next
change a bit easier. We use a pointer to an integer because the next
patch is going to let us abort the walk and we want to be able to
continue where we left off.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
(cherry picked from commit 7ebcdada00)
Whether it's coherent should be irrelevant and the ACCESS_RESTRICT check
above should consider all cases aliasing unless NIR makes it clear they're
not.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
(cherry picked from commit cb5c1bcb7c)
this was correct for 64bit loads and manually converted 32bit loads (e.g., bindless),
but it was broken for the case where 64bit was not supported, as the offset wasn't
being correctly adjusted
break out the offset division to hopefully make this a little clearer
Fixes: 150d6ee97e ("zink: move all 64-32bit shader load rewriting to nir pass")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
(cherry picked from commit bbe5136658)
Conflicts:
src/gallium/drivers/zink/ci/zink-tu-a630-fails.txt
CI file removed as it doesn't exist in 22.1
This context needs to be locked before usage, and flushed after.
If it's forgotten, radeonsi may crash (eg #6666).
To avoid this kind of error, introduce 2 helpers.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17032>
(cherry picked from commit bda1c081bd)
emit_barrier_impl() was still checking the nir_var_uniform flag to
detect images, which is no longer correct.
Fixes: cfdc7ee066 ("spirv: Use nir_var_mem_image")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16926>
(cherry picked from commit 303175cfec)
This field copy was missing in
vk_get_physical_device_core_1_1_property_ext().
Fixes: 19ff5019b7 ("vulkan: Add helpers for filling exts for core features and properties.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16926>
(cherry picked from commit b78d3ebe72)
this isn't always 32
Fixes: 2d745904ca ("zink: add a gently mangled version of the d3d12 cubemap -> array compiler pass")
fixes:
dEQP-GL45-ES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_*
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17008>
(cherry picked from commit 87c7e75721)
each sampler is 1 driver location, so use the base variable
Fixes: 2d745904ca ("zink: add a gently mangled version of the d3d12 cubemap -> array compiler pass")
fixes:
dEQP-GL45-ES31.functional.shaders.opaque_type_indexing.sampler.const_expression.*.samplercubearray
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17008>
(cherry picked from commit de6af39534)