Tessellation evaluation shaders have a single convergent URB handle
(for the common patch data) used by all lanes. Every other stage's
IO handles have separate handles in each lane.
Thanks to Alyssa Rosenzweig for catching this bug.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280>
We can avoid the stalls from subc switches by avoiding using the copy engine
during vkCmdBeginConditionalRenderingEXT. Implement this by loading the
cond render value using the MME, since the hardware doesn't have a
suitable 32-bit comparison itself.
This brings the Sascha Willems conditionalrender demo from
from 1661 to 8334 fps on my blackwell system with all meshes disabled.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40277>
If the gallium context does not support `native_fence_fd`, we can still
support sync fd export/import by exporting -1 as sync_fd in vulkan.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211>
If dmabuf export is supported we can now mark them as compatible handle
types. Additionally we can always store the backed_fd for export.
v2 (zzyiwei): hide opaque fd compat with dmabuf export behind udmabuf
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211>
Zink has assumed both import and export when dmabuf extension is
advertised, so lavapipe has to hide the extension for zink without
supporting both.
Together with the prior commit, now zink-on-lvp in the CI env without
udmabuf will no longer test against fake dmabuf support.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211>
Lavapipe relies on true udmabuf support for dmabuf export allocation.
This changes aligns the behavior with both llvmpipe_allocate_memory_fd
and llvmpipe_import_memory_fd.
Fixes: 7d0a631f20 ("llvmpipe: export dmabuf caps for kms_swrast")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40211>
More fallout from f2a59fdea6.
is_not_zero now always returns whether the result is a floating point zero.
When combined with the fp denorm handling that will be added to
floating point range analysis, this is false for many sensible integer values.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.
Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>
There was duplicated code to set unscaled_input_fragcoord and a read
from VK_ATTACHMENT_UNUSED attachment, which incorrectly updated
builder->unscaled_input_fragcoord.
ubsan:
tu_pipeline.cc:4734:44: runtime error: load of value 127, which is not a valid value for type 'bool'
Seen in:
dEQP-VK.renderpasses.renderpass1.custom_resolve.monolithic.stencil_only_s8
Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264>
D/S tests are disabled if subpass doesn't declare D/S being used, when
resolving D/S via draw call - test/write has to be enabled.
Fixes D/S tests from:
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.*
Fixes: 5a3b0ce461 ("tu: avoid incorrect pipeline draw state for disabled depth/stencil attachments")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264>
For dynamic renderpass we created a fake second subpass,
which would is used by CmdBeginCustomResolveEXT, however
CmdBeginCustomResolveEXT doesn't trigger tile stores, but
attachments didn't know they should be stored after fake
custom resolve subpass.
Fixes: 520e3f3a47 ("tu: Implement VK_EXT_custom_resolve")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40264>
The way it is, this optimization is too aggressive and may generate
code way worse than the original. Remove it from here, so drivers
consuming the generated SPIR-V will be able to make their own
more-informed decisions later.
Let's follow the same strategy of nir_load_liblc.c and just set the
limit to 0.
For indirect copies in Anv (not merged yet), block compressed formats
require some expensive divisions, so I put them all inside 'if'
statements that should never run on normal formats. This optimization
made us always run all the divisions all the time, tanking the
performance of the shader on small copies.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40020>
There are now no regressions in CTS (including no VVL errors) with
optimal keys on Turnip, so enable it by default.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40049>
Split texkill_cond into texkill_unary (single source) and texkill_binary
(two sources) variants. Update the compiler to use ISA_OPC_TEXKILL_UNARY for
discard emission since it only uses a single source operand.
Fixes: 081efcd68d ("etnaviv: isa: Split texkill into concrete bitset variants")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40262>
Use INT_MIN instead of INT_MAX for underflow.
Fixes: cc4b50b023 ("nir/opcodes: use u_overflow to fix incorrect checks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252>
When emitting the bin size, take into account per-view bin merging we
may have done that expands the size of the bin in GMEM by reusing the
right eye data for the left eye.
This fixes resolves getting clipped by the smaller bin size when using
the resolve engine. Before now we weren't using the resolve engine with
FDM, and for now we only do the merging when GMEM is enabled, so it
wasn't an issue.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39868>
With subsampled images we need access to the immutable samplers, even
though it's already been written when creating the descriptor set.
Previously we only kept a pointer to them in the template in the push
descriptor case where we needed to write them together with the image.
Refactor the descriptor template path to be more like the normal path
and always save the immutable samplers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39868>
As of now we always emit resolves during the subpass when they happen,
so we can just use that subpass's viewMask. But that won't work for
subsampled images, where we need to insert metadata and aprons for any
view resolved to after the renderpass is finished. Collect all the
resolve views for use with subsampled images.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39868>