It's pretty spammy and since the whole purpose of queries is to report
support, why bother with errors?
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38661>
Otherwise we get a lot of individual x/y/z stores to tesslevels when
we should really just be storing the whole thing at once.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
The newly rewritten remap_tess_levels_legacy will have already lowered
anything it cares about to URB intrinsics. So the generic remapping
pass won't see them, as it operates on generic input/output intrinsics.
This also drops some of the callback boilerplate we needed temporarily.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
This unifies the dynamic (SSO) and fixed (linked together) versions.
We emit piles of NIR as if we were doing the dynamic version, but
replace the tess config field access with constant values. It all
should optimize away back to something reasonable. We lower these
directly to URB read/write intrinsics.
It also rewrites the dynamic version to directly read/write the URB
rather than going through temporaries. The old version was broken
in that tessellation control shader invocations can technically use
the shared output area for cross-invocation data sharing with barriers,
although doing so using the built-in tesslevel patch outputs is very
unlikely.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
We were using this for indirect loads of the shader input thread
payload, but there's no reason we can't use it for constant access
too. In this case we can just MOV from the ATTR file directly
without a special opcode that turns into MOV_INDIRECT later.
We also allow it to load multiple components now. This is useful
for say, returning vec4 pushed inputs. And, we allow it in more
stages than just the fragment stage.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
We're going to change the intrinsic to a load(...) which puts "load" in
the name. Also, it's just more consistent with our usual terminology.
We also rename the corresponding backend opcode so they remain matched.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
These are a ton more convenient. When the TCS and TES were linked
together, the legacy layouts were a hassle, but didn't impose any
significant cost. With unlinked TCS and TES, the legacy layouts
involve significant runtime code for scrambling the data, whereas
the reversed layouts are substantially less overhead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
st/nir lowers this for iris, and brw_link_shaders lowers this for anv,
but for unlinked tessellation control / evaluation shaders, the lowering
was not happening for TCS.
Just do it unconditionally when lowering TCS outputs and TES inputs.
This lets the remapping code just assume vectors all the time, rather
than getting single component stores with nir_intrinsic_component set
(which came from nir_lower_io lowering compact arrays).
This also requires changes to the dynamic unlinked TCS/TES lowering to
temporaries, which needs to use vectors rather than arrays with this
change. That code is going away in future patches anyway, but this
keeps it going for now to avoid interim breakage.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
We now call remap_tess_levels before remap_non_header_patch_urb_offsets.
The latter already excludes tess levels anyway, so the order shouldn't
matter.
This paves the way for remap_tess_levels to skip handling some header
values in certain cases, because with reversed layouts, many of them no
longer need any special handling and we can just let the generic pass
handle them.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
Just have a single remap_tess_levels that does either the
statically-known-primitive or the dynamic (unlinked) mode.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
Our current legacy patch header layout handling doesn't actually care
which is which slot, and remaps everything to its correct spot anyway.
For using the newer "reversed" patch header layouts, it will be more
convenient to have outer as slot 0, and inner as slot 1, as that just
works with no special remapping needed for both quads and triangles
(but unfortunately isolines are still a pain).
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
These work the same regardless of stage.
v2 (Ken): Rebase, move from mesh to all stages, add reorderable load
variant, allow channel masks to be non-constant even on Xe2.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
Be consistent with lowering that happens after, so that it gets a full
vector register and can stride into it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
Noticed renderdoc complaining about our size :
RDOC 692028: [18:08:18] vk_core.cpp(2272) - Warning -
VkPhysicalDeviceDescriptorBufferPropertiesEXT.imageCaptureReplayDescriptorDataSizeis too large at 32
(must be <= 16), can't support capture of VK_EXT_descriptor_buffer
Since we only need 2 pointers (main + private), we can shrink this to
16bytes. The 1/2 planes have a relative offset from the base.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38625>
Allows us to better determine if we need Z/W payload delivery.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36392>
anv sets device->uses_ex_bso on verx10 >= 125 and then sets the
compiler->extended_bindless_surface_offset to that.
iris was not setting anything. However, LSC_ADDR_SURFTYPE_SS used for
scratch on Gfx12.5 is bindless, and Xe2 uses ExBSO for all UGM access,
so we need to be setting this.
Just set it in the compiler so both drivers have it set.
Fixes piglit arb_tessellation_shader-tes-gs-max-output -small -scan 1 50
on iris.
Fixes: 80c89909f3 ("brw: fixup immediate bindless surface handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38645>
If the library_path is just a basename like `libvulkan_lvp.so`, then we
can share the same JSON manifest like `lvp_icd.json` between all of the
architectures, like we already do for Vulkan layers. The library will
be looked up in the dynamic linker's default search path in this case,
and in practice will be found in `${libdir}`. This is how the Mesa's
EGL driver and Vulkan layers work, how Mesa is packaged in Debian 13,
and also how the Nvidia proprietary driver works; it makes installation
simpler for distros, especially on multiarch systems like Debian and
the freedesktop.org SDK.
However, if we want a separate manifest per architecture in order to
be able to write the full path into it, we still need per-architecture
filename disambiguation like `lvp_icd.x86_64.json`.
We presumably still want a separate per architecture on Windows, because
the concept of a single monolithic `${libdir}` is less common there, and
it can also be helpful during development when setting `$VK_DRIVER_FILES`
to force the use of a specific driver installed in a non-default location.
Use the following parameter to passed to vk_icd_gen:
'--icd-lib-path', vulkan_icd_lib_path,
'--icd-filename', icd_file_name,
output : 'virtio_icd.' + vulkan_manifest_suffix,
and the output is passed by '--out', '@OUTPUT@',
so we can detect vulkan_manifest_per_architecture from the --out parameter in script.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13745
Signed-off-by: Simon McVittie <smcv@collabora.com>
Co-authored-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37314>
In 80c89909f3 ("brw: fixup immediate bindless surface handling") I
forgot that we have a special usage for the only _SS surface (the
scratch surface).
Because it's only delivered in the 31:10 bits of R0 and because we
want to minimize the amount of shader instructions for scratch
messages, the surface offset in shifted right by the driver to align
things properly for the 31:6 extended descriptor format.
This is unfortunately incompatible with the full 32bit format of
ExBSO. So this surface type currently cannot be considered bindless.
We might revisit later if we start using _SS surfaces for other
things.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 80c89909f3 ("brw: fixup immediate bindless surface handling")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38618>
Without these fixes, H.265 streams using long-term references would
fail to decode correctly as the decoder wouldn't distinguish between
short-term and long-term reference frames.
Fixes: 896f95a37e ("vulkan/video: fix h265 decoding with LT enabled.")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38571>
Fixes this test on Xe2+:
INTEL_DEBUG=no32 ./deqp-vk -n dEQP-VK.spirv_assembly.instruction.maint9_vectorization.bit_field_u_extract.result_v16i-base_v16i-offset_s64u-count_s16i
Generate invalid code for that platform:
and(16) g37<1>UW g65<16,4,4>UW 0x000fUW { align1 1H I@5 };
ERROR: Invalid register region for source 0. See special restrictions section.
Several helpers like has_subdword_integer_region_restriction() do not
see the final type of the source, so compute it early.
Maybe new_src could be used in more cases. Being conservative for now.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38548>
The libvulkan_intel does not need the decoder when buildtype=release
where the debugging is disabled.
However, the decoder implementation is decided by the dep_expat
which may be turned on by like -Dtools=intel and the binary size
of libvulkan_intel increase unexpectedly.
This change creates the stub dependency and decide the exact
decoder dependency of libvulkan_intel by the buildtype.
Test: meson setup builddir -D build-tests=true -Dbuildtype=release --reconfigure && ninja -C builddir && cd builddir && meson test
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andy Hsu <hwandy@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38569>
There are cases where the freedreno `crashdec` program will not be
built, but will still be used. By making dep_lua a disabler, we move
closer to being able to have those tests automatically disabled when
crashdec isn't built.
Acked-by: Rob Clark <rob.clark@oss.qualcomm.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38579>
This is the last level layer of emission, we want the tracking to be
added above that, so that when flushing of previously accumulated
reasons happens, another pointless reason isn't added.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38542>
This is strictly a GL thing. iris can manage it in its own program keys
without polluting the compiler with stuff nobody else cares about.
We can also drop a lot of padding that was introduced in commit
a18835a9ca which doesn't appear to be
necessary.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38556>
Sadly this probably won't change anything in terms of perf as the CCS
engine has a bunch of other restrictions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 243c01c703 ("anv/iris: implement Wa_18040903259")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38484>
Saves unncessary PC and stall during encode phase.
Thanks to Felix for pointing out that CCS always needs a CS stall once
we add a pipe control, that will kill the performance for BVH
construction.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38513>
Up to 4 reasons can be saved and displayed. Previously, we were
only displaying one reason for Perfetto.
Co-authored-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38500>
--create-unlinked also creates entrypoints for the functions, and
obviates the need to create a dummy entrypoint. This is one step closer
to removing glsl2spirv and aligns us with other users of glslang.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38088>
Our backend compiler explains the limits as :
32 bytes for the patch header (tessellation factors)
480 bytes for per-patch varyings (a varying component is 4 bytes and
gl_MaxTessPatchComponents = 120)
16384 bytes for per-vertex varyings (a varying component is 4 bytes,
gl_MaxPatchVertices = 32 and
gl_MaxTessControlOutputComponents = 128)
In all that's :
* 32 patches * 128 components (counting tessellation factors)
* 32 vertices * 128 components
8192 total components.
I'm not sure why the limit was set so low, maybe leftover from older platforms?
Bump the limit to something like competition.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38523>
Trying to cut down apply_pipeline_layout a bit and also allowing some
reuse for a new extension.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38495>