CP is very slow on GFX12 and parsing the packet header is the main
bottleneck. Using paired context regs reduce the number of packet
headers and it should be more optimal.
It doesn't seem worth when only one context reg is emitted (one packet
header and same number of DWORDS) or when consecutive context regs are
emitted (would increase the number of DWORDS).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
This will allow it to be used after ir3_imm_const_to_preamble so that we
don't have to do the propagation of immediates manually there.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34397>
This patch does a couple of things to make CL integration with drivers
as seamless as possible:
- We pull in opencl-c.h and opencl-c-base.h to stop relying on system
headers.
- Parts of libcl.h are moved to new headers that are incomplete CL-safe
variants of libc headers.
- A couple of util headers are changed to remove now unnecessary
__OPENCL_VERSION__ guards and make more headers CL safe.
- Drivers now include src/compiler/libcl and use headers like
macros.h,u_math.h instead of libcl.h.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>
On X.org startup libgallium crashes on x32 hosts inside
blake3_hash_many_sse41(), most likely because of the different pointer
size. Disable SIMD blake implementation if x32 is detected.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34453>
This allows us to create temporary VGRFs that are larger than
MAX_VGRF_SIZE(devinfo), which will be split eventually. They may not
be split on the initial pass, because we may need LOAD_PAYLOAD lowering,
copy propagation, and so on to occur first. So we allow registers to
exceed that size initially.
The "Register allocation relies on split_virtual_grfs()" assertion in
brw_reg_allocate.cpp still asserts that all VGRFs which reach the
register allocator have been properly split.
One case where this is useful is for vectorizing convergent block loads.
We create temporaries to splat the SIMD1 values out to SIMD(N), which
can lead to some very large temporaries. However, copy propagation and
so on ultimately eliminate these and they'll get split down to proper
sizes or elided entirely in the end.
(Note: both this and the prior commits from this merge request are
needed to close the linked issue.)
Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12324
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
Post-RA scheduling doesn't use liveness analysis, so we continue using
MAX_VGRF_SIZE(devinfo). But for pre-RA scheduling, we now use
live->max_vgrf_size.
This helps get us to a place where we can emit arbitrarily large VGRFs
early on in compilation, but which will be split and cleaned up prior to
register allocation. It may also allocate smaller arrays in practice
since MAX_VGRF_SIZE(devinfo) assumes the worst case scenario for things
we actually could need to allocate.
Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
We're already looking at this data to calculate the per-component
vars_from_vgrf[] and vgrf_from_vars[] mappings, so just record the
largest VGRF size while we're here. This will allow passes to size
arrays based on the actual size needed, rather than hardcoding some
fixed size. In many cases, MAX_VGRF_SIZE(devinfo) is larger than
necessary, because e.g. vec5 sparse sampling results aren't used.
Not hardcoding this means we can also temporarily handle very large
VGRFs which we know will be split eventually, without having to
increase the maximum which is ultimately used for RA classes.
Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
Sync with:
commit cf05922d63e2ae6a9b1b52ff5236a44c3b29f78c
Merge: a82866fbecca6 bfef148f3680e
Author: Dave Airlie <airlied@redhat.com>
Merge tag 'drm-intel-gt-next-2025-03-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34457>
When nir_opt_varyings doesn't make progress the first time,
it should not be necessary to call it a second time.
No Fossil DB changes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
This is to stop calling nir_shader_gather_info repeatedly for
some stages, and also as a pre-requisite to the work in the next commits.
No Fossil DB changes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
VK_KHR_sampler_ycbcr_conversion is a dependency from the
VK_EXT_image_drm_format_modifier spec. panvk arch < 10 still
doesn't support it, so VK_EXT_image_drm_format_modifier should
not be exposed.
Otherwise, a Vulkan validation error is triggered for users of
VK_EXT_image_drm_format_modifier and it may cause applications
to fail to create a device.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34458>
Not used anymore, just use the existing ADL definitions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>