Also replace the magic number 0x10 with AluOp::t to make it easier to
understand what is tested.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Assisted-by: Copilot (auto mode)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41945>
this is never accessed, but spec requires that mesh shaders
can declare it (which implicitly accesses it because llvm branching)
fixes dEQP-VK.mesh_shader.ext.misc.payload_not_accessed
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41963>
Compiling with clang produces a -Wunused-const-variable warning in the
weight codec vendored from Arm's Regor compiler:
src/gallium/drivers/ethosu/mlw_codec/source/mlw_decode.cpp:313:15:
warning: unused variable 'INITIAL_BLOCKS' [-Wunused-const-variable]
This warning is emitted only by clang, not by GCC, in the same vendored
mlw_codec sources whose other warnings are already suppressed at the
build-config level. Extend the existing cpp_args with
-Wno-unused-const-variable rather than patching the imported source, so
the files stay pristine for clean re-vendoring.
Fixes: d66d2c05d3 ("ethosu: Switch to the weight encoder from Regor")
Assisted-by: Claude Code (Claude Opus 4.8)
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42070>
R32_SINT/UINT and R32G32_SINT/UINT are sampled as float, so a missing
channel is filled with a float default. A channel-expanding integer blit
then gets a float 1.0 alpha instead of integer 1, which fails 8 cases of
dEQP-GLES3.functional.fbo.blit.conversion such as rg32i_to_rgba8i.
The hardware support for integer texturing is unclear from RE and the
feature databases, so enable it on halti5 GPUs as a conservative
starting point.
Fixes: 64c7cdcae5 ("etnaviv: add missing formats")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41884>
A 64bpp color render target that is fast-cleared and then partially rendered
came back with its high 32-bit word wrong - the low clear word was duplicated
into both halves.
TS was never told the render target is 64bpp, so its fast-clear filled tiles
from a 32-bit clear value.
Helps with at least the following CTS:
- dEQP-GLES3.functional.fbo.blit.default_framebuffer.rgba16f_nearest_scale_blit_from_default
- dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.7
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41909>
gcc reports two warnings in the weight encoder vendored from Arm's
Regor compiler (Ethos-U Vela):
src/gallium/drivers/ethosu/mlw_codec/source/mlw_encode.cpp:224:9:
warning: variable 'common_val' set but not used
[-Wunused-but-set-variable]
src/gallium/drivers/ethosu/mlw_codec/source/mlw_encode.cpp:426:33:
warning: comparison of integer expressions of different signedness:
'int' and 'long unsigned int' [-Wsign-compare]
These files are imported verbatim from upstream, so rather than patch
the vendored source (which would diverge from upstream and be lost on
the next re-import) silence the two known warnings at build-config
level via cpp_args on the mlw_codec static library. The suppression is
kept narrow so that any other warning introduced by a future re-vendor
still surfaces.
Fixes: d66d2c05d3 ("ethosu: Switch to the weight encoder from Regor")
Assisted-by: Claude Code (Claude Opus 4.8)
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42064>
The generated api_trace.c passes GLhandleARB arguments to _mesa_debug()
with a "%u" conversion. On macOS GLhandleARB is unsigned long, so this
triggers -Werror=format and breaks the build:
api_trace.c:5307:52: error: format specifies type 'unsigned int' but
the argument has type 'GLhandleARB' (aka 'unsigned long')
Cast the value to unsigned int to match the "%u" conversion on all
platforms. GL handles fit in 32 bits, and on Linux GLhandleARB is
already unsigned int, so behavior is unchanged.
Fixes: 9f7f5a27a7 ("mesa/main: Auto-generate MESA_VERBOSE=api trace dispatch")
Assisted-by: Claude Code (Claude Opus 4.8)
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42059>
On later a6xx and a7xx, round-robin does not work properly when there
are more than 8 active waves from the same dispatch in the same uSP. We
have to clamp the register usage to a minimum to guarantee there aren't
more waves. There is a problem for very large workgroups, which will
have to be solved the same way as the problem with deep control flow,
through implementing ReuseGPRMode.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41562>
This will signal shaders that require concurrent workgroup dispatch,
until we get a proper Vulkan extension.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41562>
It seems we weren't actually using the opcode, but be consistent with
the other place we call OpExtInst handlers.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41562>
According to the SPIR-V spec OpExtInst cannot appear before types,
constants, and global variable declarations. We were handling it anyway,
which is wrong.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41562>
This encapsulates the forward progress guarantee required by PRAGMATA,
the so-called "occupancy bounded execution" over workgroups. On
Adreno we need to be aware of this and compile the shader differently.
There isn't yet a Vulkan extension for this, so we will set this via a
hack in coordination with vkd3d-proton.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41562>
Convert the two nightly panfrost trace replay jobs to @anholt's new GPU
trace snapshot comparison tool.
This allows running a few traces on t860 that couldn't be replayed
before.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42018>
With the trace plumbing in place, fill in the wrappers from
gl_and_es_API.xml so the trace tracks new entrypoints.
Generated-by: Claude
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41147>
Reintroduce VERBOSE_API as a bit that will cause a parallel trace
dispatch table to be installed at context create. The table itself is
populated in a follow-up commit. This commit only wires the TLS-publish
indirection via _mesa_set_dispatch(..).
With Trace NULL (the common case), the helper is equivalent to a plain
_mesa_glapi_set_dispatch — no behaviour change yet.
Assisted-by: Claude
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41147>
This covers some drivers which expose KHR_display and EXT_present_timing.
Based on Emma Anholt's work from 2025, rebased on current Mesa 26.2-devel,
tiny compile fixes and docs/features updates by Mario Kleiner.
See MR 38472 for reference of Emma's work, based on Keith's work.
Tested locally on AMD Polaris for radv, Intel Kabylake for anv, and on
Mesa CI's VK-CTS VK_GOOGLE_display_timing test case for AMD radv,
Intel anv, Qualcomm Adreno tu.
Original code of Emma is
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Update of docs/features.txt + new_features.txt updates is
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
This adds the common plumbing and support for the VK_GOOGLE_display_timing
extension. A followup commit will enable it for KHR_display. It should
also optionally work for suitable other backends like Wayland and X11
on suitable Wayland and X11 servers, if those servers and backends
mostly support VK_EXT_present_timing (minus the relative timing support,
which is not needed for this extension to work). However, fully conformant
use on Wayland or X11 is not possible, as the extension lacks the ability
to report per VKSurface capabilities wrt. timing. Therefore the extension
should only be enabled for Wayland or X11 via explicit opt-in, not by
default.
The extension provides two things:
1) Detailed information about when frames are displayed, including
slack time between GPU execution and display frame.
2) Absolute time control over swapchain queue processing. This allows
the application to request frames be displayed at specific
absolute times, using the same timebase as that used in 1).
It is implemented on top of the VK_EXT_present_timing extension
infrastructure.
This code is inspired by Emma Anholt's work from late 2025, which itself
is based on Keith Packard's original work from 2018. Only a few lines of
their code is left though after an almost complete rewrite on top of
EXT_present_timing. Specifically calculation of .earliestPresentTime
and .presentMargin in fixed refresh rate (FRR) mode is based on Keith
original math, and the followup commit for driver enable is a modified
version of Emma's commit.
See MR 38472 for reference of Emma's work, based on Keith's work.
The final implementation as a whole is so far successfully tested on top
of an AMD Polaris gpu (radv), a Intel Kabylake gpu (anv), and Mesa CI
for direct display mode on AMD radv, Intel anv, and Qualcomm Adreno turnip.
Both VK_EXT_present_timing and VK_GOOGLE_display_timing can be enabled
at the same time on a VkDevice, but only one of the extensions can be
used on a given swapchain for that device. If both extensions are enabled
on a device and VK_EXT_present_timing is requested on some swapchains, it
will be used on those swapchains, whereas on all other swapchains the
VK_GOOGLE_display_timing will be used.
On drivers which don't support queue timestamps, reported values for
earliestPresentTime are identical to actualPresentTime, and presentMargin
is reported as zero, which is a reasonable fallback behaviour. Currently
drivers with this limitation would be pvr, panvk and kk.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
This is a tiny simplification, which should not have any practical
differences, except for a tiny bit more simple enablement of
VK_GOOGLE_display_timing in a followup commit, and dropping one
line of code.
The time_domain can always be assigned, even if present_timing
is not enabled (and neither is GOOGLE_display_timing), because
1. The field isn't used if none of these extensions is enabled.
2. The field would default to a valid initial value of zero ==
VK_TIME_DOMAIN_DEVICE_KHR anyway.
3. Even if wp_presentation_feedback is unavailable, and therefore
presentation_clock_id has an "unknown clock" value of -1, the
mapping through clock_id_to_vk_time_domain(-1) would again map
to VK_TIME_DOMAIN_DEVICE_KHR.
Iow. with or without the dropped if statement, the field gets a
nominally valid value, and also a value that does not get used
can do no harm.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
wsi_swapchain_present_timing_sample_query_pool() queries the timestamp
queue_done_time, corresponding to VK_EXT_present_timing's present stage
VK_PRESENT_STAGE_QUEUE_OPERATIONS_END_BIT_EXT.
The time domain of returned timestamps depends on the number of
available bits for queue timestamps. The same timestamp is needed
as input to calculate timestamps and headroom for some elements
returned by VK_GOOGLE_display_timing, but that extension always
requires timestamps in the host time domain. To allow use of this
function also for a future implementation of GOOGLE_display_timing
in a followup commit, add a flag that asks to always return queue
timestamps in the host time domain.
The flag is set to false to keep current behaviour for use by
VK_EXT_present_timing.
Also clamp a returned queue_done_time in host time domain to the
provided upper_bound, as useful for upcoming GOOGLE_display_timing.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
Spec allows to request a present at a specific target time or duration
without actually storing + querying any present records about completion
time. Iow. it allows VkPresentTimingInfoEXT.presentStageQueries == 0.
In this case, skip allocation and processing of a timing history record,
but still assign a VkPresentTimingInfoEXT.targetTime for timed present.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Fixes: 47d69664d8 ("vulkan/wsi: Add common infrastructure for EXT_present_timing.")
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
- Queueing a present with VkPresentTimingsInfoEXT in the .pNext chain of
VkPresentInfoKHR, but VkPresentTimingsInfoEXT.pTimingInfos == NULL is
allowed and must not crash, just no-op.
- VkPresentTimingInfoEXT.targetTime == 0 means to ignore targetTime and
to simply present as soon as possible. This is achieved by setting
info->targetTime == 0 ==> target_time = 0. Make sure target_time stays
also 0 if targetTimeDomainPresentStage is set to
VK_PRESENT_STAGE_QUEUE_OPERATIONS_END_BIT_EXT, ie. skip the device->cpu
conversion via wsi_swapchain_present_convert_device_to_cpu(), as that
might map a zero info->targetTime device time to a non zero cpu
target_time.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Fixes: 47d69664d8 ("vulkan/wsi: Add common infrastructure for EXT_present_timing.")
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
Some hw + kms driver combos do not support vblank related functions
at all, ie. no drmCrtc[Get/Queue]Sequence() ioctl, no crtc sequence
events, no vblank of pageflip completion reported in pageflip events.
Most notable under the present_timing supported Vulkan drivers is
Asahi Linux on Apple Silicon Macs, with no such support: Only pageflip
events with a valid flip timestamp are supported.
To deal with this, we detect lack of vblank support and instead
use the current "vrr timing" path, which doesn't use vblanks, but
absolute time and timed waits. This also required a slight restructuring
of the setup logic.
Also fix semantics of requested relative timed presents via
VK_PRESENT_TIMING_INFO_PRESENT_AT_RELATIVE_TIME_BIT_EXT. The
spec states that the given target time should be relative to
the most recently presented image on a swapchain, and that if
no such image was presented yet (during the first present on
a swapchain), the relative target present time should be ignored.
Take care of this by tracking vblank count and time of the most
recent completed swapchain present separately from the most recent
known vblank count and time of the connector. Choose the swapchain
most recent present vblank data as baseline for relative timed
presents, to optimally implement spec semantics, but the connectors
vblank data for absolute timed presents to minimize rounding errors
and drift when converting between time and vblank cycle counts.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Fixes: 5e2814c8a4 ("wsi/display: Implement present timing on KHR_display.")
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>