fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-10 03:28:18 +02:00

Author	SHA1	Message	Date
Mario Kleiner	2beb0c8820	wsi/common: Allow VK_EXT_present_timing present without presentStageQueries. Spec allows to request a present at a specific target time or duration without actually storing + querying any present records about completion time. Iow. it allows VkPresentTimingInfoEXT.presentStageQueries == 0. In this case, skip allocation and processing of a timing history record, but still assign a VkPresentTimingInfoEXT.targetTime for timed present. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Fixes: `47d69664d8` ("vulkan/wsi: Add common infrastructure for EXT_present_timing.") Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Mario Kleiner	784a41cb8b	wsi/common: Small compliance fixes for VK_EXT_present_timing. - Queueing a present with VkPresentTimingsInfoEXT in the .pNext chain of VkPresentInfoKHR, but VkPresentTimingsInfoEXT.pTimingInfos == NULL is allowed and must not crash, just no-op. - VkPresentTimingInfoEXT.targetTime == 0 means to ignore targetTime and to simply present as soon as possible. This is achieved by setting info->targetTime == 0 ==> target_time = 0. Make sure target_time stays also 0 if targetTimeDomainPresentStage is set to VK_PRESENT_STAGE_QUEUE_OPERATIONS_END_BIT_EXT, ie. skip the device->cpu conversion via wsi_swapchain_present_convert_device_to_cpu(), as that might map a zero info->targetTime device time to a non zero cpu target_time. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Fixes: `47d69664d8` ("vulkan/wsi: Add common infrastructure for EXT_present_timing.") Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Mario Kleiner	d7b23e9f3a	wsi/display: Deal with vblank-less systems for VK_EXT_present_timing. Some hw + kms driver combos do not support vblank related functions at all, ie. no drmCrtc[Get/Queue]Sequence() ioctl, no crtc sequence events, no vblank of pageflip completion reported in pageflip events. Most notable under the present_timing supported Vulkan drivers is Asahi Linux on Apple Silicon Macs, with no such support: Only pageflip events with a valid flip timestamp are supported. To deal with this, we detect lack of vblank support and instead use the current "vrr timing" path, which doesn't use vblanks, but absolute time and timed waits. This also required a slight restructuring of the setup logic. Also fix semantics of requested relative timed presents via VK_PRESENT_TIMING_INFO_PRESENT_AT_RELATIVE_TIME_BIT_EXT. The spec states that the given target time should be relative to the most recently presented image on a swapchain, and that if no such image was presented yet (during the first present on a swapchain), the relative target present time should be ignored. Take care of this by tracking vblank count and time of the most recent completed swapchain present separately from the most recent known vblank count and time of the connector. Choose the swapchain most recent present vblank data as baseline for relative timed presents, to optimally implement spec semantics, but the connectors vblank data for absolute timed presents to minimize rounding errors and drift when converting between time and vblank cycle counts. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Fixes: `5e2814c8a4` ("wsi/display: Implement present timing on KHR_display.") Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Mario Kleiner	b4a1756fd1	wsi/display: Add workaround for all-zero valued pageflip events. Both local testing and the Mesa CI's VK-CTS VK_GOOGLE_display_timing test cases show some oddity of amdgpu-kms driven AMD gpu's wrt. VK_EXT_present_timing and upcoming VK_GOOGLE_display_timing: The very first present (atomic commit / pageflip) after a full modeset commit which turns the associated output / connector from fully off to on (powering up the display hw) will not send a regular pageflip completion event, but instead send a pageflip completion event after display hw programming is completed, with all-zero vblank sequence count and present timestamp. This would cause invalid timestamps for this very first present reported to clients, and trips up the VK-CTS VK_GOOGLE_display_timing conformance tests, because the first present is signalled as completed before it was even queued. This failure can be observed with AMD gpu's in the CI, but not with Intel or Qualcomm gpu CI, where CTS is successful. Note this quirk doesn't happen for regular modesets on an already running output, ie. one with at least one active hw plane. It does happen for the CTS, as it seems to start from a powered off output. Work around this AMD quirk: 1. Detect a pageflip event with all zero frame count and timestamp. 2. Try to query the count and timestamp of the most recent vblank, as a likely good substitute for the "completed" pageflip, given that pageflip and vblank counts and timestamps must always match for the vblank of actual flip completion. 3. If the query should fail or also report non-sensical values, e.g., completed before queued, fall back to current system time as a ok'ish result. Note that during my local testing on AMD Polaris11 with DCE-11.2 display engine this 3rd case was not ever observed, and 2 did a good job. This is just a fallback for the fallback. For reference, after digging through lots of amdgpu DC Linux source code, the relevant decision code for deciding for a regular pageflip event dispatched from the pageflip completion interrupt handler is to be found by searching for the call site of the function prepare_flip_isr(). The fallback code for the special "full modeset to power on the display engine and skip regular pageflip event" is the call site of the function drm_send_event_locked(). Successfully tested on AMD Polaris 11, DCE 11.2 display engine, and also by Mesa CI's VK-CTS VK_GOOGLE_display_timing test cases for direct display mode on AMD gpu's. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Fixes: `5e2814c8a4` ("wsi/display: Implement present timing on KHR_display.") Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Mario Kleiner	f7fb4dd5ff	wsi/display: Improve connector->last_nsec timestamping. This improves reliability of VK_EXT_present_timing on wsi/display and should be backported to Mesa 26.1-rc. When latching connector->last_nsec timestamps from either a drmCrtcGetSequence() query, or from a vblank sequence event timestamp extracted as part of wsi_display_sequence_handler() -> wsi_display_fence_event_handler() call sequence, the vblank timestamps are in nanosecond precision/granularity, whereas latched timestamps from a pageflip completion event in a call to wsi_display_page_flip_handler2() are in increments of 1000 nsecs, based on microsecond precision/granularity timestamps. All timestamp sources are based on the same DRM/KMS timestamps, bit the different interfaces/api's expose those in different precision. This could cause a connector timestamp from the sequence path in nanoseconds to be overwritten by a new timestamp from a pageflip completion event that is truncated down to the next lowest microsecond, causing time in connector->last_nsec to go backwards by up to 999 nsecs. A MAX2 operator prevents this. Additionally, this also updates connector->last_nsec from a successful Vulkan client call to vkGetSwapchainCounterEXT(), allowing for a potentially more recent and thereby accurate connector->last_nsec timestamp to be used as baseline for scheduling timed FRR presents via VK_GOOGLE_display_timing or VK_EXT_present_timing. This is an improvement originally made by Keith Packard in his original VK_GOOGLE_display_timing KHR_Display implementation, just forward ported by myself, adding a slightly more descriptive comment in the code. See MR 38472 for reference of Emma's work, based on Keith's work. The code from the original commit was... Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Fixes: `5e2814c8a4` ("wsi/display: Implement present timing on KHR_display.") Cc: mesa-stable Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Mario Kleiner	d22e0844ef	wsi/display: Expose VK_FORMAT_B8G8R8A8_UNORM before VK_FORMAT_B8G8R8A8_SRGB Some wsi/display VK-CTS test cases, e.g., for VK_GOOGLE_display_timing, select swapchain imageUsage flags which are incompatible with the color format VK_FORMAT_B8G8R8A8_SRGB that was returned as the first ("default") swapchain image color format by vulkan/wsi/display, but not properly validated for compatibility by the CTS test cases. This ends badly - with a crash due to assert(), also in Mesa's CI pipeline, e.g., ../src/vulkan/wsi/wsi_common_drm.c:710: wsi_configure_native_image: Assertion `!"Failed to find a supported modifier! This should never " "happen because LINEAR should always be available"' failed. Reorder VK_FORMAT_B8G8R8A8_UNORM into the first slot, as this is safe to use, and make VK_FORMAT_B8G8R8A8_SRGB a safe second. This should be fine, as the spec doesn't mandate VK_FORMAT_B8G8R8A8_SRGB or any specific format be first, and vulkan/wsi/wayland regularly exposes other formats on various Wayland compositors. The macOS Khronos MoltenVK Vulkan ICD also uses unorm first ordering, as seem to do common MS-Windows Vulkan ICD's. I assume that apps which really want to specifically test SRGB color formats will explicitly select such a format, so no harm is done by reordering. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>	2026-06-05 10:21:51 +00:00
Zan Dobersek	fbdc5814ad	tu/kgsl: initialize dump bo state in kgsl_bo_init sooner In kgsl_bo_init(), tu_dump_bo_init() should be called for tu_bo after it's initialized and before it's possibly mapped, since the mapping can fail and cause kgsl_bo_finish() to call tu_dump_bo_del() for tu_bo with an improperly initialized dump_bo_list_idx, leading to crashes. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41983>	2026-06-05 09:39:36 +00:00
Pierre-Eric Pelloux-Prayer	bed8008c9d	ac/parse_ib: initialize data variables to 0 Avoids "warning: ‘data1’ may be used uninitialized" messages. Fixes: `2aec2e8dba` ("ac/parse_ib: Add VCN decode queue parsing") Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41966>	2026-06-05 09:16:57 +00:00
Pierre-Eric Pelloux-Prayer	c68e4d229b	radeonsi: use aux context locks in si_destroy_screen Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41966>	2026-06-05 09:16:57 +00:00
Pierre-Eric Pelloux-Prayer	9487c1b0e9	radeonsi: consolidate aux context creation into si_get_aux_context si_create_context checks contexts that need recreation but only destroy them rather than creating them. Creation now belongs to a single function: si_get_aux_context. Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41966>	2026-06-05 09:16:57 +00:00
Pierre-Eric Pelloux-Prayer	3b3181a14d	radeonsi: fix sdma copy for gfx10 The shared sdma code used the "sdma_supports_compression" field from info but radeonsi code still relied on gfx level checks. Fixes: `f5ecc5ffd5` ("ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()") Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41966>	2026-06-05 09:16:57 +00:00
Alexander Slobodeniuk	7e040483ec	radeonsi: fix conformance window emission in the SPS If the vaapi application submits SPS with pic_width_in_luma_samples not aligned to be divisible by 64, the driver overwrites it to an aligned value. But if it does so, then it should also recalculate the conformance window. Example from real life: gstreamer vah265enc built with libva < 1.21 or vaapih265enc transcoding a video of width == 854 gst-launch-1.0 uridecodebin uri=https://media.w3.org/2010/05/sintel/trailer.mp4 ! vaapih265enc ! filesink location=out.h265 The code uploads an SPS with pic_width_in_luma_samples == 864, and the driver overwrites it to 896. The conformance window provided in the SPS was 10 : 864 - 10 = 854. So after encoding the output width results in a wrong value: 896 - 10 = 886 Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41997>	2026-06-05 08:58:56 +00:00
Samuel Pitoiset	e984014d56	turnip: declare common VK drirc options using the helper Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41843>	2026-06-05 09:14:45 +02:00
Samuel Pitoiset	64e63051dc	anv: declare common VK drirc options using the helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41843>	2026-06-05 09:14:45 +02:00
Samuel Pitoiset	4e436fbd3d	radv: declare common VK drirc options using the helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41843>	2026-06-05 09:14:45 +02:00
Samuel Pitoiset	38ce035860	util/drirc_gen: add a function to declare commmon VK options Similar to WSI options. It's possible to ignore options that aren't implemented by a driver and to set different default values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41843>	2026-06-05 09:14:45 +02:00
Samuel Pitoiset	9237656171	util: remove declared but unused DRIC_CONF_VK_REQUIRE_ASTC Only used by RADV and ANV and options are auto-generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41843>	2026-06-05 09:14:45 +02:00
Alessandro Astone	e84e9dc582	gallivm: Fix armhf build against LLVM 22 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details StringMapIterator<bool> became StringMapIterBase<bool, false /* IsConst */>; Use `auto` to handle either case. Reviewed-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40161>	2026-06-05 05:47:39 +00:00
Karol Herbst	a6172f19a0	vtn/opencl: fix edge case behavior for tanpi Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41726>	2026-06-04 21:39:02 +00:00
Karol Herbst	1e9b1075b6	vtn/opencl: fix edge case behavior for sinpi Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41726>	2026-06-04 21:39:02 +00:00
Karol Herbst	8c109381ed	vtn/opencl: fix edge case behavior for cospi Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41726>	2026-06-04 21:39:01 +00:00
Karol Herbst	3531a05fdd	vtn/opencl: convert libclc workaround handling to a switch statement Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41726>	2026-06-04 21:39:00 +00:00
Georg Lehmann	3a815a5969	nir: preserve infinities and signed zero during atan2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For zero inputs, we end up with intermediate infinities from frcp(0.0). The final output is not infinity though, so this has to be well defined even when applications don't request preserving infinities on their own. Also preserve signed zeros to make the sign of the infinities well defined. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15585 Cc: mesa-stable Tested-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41953>	2026-06-04 21:02:11 +00:00
Emma Anholt	c634bf11ce	drm-shim/freedreno: Report VM_BIND support. This lets me replay .rdcs traced with sparse support so I can look at their shaders. We don't have to do anything with the iovas being bound, since we don't execute anything. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37323>	2026-06-04 20:17:38 +00:00
Emma Anholt	053c8025a2	drm-shim/freedreno: report a 48-bit address space. Not all GPUs would support this, but it shouldn't affect our shader compiles, and it does mean that we can support replay of traces with userspace iova allocation buffer addresses on larger GPUs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37323>	2026-06-04 20:17:36 +00:00
Emma Anholt	2b89d3da70	drm-shim/freedreno: Provide a dummy set of UBWC config params. This reduces noise from drm-shim with current tu. These values don't affect shader compiles, so just pick some values from 750. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37323>	2026-06-04 20:17:35 +00:00
Emma Anholt	37ae0e4255	drm-shim: Include the hex of the driver ioctl for unimplemented ioctls. Some headers #define them in hex, so make it easier to look up which one isn't implemented. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37323>	2026-06-04 20:17:34 +00:00
Marek Olšák	1375ba209d	ac: add basic HTILE dword printing Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42004>	2026-06-04 19:55:19 +00:00
Marek Olšák	9f3af96552	ac/surface: print the modifier in ac_surface_print_info Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42004>	2026-06-04 19:55:19 +00:00
Benjamin Cheng	a989ca8c8f	mesa/st: run the lower_opcodes pass for draw shaders Fixes: `5eb0136a3c` ("mesa/st: when creating draw shader variants, use the base nir and skip driver opts") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15304 Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41302>	2026-06-04 19:29:57 +00:00
Benjamin Cheng	a4a862a605	draw: Add lower_opcodes NIR pass Gallivm runs shaders that are originally compiled with another backend's compiler options, which may have optimizations that introduce opcodes that gallivm does not support. Add a pass to lower these. Assisted-by: Claude Opus 4.6 Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41302>	2026-06-04 19:29:57 +00:00
Faith Ekstrand	364b5f806c	compiler/rust/smallvec: Optimize extend() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42005>	2026-06-04 18:09:19 +00:00
Yiwei Zhang	4e8595da21	venus: let resource_create_blob wait for mem alloc Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Previously, the mem alloc wait barrier is via a separate renderer submission (e.g. execbuf for virtgpu backend). In fact, we can leverage the cmd payload in resource_create_blob to avoid the extra submission. This would help downstream win32 backend as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42003>	2026-06-04 16:33:02 +00:00
Yiwei Zhang	77b73d8595	venus: update create_from_device_memory to take a cmd payload This is to leverage drm_virtgpu_resource_create_blob::cmd for expressing the blob mem host resource dependency in the virtgpu backend, which can avoid the execbuf. Similar for vtest backend. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42003>	2026-06-04 16:33:02 +00:00
Job Noorman	2b37a0b410	vulkan: use consistent module hashing for pipeline stages Currently, when hashing a pipeline stage, the final hash is different when the module is passed as VkPipelineShaderStageCreateInfo::module (the module's hash is hashed) or as a VkShaderModuleCreateInfo in its pNext chain (the module's code is hashed). This causes unnecessary cache misses. To prevent this, hash the code first in the latter case and add that hash to the stage's hash. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42014>	2026-06-04 16:01:55 +00:00
Job Noorman	0a60a53c81	vulkan: add vk_shader_module_hash helper Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42014>	2026-06-04 16:01:55 +00:00
Hyunjun Ko	bea1212ee7	anv/video: Change size of the cached array of recently decoded AV1 frames. Current size of prev_refs is 8, which just means the size of ref-frames but needs to be aligned with full size of dpb, which is 9. Also prev_refs is now indexed by dpb slot and holds the last intra frame written to that slot. This fixes visible artifacts on AV1 streams that mix super-res and non-super-res frames in a hierarchical reference structure. Closes: mesa/mesa#15503 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>	2026-06-04 15:43:54 +00:00
Hyunjun Ko	11c8930e2b	anv/video: define ANV_VIDEO_AV1_MAX_DPB_SLOTS this is a prep-work for the follwing fix. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>	2026-06-04 15:43:54 +00:00
Hyunjun Ko	6875286159	anv/video: Add to check size mismatch during motion field estimation. Due to super resolution size can change so we need to keep coded size and check whether the change happens during motion field estimation. Closes: mesa/mesa#15503 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>	2026-06-04 15:43:54 +00:00
Natalie Vock	1a8953c956	radv: Dump printf buffer after detecting a GPU hang This allows us to use printf debugging when the GPU hangs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41961>	2026-06-04 15:22:07 +00:00
Natalie Vock	c8518581bf	radv/rt: Don't overwrite bvh_base at the start of the traversal loop Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This may delete existing pointer flags coming from the instance if the traversal loop is exited and then restarted, as is done with ray queries. Fixes geometry being incorrectly culled due to FLIP_FACING flags going missing. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41965>	2026-06-04 14:55:30 +00:00
Karmjit Mahil	10c914693d	freedreno/computerator: Remove VLA giving a build warning ``` ../src/freedreno/computerator/main.cc:327:24: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension] 327 \| uint64_t results[num_perfcntrs]; \| ^~~~~~~~~~~~~ ../src/freedreno/computerator/main.cc:327:24: note: read of non-const variable 'num_perfcntrs' is not allowed in a constant expression ../src/freedreno/computerator/main.cc:206:13: note: declared here 206 \| unsigned num_perfcntrs = 0; \| ^ ``` Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42017>	2026-06-04 14:38:43 +00:00
Jose Maria Casanova Crespo	28e584b687	v3dv: enable lowered shaderFloat16/Int16/Int8 + VK_KHR_shader_float16_int8 V3D 7.1 now exposes shaderFloat16, shaderInt8, shaderInt16 and VK_KHR_shader_float16_int8. Partial native Float16 support is already available. But the rest of sub-32-bit ALU operations are widened to 32-bit by nir_lower_bit_size in v3d_lower_nir(); conversion and pack operations are kept at their native bit width so the QPU's 16-bit pack/unpack paths on mul/mov can be used. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo	4c5b0fa7f4	v3d: emit packed-f16 ALU ops natively on V3D 7.1 Keep f16 fadd/fsub/fmul/fmin/fmax/fneg/fabs at 16-bit through nir_lower_bit_size on V3D 7.1+ and emit the matching VF* op in nir_to_vir, instead of widening to f32 with f16<->f32 round-trip movs that pack-fold can absorb into hints. The native path saves the absorption overhead in f16-heavy shaders. Only the lower half of each VF* result is consumed; the upper half is computed but unused. New VIR helpers vir_VFADD, vir_VFSUB, vir_VFCMP, vir_VFMIN, vir_VFMUL, vir_VFMOV, vir_VFABS, vir_VFNEG, vir_VFNAB were added. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo	16856adff5	broadcom/qpu: expose V3D 7.1 packed-f16 instructions Add the V3D 7.1+ 2x16-bit f16 add-pipe ops (VFADD/VFSUB/VFCMP and the sign-manipulation family VFMOV/VFABS/VFNEG/VFNAB), wire VFMAX into v3d71_add_ops, and complete the V3D 7.1 decode/encode for VFMIN/VFMAX/VFMUL. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo	5a575cca8e	v3d: improve liveness analysis for packed partial writes The liveness analysis treated any output-pack write (D.l / D.h) as a partial definition, refusing to mark the variable as defined in the block. That extended live ranges all the way to the top of the program for every f16 temporary, artificially increasing register pressure. D.l/h only modifies the written bits, leaving the unwritten half bits preserved. So a pack write is a full definition whenever no consumer ever observes the unwritten half, or when both halves are written before the variable is used. This scans every instruction into a per-temp read-flag array (TEMP_READ_LO / TEMP_READ_HI, with FULL = LO \| HI) by inspecting each source's input unpack. And recognizes two patterns as full definitions: * Both PACK_L and PACK_H written unconditionally in the same block. * The instruction's pack writes the half that covers every observed read of the variable across the program (the unwritten half is never read). Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo	66ac3b55af	v3d: widen sub-32-bit subgroup arithmetic and vote ops nir_lower_subgroups lowers reduce/scan to a tree of shuffle + ALU chains over the source data type. When the source is sub-32-bit (int8, int16, float16, or vector forms) those new ALU ops escape the bit_size widening done earlier in v3d_lower_nir, leaving the QPU codegen to emit raw min/max/etc. on 32-bit channel registers whose upper bits are unspecified. The result is wrong reductions for signed integer min/max (the upper bits make a signed int8 look like a positive int32), wrong unsigned reductions (high-bit garbage mixes into the result), and wrong f16 reductions. Re-run nir_lower_bit_size after nir_lower_subgroups so the generated sub-32-bit ALU ops are widened with the correct sign/zero extension on inputs and the matching narrow on outputs. Also widen vote_feq/vote_ieq when the source operand is sub-32-bit: the V3D backend emits ALLFEQ/ALLEQ on full 32-bit channels (it does not use yet the f16 vfcmp/vfmin/vfmax HW path), so the comparison input must be 32-bit. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo	54de903ae4	v3dv: lower flrp16 for consistency with flrp32 flrp32 is already lowered; mirror it for flrp16 so V3D's f16 ALU path doesn't see an unsupported flrp@16 leftover after bit_size widening. No measurable test impact on the current f16 sweep, but matches the f32 behaviour and keeps the lowering surface consistent across bit sizes. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:38 +00:00
Jose Maria Casanova Crespo	0a5200d051	v3d: move nir_lower_frexp after nir_lower_bit_size The frexp lowering decomposes frexp into bit manipulation (fabs, ushr, iand, ior) that relies on implicit float-to-int bit reinterpretation. When lowered at 16-bit, the subsequent nir_lower_bit_size pass widens float operations with f2f32 (changing the bit pattern to IEEE fp32) and integer operations with u2u32 (zero-extending 16-bit bits). This breaks the reinterpretation: ushr on the fabs result gets f2f32-widened float bits instead of the original fp16 bit pattern, causing the sign bit to leak into the exponent extraction for negative inputs. Moving nir_lower_frexp into v3d_lower_nir after nir_lower_bit_size. This way frexp decomposition operates at 32-bit where float and integer operations share the same bit width, and the bit manipulation masks use the correct IEEE fp32 constants. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:38 +00:00
Jose Maria Casanova Crespo	cac92fecac	broadcom/qpu: support output pack on itof/utof itof and utof natively support packing the f32 result to f16 (.l/.h), but the encode/decode paths fell through to the default case and rejected any non-NONE pack, breaking nir_op_i2f16 / nir_op_u2f16 codegen with "Failed to pack instruction: itof rfN.l". Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>	2026-06-04 13:29:38 +00:00

1 2 3 4 5 ...

223789 commits