fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-09 23:08:18 +02:00

Author	SHA1	Message	Date
Marek Olšák	c42e4a2fba	ac/nir/lower_ps_early: remove obsolete comment Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41768>	2026-06-02 20:38:04 +00:00
Marek Olšák	ea5352b7d7	radv: ignore color attachment samples for ps_iter_samples Sample shading is only affected by the number of rasterization samples. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41768>	2026-06-02 20:38:04 +00:00
Gleb Popov	6ae0114b05	Rename the CACHE_LINE_SIZE define to MESA_CACHE_LINE_SIZE The former clashes with a define with the same name that comes from FreeBSD base headers. Closes #5737 Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41906>	2026-06-02 20:04:20 +00:00
Karol Herbst	2ea31794f0	rusticl/program: wrap compiler option parsing Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: @LingMan Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41905>	2026-06-02 19:30:23 +00:00
Karol Herbst	01de0ff26f	rusticl/program: store log as a CString Reviewed-by: @LingMan Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41905>	2026-06-02 19:30:23 +00:00
Karol Herbst	d7674245e2	rusticl/kernel: store kernel names as CString There is no reasons to have this as a rust string. Reviewed-by: @LingMan Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41905>	2026-06-02 19:30:23 +00:00
Karol Herbst	8711a88d96	rusticl/util: add Traits to help with usage of CString Reviewed-by: @LingMan Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41905>	2026-06-02 19:30:22 +00:00
Karol Herbst	4fa798516e	meson: enable more rust 2024 lints Reviewed-by: @LingMan Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41905>	2026-06-02 19:30:22 +00:00
Emma Anholt	aa26be6ea9	tu: Move to using drirc_gen. Slight reduction in boilerplate (driconf.h definition, tu_device.cc parsing, tu_device.h instance definitions), plus validation that you don't typo between 00-turnip-defaults.conf and tu_device.cc parsing. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Emma Anholt	ae2cc693f8	util/drirc_gen: Make the header usable from C++. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Emma Anholt	2595e4a972	util/drirc_gen: Move the common VK WSI options to a core helper function. I didn't want to copy and paste this again for tu. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Emma Anholt	3ad3bb33d1	util/drirc_gen: Reduce manual importing of functions. I wanted to use another common function, and having to manual import it felt silly given that we're a drirc_gen user. But we are limited because we don't have the common code in the system path at startup, so we can't just import it at module level. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Emma Anholt	a514d1e7ec	radv/drirc_gen: Clean up the dependency handling. This matches Intel -- we get a dependency on the conf file from it appearing as a command arg, and drirc_gen.py from it being in the inputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Emma Anholt	2ac718c1b8	util/drirc_gen: Add a little documentation of what this does. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41877>	2026-06-02 18:49:48 +00:00
Ian Romanick	dcfc90a8fc	nir/algebraic: Convert bcsel of addition to addition of b2i or b2f Recent changes to continue handling in loops results in many cases of loop { ... if (...) { do_continue = true; // was continue; } i = do_continue ? i : i + 1; } I noticed this while investigating mesa#15154. Unfortunately, this doesn't fix the performance regressions noted in that issue. One fragment shader in XCOM: Enemy Unknown doesn't like this change. :( v2: Drop _nsz from a couple bcsel patterns where it is not needed. Suggested by Georg. v3: Drop ~ from the last two fadd patterns. Suggested by Georg. Update expected checksum for plot3d-v2.trace on many platforms. shader-db: All Iris platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17089936 -> 17086837 (-0.02%) instructions in affected programs: 864928 -> 861829 (-0.36%) helped: 696 / HURT: 110 total cycles in shared programs: 864096306 -> 863913752 (-0.02%) cycles in affected programs: 345726340 -> 345543786 (-0.05%) helped: 620 / HURT: 196 total spills in shared programs: 3318 -> 3319 (0.03%) spills in affected programs: 14 -> 15 (7.14%) helped: 0 / HURT: 1 total fills in shared programs: 1604 -> 1606 (0.12%) fills in affected programs: 28 -> 30 (7.14%) helped: 0 / HURT: 1 total sends in shared programs: 876852 -> 876850 (<.01%) sends in affected programs: 6 -> 4 (-33.33%) helped: 2 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 914468779 -> 914215874 (-0.03%); split: -0.03%, +0.00% CodeSize: 12885732160 -> 12881939568 (-0.03%); split: -0.04%, +0.01% Cycle count: 100100279922 -> 100096866800 (-0.00%); split: -0.05%, +0.04% Spill count: 3459786 -> 3459693 (-0.00%); split: -0.01%, +0.01% Fill count: 4909835 -> 4909177 (-0.01%); split: -0.04%, +0.03% Max live registers: 191819298 -> 191822052 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48511264 -> 48510608 (-0.00%); split: +0.00%, -0.00% Non SSA regs after NIR: 136334891 -> 136301926 (-0.02%); split: -0.03%, +0.00% Totals from 37416 (1.87% of 2003390) affected shaders: Instrs: 53346249 -> 53093344 (-0.47%); split: -0.48%, +0.01% CodeSize: 775396384 -> 771603792 (-0.49%); split: -0.60%, +0.11% Cycle count: 32275003526 -> 32271590404 (-0.01%); split: -0.14%, +0.13% Spill count: 569304 -> 569211 (-0.02%); split: -0.05%, +0.03% Fill count: 620240 -> 619582 (-0.11%); split: -0.31%, +0.21% Max live registers: 6712048 -> 6714802 (+0.04%); split: -0.01%, +0.05% Max dispatch width: 893344 -> 892688 (-0.07%); split: +0.10%, -0.17% Non SSA regs after NIR: 7191473 -> 7158508 (-0.46%); split: -0.49%, +0.03% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 985625036 -> 985366432 (-0.03%); split: -0.03%, +0.00% CodeSize: 16446268768 -> 16442606864 (-0.02%); split: -0.03%, +0.01% Cycle count: 91278956920 -> 91272371300 (-0.01%); split: -0.07%, +0.06% Spill count: 3713935 -> 3714003 (+0.00%); split: -0.00%, +0.00% Fill count: 5001514 -> 5001259 (-0.01%); split: -0.03%, +0.02% Max live registers: 120736970 -> 120738919 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 37827808 -> 37829472 (+0.00%); split: +0.01%, -0.00% Non SSA regs after NIR: 160606595 -> 160573270 (-0.02%); split: -0.02%, +0.00% Totals from 38664 (1.71% of 2265137) affected shaders: Instrs: 53621392 -> 53362788 (-0.48%); split: -0.49%, +0.01% CodeSize: 932994544 -> 929332640 (-0.39%); split: -0.52%, +0.13% Cycle count: 24442489628 -> 24435904008 (-0.03%); split: -0.25%, +0.22% Spill count: 550952 -> 551020 (+0.01%); split: -0.02%, +0.03% Fill count: 525010 -> 524755 (-0.05%); split: -0.27%, +0.23% Max live registers: 3594805 -> 3596754 (+0.05%); split: -0.01%, +0.07% Max dispatch width: 510928 -> 512592 (+0.33%); split: +0.47%, -0.14% Non SSA regs after NIR: 7652247 -> 7618922 (-0.44%); split: -0.46%, +0.03% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 997905938 -> 997771670 (-0.01%); split: -0.01%, +0.00% CodeSize: 13990460928 -> 13988346016 (-0.02%); split: -0.02%, +0.00% Cycle count: 83465002175 -> 83456829524 (-0.01%); split: -0.02%, +0.01% Spill count: 3815020 -> 3814879 (-0.00%); split: -0.01%, +0.00% Fill count: 6561078 -> 6560768 (-0.00%); split: -0.01%, +0.00% Max live registers: 121468149 -> 121468160 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 37914400 -> `37914624` (+0.00%); split: +0.00%, -0.00% Non SSA regs after NIR: 155941530 -> 155944033 (+0.00%); split: -0.00%, +0.00% Totals from 27771 (1.22% of 2273117) affected shaders: Instrs: 31224666 -> 31090398 (-0.43%); split: -0.44%, +0.01% CodeSize: 450250800 -> 448135888 (-0.47%); split: -0.57%, +0.10% Cycle count: 15045135658 -> 15036963007 (-0.05%); split: -0.13%, +0.08% Spill count: 406812 -> 406671 (-0.03%); split: -0.05%, +0.01% Fill count: 391210 -> 390900 (-0.08%); split: -0.10%, +0.02% Max live registers: 2592759 -> 2592770 (+0.00%); split: -0.02%, +0.02% Max dispatch width: 383888 -> 384112 (+0.06%); split: +0.23%, -0.17% Non SSA regs after NIR: 4221402 -> 4223905 (+0.06%); split: -0.01%, +0.07% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:14 +00:00
Ian Romanick	daa38c1972	nir/opt_if: Merge if-statements with inverted conditions Cases like if (x) { ... } else { ... } if (!x) { ... } else { ... } should be merged. I don't know why Ice Lake is affected differetly by this commit. v2: Add implementation of srcs_equal_or_logical_inverse after bad rebase. That's what I get for rushing out an MR right before lunch. Noticed by Georg. shader-db: Lunar Lake No changes. All other Iris platforms had simlar results. (Meteor Lake shown) total cycles in shared programs: 882310108 -> 882311504 (<.01%) cycles in affected programs: 74306 -> 75702 (1.88%) helped: 4 HURT: 2 helped stats (abs) min: 2.0 max: 38.0 x̄: 11.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.29% x̄: 0.09% x̃: 0.02% HURT stats (abs) min: 720.0 max: 720.0 x̄: 720.00 x̃: 720 HURT stats (rel) min: 5.27% max: 5.27% x̄: 5.27% x̃: 5.27% 95% mean confidence interval for cycles value: -163.75 629.08 95% mean confidence interval for cycles %-change: -1.21% 4.61% Inconclusive result (value mean confidence interval includes 0). fossil-db: All Intel platforms except Ice Lake had similar results. (Lunar Lake shown) Totals: Instrs: 914554534 -> 914546744 (-0.00%); split: -0.00%, +0.00% CodeSize: 12887129264 -> 12886823808 (-0.00%); split: -0.00%, +0.00% Send messages: 40220826 -> 40219429 (-0.00%); split: -0.00%, +0.00% Cycle count: 100101810976 -> 100101804762 (-0.00%); split: -0.00%, +0.00% Spill count: 3459811 -> 3459786 (-0.00%); split: -0.00%, +0.00% Fill count: 4909877 -> 4909835 (-0.00%); split: -0.00%, +0.00% Max live registers: 191837229 -> 191838000 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48514400 -> 48514336 (-0.00%) Non SSA regs after NIR: 136346777 -> 136343948 (-0.00%); split: -0.00%, +0.00% Totals from 1937 (0.10% of 2003486) affected shaders: Instrs: 3013550 -> 3005760 (-0.26%); split: -0.39%, +0.13% CodeSize: 43169072 -> 42863616 (-0.71%); split: -0.81%, +0.10% Send messages: 183171 -> 181774 (-0.76%); split: -0.82%, +0.06% Cycle count: 126864798 -> 126858584 (-0.00%); split: -0.67%, +0.67% Spill count: 7354 -> 7329 (-0.34%); split: -0.45%, +0.11% Fill count: 5547 -> 5505 (-0.76%); split: -0.88%, +0.13% Max live registers: 296895 -> 297666 (+0.26%); split: -0.04%, +0.30% Max dispatch width: 41856 -> 41792 (-0.15%) Non SSA regs after NIR: 545672 -> 542843 (-0.52%); split: -1.15%, +0.63% Ice Lake Totals: Instrs: 996341606 -> 996312120 (-0.00%); split: -0.00%, +0.00% CodeSize: 12563695936 -> 12563195200 (-0.00%); split: -0.00%, +0.00% Send messages: 45911343 -> 45909063 (-0.00%); split: -0.00%, +0.00% Cycle count: 82819362995 -> 82818778468 (-0.00%); split: -0.00%, +0.00% Spill count: 2935451 -> 2935452 (+0.00%); split: -0.00%, +0.00% Fill count: 5034267 -> 5034281 (+0.00%); split: -0.00%, +0.00% Max live registers: 124672355 -> 124672961 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 41330808 -> 41330672 (-0.00%) Non SSA regs after NIR: 160790466 -> 160785863 (-0.00%); split: -0.01%, +0.00% Totals from 2163 (0.09% of 2327905) affected shaders: Instrs: 4164788 -> 4135302 (-0.71%); split: -0.80%, +0.09% CodeSize: 53351344 -> 52850608 (-0.94%); split: -0.95%, +0.01% Send messages: 271164 -> 268884 (-0.84%); split: -0.84%, +0.00% Cycle count: 145818114 -> 145233587 (-0.40%); split: -0.66%, +0.26% Spill count: 7819 -> 7820 (+0.01%); split: -0.32%, +0.33% Fill count: 7191 -> 7205 (+0.19%); split: -0.57%, +0.76% Max live registers: 192403 -> 193009 (+0.31%); split: -0.08%, +0.40% Max dispatch width: 34728 -> 34592 (-0.39%) Non SSA regs after NIR: 570874 -> 566271 (-0.81%); split: -1.49%, +0.68% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:14 +00:00
Ian Romanick	e8cef4725d	nir/opt_if: use nir_def_replace() instead of nir_def_rewrite_uses() Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:13 +00:00
Ian Romanick	4a37fda884	nir: Use nir_instr_remove_v in nir_def_replace The non _v version sets up and returns a nir_cursor that isn't used. Skip that work by calling nir_instr_remove_v directly. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:13 +00:00
Job Noorman	56a8225742	ir3: use a1.x addressing for ldg.k with dst 256 Its dst uses 8 bits so cannot encode 256. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `e6529b54c0` ("ir3: add support for the ldg.k a1.x addressing mode") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:13 +00:00
Rob Herring (Arm)	0972ef7d33	ethosu: Add performance counter debug output Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add simple performance counter support as debug output. This is enough to measure NPU cycles for networks. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40269>	2026-06-02 17:07:08 +00:00
Tomeu Vizoso	83d0646d79	teflon/tests: make tflite stubs fail loudly with diagnostics Be more explicit when the stubs are used, even if that shouldn't happen any more. Reviewed-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40269>	2026-06-02 17:07:08 +00:00
Tomeu Vizoso	b33f0cc7fc	teflon/tests: avoid loading build-tree tensorflow-lite stub at runtime It can be quite confusing to see the tests failing to load models without knowing why. To avoid making people waste time with strace, link with the stubs at build time but look for the actual implementation at run time. Reviewed-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40269>	2026-06-02 17:07:08 +00:00
Silvio Vilerino	31fe3637a2	mediafoundation: check for AUTO slice/tile only capable hardware Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41947>	2026-06-02 16:50:00 +00:00
Silvio Vilerino	ce05976385	d3d12: Support video encode AUTO slice/tile only capable hardware Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41947>	2026-06-02 16:50:00 +00:00
Samuel Pitoiset	2e55357b33	radv: fix DGC with conditional rendering and task+mesh shaders While I was investigating some task+mesh random GPU hangs in CI, I finally found a sequence that caused a test failure: dEQP-VK.dgc.ext.graphics.mesh.conditional_rendering.general.classic_bind_with_count_buffer_condition_false_with_task_shader dEQP-VK.dgc.ext.graphics.mesh.token_draw_count.monolithic_with_task_shader Executing these two tests in a row caused the second one to always fail (tested on NAVI33). After investigating I figured out that only the DGC GFX IB was predicated (with IB2) and the DGC ACE IB was always running, although without any mesh draws to consume the task output. It seems the hardware is confused if another task+mesh draw is dispatched after that, and this could cause failures or GPU hangs. Fix this by resetting the number of DGC sequences to 0 when conditional rendering is used. This is the only option to emulate conditional rendering with DGC and ACE. This also likely fixes DGC+RT on compute queue. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41939>	2026-06-02 16:23:18 +00:00
Rhys Perry	1301eef21d	radv: fix usage of radv_nir_cmat_length This should be after we finalize desc.use. Fixes FSR4 on RDNA3. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `ca0496bc26` ("radv: use load_deref_transpose_amd for transposed cooperative matrix loads") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41922>	2026-06-02 15:57:14 +00:00
Mary Guillemard	eb90c4b718	nvk: Do not report task and mesh stages as supported on pre-Turing Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We should not report support for subgroup ops or DGC for mesh stages on pre-Turing. Signed-off-by: Mary Guillemard <mary@mary.zone> Reported-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `145b8540e5` ("nvk: Advertises VK_EXT_mesh_shader") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41962>	2026-06-02 15:09:43 +00:00
Daniel Schürmann	c0e83629d8	anti-lag: rework wait time calculation The extension allows for a simplified API where pPresentationInfo is NULL and the second call to vkAntiLagUpdateAMD() is omitted which makes it necessary to separate frames on vkQueuePresentKHR(). The second main difference is that the wait time is now based on the previous input stage plus the average frame time. This greatly smooths frame pacing. v2: - measure the GPU frame time directly - Only try to evaluate frames which are likely to complete within the waiting time - Calculate the average absolute deviation of the total frame time and use that to determine the slack time v3: - move frame separation to vkQueuePresentKHR() - tightened frame pacing aiming for at most 1ms overlap Acked-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41727>	2026-06-02 14:21:20 +00:00
Jakob Sinclair	3a3099369f	gallium: fix type size in z24_unorm_packed_pack_z_32unorm util_format_z24_unorm_packed_pack_z_32unorm was accidentally writing 48-bits instead of 24 since it used a 16-bit integer pointer instead of an 8-bit pointer. This could cause a segfault if the function was used, but it is currently unused. Fixes: `18f352090d` ("util/format: Add a Z24_UNORM_PACKED format") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41288>	2026-06-02 13:49:08 +00:00
Jakob Sinclair	0925c68adf	pan: Support lowering D24X8 to D24 On Mali, HW does not advertise support for writing D24X8 with AFBC enabled but when AFBC is enabled for a D24X8 image, we can lower it to just D24. In Panfrost we keep the external format as Z24X8 but the internal format as Z24 packed. The driver already handles setting up a new resource in the external format when mapping to CPU since AFBC resources can't be mapped directly anyways. For PanVK we return the Z24 packed format D24X8 with AFBC and otherwise Z24X8 format without AFBC. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41288>	2026-06-02 13:49:08 +00:00
Jakob Sinclair	44362461b5	vulkan/meta: Don't issue a full drawcall for clears The Vulkan meta path used to issue a full drawcall with a rectangle primitive for cmdClear*Image which is not optimal if the device has support for HW clears. This commit enables the meta path to skip the full drawcall if the driver supports it by setting VK_ATTACHMENT_LOAD_OP_CLEAR on the attachment and letting the driver handle setting up a clear pipeline. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41288>	2026-06-02 13:49:08 +00:00
Jakob Sinclair	cdf5703eb1	panvk: Remove unnecessary functions panvk_get_image_layout_transition_handler returns the same zero struct in both paths so it can simply be removed. This also means that transition_image_layout_sync_scope and cmd_transition_image_layout can be removed as they are always NOPs. Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41288>	2026-06-02 13:49:07 +00:00
Samuel Pitoiset	b0ee9510d7	radv: advertise VK_KHR_device_fault Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41648>	2026-06-02 15:18:34 +02:00
Samuel Pitoiset	7dcd2a4c87	radv: implement VK_KHR_device_fault Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41648>	2026-06-02 15:18:22 +02:00
Pavel Ondračka	f6b06ea3de	nir/algebraic: prevent ffract optimization on lowered ffloor Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ffloor(a) is lowered as a - ffract(a). dEQP expects that for example ffloor(a) == 1.0 for every a in between 1.0 a 2.0. This worked fine, but the new ffract(a + b(is_integral)) -> ffract(a) rule broke this. Specifically, dEQP-GLES2.functional.shaders.struct.uniform.equal_fragment checks that ffloor(a + 1.0) == 1.0 for every a between 0.0 and 1.0. However this is not exactly true once the ffract(a + 1.0) is lowered to ffract(a). Prevent this by marking ffract from ffloor lowering as exact so that the recently introduced ffract(a + b(is_integral)) -> ffract(a) rule does not trigger. Fixes: `c6aaafa3` ("nir: add lowering for ffloor") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15562 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41882>	2026-06-02 12:03:09 +00:00
Mary Guillemard	8ec56cc0cc	docs/nvk: Add some notes about mesh shading and ISBE layout Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:32 +00:00
Mary Guillemard	145b8540e5	nvk: Advertises VK_EXT_mesh_shader Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:32 +00:00
Mary Guillemard	cdb0dea462	nvk: Lower mesh and task shaders Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:32 +00:00
Mary Guillemard	cf29933de1	nvk: Only lower shared memory for compute shaders It's a no-op on other stages so let's not run this. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:32 +00:00
Mary Guillemard	424c466cdb	nvk: Do not set lower_cs_local_index_to_id With task/mesh shaders, we need that lowering to not happen. Move to conditionally lower local invocation index with nir_lower_compute_system_values_options in case of compute shader. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:32 +00:00
Mary Guillemard	50b3ae08e7	nak: Implement mesh and task shader stages Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	368a6693bc	nvk: Implement mesh draw commands Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	96ade67e2b	nvk: Add support for mesh and task shader binding Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	3286990481	nvk: Prepare cbuf for mesh shader support Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	3348003735	nvk: Prepare nvk_shader for GS header upload for mesh shaders This define structure and a way to upload the GS header when present. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	1f31ec46be	nak: Add a lowering pass for shared memory atomics in mesh stages This add a new lowering pass for shared memory atomics that will be used for mesh/task stages. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	b95dbc64bf	nir,nak: Add match_any_nv NVIDIA hardware have an instruction allowering you to retrive the mask of active threads matching the same source value as the current invocation. This is going to be used by shared memory lowering for mesh / task stages on NVK. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Danylo Piliaiev	d88c183785	tu: Disable FS in some cases even when FS explicitly writes D/S For example, the FS may write gl_SampleMask while color writes are masked out and there is no depth attachment. Note that the proprietary driver still considers more state when disabling the FS, such as the depth test being disabled, and thus disables the FS in cases where we do not. However, I think that is too much of a stretch unless we find some real workload needing it. This change also allows disabling an FS that has discard. This requires being careful around occlusion queries, since when one is enabled, we cannot disable an FS that can discard. Found via gpu-ratemeter bench: vk.pix.noaa.output.color+z+samplemask.colormask=0 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41857>	2026-06-02 11:56:26 +02:00
Maíra Canal	c4a1d9583c	etnaviv/ml: derive stride-2 destriding offsets from padding Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The destriding lowering hard-coded a special case for weight_width == 5 with a fallback "+1" branch that was only correct for 3x3 kernels. Replace it with formulas derived from TFLite's SAME-padding rule for stride 2: The half-resolution expansion applied to the reshuffle output and to the strided_to_normal() input is: weight_width / 2 which gives 1 for 3x3, 2 for 5x5, and 3 for 7x7 kernels. The reshuffle window start offset is: (weight_width + input_width % 2 - 2) / 2 This folds the previous odd-input fixup into the same expression preserves the existing 3x3 and 5x5 behavior while extending the lowering to wider odd kernels such as 7x7. Fixes Models.Op/inception_000, which uses Inception V1's Conv2d_1a_7x7, in the Teflon test suite. Signed-off-by: Maíra Canal <mairacanal@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41774>	2026-06-02 08:13:11 +00:00
David Rosca	c24e4085a1	radeonsi/mm: Set PIPE_RESOURCE_FLAG_UNMAPPABLE for buffers This creates the BO with AMDGPU_GEM_CREATE_NO_CPU_ACCESS for buffers that we don't map. Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41850>	2026-06-02 07:06:50 +00:00

1 2 3 4 5 ...

223507 commits