fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 11:28:05 +02:00

Author	SHA1	Message	Date
Eric Engestrom	aa2d16c80a	docs: add sha sum for 26.0.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40780>	2026-04-03 14:18:06 +00:00
Eric Engestrom	902f016612	docs: add release notes for 26.0.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40780>	2026-04-03 14:18:06 +00:00
Eric Engestrom	c4bd563e6e	docs: update calendar for 26.0.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40780>	2026-04-03 14:18:06 +00:00
Arkady Shlykov	b1d379eeba	anv: Add control over divergent atomics fusion opt via driconf anv_enable_opt_divergent_atomics dricong option supported values: 1 - fuse buffer divergent atomics 2 - fuse image divergent atomics Enabled for titles: Total War: WARHAMMER III The Elder Scrolls IV: Oblivion Remastered Call of Duty: Black Ops III Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631>	2026-04-03 12:17:01 +00:00
Arkady Shlykov	7f7ba20cca	brw: Implement divergent atomics fusion optimization (single message approach) For an atomic with a divergent addr generates a CFG grouping the same addrs values together and emits a single atomic with fused data covering the subgroup. Lanes with other addr values perform a default atomic. Co-authored-by: Jhanani Thiagarajan <jhanani.thiagarajan@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631>	2026-04-03 12:17:01 +00:00
Lionel Landwerlin	fab6f84126	brw: make the program key available on pass_tracker Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631>	2026-04-03 12:17:01 +00:00
Valentine Burley	d15ba8d14a	turnip/ci: Add Android job with ANGLE on a618 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This is a Cuttlefish-based Android job running with DRM native context, using Turnip and ANGLE. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Valentine Burley	57d55f8c09	ci/android: Update Cuttlefish build The new version updates the default Mesa version to 26.1.0-devel. This is used for booting the VM, after which point the drivers are replaced by the ones built in the Mesa CI pipeline. Fixes GPU faults with ANGLE on Turnip. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Valentine Burley	1b8018389b	ci/android: Enable virtio freedreno KMD support Enable the virtio freedreno kernel mode driver in the debian-android build. This will be used by Cuttlefish virtual machines. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Valentine Burley	d4cd93854a	ci/android: Refactor replacing Vulkan drivers Setting the VK_DRIVER variable for lavapipe jobs simplifies the driver replacement logic while keeping all existing paths working. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Valentine Burley	9548bf86f4	ci/android: Add 5-minute timeout to Cuttlefish launch Cuttlefish usually boots within 2-3 minutes, and this ensures logs are saved if the boot process hangs or fails. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Valentine Burley	338a96f0a0	venus/ci: Move android-angle-venus-tu-a618 to sc7180-trogdor-kingoftown Huge thanks to Laura and Doug for updating the EC and AP firmware, and for switching the network adapter across all trogdor Chromebooks, enabling them to boot Cuttlefish. Also limit the concurrency to 6 for now. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40010>	2026-04-03 09:25:14 +00:00
Lionel Landwerlin	30fda4488f	nir: divergence analysis support for image_heap_load_param_intel Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `22b16d54ab` ("nir: add heap variant of load_param_intel") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40777>	2026-04-03 07:49:24 +00:00
Job Noorman	893d3caf7b	tu: use nir_opt_varyings_bulk for linking Replace the use of nir_link_opt_varyings/nir_compact_varyings for linking with the new nir_opt_varyings linker using the nir_opt_varyings_bulk helper. This moves all the NIR lowering up to nir_lower_io (tu_lower_nir) to the pre-linking stage since nir_opt_varyings expects lowered IO. Totals from 38233 (21.69% of 176258) affected shaders: MaxWaves: 522636 -> 522578 (-0.01%); split: +0.14%, -0.15% Instrs: 15111014 -> 15062812 (-0.32%); split: -0.71%, +0.39% CodeSize: 31555448 -> 31530676 (-0.08%); split: -0.70%, +0.62% NOPs: 2605163 -> 2582030 (-0.89%); split: -2.38%, +1.49% MOVs: 519056 -> 511167 (-1.52%); split: -4.88%, +3.36% COVs: 244091 -> 243317 (-0.32%); split: -0.55%, +0.23% Full: 463796 -> 463307 (-0.11%); split: -0.47%, +0.36% (ss): 390558 -> 386374 (-1.07%); split: -3.07%, +2.00% (sy): 180298 -> 179347 (-0.53%); split: -1.55%, +1.02% (ss)-stall: 1485337 -> 1473362 (-0.81%); split: -3.92%, +3.11% (sy)-stall: 5441818 -> 5375690 (-1.22%); split: -2.99%, +1.78% Preamble Instrs: 3707325 -> 3724339 (+0.46%); split: -0.38%, +0.84% Early Preamble: 29397 -> 29392 (-0.02%); split: +0.10%, -0.12% Cat0: 2883908 -> 2860585 (-0.81%); split: -2.16%, +1.35% Cat1: 765447 -> 757066 (-1.09%); split: -3.46%, +2.36% Cat2: 5664380 -> 5663562 (-0.01%); split: -0.51%, +0.49% Cat3: 4393358 -> 4386474 (-0.16%); split: -0.27%, +0.12% Cat4: 443624 -> 443546 (-0.02%); split: -0.03%, +0.01% Cat5: 427389 -> 427239 (-0.04%); split: -0.27%, +0.24% Cat6: 173632 -> 164362 (-5.34%); split: -5.36%, +0.02% Cat7: 359276 -> 359978 (+0.20%); split: -1.33%, +1.53% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	99713d0c53	tu: extract NIR lowering to a separate function As a preparation for a later commit moving NIR lowering to before shader linking. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	59a547cba4	ir3: call nir_io_add_intrinsic_xfb_info after IO lowering Needed by nir_opt_varyings. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	3ae9f0cd0a	ir3: call nir_lower_io_vars_to_temporaries for GS outputs Divergence analysis doesn't allow load_output on GS outputs so make sure they are lowered away. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	c27f0406b0	ir3: fix handle_partial_const with vectorized src Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `50a91fbf87` ("freedreno/ir3: cleanup "partially const" ubo srcs") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	adfc1086c9	nir/recompute_io_bases: fix num_slots for per_view outputs per_view outputs use one slot per enabled view. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	a72704d0fb	nir/gather_info: clear interpolation qualifiers before gathering Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Fixes: `66740d9c91` ("nir: gather interpolation qualifiers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	2403e88a76	nir/gather_info: gather per_view info Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	273fd18b89	nir/opt_varyings: fix alu def cloning nir_builder_alu_instr_finish_and_insert initialized the def's bit_size and num_components so we should set them afterwards. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Fixes: `c66967b5cb` ("nir: add nir_opt_varyings, new pass optimizing and compacting varyings") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Job Noorman	d56d35aa76	nir/opt_varyings_bulk: add data parameter to optimize callback Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40651>	2026-04-03 08:18:08 +02:00
Timothy Arceri	27b56314ee	radeonsi: add Gun Godz workaround Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This is another game based on the old YoYo engine Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15209 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40757>	2026-04-03 00:01:32 +00:00
Sagar Ghuge	19f39910a9	anv/bvh: Drop atomic on instance_count Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Thanks to Konstantin for pointing out that we really don't need atomics here. We can use the IR offset to get the slot and keep stuffing the instance address in it. Header already writes the instance count for us. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40734>	2026-04-02 22:14:11 +00:00
Rob Herring (Arm)	e100ca7c86	ethosu: Move ethosu_allocate_feature_map() to ethosu_lower.c Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that all callers of ethosu_allocate_feature_map() are in ethosu_lower.c, move it there too. Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40719>	2026-04-02 21:04:25 +00:00
Rob Herring (Arm)	dd10897c5d	ethosu: Drop 2nd allocation of IFM and OFM The IFM and OFM were already allocated by the call to allocate_feature_maps() in ethosu_lower_convolution(). Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40719>	2026-04-02 21:04:25 +00:00
Rob Herring (Arm)	e89a672ab7	ethosu: Fix U85 AvgPool for greater than 8x8 kernel sizes The U85 uses average mode for kernel sizes less than or equal to 8x8 and sum mode for larger (in either dimension) kernel sizes. According to the U85 TRM, the average and sum modes have the following constraints: average - Average pooling up to 8x8, inbuilt scale only sum - Sum or average pooling, per-channel, or global scale Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40719>	2026-04-02 21:04:25 +00:00
Jason Macnak	cabc55e9a5	gfxstream: fix submit to not hold lock when calling into encoder ... as this can lead to a deadlock with the following sequence: Time1: guest-thread-1: vkDestroyImageView() called Time2: VkEncoder grabs seqno 1 Time3: guest-thread-2: vkQueueSubmit() called Time4: ResourceTracker::on_vkQueueSubmitTemplate() locks mLock for using `info_VkFence` Time5: ResourceTracker::on_vkQueueSubmitTemplate() calls enc->vkQueueWaitIdle() Time6: VkEncoder grabs seqno 2 Time7: VkEncoder sends the vkQueueWaitIdle with seqno 2 via ASG to host Time8: VkEncoder waits for the `VkResult` from the host via `stream->read()` Time9: guest-thread-1: VkEncoder calls sResourceTracker->destroyMapping() ->mapHandles_VkImageView((VkBuffer*)&buffer); which calls ResourceTracker::unregister_VkImageView() ResourceTracker::unregister_VkImageView() tries to lock mLock to erase the info struct !!! DEADLOCKED HERE !!! guest-thread-1 is stuck waiting on mLock (currently locked by guest-thread-2) before it would `stream->flush();` to finishing sending the vkDestroyImageView() command to the host and potentially ping its corresponding host-render-thread-1. guest-thread-2 is stuck waiting on the result from host-render-thread-2 but host-render-thread-2 won't progress until host-render-thread-1 finishes seqno 1 which needs guest-thread-1 to finish sending/pinging. Android equivalent change ag/39258728 for b/498964194 Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader open maps pan/zoom/etc for a couple minutes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40767>	2026-04-02 20:20:52 +00:00
Casey Bowman	007be58ade	intel/ds: Modify rejection threshold to scale with requested sample period Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Previously, we only checked if the hardware duration was greater than the requested sample period by 1000 ns. This can lead the hardware duration to be rejected and use the next cycle, which is double the size of the current duration. At larger requested sample size, this can mean getting a hardware duration of 1.7 ms for a requested sample period of 1 ms. To fix this, we'll scale the check so that it uses 67% of the requested sample period as the reject threshold. This way, if the hardware duration is below 67%, it's guaranteed to be within 100%-133% of the requested sample period on the next hardware interval. Signed-off-by: Casey Bowman <casey.g.bowman@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40735>	2026-04-02 18:56:16 +00:00
Caio Oliveira	0bf3aaedb1	brw: Always use split send in generator Instead of generating special single source send in some cases, always use the split send (called SENDS pre-Xe, and the only option in Xe). Having code-path for single source was relevant for old Gfx versions, but for Gfx9+ split send is always available. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40755>	2026-04-02 18:31:02 +00:00
Danylo Piliaiev	3335e707e1	tu: u_trace usage fixes before u_trace refactoring Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details - We won't be able to rely on u_trace_fini leaving u_trace in valid state, so u_trace_init should be called after it. - There probably was a double-free of u_trace_submission_data. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40728>	2026-04-02 16:20:09 +00:00
Rob Clark	75fad9e2c4	tu/kgsl: Add UBWC_5 and UBWC_6 support Handle the two additional UBWC versions used on gen8. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40758>	2026-04-02 15:55:55 +00:00
Zan Dobersek	d574bf0d64	tu/kgsl: bump msm_kgsl.h header Update the msm_kgsl.h header up to the d45f9faad921 kgsl commit. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40758>	2026-04-02 15:55:55 +00:00
Marek Olšák	27a4c58745	nir/tests: test nir_opt_varyings with sysvals Test that view_index is moved, and sample_mask_in isn't. Acked-by: Pierre-Eric Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40664>	2026-04-02 14:38:56 +00:00
Marek Olšák	f3e208ee6c	nir/opt_varyings: move expressions with view_index into preceding shaders Example: Before: VS output0 = v0 VS output1 = v1 FS output = gl_ViewIndex == 0 ? input0 : input1; After: VS output0 = gl_ViewIndex == 0 ? v0 : v1; FS output = input0; Acked-by: Pierre-Eric Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40664>	2026-04-02 14:38:56 +00:00
Marek Olšák	92cf9af827	nir: factor out nir_system_value_from_instr from nir_opt_varyings Acked-by: Pierre-Eric Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40664>	2026-04-02 14:38:56 +00:00
Marek Olšák	bfc75c0641	nir: return a failure value from nir_system_value_from_intrinsic We need to be able to check whether an intrinsic loads a sysval. Acked-by: Pierre-Eric Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40664>	2026-04-02 14:38:56 +00:00
Rob Clark	e6af9524b0	freedreno/a6xx: Fix blit fmt check Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The commit that introduced 9_9_9_E5 RB support mistakenly broke fake-format blits (such as compressed formats, etc). Re-order the logic to restore fake-format blits. Fixes iova fault in manhattan. Not to mention inadvertantly falling off of the A2D path for a lot of blits. Fixes: `9dc3410512` ("tu: Add support for VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 color attachments") Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40754>	2026-04-02 14:21:03 +00:00
Silvio Vilerino	b83a931cb1	d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40756>	2026-04-02 13:54:02 +00:00
Silvio Vilerino	9f4d3267c9	d3d12: Fix video fence leak and double assign Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40756>	2026-04-02 13:54:02 +00:00
Wenfeng Gao	aa5398689b	mediafoundation: Fix the frame number validation logic for motion hint The external move region frame number was continuously generated. However, the current POC was reset based on IDR. Modified the logic of validation and logged a warning in case of mismatch. Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40756>	2026-04-02 13:54:02 +00:00
Sergi Blanch Torne	bae86c3118	ci: fix envvar default value Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With `047bb6b8` on !35740, when GIT_STRATEGY is not defined, the scripts can fail where we use `set -u` to raise an error when unset variables. Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40759>	2026-04-02 09:09:52 +00:00
Pavel Ondračka	33864e569e	st/bitmap: release the temporary bitmap sampler view st_cb_bitmap appends a temporary bitmap sampler view to the sampler view array passed to set_sampler_views(). `1a5c660ef5` changed this path to only release the extra YUV views returned by st_get_sampler_views(), but the temporary bitmap view is created locally and is not part of extra_sampler_views. It therefore stopped being released so release the temporary bitmap sampler view explicitly after drawing the bitmap quad. Fixes: `1a5c660ef5` ("st/bitmap: only release YUV samplerviews") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40694>	2026-04-02 08:34:54 +00:00
Karol Herbst	72e9f9a760	nak: add algebraic patterns to improve MUFU.F16 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Doesn't really help many shaders, but I've seen a couple that turn from MUFU into F2F(MUFU.F16(F2F)). Though this might be as well a limitation of related code, e.g. returning F32 from TEX, and not use TEX.F16 instead. Totals: CodeSize: 8662337424 -> 8662336960 (-0.00%) Static cycle count: 4718044491 -> 4718044554 (+0.00%); split: -0.00%, +0.00% Totals from 7 (0.00% of 1163204) affected shaders: CodeSize: 236480 -> 236016 (-0.20%) Static cycle count: 2108061 -> 2108124 (+0.00%); split: -0.01%, +0.01% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:57 +00:00
Karol Herbst	9cc2cd843b	nak: enable MUFU.F16 on Turing and newer Totals from 1427 (0.12% of 1163204) affected shaders: CodeSize: 18599616 -> 18495424 (-0.56%); split: -0.56%, +0.00% Number of GPRs: 91579 -> 91571 (-0.01%) SLM Size: 14144 -> 14140 (-0.03%) Static cycle count: 96164214 -> 96075886 (-0.09%); split: -0.13%, +0.04% Spills to memory: 2677 -> 2681 (+0.15%) Fills from memory: 2677 -> 2681 (+0.15%) Max warps/SM: 48868 -> 48872 (+0.01%) Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:57 +00:00
Karol Herbst	c7ff7c7d40	nak: add hw_test for MUFU.F16 Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:57 +00:00
Karol Herbst	d031365f7c	nak: support MUFU.F16 Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:57 +00:00
Karol Herbst	3d94841bba	nak: remove OpF2F::dst_high It was dead code Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:57 +00:00
Karol Herbst	67bfbc7535	nak: rework swizzling on scalar FP16 ops Instructions that take a F16 value can generally select which component to read from. This lets us get rid of some PRMTs. This also cleans up partial support for it for F2F and streamlines everything into an uniform model as previously it wasn't wired up generally and copy prop didn't always propagate the swizzle through. This also makes it uneccessary to apply a Xx swizzle to scalar FP16 sources. Totals from 907 (0.08% of 1163204) affected shaders: CodeSize: 40856816 -> 40843408 (-0.03%); split: -0.03%, +0.00% Static cycle count: 20898101 -> 20895619 (-0.01%); split: -0.01%, +0.00% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40392>	2026-04-02 01:10:56 +00:00

1 2 3 4 5 ...

220677 commits