fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 22:38:06 +02:00

Author	SHA1	Message	Date
Sagar Ghuge	8a990b5a1c	intel/genxml: Added dispatch timeout counter extended field Since field is split in between multiple fields, we have to manually write the values and refer to Bspec 43851 for exact values. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40733>	2026-04-24 01:38:20 +00:00
Emma Anholt	01cb024922	ci/intel: Switch over to the new tool for restricted traces. The new tool has much better image diffing presentation (thanks to Danilo's work on turnip's private trace CI), better performance, flake checking within a single run, parallelized downloads along with replays, system monitoring for replay debug (OOMs especially), and DXVK support (I've added a few traces, but not most of the collection because I didn't want to block on stabilizing this job with everything). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41115>	2026-04-23 22:54:12 +00:00
Sagar Ghuge	e65e62b17f	intel/genxml: Disable compute walker mid-thread preemption Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On Xe, we have this bit reversed. It's called Thread preemption Disable. On Xe2+ (Bspec 56590), it's called Thread preemption with option enabled/disabled. AFAIK, we don't support mid-thread preemption. This patch set values properly according to bspec. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41120>	2026-04-23 19:24:41 +00:00
Lionel Landwerlin	b3fe0cb34e	anv: expose VK_KHR_shader_constant_data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40741>	2026-04-23 19:02:27 +00:00
Tapani Pälli	c105366165	drirc/anv: add flag to disable VK_EXT_subgroup_size_control This can be used to workaround problem cases with application controlled subgroup size. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40813>	2026-04-23 13:16:05 +00:00
Iván Briano	c5edb90046	anv: silence warning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ../src/intel/vulkan/genX_init_state.c: In function ‘gfx9_CreateSampler’: ../src/intel/vulkan/genX_init_state.c:1507:40: warning: ‘border_color_offset’ may be used uninitialized [-Wmaybe-uninitialized] 1507 \| sampler_state.BorderColorPointer = border_color_offset; Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41116>	2026-04-22 16:17:35 -07:00
GKraats	3c01e6139a	hasvk: unbreak assert format != ISL_FORMAT_UNSUPPORTED Format is set to ISL_FORMAT_UNSUPPORTED at anv_get_format_plane at src/intel/vulkan_hasvk/anv_formats.c, because Ivy Bridge does not support enough 24 and 48-bits formats. Problem solved by checking format after the call. Signed-off-by: GKraats <vd.kraats@hccnet.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40237>	2026-04-22 20:35:25 +00:00
Valentine Burley	d982092865	anv/ci: Revert ADL VKCTS job to stable 6.17 kernel Xe is unstable on 6.19; revert to the previous stable kernel. https://gitlab.freedesktop.org/mesa/mesa/-/jobs/97945843 https://gitlab.freedesktop.org/mesa/mesa/-/jobs/97944526 Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41112>	2026-04-22 19:29:43 +00:00
Caio Oliveira	26ef12f7c1	brw: Use brw prefix to LSC helpers tied to brw Mapping from BRW ops to LSC ops. And the len() helpers that use the REG_SIZE as unit -- which is a BRW convention. Acked-by: Iván Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41006>	2026-04-22 18:25:41 +00:00
Caio Oliveira	9329da6d88	brw: Don't set saturate for SYNC instruction This helper might be used as by another instruction emission, which itself might have set the saturate bit in the default state. This might result in the SYNC being created already with saturate bit set. Since SYNC doesn't have saturate, clear that field instead of sometimes having it set. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41005>	2026-04-22 16:06:42 +00:00
Lionel Landwerlin	6031d52393	anv: implement VK_EXT_primitive_restart_index Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40776>	2026-04-22 08:52:57 +00:00
Samuel Pitoiset	9d17a7bdb4	spirv,treewide: rework specialization constant Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With SPV_KHR_constant_data, it's allowed to specialize array of constants. RustiCL changes are from Karol Herbst <kherbst@redhat.com>. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>	2026-04-22 06:57:55 +00:00
Sagar Ghuge	12f81eaa88	anv: Enable dynamic stack ID control on Xe3+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This patch enables dynamic stack ID control on Xe3+. Programmed values are the recommended settings from the Bspec. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41066>	2026-04-22 01:48:19 +00:00
Sagar Ghuge	acecc0f1b3	intel/genxml: Update xml for dynamic stack ID control fields Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41066>	2026-04-22 01:48:18 +00:00
Sagar Ghuge	620835926d	brw: Pass write back register for ray query messages For DG2 (Bspec 47937) has the same programming note as of Xe2+, "When this bit is set in the header, Trace Ray Message behaves like a Ray Query. This message requires a write-back message indicating RayQuery for all valid Rays (SIMD lanes) have completed." So this patch is just passing a write back destination register when we have ray query message. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41039>	2026-04-21 23:16:09 +00:00
José Roberto de Souza	64bc538f5e	intel/brw: Explicitly upcast UB to UW for SHR with vector immediates HW does not allow instructions with vector immediates to cross a GRF boundary if it has a stride. Under register pressure, the register allocator may place a temporary register across such a boundary. To resolve this, we now explicitly emit a MOV to upcast the UB payload into a UW VGRF. This ensures the SHR instruction operates on a dense, well-aligned region that satisfies hardware alignment constraints. Below is the portion of the shader exhibiting this issue: Native code for unnamed fragment shader GLSL6 (src_hash 0x9c84a007) (sha1 48745e7dae90d08f8a9bbe4dbf837de23440c841f0344e669cb8af9df79bce58) SIMD32 shader: 44 instructions. 0 loops. 354 cycles. 0:0 spills:fills, 2 sends, scheduled with mode latency-sensitive. Promoted 0 constants. GRF registers: 22. Non-SSA regs (after NIR): 11. Compacted 800 to 800 bytes (0%) mov(1) f1<1>UW g0.30<0,1,0>UW { align1 WE_all 1N }; mov(1) f1.1<1>UW g1.30<0,1,0>UW { align1 WE_all 1N I@1 }; mov(32) g2<2>UW g0.20<2,8,0>UW { align1 WE_all }; mov(32) g4<2>UW g0.21<2,8,0>UW { align1 WE_all }; mov(32) g8<2>UW g1.20<2,8,0>UW { align1 WE_all }; mov(32) g10<2>UW g1.21<2,8,0>UW { align1 WE_all }; mov(16) g12<4>UB g0.60<1,8,0>UB { align1 1H }; mov(16) g13<4>UB g1.60<1,8,0>UB { align1 2H }; add(32) g0<1>UW g2<16,8,2>UW 0x01000100V { align1 WE_all I@6 }; add(32) g1<1>UW g4<16,8,2>UW 0x01010000V { align1 WE_all I@6 }; add(32) g2<1>UW g8<16,8,2>UW 0x01000100V { align1 WE_all I@6 }; add(32) g3<1>UW g10<16,8,2>UW 0x01010000V { align1 WE_all I@6 }; shr(16) g4<1>UW g12<32,8,4>UB 0x76543210V { align1 1H I@6 }; mov(16) g14.32<4>UB g13<32,8,4>UB { align1 2H I@6 }; sync nop(1) null<0,1,0>UB { align1 WE_all 1N I@6 }; mov(16) g5<1>UW g0<16,8,2>UW { align1 1H }; sync nop(1) null<0,1,0>UB { align1 WE_all 1N I@6 }; mov(16) g0<1>UW g1<16,8,2>UW { align1 1H }; sync nop(1) null<0,1,0>UB { align1 WE_all 5N I@6 }; mov(16) g5.16<1>UW g2<16,8,2>UW { align1 2H }; sync nop(1) null<0,1,0>UB { align1 WE_all 5N I@6 }; mov(16) g0.16<1>UW g3<16,8,2>UW { align1 2H }; shr(16) g4.16<1>UW g14.32<32,8,4>UB 0x76543210V { align1 2H I@5 }; ERROR: Invalid register region for source 0. See special restrictions section. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40856>	2026-04-21 22:51:45 +00:00
Eric R. Smith	4ae192a3d9	glsl, spirv: Improve accuracy of asin() and acos() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The polynomial used for asin_expr() was suboptimal (and its source was not documented). A better approximation is found in the _Handbook_of_Mathematical_Functions_ by Abramowitz and Stegun, which is used in Nvidia's Cg toolkit. However, while this approximation gives a good absolute error bound, its relative error exceeds the 4096 ulp allowed by the Vulkan spec. Taking a page from the spirv implementation of asin(), we implement a piecewise approximation where a Taylor series is used for small values of \|x\|. This patch also harmonizes the GLSL and Vulkan implementations by moving the implementation to common code (nir_builder). Running tests on asin() with a grid of 64000 samples between 0.0 and +1.0, the original asin() at 32 bits has: ``` glsl spirv RMSE: 1.756451e-04 1.609091e-04 worst abs error: 3.904104e-04 at 0.937001 3.904104e-04 at 0.937001 worst ulp error: 11800 at 6.2499e-05 3826 at 0.841331 ``` whereas the new implementation has for both: ``` RMSE: 2.528056e-05 worst abs error: 4.962087e-05 at 0.451149 worst ulp error: 2379 at 0.215106 ``` Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40862>	2026-04-21 21:10:22 +00:00
Jordan Justen	fa784fffd0	brw: Don't set header_size at init since it will be re-set in later code Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Ref: `efcba73b49` ("brw: switch to new sampler payload description scheme") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:41 +00:00
José Roberto de Souza	26525ac7ae	anv: Move code to load color border to memory to a function Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:41 +00:00
José Roberto de Souza	83d75a0384	anv: Move init and finish of state pools to its own functions Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:41 +00:00
José Roberto de Souza	a4c22baeb4	anv: Move VMA heaps init and finish of vma heaps to anv_va.c Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:40 +00:00
José Roberto de Souza	32f3d6486c	anv: Change fill_inline_params() first parameter from struct GENX(COMPUTE_WALKER_BODY) to uint32_t * This will make this function more generic allowing us to use it for COMPUTE_WALKER_2. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:40 +00:00
Lionel Landwerlin	b0c17357db	intel/ci: update expectation for RPL This fails everywhere but CI only run this test on RPL. A CTS fix has been merged in main. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39451>	2026-04-21 16:29:14 +00:00
Lionel Landwerlin	eda83bc2b6	anv: add a pass to realign global loads on DX CBV resources CBV resources are supposed to be 256B aligned (D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT). vkd3d-proton will puts CBV addresses in the push constant data and do global loads on them. Unfortunately those loads don't have a 256B alignment value on them. So when looking at what we can promote to HW push buffers, we can't consider them. This change introduces a detection pass for CBV resources (according to vkd3d-proton devs those are 64KiB in size) and realign the loads to be 256B aligned. This is only enabled on DX emulation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39451>	2026-04-21 16:29:14 +00:00
Lionel Landwerlin	bba428ce3f	anv: promote push constant pointers to push buffers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39451>	2026-04-21 16:29:14 +00:00
Lionel Landwerlin	0539f26065	brw: track push constants shader stats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39451>	2026-04-21 16:29:14 +00:00
Sagar Ghuge	7a627fa8f3	anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details StackSizePerRay is the RTDispatchGlobals::AsyncStackSize and DisableRTGlobalsKnownValues is to interpret how many Max BVH levels we need to use. It's not relevant to Vulkan, since we have just 2 fixed BVH levels. Fixes: `cb423ee6` ("anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921") Fixes: `c1a44e8d` ("anv: force StackIDControl value for Wa_14021821874") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41012>	2026-04-21 01:38:34 +00:00
Alyssa Rosenzweig	fd46a48ccc	jay/ra: only use stride=4 temps Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details SIMD16: Totals from 56 (2.12% of 2647) affected shaders: Instrs: 541831 -> 542004 (+0.03%); split: -0.40%, +0.44% CodeSize: 8597680 -> 8597248 (-0.01%); split: -0.45%, +0.44% SIMD32: Totals: Instrs: 4858179 -> 4734713 (-2.54%); split: -2.78%, +0.24% CodeSize: 78651424 -> 76667440 (-2.52%); split: -2.76%, +0.24% Totals from 1108 (41.86% of 2647) affected shaders: Instrs: 4241312 -> 4117846 (-2.91%); split: -3.18%, +0.27% CodeSize: 68753152 -> 66769168 (-2.89%); split: -3.16%, +0.27% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:12 +00:00
Alyssa Rosenzweig	1f62da938b	jay/ra: drop memory copy reordering No shader-db changes, and no longer required for correctness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:12 +00:00
Alyssa Rosenzweig	45845ea7f2	jay/ra: use accumulator for stride=4 swaps SIMD16: Totals: Instrs: 2767930 -> 2767190 (-0.03%) CodeSize: 44327408 -> 44312304 (-0.03%); split: -0.04%, +0.00% Totals from 142 (5.36% of 2647) affected shaders: Instrs: 658928 -> 658188 (-0.11%) CodeSize: 10514512 -> 10499408 (-0.14%); split: -0.16%, +0.01% SIMD32: Totals: Instrs: 4884039 -> 4858179 (-0.53%) CodeSize: 79079008 -> 78651424 (-0.54%); split: -0.54%, +0.00% Totals from 761 (28.75% of 2647) affected shaders: Instrs: 3803274 -> 3777414 (-0.68%) CodeSize: 61707728 -> 61280144 (-0.69%); split: -0.70%, +0.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:12 +00:00
Alyssa Rosenzweig	489f883277	jay/ra: use accumulator for memory swaps SIMD1: Totals from 34 (1.28% of 2647) affected shaders: Instrs: 427731 -> 434349 (+1.55%); split: -0.03%, +1.58% CodeSize: 6773248 -> 6881136 (+1.59%); split: -0.04%, +1.63% Number of spill instructions: 1833 -> 1700 (-7.26%) Number of fill instructions: 2095 -> 1944 (-7.21%) SIMD32: Totals from 621 (23.46% of 2647) affected shaders: Instrs: 3663406 -> 3739089 (+2.07%); split: -0.62%, +2.68% CodeSize: 59392464 -> 60624704 (+2.07%); split: -0.61%, +2.68% Number of spill instructions: 52115 -> 50109 (-3.85%); split: -3.90%, +0.05% Number of fill instructions: 53864 -> 51355 (-4.66%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:11 +00:00
Alyssa Rosenzweig	2e5fd6da42	jay/ra: use accumulator for memory copies SIMD16: Totals from 34 (1.28% of 2647) affected shaders: Instrs: 424527 -> 427731 (+0.75%); split: -0.03%, +0.78% CodeSize: 6720896 -> 6773248 (+0.78%); split: -0.04%, +0.82% Number of spill instructions: 1967 -> 1833 (-6.81%) Number of fill instructions: 2247 -> 2095 (-6.76%) SIMD32: Totals: Instrs: 4691989 -> 4808356 (+2.48%); split: -0.46%, +2.94% CodeSize: 76011248 -> 77884320 (+2.46%); split: -0.46%, +2.92% Number of spill instructions: 54223 -> 52115 (-3.89%); split: -4.08%, +0.19% Number of fill instructions: 56519 -> 53864 (-4.70%) Totals from 606 (22.89% of 2647) affected shaders: Instrs: 3509511 -> 3625878 (+3.32%); split: -0.61%, +3.93% CodeSize: 56909488 -> 58782560 (+3.29%); split: -0.61%, +3.90% Number of spill instructions: 54223 -> 52115 (-3.89%); split: -4.08%, +0.19% Number of fill instructions: 56519 -> 53864 (-4.70%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:11 +00:00
Alyssa Rosenzweig	7d2a88a9e5	jay/ra: don't reserve registers when not spilling No changes at SIMD16. At SIMD32: Totals: Instrs: 4691895 -> 4691989 (+0.00%); split: -0.03%, +0.03% CodeSize: 76010880 -> 76011248 (+0.00%); split: -0.03%, +0.03% Number of spill instructions: 54369 -> 54223 (-0.27%) Number of fill instructions: 56668 -> 56519 (-0.26%) Totals from 71 (2.68% of 2647) affected shaders: Instrs: 75963 -> 76057 (+0.12%); split: -1.67%, +1.79% CodeSize: 1229792 -> 1230160 (+0.03%); split: -1.71%, +1.74% Number of spill instructions: 146 -> 0 (-inf%) Number of fill instructions: 149 -> 0 (-inf%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:11 +00:00
Alyssa Rosenzweig	e5bf153d4f	jay/lower_post_ra: drop old 2<-->8 lowering this XOR based lowering is no longer needed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:10 +00:00
Alyssa Rosenzweig	915af8e121	jay/lower_post_ra: remove SWAP macro Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:10 +00:00
Alyssa Rosenzweig	4c5ad7a832	jay/register_allocate: start using accumulators this lets us lower away 8<-->2 copies/swaps in a faster, more straightforward way by (ab)using accumulators. I think as an edge case this plays nicely enough with my plans to profit from accs for normal fma-heavy code. SIMD16: Totals: Instrs: 2761525 -> 2758108 (-0.12%) CodeSize: 44222384 -> 44167168 (-0.12%) Totals from 33 (1.25% of 2647) affected shaders: Instrs: 422130 -> 418713 (-0.81%) CodeSize: 6713680 -> 6658464 (-0.82%) SIMD32: Totals: Instrs: 4911601 -> 4691895 (-4.47%) CodeSize: 79553984 -> 76010880 (-4.45%) Totals from 947 (35.78% of 2647) affected shaders: Instrs: 4143501 -> 3923795 (-5.30%) CodeSize: 67174592 -> 63631488 (-5.27%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:10 +00:00
Alyssa Rosenzweig	53c1c076a8	jay: validate non-SSA accumulators just enough for us to do parallel copy lowering with them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:09 +00:00
Alyssa Rosenzweig	28cf0f52c1	jay/to_binary: handle packing accumulators Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:09 +00:00
Alyssa Rosenzweig	aa37d8b248	jay/print: deal with bare r0 copies Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:09 +00:00
Kenneth Graunke	e55af8793f	jay: Add missing ROR case Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:09 +00:00
Alyssa Rosenzweig	6c862b1951	jay: fix SEL types SEL.f32 flushes denorms but SEL.u32 does not. That means changing the type of the SEL is only justified if we know we're used as a float. This fixes miscompilation in cases like: ieq(1, bcsel(a, fneg(b), c)) Previously we'd be too greedy and form (a) SEL.f32 t, -b, c cmp.u32 t, 1 But that would inadvertently flush c which is an integer here. So just set the type based on what we're used as. Some regressions due to is_only_used_as_float not seeing through phis (..could probably be fixed?). Totals: Instrs: 2760796 -> 2761525 (+0.03%); split: -0.06%, +0.08% CodeSize: 44244128 -> 44222384 (-0.05%); split: -0.13%, +0.08% Totals from 945 (35.70% of 2647) affected shaders: Instrs: 1968645 -> 1969374 (+0.04%); split: -0.08%, +0.11% CodeSize: 31721968 -> 31700224 (-0.07%); split: -0.17%, +0.11% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:09 +00:00
Alyssa Rosenzweig	b5898a418b	jay: relax mov type check prevents regression with next patch which turns u32 into s32. Totals: Instrs: 2764288 -> 2760796 (-0.13%) CodeSize: 44299920 -> 44244128 (-0.13%); split: -0.13%, +0.00% Totals from 193 (7.29% of 2647) affected shaders: Instrs: 255455 -> 251963 (-1.37%) CodeSize: 4160400 -> 4104608 (-1.34%); split: -1.34%, +0.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:07 +00:00
Alyssa Rosenzweig	1b648326ac	jay: refuse to propagate ADDRESS copies at least until we have address RA.. Totals: Instrs: 2764282 -> 2764288 (+0.00%) CodeSize: 44299872 -> 44299920 (+0.00%) Totals from 2 (0.08% of 2647) affected shaders: Instrs: 4215 -> 4221 (+0.14%) CodeSize: 67456 -> 67504 (+0.07%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:07 +00:00
Alyssa Rosenzweig	56ffad0c3a	jay: call DCE an extra time Totals: Instrs: 2767235 -> 2765908 (-0.05%); split: -0.10%, +0.05% CodeSize: 44349488 -> 44328688 (-0.05%); split: -0.10%, +0.06% Totals from 347 (13.11% of 2647) affected shaders: Instrs: 718067 -> 716740 (-0.18%); split: -0.39%, +0.20% CodeSize: 11626032 -> 11605232 (-0.18%); split: -0.39%, +0.21% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:06 +00:00
Alyssa Rosenzweig	d85eb51e17	jay/register_allocate: don't depend on indexing this can get messed up by optimizations. Totals: Instrs: 2768612 -> 2764317 (-0.16%); split: -0.29%, +0.13% CodeSize: 44367648 -> 44300352 (-0.15%); split: -0.28%, +0.13% Totals from 867 (32.75% of 2647) affected shaders: Instrs: 1694745 -> 1690450 (-0.25%); split: -0.47%, +0.22% CodeSize: 27387648 -> 27320352 (-0.25%); split: -0.46%, +0.21% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:06 +00:00
Alyssa Rosenzweig	a964f321a5	jay: don't print internal without the flag Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:06 +00:00
Alyssa Rosenzweig	3a73c76373	jay: fix spiller coupling code Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:05 +00:00
Alyssa Rosenzweig	cd6c5a2f90	jay: improve spiller debug Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:05 +00:00
Alyssa Rosenzweig	d637554418	jay: fix simd32 deswizzle Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:05 +00:00
Alyssa Rosenzweig	f728e3cb05	jay: test logic op fusing Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41064>	2026-04-20 22:32:04 +00:00

1 2 3 4 5 ...

15908 commits