fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 12:58:09 +02:00

Author	SHA1	Message	Date
Iván Briano	3448f3ce4a	intel/brw: add load_coverage_mask_intel intrinsic We'll need the raw coverage mask provided to the fragment shader in a future patch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Caleb Callaway <caleb.callaway@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38879>	2026-05-11 18:15:49 +00:00
Caio Oliveira	46cd7b6e28	brw: Move brw_prog_data_init to a different file Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The generator code will be reworked, remove this unrelated function from there. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41458>	2026-05-10 00:07:15 +00:00
Caio Oliveira	2273533504	brw: Fix some indentation in brw_generator.cpp Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Will reduce noise in later changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41459>	2026-05-09 16:40:32 -07:00
Kenneth Graunke	2729b1608f	brw: Limit SIMD width based on NIR rather than first backend compile Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details I originally added this mechanism to have the first (SIMD8) compile note that certain features were in use which would prevent SIMD16/32 from compiling, so we could skip the work of trying those. But these days, there aren't many cases, and the ones we have are easily detectable based on the NIR. We can detect it earlier without even having to do the SIMD8 compile. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Kenneth Graunke	c5928d40ae	brw: Drop dead code from dispatch limit check for dual source blending We checked that ver is 11 or 12. It can't be >= 20. This is dead code. Dual source blending on Xe2 does not have native SIMD32 RT write message support, but SIMD splitting is currently lowering it to low/high SIMD16 message pairs when using SIMD32 dispatch. I'm not aware of any of the hardware errata from previous platform still applying. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Kenneth Graunke	599d26db00	brw: Set prog_data::dual_src_blend from NIR outputs written bitfield Simpler and set earlier. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Kenneth Graunke	afb97ff2af	brw: Switch FS outputs to semantic IO and FRAG_RESULT_DUAL_SRC_BLEND The new FRAG_RESULT_DUAL_SRC_BLEND option is easier to work with than looking for FRAG_RESULT_DATA0 with an index of 1. This also means we no longer care about the dual source blend index, and can just use the FRAG_RESULT location. That cascades to meaning we no longer have to store a tuple in driver_location. And, if we just need location, we can avoid populating that at all and use nir_io_semantics to get it. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Kenneth Graunke	fbaa5ad0c3	iris: Implement force_dual_color_blend_by_location via NIR We can just have iris look at its own program key and change the fragment shader output variable's location/index in the NIR. By doing this before lowering fragment shader outputs, the rest of the output lowering does the right thing, and the backend no longer has to consider hacks for broken OpenGL apps. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Rhys Perry	ec59b59b97	nir: rename nir_src_parent_instr to nir_src_use_instr sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f` sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f` sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f` There are two kinds of "parent" in relation to a src/def: - the instruction where the def or src's def is defined - the instruction which the src is a part of and where the def is used Clarify that the parent here is where the src's def is used, not where it's defined. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>	2026-05-06 17:09:22 +00:00
Lionel Landwerlin	c30a4d4fdb	anv/brw/nir: fix wa_18019110168 Several things were wrong : - incorrect offset in the FS push constant data - incorrect encoding of the 32bit values with 2 fields (remap table offset & provoking vertex) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31384>	2026-05-06 09:49:41 +00:00
Lionel Landwerlin	25bc517ef5	brw: add heap support to brw_lower_storage_image Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Lionel Landwerlin	5ec7d31e20	brw/lower_texel_address: add heap support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39478>	2026-05-05 18:21:16 +00:00
Calder Young	4120ae4963	brw: Avoid vectorizing loads in NIR if it could extend into a different page Took inspiration from RADV to make nir_opt_load_store_vectorize robust against page faults, by checking the align_offset and align_mul to see if any extra components could be overlapping into a different page. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Calder Young	3ac6233655	brw: Avoid rounding every convergent block load up to a full register To simplify things, our backend rounds convergent block loads up to a full register. This causes page faults with the scratch page disabled since the address is not always aligned to a register size. Loading smaller blocks is slightly more difficult because the SEND instruction can only write back a multiple of full registers, even if the actual data is smaller. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>	2026-05-01 19:51:41 +00:00
Caio Oliveira	1ebc14bcb9	brw: Stop tracking inline parameter usage in prog_key/prog_data Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Since inline parameter is the last field of the thread payload, the backend can always assume they may exist. They won't affect the position of other payload fields and the register allocator will reuse any unused space. In Anv, also update EmitInlineParameter for Task/Mesh/CS to reflect previous changes in inline parameter setup. Remove/Update some stale comments since we are here. Finally, remove the prog_key/prog_data bits that tracked whether inline data or a push address was needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41230>	2026-04-30 16:39:22 +00:00
Caio Oliveira	e1745e0bd9	brw: Fix max_dispatch_width collection for CS with variable size The intention of the original commit was to make all the shaders report the same max_dispatch_width. When CS has multiple variants, this was not happening as expected. Fixes: `2acc2f18ea` ("intel/compiler: report max dispatch width statistic") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41209>	2026-04-29 15:52:04 +00:00
Ian Romanick	e301817753	brw: Don't lower phis involved in DPAS instructions to scalar On my Arc A380 (DG2), this more than doubles the performance of Jeff Bolz's cooperative matrix benchmark. With llama.cpp modified to use cooperative matrix on DG2, performance is improved by 37%. Closes: #15311 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Matt Corallo <git@bluematt.me> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41172>	2026-04-27 18:09:16 +00:00
Ian Romanick	09b43966ba	brw: Lower all phis to scalar The next commit will cause some very specific phis to not be lowered to scalar, and that's the reason the callback is used instead of nir_lower_all_phis_to_scalar. It's worth noting that the comment in nir_lower_phis_to_scalar.c specifically calls out Deus Ex as the reason some phis should not be lowered. At least on current BRW, zero shaders from Deus Ex trace were affected for spills or fills on any Intel platform. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17050005 -> 17051449 (<.01%) instructions in affected programs: 41032 -> 42476 (3.52%) helped: 29 / HURT: 159 total cycles in shared programs: 876411976 -> 876433702 (<.01%) cycles in affected programs: 1455550 -> 1477276 (1.49%) helped: 40 / HURT: 150 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 916599633 -> 916694854 (+0.01%); split: -0.00%, +0.01% CodeSize: 14705971792 -> 14708302384 (+0.02%); split: -0.00%, +0.02% Send messages: 40870114 -> 40870113 (-0.00%) Cycle count: 102360965889 -> 102364169753 (+0.00%); split: -0.00%, +0.01% Spill count: 3460669 -> 3460240 (-0.01%) Fill count: 4988325 -> 4987891 (-0.01%) Max live registers: 192914542 -> 192918153 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48848112 -> 48848128 (+0.00%) Non SSA regs after NIR: 141633613 -> 141671589 (+0.03%); split: -0.00%, +0.03% Totals from 5713 (0.28% of 2010434) affected shaders: Instrs: 5215921 -> 5311142 (+1.83%); split: -0.09%, +1.91% CodeSize: 88940784 -> 91271376 (+2.62%); split: -0.20%, +2.82% Send messages: 284751 -> 284750 (-0.00%) Cycle count: 275671864 -> 278875728 (+1.16%); split: -0.74%, +1.90% Spill count: 857 -> 428 (-50.06%) Fill count: 845 -> 411 (-51.36%) Max live registers: 667776 -> 671387 (+0.54%); split: -0.86%, +1.40% Max dispatch width: 160416 -> 160432 (+0.01%) Non SSA regs after NIR: 1127904 -> 1165880 (+3.37%); split: -0.10%, +3.47% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Matt Corallo <git@bluematt.me> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41172>	2026-04-27 18:09:16 +00:00
Alyssa Rosenzweig	bccaeb28bb	brw/nir_lower_cs_intrinsics: do some math at 16-bit Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There are less than 2^16 lanes within a threadgroup, so it is safe to do all math at 16-bit. This allows us to use 16-bit integer division which is much faster than 32-bit integer division (in terms of the lowerings). In a "hello world" kernel with variable wg size, simd32 goes 72 inst -> 57 inst on jay and 82 -> 67 inst on brw. OTOH it's a loss for non-variable wg size, so do it only there to avoid unwelcome stats regresions on Vulkan. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41084>	2026-04-24 17:13:24 +00:00
Caio Oliveira	0422165d9a	brw: Remove various unused fields Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details These are a mix of fields whose last used was removed or fields that were never used, possibly because they remained in a patch while the rest of the code changed before landing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41139>	2026-04-24 15:04:25 +00:00
Caio Oliveira	26ef12f7c1	brw: Use brw prefix to LSC helpers tied to brw Mapping from BRW ops to LSC ops. And the len() helpers that use the REG_SIZE as unit -- which is a BRW convention. Acked-by: Iván Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41006>	2026-04-22 18:25:41 +00:00
Caio Oliveira	9329da6d88	brw: Don't set saturate for SYNC instruction This helper might be used as by another instruction emission, which itself might have set the saturate bit in the default state. This might result in the SYNC being created already with saturate bit set. Since SYNC doesn't have saturate, clear that field instead of sometimes having it set. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41005>	2026-04-22 16:06:42 +00:00
Sagar Ghuge	620835926d	brw: Pass write back register for ray query messages For DG2 (Bspec 47937) has the same programming note as of Xe2+, "When this bit is set in the header, Trace Ray Message behaves like a Ray Query. This message requires a write-back message indicating RayQuery for all valid Rays (SIMD lanes) have completed." So this patch is just passing a write back destination register when we have ray query message. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41039>	2026-04-21 23:16:09 +00:00
José Roberto de Souza	64bc538f5e	intel/brw: Explicitly upcast UB to UW for SHR with vector immediates HW does not allow instructions with vector immediates to cross a GRF boundary if it has a stride. Under register pressure, the register allocator may place a temporary register across such a boundary. To resolve this, we now explicitly emit a MOV to upcast the UB payload into a UW VGRF. This ensures the SHR instruction operates on a dense, well-aligned region that satisfies hardware alignment constraints. Below is the portion of the shader exhibiting this issue: Native code for unnamed fragment shader GLSL6 (src_hash 0x9c84a007) (sha1 48745e7dae90d08f8a9bbe4dbf837de23440c841f0344e669cb8af9df79bce58) SIMD32 shader: 44 instructions. 0 loops. 354 cycles. 0:0 spills:fills, 2 sends, scheduled with mode latency-sensitive. Promoted 0 constants. GRF registers: 22. Non-SSA regs (after NIR): 11. Compacted 800 to 800 bytes (0%) mov(1) f1<1>UW g0.30<0,1,0>UW { align1 WE_all 1N }; mov(1) f1.1<1>UW g1.30<0,1,0>UW { align1 WE_all 1N I@1 }; mov(32) g2<2>UW g0.20<2,8,0>UW { align1 WE_all }; mov(32) g4<2>UW g0.21<2,8,0>UW { align1 WE_all }; mov(32) g8<2>UW g1.20<2,8,0>UW { align1 WE_all }; mov(32) g10<2>UW g1.21<2,8,0>UW { align1 WE_all }; mov(16) g12<4>UB g0.60<1,8,0>UB { align1 1H }; mov(16) g13<4>UB g1.60<1,8,0>UB { align1 2H }; add(32) g0<1>UW g2<16,8,2>UW 0x01000100V { align1 WE_all I@6 }; add(32) g1<1>UW g4<16,8,2>UW 0x01010000V { align1 WE_all I@6 }; add(32) g2<1>UW g8<16,8,2>UW 0x01000100V { align1 WE_all I@6 }; add(32) g3<1>UW g10<16,8,2>UW 0x01010000V { align1 WE_all I@6 }; shr(16) g4<1>UW g12<32,8,4>UB 0x76543210V { align1 1H I@6 }; mov(16) g14.32<4>UB g13<32,8,4>UB { align1 2H I@6 }; sync nop(1) null<0,1,0>UB { align1 WE_all 1N I@6 }; mov(16) g5<1>UW g0<16,8,2>UW { align1 1H }; sync nop(1) null<0,1,0>UB { align1 WE_all 1N I@6 }; mov(16) g0<1>UW g1<16,8,2>UW { align1 1H }; sync nop(1) null<0,1,0>UB { align1 WE_all 5N I@6 }; mov(16) g5.16<1>UW g2<16,8,2>UW { align1 2H }; sync nop(1) null<0,1,0>UB { align1 WE_all 5N I@6 }; mov(16) g0.16<1>UW g3<16,8,2>UW { align1 2H }; shr(16) g4.16<1>UW g14.32<32,8,4>UB 0x76543210V { align1 2H I@5 }; ERROR: Invalid register region for source 0. See special restrictions section. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40856>	2026-04-21 22:51:45 +00:00
Jordan Justen	fa784fffd0	brw: Don't set header_size at init since it will be re-set in later code Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Ref: `efcba73b49` ("brw: switch to new sampler payload description scheme") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41035>	2026-04-21 19:23:41 +00:00
Lionel Landwerlin	0539f26065	brw: track push constants shader stats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39451>	2026-04-21 16:29:14 +00:00
Sagar Ghuge	7a627fa8f3	anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details StackSizePerRay is the RTDispatchGlobals::AsyncStackSize and DisableRTGlobalsKnownValues is to interpret how many Max BVH levels we need to use. It's not relevant to Vulkan, since we have just 2 fixed BVH levels. Fixes: `cb423ee6` ("anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921") Fixes: `c1a44e8d` ("anv: force StackIDControl value for Wa_14021821874") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41012>	2026-04-21 01:38:34 +00:00
Tapani Pälli	8736d1a9a6	intel/compiler: implement macl part of Wa_18035690555 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `3ab9145393` ("intel/compiler: implement dummy mov for Wa_18035690555") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40941>	2026-04-19 11:39:10 +00:00
Lionel Landwerlin	a84c12414c	brw: don't support frontfacing ternary optimization on != 32bit Fix shader compilation on Crimson Desert : 16 %1995 = b32csel %1992, %1993, %1994 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40931>	2026-04-13 20:32:06 +00:00
Lionel Landwerlin	46d42b63da	brw: add support for < 32bit io values Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40821>	2026-04-13 14:01:04 +00:00
Kenneth Graunke	7468261d3d	intel/nir: Make intel_nir_lower_sparse work for either brw or jay Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	5c4cae2661	brw: lower ifind_msb for Jay I'm not sure brw wants the common lowering but Jay definitely does. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	3afd572d8f	brw: add Jay-specific SIMD selection rule In the future this might even do something clever. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	f33454e35a	brw: disable nir_opt_uniform_atomics for Jay While Jay supports subgroups, efficient reductions are TODO so it's probably better not to run this pass yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	4778fc1ab7	brw: disable hw generate local ID for jay Jay will need more work to handle these payloads properly especially in SIMD32. For now just disable the optimization for Jay for correctness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	e8b9f26561	brw: subgroup lowering for jay Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Kenneth Graunke	2780a327fa	intel: add INTEL_JAY environment variable Add a new environment variable controlling which shader stages use the experimental compiler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Kenneth Graunke	09089fdd13	nir: Add nir_texop_sparse_residency[_txf]_intel operations These lowered versions map to what Jay can deal with. The hardware is more flexible but we're not due to data model restrictions. We choose to lower to get us off the ground, we can revisit later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	181611786c	brw: round up block components Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details this ensures we don't see vec5 @load_ssbo_uniform_block_intel which requires special backend handling, instead rounding up in NIR to vec8 which the LSC can do. affects dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec3_lowp_compute. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>	2026-04-10 09:16:43 +00:00
Alyssa Rosenzweig	70e246d7bc	brw: chop up unaligned access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>	2026-04-10 09:16:43 +00:00
Alyssa Rosenzweig	c1e1cc9b01	brw: lower mem access sizes even for UBOs Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>	2026-04-10 09:16:43 +00:00
Alyssa Rosenzweig	9d82888383	brw: lower 16-bit mulh Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40877>	2026-04-10 09:16:42 +00:00
Kenneth Graunke	0b99c88337	nir, brw: lower scratch in NIR Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This will let us share a common scratch swizzling between brw and jay. Changes by Ken: - Use an immediate SIMD width when known so we don't need to re-lower - Switch to load_simd_width_intel because it may not match info->api_subgroup_size on Vulkan without VK_EXT_subgroup_size_control - Stop using DWord Scattered Write messages for scratch. These take an offset in DWords, and our offsets are now always in bytes. This also means that we no longer create MEMORY_OPCODE_* IR with inconsistent units of either bytes or dwords. Yikes. We use byte scattered messages now. fossil-db stats on Battlemage: Instrs: 500477504 -> 500450056 (-0.01%); split: -0.01%, +0.00% CodeSize: 7807432368 -> 7806786192 (-0.01%); split: -0.01%, +0.00% Cycle count: 62404008370 -> 62398437734 (-0.01%); split: -0.01%, +0.00% Fill count: 546690 -> 546695 (+0.00%); split: -0.00%, +0.00% Max live registers: 141257956 -> 141258100 (+0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 72350283 -> 72336544 (-0.02%) Totals from 99 (0.01% of 1581969) affected shaders: Instrs: 366593 -> 339145 (-7.49%); split: -7.58%, +0.09% CodeSize: 6425936 -> 5779760 (-10.06%); split: -10.06%, +0.00% Cycle count: 2412009876 -> 2406439240 (-0.23%); split: -0.26%, +0.03% Fill count: 19675 -> 19680 (+0.03%); split: -0.02%, +0.04% Max live registers: 17600 -> 17744 (+0.82%); split: -0.09%, +0.91% Non SSA regs after NIR: 37894 -> 24155 (-36.26%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Alyssa Rosenzweig	140616d26a	brw: scalarize even 64-bit scratch access No, I don't know how this worked before, thanks for asking. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Alyssa Rosenzweig	15b11635a2	brw: Move intel_nir_opt_peephole_imul32x16 later in compilation (Split by Ken out of a patch authored by Alyssa.) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Kenneth Graunke	e5598166b0	brw: Have brw_nir_apply_key call brw_nir_lower_simd for all stages brw_nir_apply_key typically knows the dispatch width (it's fixed for geometry stages, and we clone the NIR for compute and mesh shaders). For compute/mesh, this was the very next thing called. For the others, if we know the width, there's no reason not to lower it. Scratch lowering will start using load_simd_width_intel soon, so we need it to work in all stages. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Kenneth Graunke	765d74eebe	brw: Set nir->info.{min,max}_subgroup_size in brw_nir_apply_key This records the actual SIMD width we selected for the shader, in all cases except fragment shaders, where we don't know it yet. MR 37258 notes that "Backends can update [these fields] when they make new decisions about the subgroup size" - which is what we now do. Note that nir->info.api_subgroup_size may be different than min/max subgroup size on Vulkan prior to SPV1.6/VK_EXT_subgroup_size_control, so we do not alter that. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Kenneth Graunke	d7d2d7aceb	brw: Support load_simd_width_intel for fragment shaders This lets us emit NIR code based on the SIMD size. For non-fragment stages, we'll replace it with a constant and optimize, but for FS, we delay it until the backend. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Kenneth Graunke	cac9f670d1	intel/compiler: Use nir_static_workgroup_size helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Tapani Pälli	3ab9145393	intel/compiler: implement dummy mov for Wa_18035690555 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37804>	2026-04-09 07:30:01 +00:00

1 2 3 4 5 ...

440 commits