fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 00:28:08 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	ef2a95a40a	brw: move brw_can_coherent_fb_fetch to a C header this isn't C++ brw code, it's just a devinfo query. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>	2026-03-02 12:44:42 +00:00
Alyssa Rosenzweig	d6d1dc5822	brw: move brw_nir_pack_vs_input to brw_nir.c It's just a pass like the others. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>	2026-03-02 12:44:42 +00:00
Caio Oliveira	df4042371f	anv: Set PIPELINE_SELECT systolic mode based on shader usage For Gfx125 workloads that use systolic mode, this might mean an extra PIPELINE_SELECT when flipping between a compute shader that use the mode and another that doesn't use the mode (or vice-versa). Reviewed-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>	2026-02-26 19:05:56 +00:00
Caio Oliveira	ffc3219d57	brw: Add lowering for nir_cmat_call_op_per_element_op Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39904>	2026-02-26 18:45:20 +00:00
Caio Oliveira	63e1592f8d	brw/scoreboard: Don't track dependencies for UNDEFs Dependencies in UNDEFs were already not propagated by update_inst_scoreboard(), since the instruction there was not consider neither ordered or unordered; and also not being used to resolve implicit dependencies. The generator was already ignoring any baked dependency but for cases where UNDEF had two dependencies, a sync nop would be generated -- which would be redundant with a later sync nop. Since we know UNDEFs have no dependencies, stop treating them specially when trimming dependencies. This patch remove this particular class of redundant sync nops. No functional change is expected. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39875>	2026-02-26 06:54:48 +00:00
Lionel Landwerlin	487586fefa	anv: implement inline parameter promotion from push constants Push constants on bindless stages of Gfx12.5+ don't get the data delivered in the registers automatically. Instead the shader needs to load the data with SEND messages. Those stages do get a single InlineParameter 32B block of data delivered into the EU. We can use that to promote some of the push constant data that has to be pulled otherwise. The driver will try to promote all push constant data (app + driver values) if it can, if it can't it'll try to promote only the driver values (usually a shader will only use a few driver values). If even the drivers values won't fit, give up and don't use the inline parameter at all. LNL internal fossil-db: Totals from 315738 (20.08% of 1572649) affected shaders: Instrs: 155053691 -> 154920901 (-0.09%); split: -0.09%, +0.00% CodeSize: 2578204272 -> 2574991568 (-0.12%); split: -0.15%, +0.02% Send messages: 8235628 -> 8184485 (-0.62%); split: -0.62%, +0.00% Cycle count: 43911938816 -> 43901857748 (-0.02%); split: -0.05%, +0.03% Spill count: 481329 -> 473185 (-1.69%); split: -1.82%, +0.13% Fill count: 405617 -> 399243 (-1.57%); split: -1.86%, +0.28% Max live registers: 34309395 -> 34309300 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 8298224 -> 8299168 (+0.01%) Non SSA regs after NIR: 18492887 -> 17631285 (-4.66%); split: -4.73%, +0.08% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Lionel Landwerlin	7f19814414	brw/nir: handle inline_data_intel more like push_data_intel It's pretty much the same mechanism, except it's a different register location. With this change we gain indirect loading support. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Caio Oliveira	922e3c75cf	brw: Explicitly set group=0 in generator for SYNC used in workaround Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Instead of using whatever group was set by the previous instruction. No behavior change, just normalizes what we generate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>	2026-02-20 17:11:59 +00:00
Caio Oliveira	4382d51cd0	brw: Make brw_builder::uniform() ignore previous group The `group()` helper creates the new builder "relative" to the existing one, so this was resulting in some uniform instructions having a non-zero channel offset ("group") -- which was surprising and had no practical effect. Normalize to always use group = 0. No change in behavior expected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>	2026-02-20 16:50:41 +00:00
Ian Romanick	da1fd9786b	elk/cmod: Don't propagate from CMP to ADD if there is a write between Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details If either source of the CMP is modified before an appropriate ADD is found, the ADD and the CMP will not have the same result. No shader-db changes on any ELK platform. I suspect the problematic cases only occur after scheduling has rearranged instructions. This is likely the reason BRW didn't experience this problem until `09450faf`. Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:55 +00:00
Ian Romanick	bdbfe8de4d	elk/cmod: Don't propagate from CMP to possible Inf + (-Inf) This is a backport of BRW `e26270249b`. shader-db: All Intel platforms had similar results. (Broadwell shown) total instructions in shared programs: 18623918 -> 18624594 (<.01%) instructions in affected programs: 125179 -> 125855 (0.54%) helped: 0 / HURT: 139 total cycles in shared programs: 957073100 -> 957072484 (<.01%) cycles in affected programs: 16534168 -> 16533552 (<.01%) helped: 42 / HURT: 68 Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:54 +00:00
Ian Romanick	d1614cd6db	brw/cmod: Don't propagate from CMP to ADD if there is a write between If either source of the CMP is modified before an appropriate ADD is found, the ADD and the CMP will not have the same result. shader-db: Lunar Lake total instructions in shared programs: 17098815 -> 17098818 (<.01%) instructions in affected programs: 1187 -> 1190 (0.25%) helped: 0 / HURT: 3 total cycles in shared programs: 876858960 -> 876858968 (<.01%) cycles in affected programs: 6878 -> 6886 (0.12%) helped: 0 / HURT: 1 Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) total instructions in shared programs: 20034973 -> 20034984 (<.01%) instructions in affected programs: 4599 -> 4610 (0.24%) helped: 0 / HURT: 11 total cycles in shared programs: 881033088 -> 881033108 (<.01%) cycles in affected programs: 57872 -> 57892 (0.03%) helped: 0 / HURT: 5 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 918873064 -> 918873269 (+0.00%) CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00% Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00% Totals from 205 (0.01% of 2011421) affected shaders: Instrs: 290415 -> 290620 (+0.07%) CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03% Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02% Closes: #14874 Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:54 +00:00
José Roberto de Souza	39ec9e3448	intel/brw: Add and call brw_lsc_supports_base_offset() in places that checks for support of this feature Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>	2026-02-19 16:53:03 +00:00
José Roberto de Souza	91c5744e25	intel/brw: Use computed push constants size in brw_assign_urb_setup() It was already computed in brw_shader::assign_curb_setup() so we can use it in brw_assign_urb_setup(). There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when push_sizes were not multiple of REG_SIZE, the first one was aligning every push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum of all push_size. By luck the only places that did not had a push_size aligned to REG_SIZE only had one push_size, so this was not an issue. So here also fixing this mismatch and adding an assert to caught any future mismatch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>	2026-02-19 16:53:03 +00:00
Alyssa Rosenzweig	5386e93865	brw: use data helper Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>	2026-02-19 14:47:11 +00:00
Kenneth Graunke	1478329c53	iris: Move ALT mode handling from brw to iris Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We just read this from the NIR and store it in iris_compiled_shader, there's no reason for the backend compiler to be involved. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:51:00 +00:00
Kenneth Graunke	b985494d6f	iris: Create our own enums for system values These days, our system value concept is just about iris_program communicating to iris_state which values to upload into a UBO. Nowhere in that process is the backend compiler involved, so it doesn't make sense for there to be brw/elk mechanisms. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:51:00 +00:00
Kenneth Graunke	53c5798194	iris: Move passthrough TCS generation out of brw and into iris iris needs this, but anv does not, and it's just a small wrapper around common NIR lowering anyway. This also removes some brw/elk splitting. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:50:59 +00:00
Kenneth Graunke	341687a019	brw: Drop extra validation from TCS passthrough creation nir_create_passthrough_tcs already validates the result, we don't need to validate a second time. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:50:59 +00:00
Kenneth Graunke	a8481295d8	brw: Only lower system values for passthrough TCS This cuts 75 passes that do nothing useful. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:50:58 +00:00
Rhys Perry	f44de53586	nir: only set fp_math_ctrl if meaningful Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>	2026-02-18 14:04:22 +00:00
Kenneth Graunke	add69407c7	brw: Use memset for initializing varying/slot maps Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>	2026-02-16 15:15:38 -08:00
Kenneth Graunke	19d9e10f4d	brw: Drop VUE header values and position from wm_prog_data->inputs The FS doesn't read these from the VUE so we don't care about them. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>	2026-02-16 15:15:36 -08:00
Kenneth Graunke	5e48094d72	brw: Drop BRW_VARYING_SLOT_PAD and brw_varying_slot enum In elk, we tried to store our own "driver" enum values after Mesa's VARYING_SLOT_MAX. In brw, we eliminated all of these except for an unnecessary "BRW_VARYING_SLOT_PAD" value. This was used for empty slots, so vue_map::slot_to_varying[] could store something. This patch replaces BRW_VARYING_SLOT_PAD with -1. Our "driver" enum values overlapped with VARYING_SLOT_PATCH0, leading to unnecessary headaches. Now gl_varying_slot_name_for_stage will do the right thing for both regular and patch varyings. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>	2026-02-16 15:15:35 -08:00
Kenneth Graunke	16ab31f358	brw: Use NUM_TOTAL_VARYING_SLOTS instead of VARYING_SLOT_TESS_MAX This is a bit larger, but also clearer. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>	2026-02-16 15:15:34 -08:00
Kenneth Graunke	3dbeaf18c8	iris: Defeature native two-sided color support This drops native support for legacy GL's two-sided color feature in favor of lowering it via nir_lower_two_sided_color(). Instead of having a whole bunch of state management hassle to set up the SBE unit to swizzle between the COL and BFC VUE slots, and have it transparently deliver one or the other to the fragment shader, we simply deliver both and insert a conditional select there: (is-front-facing ? front color : back color) This also works even for > 16 varyings, where swizzling via the SBE unit isn't viable. zink, asahi, freedreno, lima, panfrost, r600, v3d, and vc4 all use this lowering rather than having native support. Only four games in our shader-db even use this feature. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38121>	2026-02-16 15:15:33 -08:00
Kenneth Graunke	e0fc4a7c54	brw: Drop brw_compiler option from brw_no_indirect_mask() It's unused. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:49 +00:00
Kenneth Graunke	c2df854359	brw: Make a devinfo temporary in lower_mem_access_bitsizes Less typing. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:49 +00:00
Kenneth Graunke	f873cfd7a0	brw: Pass devinfo to lower_bit_size, not compiler We only need devinfo. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:48 +00:00
Kenneth Graunke	1df2158f50	brw: Delete use_bindless_sampler_offset flag No drivers use this. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:48 +00:00
Kenneth Graunke	4bdef9824a	anv, brw: Consolidate ex_bso bits to a static devinfo inline If we have extended bindless surface offset (ExBSO) support, we want to use it. Consolidate the anv_physical_device and brw_compiler bits into a single static inline that take devinfo. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:47 +00:00
Kenneth Graunke	aa939db0c5	iris: Move recompile debugging to work on iris program keys iris decides to do recompiles or not based on its own program keys, not the brw or elk keys. So, it makes sense to handle the "why did we have to recompile a new variant" debugging based on those keys as well. It also unifies the code, eliminating a brw/elk split, so it's actually less code. Additionally, this was the only remaining user of the brw code, so we can delete that, resulting in even larger cleanups. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:42 +00:00
Kenneth Graunke	d013ef4c0f	brw: Make use_tcs_multi_patch a static inline taking devinfo This simplifies some iris wrapping for multiple compilers and also saves some space in the brw_compiler singleton. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:42 +00:00
Kenneth Graunke	9531c6b89e	brw: Make indirect_ubos_use_sampler a static inline bool taking devinfo Having the named field allowed us to indicate that our code conditions are referring to the specific decision about how we handle indirect UBOs, rather than some other arbitrary hardware change. Still, there's no need to store this in a singleton struct - we can easily have a static inline bool that does the devinfo check for us. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:42 +00:00
Kenneth Graunke	de03b38daa	intel/elk, hasvk: Drop indirect_ubos_use_sampler option and DP code This is always set to true for elk platforms. No need for the option. crocus also assumes that we take the sampler path. hasvk had support for both paths (leftover from when the driver still supported Gfx12). We started using HDC messages for indirect UBO access on Tigerlake (Gfx12.x) because of cache reworks that made it more viable. On all prior platforms, we used the sampler because it has additional L1/L2 caches that the dataport lacks. Additionally, Ivybridge and nearby platforms had notoriously slow L3 access in some very common cases. Note that we do use the dataport for constant-offset UBO access, since we can combine many reads into larger block loads. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:42 +00:00
Ian Romanick	df704bd38e	elk: Call nir_opt_algebraic_late in elk_postprocess_nir Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Make sure that lowering undone in elk_nir_optimize are reapplied. No shader-db or fossil-db changes on any Intel platform. This is most likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't fossil-db test either of those platforms. I tried doing a similar thing here as is done in BRW (previous commit), but that caused a couple Haswell shaders to fall off a performance cliff: total spills in shared programs: 8247 -> 8311 (0.78%) spills in affected programs: 6 -> 70 (1066.67%) helped: 0 / HURT: 2 total fills in shared programs: 8558 -> 8910 (4.11%) fills in affected programs: 6 -> 358 (5866.67%) helped: 0 / HURT: 2 Fixes: `442daeb54a` ("nir/opt_algebraic: use fcanonicalize") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>	2026-02-14 02:06:59 +00:00
Ian Romanick	11b96a84b0	brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts Move the call to nir_opt_algebraic_late after the last time brw_nir_optimize might be called. nir_opt_algebraic_distribute_src_mods works together with the late algebraic optimizations, so move it also. shader-db: Lunar Lake total instructions in shared programs: 17081222 -> 17080842 (<.01%) instructions in affected programs: 419931 -> 419551 (-0.09%) helped: 545 / HURT: 826 total cycles in shared programs: 878437752 -> 879236226 (0.09%) cycles in affected programs: 506003142 -> 506801616 (0.16%) helped: 3091 / HURT: 3189 LOST: 18 GAINED: 16 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19994270 -> 19993231 (<.01%) instructions in affected programs: 490499 -> 489460 (-0.21%) helped: 660 / HURT: 800 total cycles in shared programs: 882498776 -> 882834186 (0.04%) cycles in affected programs: 477858602 -> 478194012 (0.07%) helped: 3458 / HURT: 3564 total fills in shared programs: 4371 -> 4370 (-0.02%) fills in affected programs: 7 -> 6 (-14.29%) helped: 1 / HURT: 0 LOST: 28 GAINED: 10 Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) total instructions in shared programs: 19943849 -> 19942782 (<.01%) instructions in affected programs: 467384 -> 466317 (-0.23%) helped: 655 / HURT: 796 total cycles in shared programs: 860085674 -> 861410289 (0.15%) cycles in affected programs: 426900998 -> 428225613 (0.31%) helped: 3250 / HURT: 3441 LOST: 19 GAINED: 14 fossil-db: Lunar Lake Totals: Instrs: 926472091 -> 926204838 (-0.03%); split: -0.04%, +0.01% CodeSize: 14845921056 -> 14842776112 (-0.02%); split: -0.10%, +0.08% Send messages: 41459570 -> 41459574 (+0.00%); split: -0.00%, +0.00% Cycle count: 104481085069 -> 104583692712 (+0.10%); split: -0.14%, +0.24% Spill count: 3454651 -> 3457340 (+0.08%); split: -0.15%, +0.23% Fill count: 4958779 -> 4958487 (-0.01%); split: -0.46%, +0.45% Max live registers: 193805970 -> 193839002 (+0.02%); split: -0.00%, +0.02% Max dispatch width: 49114416 -> 49113776 (-0.00%); split: +0.01%, -0.01% Non SSA regs after NIR: 142953905 -> 142800740 (-0.11%); split: -0.12%, +0.01% Totals from 420256 (20.80% of 2020128) affected shaders: Instrs: 448571327 -> 448304074 (-0.06%); split: -0.09%, +0.03% CodeSize: 7312002800 -> 7308857856 (-0.04%); split: -0.21%, +0.17% Send messages: 17716494 -> 17716498 (+0.00%); split: -0.00%, +0.00% Cycle count: 52178854998 -> 52281462641 (+0.20%); split: -0.28%, +0.48% Spill count: 2945654 -> 2948343 (+0.09%); split: -0.17%, +0.26% Fill count: 4404768 -> 4404476 (-0.01%); split: -0.51%, +0.51% Max live registers: 60875448 -> 60908480 (+0.05%); split: -0.01%, +0.06% Max dispatch width: 9455280 -> 9454640 (-0.01%); split: +0.04%, -0.04% Non SSA regs after NIR: 60542740 -> 60389575 (-0.25%); split: -0.28%, +0.02% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 1000081384 -> 999726726 (-0.04%); split: -0.05%, +0.01% CodeSize: 16764458080 -> 16761624256 (-0.02%); split: -0.09%, +0.07% Subgroup size: 27599528 -> 27599544 (+0.00%) Send messages: 45538933 -> 45538951 (+0.00%); split: -0.00%, +0.00% Cycle count: 93303830912 -> 93370118192 (+0.07%); split: -0.19%, +0.26% Spill count: 3739306 -> 3739719 (+0.01%); split: -0.22%, +0.23% Fill count: 5089719 -> 5083626 (-0.12%); split: -0.56%, +0.44% Max live registers: 122041364 -> 122055848 (+0.01%); split: -0.00%, +0.01% Max dispatch width: 38117296 -> 38127200 (+0.03%); split: +0.06%, -0.03% Non SSA regs after NIR: 164296197 -> 164299306 (+0.00%); split: -0.01%, +0.01% Totals from 338754 (14.82% of 2285730) affected shaders: Instrs: 452723479 -> 452368821 (-0.08%); split: -0.10%, +0.03% CodeSize: 7861878032 -> 7859044208 (-0.04%); split: -0.19%, +0.16% Subgroup size: 16 -> 32 (+100.00%) Send messages: 17050010 -> 17050028 (+0.00%); split: -0.00%, +0.00% Cycle count: 52881801997 -> 52948089277 (+0.13%); split: -0.33%, +0.46% Spill count: 3271458 -> 3271871 (+0.01%); split: -0.25%, +0.26% Fill count: 4628422 -> 4622329 (-0.13%); split: -0.61%, +0.48% Max live registers: 30738902 -> 30753386 (+0.05%); split: -0.01%, +0.06% Max dispatch width: 4787264 -> 4797168 (+0.21%); split: +0.47%, -0.26% Non SSA regs after NIR: 61748026 -> 61751135 (+0.01%); split: -0.03%, +0.03% Tiger Lake Totals: Instrs: 1011068379 -> 1010977290 (-0.01%); split: -0.03%, +0.02% CodeSize: 14197751744 -> 14197683040 (-0.00%); split: -0.07%, +0.07% Send messages: 46431228 -> 46431220 (-0.00%); split: -0.00%, +0.00% Cycle count: 85066526419 -> 85085088071 (+0.02%); split: -0.16%, +0.18% Spill count: 3853750 -> 3855185 (+0.04%); split: -0.15%, +0.19% Fill count: 6716746 -> 6719594 (+0.04%); split: -0.25%, +0.29% Max live registers: 122307387 -> 122326083 (+0.02%); split: -0.00%, +0.02% Max dispatch width: 38009632 -> 38003280 (-0.02%); split: +0.03%, -0.05% Non SSA regs after NIR: 158403572 -> 158415390 (+0.01%); split: -0.01%, +0.02% Totals from 277728 (12.17% of 2281577) affected shaders: Instrs: 349206856 -> 349115767 (-0.03%); split: -0.07%, +0.05% CodeSize: 5042621104 -> 5042552400 (-0.00%); split: -0.20%, +0.20% Send messages: 13132243 -> 13132235 (-0.00%); split: -0.00%, +0.00% Cycle count: 36183327716 -> 36201889368 (+0.05%); split: -0.38%, +0.43% Spill count: 2210072 -> 2211507 (+0.06%); split: -0.26%, +0.33% Fill count: 4188439 -> 4191287 (+0.07%); split: -0.39%, +0.46% Max live registers: 24956695 -> 24975391 (+0.07%); split: -0.02%, +0.09% Max dispatch width: 3948832 -> 3942480 (-0.16%); split: +0.32%, -0.48% Non SSA regs after NIR: 45616425 -> 45628243 (+0.03%); split: -0.04%, +0.06% Ice Lake Totals: Instrs: 1009584306 -> 1009411757 (-0.02%); split: -0.02%, +0.01% CodeSize: 12593466880 -> 12592958096 (-0.00%); split: -0.01%, +0.01% Send messages: 47274203 -> 47274171 (-0.00%); split: -0.00%, +0.00% Cycle count: 84920281455 -> 84914027301 (-0.01%); split: -0.05%, +0.04% Spill count: 2988523 -> 2986191 (-0.08%); split: -0.14%, +0.07% Fill count: 5296078 -> 5288737 (-0.14%); split: -0.21%, +0.07% Max live registers: 125429384 -> 125444786 (+0.01%); split: -0.00%, +0.02% Max dispatch width: 41269072 -> 41267312 (-0.00%); split: +0.03%, -0.03% Non SSA regs after NIR: 163223895 -> 163236623 (+0.01%); split: -0.01%, +0.02% Totals from 243818 (10.45% of 2334244) affected shaders: Instrs: 296953759 -> 296781210 (-0.06%); split: -0.08%, +0.02% CodeSize: 3643224480 -> 3642715696 (-0.01%); split: -0.04%, +0.03% Send messages: 11518671 -> 11518639 (-0.00%); split: -0.00%, +0.00% Cycle count: 33065548412 -> 33059294258 (-0.02%); split: -0.13%, +0.11% Spill count: 1346515 -> 1344183 (-0.17%); split: -0.32%, +0.15% Fill count: 2537906 -> 2530565 (-0.29%); split: -0.43%, +0.14% Max live registers: 21476776 -> 21492178 (+0.07%); split: -0.02%, +0.09% Max dispatch width: 3727288 -> 3725528 (-0.05%); split: +0.31%, -0.35% Non SSA regs after NIR: 41050474 -> 41063202 (+0.03%); split: -0.04%, +0.07% Skylake Totals: Instrs: 513573157 -> 513462971 (-0.02%); split: -0.02%, +0.00% CodeSize: 5950280672 -> 5950001392 (-0.00%); split: -0.01%, +0.00% Send messages: 24909757 -> 24909758 (+0.00%); split: -0.00%, +0.00% Cycle count: 57636102242 -> 57634726342 (-0.00%); split: -0.03%, +0.03% Spill count: 627286 -> 627241 (-0.01%); split: -0.01%, +0.00% Fill count: 837888 -> 837804 (-0.01%); split: -0.01%, +0.00% Max live registers: 87272271 -> 87284192 (+0.01%); split: -0.00%, +0.02% Max dispatch width: 32278832 -> 32271800 (-0.02%); split: +0.02%, -0.04% Non SSA regs after NIR: 87387713 -> 87387614 (-0.00%); split: -0.00%, +0.00% Totals from 177432 (10.30% of 1722906) affected shaders: Instrs: 127170648 -> 127060462 (-0.09%); split: -0.10%, +0.01% CodeSize: 1443406368 -> 1443127088 (-0.02%); split: -0.03%, +0.01% Send messages: 5444220 -> 5444221 (+0.00%); split: -0.00%, +0.00% Cycle count: 15423028495 -> 15421652595 (-0.01%); split: -0.10%, +0.10% Spill count: 235844 -> 235799 (-0.02%); split: -0.03%, +0.01% Fill count: 333783 -> 333699 (-0.03%); split: -0.03%, +0.01% Max live registers: 13765573 -> 13777494 (+0.09%); split: -0.01%, +0.10% Max dispatch width: 3086880 -> 3079848 (-0.23%); split: +0.24%, -0.47% Non SSA regs after NIR: 17623772 -> 17623673 (-0.00%); split: -0.00%, +0.00% Fixes: `442daeb54a` ("nir/opt_algebraic: use fcanonicalize") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>	2026-02-14 02:06:59 +00:00
Ian Romanick	5af0b8bd09	brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline Make sure that lowering undone in brw_nir_optimize are reapplied. No shader-db changes on any Intel platform. Why are there fossil-db changes on platforms that don't support ray tracing? Lunar Lake Totals: Instrs: 926636441 -> 926636313 (-0.00%); split: -0.00%, +0.00% Send messages: 41510729 -> 41510723 (-0.00%); split: -0.00%, +0.00% Cycle count: 104509492613 -> 104509490569 (-0.00%); split: -0.00%, +0.00% Max live registers: 193792922 -> 193792890 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 150091934 -> 150092170 (+0.00%); split: -0.00%, +0.00% Totals from 10 (0.00% of 2020428) affected shaders: Instrs: 8142 -> 8014 (-1.57%); split: -3.14%, +1.57% Send messages: 192 -> 186 (-3.12%); split: -7.29%, +4.17% Cycle count: 131892 -> 129848 (-1.55%); split: -6.93%, +5.38% Max live registers: 1442 -> 1410 (-2.22%); split: -3.05%, +0.83% Non SSA regs after NIR: 950 -> 1186 (+24.84%); split: -26.95%, +51.79% Meteor Lake Totals: Instrs: 1000805547 -> 1000805543 (-0.00%); split: -0.00%, +0.00% Cycle count: 93131592265 -> 93131619619 (+0.00%); split: -0.00%, +0.00% Max live registers: 122081268 -> 122081244 (-0.00%); split: -0.00%, +0.00% Totals from 16 (0.00% of 2286241) affected shaders: Instrs: 18652 -> 18648 (-0.02%); split: -1.39%, +1.37% Cycle count: 369520 -> 396874 (+7.40%); split: -2.94%, +10.34% Max live registers: 1350 -> 1326 (-1.78%); split: -4.15%, +2.37% DG2 Totals: Instrs: 999834626 -> 999834651 (+0.00%); split: -0.00%, +0.00% Send messages: 45719398 -> 45719403 (+0.00%); split: -0.00%, +0.00% Cycle count: 93118238139 -> 93118269557 (+0.00%); split: -0.00%, +0.00% Max live registers: 122098944 -> 122098936 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 169413734 -> 169413661 (-0.00%); split: -0.00%, +0.00% Totals from 13 (0.00% of 2286795) affected shaders: Instrs: 18799 -> 18824 (+0.13%); split: -1.04%, +1.18% Send messages: 492 -> 497 (+1.02%); split: -2.44%, +3.46% Cycle count: 352838 -> 384256 (+8.90%); split: -1.08%, +9.98% Max live registers: 1237 -> 1229 (-0.65%); split: -2.91%, +2.26% Non SSA regs after NIR: 2191 -> 2118 (-3.33%); split: -20.86%, +17.53% Tiger Lake Totals: Instrs: 1011816778 -> 1011816714 (-0.00%); split: -0.00%, +0.00% Send messages: 46515289 -> 46515285 (-0.00%); split: -0.00%, +0.00% Cycle count: 85148902406 -> 85148894668 (-0.00%); split: -0.00%, +0.00% Max live registers: 122362180 -> 122362172 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 38036160 -> 38036176 (+0.00%) Non SSA regs after NIR: 160317521 -> 160317649 (+0.00%); split: -0.00%, +0.00% Totals from 6 (0.00% of 2282318) affected shaders: Instrs: 9204 -> 9140 (-0.70%); split: -1.43%, +0.74% Send messages: 258 -> 254 (-1.55%); split: -3.10%, +1.55% Cycle count: 287652 -> 279914 (-2.69%); split: -3.29%, +0.60% Max live registers: 552 -> 544 (-1.45%); split: -2.90%, +1.45% Max dispatch width: 48 -> 64 (+33.33%) Non SSA regs after NIR: 914 -> 1042 (+14.00%); split: -14.00%, +28.01% Ice Lake Totals: Instrs: 1012203285 -> 1012203249 (-0.00%); split: -0.00%, +0.00% Send messages: 47358859 -> 47358858 (-0.00%); split: -0.00%, +0.00% Cycle count: 85112165276 -> 85112171905 (+0.00%); split: -0.00%, +0.00% Max live registers: 125545002 -> 125544992 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 41335696 -> 41335656 (-0.00%) Non SSA regs after NIR: 166448597 -> 166448602 (+0.00%); split: -0.00%, +0.00% Totals from 13 (0.00% of 2335519) affected shaders: Instrs: 16486 -> 16450 (-0.22%); split: -1.67%, +1.46% Send messages: 368 -> 367 (-0.27%); split: -4.89%, +4.62% Cycle count: 347643 -> 354272 (+1.91%); split: -1.34%, +3.25% Max live registers: 1104 -> 1094 (-0.91%); split: -3.80%, +2.90% Max dispatch width: 192 -> 152 (-20.83%) Non SSA regs after NIR: 2100 -> 2105 (+0.24%); split: -21.76%, +22.00% Skylake Totals: Instrs: 504548665 -> 504548057 (-0.00%); split: -0.00%, +0.00% Send messages: 24479148 -> 24479118 (-0.00%); split: -0.00%, +0.00% Cycle count: 57575198140 -> 57575179256 (-0.00%); split: -0.00%, +0.00% Max live registers: 85570671 -> 85570575 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 85097646 -> 85098486 (+0.00%); split: -0.00%, +0.00% Totals from 22 (0.00% of 1703671) affected shaders: Instrs: 19866 -> 19258 (-3.06%); split: -3.72%, +0.66% Send messages: 464 -> 434 (-6.47%); split: -8.19%, +1.72% Cycle count: 250854 -> 231970 (-7.53%); split: -9.23%, +1.70% Max live registers: 2024 -> 1928 (-4.74%); split: -5.53%, +0.79% Non SSA regs after NIR: 2498 -> 3338 (+33.63%); split: -8.33%, +41.95% Fixes: `442daeb54a` ("nir/opt_algebraic: use fcanonicalize") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>	2026-02-14 02:06:59 +00:00
Ian Romanick	fd29183901	elk: Use F16TO32 for nir_op_f2f32 of float16 source This matches the behavior of nir_op_unpack_half_2x16_split_x. Gfx7 uses a special opcode for this conversion. Fixes numerous assertion failures in shader-db on Ivy Bridge and Haswell. I am not sure why this was never encountered previously. Fixes: `609c46cf23` ("nir/lower_alu_width: emit f2f32 for unpack_half_2x16") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>	2026-02-14 02:06:59 +00:00
Alyssa Rosenzweig	bd5ebbb2f8	brw: drop buggy SLM optimization Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This was incorrect for OpenCL due to the possibility of variable shared memory existing despite shared_size == 0. Fortunately the optimization it was trying to do should be done in NIR via nir_opt_barrier_modes so we can just drop the brw code and move on with our merry lives. Fixes OpenCL tests on Iris: non_uniform_work_group non_uniform_3d_barriers basic async_strided_copy_local_to_global Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39795>	2026-02-13 20:28:28 +00:00
Lionel Landwerlin	1f1f484570	brw/iris: move ubo range analysis pass to iris Anv isn't using this pass anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:26 +00:00
Lionel Landwerlin	d1a1e98e4e	brw: handle non-GRF aligned pushed UBO masking Right now all the drivers align push data to GRF (32B pre Xe2, 64B post Xe2) but the push constant delivery mechanism can actually pack 32B ranges so alignment is not required. Off course we need the push UBO masking to deal with unaligned pushed ranges. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Calder Young <cgiacun@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:25 +00:00
Lionel Landwerlin	c1c9048dbf	anv: add a couple of surfaces to read descriptors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:25 +00:00
Sagar Ghuge	1fb8435b77	nir: Add nir_resource_intel_internal entry Will use the load/store_ssbo with nir_resource_intel_internal later in this series. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:22 +00:00
Lionel Landwerlin	2ef29502ed	brw: enable ex_bso for LSC_SS Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:22 +00:00
Lionel Landwerlin	9bb152c9a9	brw: make PULL_CONSTANT opcodes more like MEMORY opcodes Using binding & binding_type sources. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:22 +00:00
Matt Turner	14c65322e8	elk/cse: use copies in `operands_match` instead of in-place modification `operands_match` was modifying instruction source operands in-place (through the `elk_fs_reg *src` pointer member) and relying on a save/restore pattern to undo the modifications. Work on local copies instead, which is simpler and avoids mutating shared state in a comparison function. Fixes: `47c4b38540` ("i965/fs: Allow CSE to handle MULs with negated arguments.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>	2026-02-11 18:43:03 +00:00
Matt Turner	93f39f87c4	elk/cse: fix `operands_match` corrupting non-IMM register data The MUL case in `operands_match` was reading and writing the `.f` union member unconditionally, even when the register's `.file != IMM`. In that case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the `fabsf()` call could corrupt the `.nr` by clearing bit 31. Guard all `.f` accesses with `.file == IMM` checks. Fixes: `47c4b38540` ("i965/fs: Allow CSE to handle MULs with negated arguments.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>	2026-02-11 18:43:03 +00:00
Matt Turner	b302faad8b	brw/cse: use copies in `operands_match` instead of in-place modification `operands_match` was modifying instruction source operands in-place (through the `brw_reg *src` pointer member) and relying on a save/restore pattern to undo the modifications. Work on local copies instead, which is simpler and avoids mutating shared state in a comparison function. Fixes: `47c4b38540` ("i965/fs: Allow CSE to handle MULs with negated arguments.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>	2026-02-11 18:43:02 +00:00
Matt Turner	f5e0f63216	brw/cse: fix `operands_match` corrupting non-IMM register data The MUL case in `operands_match` was reading and writing the `.f` union member unconditionally, even when the register's `.file != IMM`. In that case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the `fabsf()` call could corrupt the `.nr` by clearing bit 31. Guard all `.f` accesses with `.file == IMM` checks. Fixes: `47c4b38540` ("i965/fs: Allow CSE to handle MULs with negated arguments.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>	2026-02-11 18:43:02 +00:00

1 2 3 4 5 ...

4985 commits