fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 08:18:11 +02:00

Author	SHA1	Message	Date
Marek Olšák	c0ac992a2a	Remove mesa-sha1.h Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	53c64973e8	Inline _mesa_sha1_compute/format, remove the other unused ones _mesa_sha1_format has a few remaining uses, so it's moved to build_id.c, which is its last user. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	699f9d7066	Inline _mesa_sha1_init/update/final functions Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	a965ada6ee	Inline mesa_sha1, SHA1_CTX Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	0da88d237a	Inline SHA1_DIGEST_STRING_LENGTH Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	110632f702	Inline SHA1_DIGEST_LENGTH Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Georg Lehmann	ec331cc48a	nir: replace lower_ldexp with has_ldexp I can be bothered to fix all the backends that don't set lower_ldexp, and only two backends have ldexp anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Iván Briano	fd556e54f6	brw: do not omit RT writes if dual_src_blend is on Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Dual source blending when one of the sources is not written to leaves those values undefined, but the other should still be valid. By omitting unwritten outputs, we ended up not writing anything at all for the case that OUT1 is written to but OUT0 is undefined. Fixes new CTS tests: dEQP-VK.pipeline..blend.dual_source.undefined_output.first Cc: mesa-stable Signed-off-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40357>	2026-03-19 23:38:40 +00:00
Caio Oliveira	dcba49d7ef	intel/compiler: Handle shuffle_*_intel intrinsics in bit size lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>	2026-03-17 17:21:52 +00:00
Kenneth Graunke	9f77991751	brw: Simplify mark_last_urb_write_with_eot() Just tag the last instruction, drop useless dead code elimination. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	4bfa7a602c	brw: Don't emit HALT_TARGET for VS/TCS/TES/GS This isn't needed and will allow simplifications in the next patch. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	2b6c6f8130	brw: Lower TCS single patch invocation ID calculations in NIR This is a bit less code and also drops one more TCS-specific thing from the "run" function. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	66fbfe7bf3	brw: Fix single patch thread dispatch masks in NIR Arguably a little more code but it brings us a bit closer to not needing separate per-stage "run" functions. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	4a9aa3ecc4	brw: Combine brw_assign_*_urb_setup() into one function They all do exactly the same thing, except that GS multiplies by an extra factor, and TCS has urb_read_length == 0 so it skips one line. No need for four copies. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	7d463a45f7	brw: Simplify GS load_invocation_id handling Just return the register instead of having multiple functions stash the register in an array of registers. Way too much hoopla here. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Kenneth Graunke	9933882182	brw: Purge source_depth_to_render_target This was used for Gfx4-5. Since then, we're just passing around a boolean that nobody wants. Even if someone did, a better plan is to just check nir->info directly. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>	2026-03-12 21:40:37 +00:00
Sagar Ghuge	cb423ee636	anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921 WA states that we need to allocate maximum number of stackIDs per DSS from RT_DISPATCH_GLOBALS to 2048. We can still throttle/control the CFE_STATE::StackID to be in range specified by the field. This does impact performance having CFE_STATE::stackIDs capped to 2K by default. More the outstanding ray queries, larger the working set and have more impact on cache hit rate. This affect performance on Xe2+ onwards: * Boundary Benchmark: 36.2% * Solar Bay extreme: 9.8% * Hitman world of assassination: 3.9% Fixes: `c1a44e8d43` ("anv: force StackIDControl value for Wa_14021821874") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40310>	2026-03-10 22:41:54 +00:00
Lionel Landwerlin	f508c6acbb	brw/nir: improve shader_indirect_data_intel handling Use is_scalar to know if we can do transpose loading. Also enable vectorization if 2 intrinsics share the same source (it means the only difference is the base). Fixes: `e14d6b535c` ("brw/nir: add new intrinsics to load data from the indirect address") Tested-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>	2026-03-10 18:24:04 +00:00
Paulo Zanoni	85751506ab	elk: don't use instr->const_index[] directly From what I understand, use of const_index[] by the driver is dangerous and should be avoided, as commits such as `a6330ed4d0` ("nir: add ACCESS to load_uniforms") may result in the indexes changing, breaking the driver. Switch to using the parameter names in order to make the code more future-proof. For elk_fs_nir.cpp and elk_vec4_tes.cpp we can verify in the generated nir_intrinsics.c that the wanted value is actually nir_intrinsic_base(). For elk_nir.c, according to Caio Oliveira: "The code is checking for certain load/store via the is_input() and is_output() checks a few lines above. I've checked all them have BASE at 0." Thanks to Ian Romanick for his guidance regarding this patch. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39438>	2026-03-10 01:03:42 +00:00
Ian Romanick	ffd4497e48	brw/asm: Don't drop accumulator number in the assembler Previously "acc1" or "acc2" would be stored as acc0. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:39 +00:00
Ian Romanick	1ae7a82811	brw: Fix encoding of accumulator sources of 3-source instructions Previously the accumulator was always forced to be acc0. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:39 +00:00
Ian Romanick	6531c425a0	brw/emit: Src1 can be accumulator on Gfx12.5 and newer v2: Add Bspec reference number. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:39 +00:00
Ian Romanick	c3a5b62c08	brw/validate: Perform more 3-src validation in brw_validate instead of brw_eu_emit v2: s/Lake/Ice Lake/ in a comment. Noticed by Caio. Add a missing Xe2 Bspec reference number. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:39 +00:00
Ian Romanick	1f45e33072	brw/validate: Implicit read of accumulator cannot also have explicit read v2: Add Bspec reference number. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:38 +00:00
Ian Romanick	8a6de2d973	brw/validate: Eliminate duplicate integer multiply validation I think two MRs must have crossed in the mail so to speak. Keep Caio's formatting and error message, and keep my PRM quote. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40226>	2026-03-09 19:21:38 +00:00
Ian Romanick	64c60582b5	elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT shader-db: Broadwell total instructions in shared programs: 18607516 -> 18607530 (<.01%) instructions in affected programs: 2095 -> 2109 (0.67%) helped: 0 / HURT: 8 total cycles in shared programs: 955704436 -> 955702925 (<.01%) cycles in affected programs: 34299 -> 32788 (-4.41%) helped: 2 / HURT: 6 All Haswell and older platforms had similar results. (Haswell shown) total instructions in shared programs: 16989200 -> 16989201 (<.01%) instructions in affected programs: 461 -> 462 (0.22%) helped: 0 / HURT: 1 total cycles in shared programs: 946537070 -> 946537035 (<.01%) cycles in affected programs: 16378 -> 16343 (-0.21%) helped: 1 / HURT: 0 Test: piglit!1100 Reported-by: Georg Lehmann Fixes: `ca675b73d3` ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40284>	2026-03-09 18:41:55 +00:00
Ian Romanick	6c6c6ce054	brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT This optimization was added in October 2013, and the error was only just now discovered. Removing the SEL.G.SAT optimization affected zero shader-db shaders, and it affected 9 fossil-db shaders for instruction size only. I haven't checked to see if any of the hurt shaders are helped by !39987. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17093041 -> 17093055 (<.01%) instructions in affected programs: 2072 -> 2086 (0.68%) helped: 0 / HURT: 8 total cycles in shared programs: 876739578 -> 876739154 (<.01%) cycles in affected programs: 18946 -> 18522 (-2.24%) helped: 2 / HURT: 6 fossil-db: Lunar Lake Totals: Instrs: 906230557 -> 906240487 (+0.00%); split: -0.00%, +0.00% CodeSize: 14498856128 -> 14499003168 (+0.00%); split: -0.00%, +0.00% Send messages: 40667184 -> 40667205 (+0.00%); split: -0.00%, +0.00% Cycle count: 104068494103 -> 104068561943 (+0.00%); split: -0.00%, +0.00% Max live registers: 189570192 -> 189570204 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48157648 -> 48157552 (-0.00%) Non SSA regs after NIR: 139823587 -> 139823016 (-0.00%); split: -0.00%, +0.00% Totals from 9172 (0.46% of 1985212) affected shaders: Instrs: 10774709 -> 10784639 (+0.09%); split: -0.00%, +0.09% CodeSize: 177868384 -> 178015424 (+0.08%); split: -0.08%, +0.17% Send messages: 311154 -> 311175 (+0.01%); split: -0.00%, +0.01% Cycle count: 232471392 -> 232539232 (+0.03%); split: -0.15%, +0.18% Max live registers: 1243549 -> 1243561 (+0.00%); split: -0.00%, +0.01% Max dispatch width: 196672 -> 196576 (-0.05%) Non SSA regs after NIR: 509663 -> 509092 (-0.11%); split: -0.19%, +0.08% Test: piglit!1100 Reported-by: Georg Lehmann Fixes: `ca675b73d3` ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40284>	2026-03-09 18:41:55 +00:00
Lionel Landwerlin	9f2215b480	anv/brw: remove push constant load emulation from the backend compiler Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Anv is responsible for much of how the data is accessed (where is the push constant base pointer located, etc...), so move the memory load there. Fossildb on LNL : Totals from 135931 (8.65% of 1572134) affected shaders: Instrs: 68518228 -> 67142101 (-2.01%); split: -2.05%, +0.05% CodeSize: 1123507040 -> 1092022560 (-2.80%); split: -2.88%, +0.08% Subgroup size: 32 -> 16 (-50.00%) Send messages: 4401584 -> 4402565 (+0.02%); split: -0.02%, +0.04% Cycle count: 4626573038 -> 4619434858 (-0.15%); split: -0.89%, +0.74% Spill count: 451759 -> 452407 (+0.14%); split: -0.43%, +0.57% Fill count: 374513 -> 377440 (+0.78%); split: -0.76%, +1.54% Max live registers: 15788042 -> 15791399 (+0.02%); split: -0.05%, +0.08% Max dispatch width: 3349408 -> 3346192 (-0.10%); split: +0.09%, -0.19% Non SSA regs after NIR: 9477038 -> 9498328 (+0.22%); split: -0.27%, +0.50% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	e14d6b535c	brw/nir: add new intrinsics to load data from the indirect address This address is delivered on Gfx12.5+ in compute/mesh/task shaders from the command stream instruction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	7b1533414a	brw/nir: enable constant offsets for global_constant_uniform_block_intel Will be useful to retain the base offset added in `0e9453291c` ("brw: improve push constant loading using base offsets") once we move push constant data loading into NIR. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	d7c64af78e	brw: use scalar build for immediate offsets Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Ian Romanick	8624da56ee	brw: Also check for ADDRESS file in update_for_reads Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Like accumulators and ARF address registers, the virtual address registers are not tracked in a way the defs analysis can know about. This could actually be fixed, but that is future work. Fixes: `b110b06447` ("brw: introduce a new register type for the address register") Suggested-by: Lionel Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>	2026-03-05 00:02:51 +00:00
Ian Romanick	366410e913	brw: Use brw_reg_is_arf in update_for_reads brw_reg::nr encodes both which ARF it is and which instance of that ARF. In other words, nr for acc0 and acc2 have some bits that say BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test would only detect acc0. Fixes: `0d144821f0` ("intel/brw: Add a new def analysis pass") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>	2026-03-05 00:02:51 +00:00
Ian Romanick	a548466186	brw: Don't mark_invalid in update_for_reads for non-VGRF destination This can occur if NULL or an accumulator is an explicit destination. update_for_reads still needs to process the sources. v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one place. Fixes: `0d144821f0` ("intel/brw: Add a new def analysis pass") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>	2026-03-05 00:02:50 +00:00
Alyssa Rosenzweig	ef2a95a40a	brw: move brw_can_coherent_fb_fetch to a C header this isn't C++ brw code, it's just a devinfo query. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>	2026-03-02 12:44:42 +00:00
Alyssa Rosenzweig	d6d1dc5822	brw: move brw_nir_pack_vs_input to brw_nir.c It's just a pass like the others. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>	2026-03-02 12:44:42 +00:00
Caio Oliveira	df4042371f	anv: Set PIPELINE_SELECT systolic mode based on shader usage For Gfx125 workloads that use systolic mode, this might mean an extra PIPELINE_SELECT when flipping between a compute shader that use the mode and another that doesn't use the mode (or vice-versa). Reviewed-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>	2026-02-26 19:05:56 +00:00
Caio Oliveira	ffc3219d57	brw: Add lowering for nir_cmat_call_op_per_element_op Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39904>	2026-02-26 18:45:20 +00:00
Caio Oliveira	63e1592f8d	brw/scoreboard: Don't track dependencies for UNDEFs Dependencies in UNDEFs were already not propagated by update_inst_scoreboard(), since the instruction there was not consider neither ordered or unordered; and also not being used to resolve implicit dependencies. The generator was already ignoring any baked dependency but for cases where UNDEF had two dependencies, a sync nop would be generated -- which would be redundant with a later sync nop. Since we know UNDEFs have no dependencies, stop treating them specially when trimming dependencies. This patch remove this particular class of redundant sync nops. No functional change is expected. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39875>	2026-02-26 06:54:48 +00:00
Lionel Landwerlin	487586fefa	anv: implement inline parameter promotion from push constants Push constants on bindless stages of Gfx12.5+ don't get the data delivered in the registers automatically. Instead the shader needs to load the data with SEND messages. Those stages do get a single InlineParameter 32B block of data delivered into the EU. We can use that to promote some of the push constant data that has to be pulled otherwise. The driver will try to promote all push constant data (app + driver values) if it can, if it can't it'll try to promote only the driver values (usually a shader will only use a few driver values). If even the drivers values won't fit, give up and don't use the inline parameter at all. LNL internal fossil-db: Totals from 315738 (20.08% of 1572649) affected shaders: Instrs: 155053691 -> 154920901 (-0.09%); split: -0.09%, +0.00% CodeSize: 2578204272 -> 2574991568 (-0.12%); split: -0.15%, +0.02% Send messages: 8235628 -> 8184485 (-0.62%); split: -0.62%, +0.00% Cycle count: 43911938816 -> 43901857748 (-0.02%); split: -0.05%, +0.03% Spill count: 481329 -> 473185 (-1.69%); split: -1.82%, +0.13% Fill count: 405617 -> 399243 (-1.57%); split: -1.86%, +0.28% Max live registers: 34309395 -> 34309300 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 8298224 -> 8299168 (+0.01%) Non SSA regs after NIR: 18492887 -> 17631285 (-4.66%); split: -4.73%, +0.08% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Lionel Landwerlin	7f19814414	brw/nir: handle inline_data_intel more like push_data_intel It's pretty much the same mechanism, except it's a different register location. With this change we gain indirect loading support. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Caio Oliveira	922e3c75cf	brw: Explicitly set group=0 in generator for SYNC used in workaround Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Instead of using whatever group was set by the previous instruction. No behavior change, just normalizes what we generate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>	2026-02-20 17:11:59 +00:00
Caio Oliveira	4382d51cd0	brw: Make brw_builder::uniform() ignore previous group The `group()` helper creates the new builder "relative" to the existing one, so this was resulting in some uniform instructions having a non-zero channel offset ("group") -- which was surprising and had no practical effect. Normalize to always use group = 0. No change in behavior expected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>	2026-02-20 16:50:41 +00:00
Ian Romanick	da1fd9786b	elk/cmod: Don't propagate from CMP to ADD if there is a write between Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details If either source of the CMP is modified before an appropriate ADD is found, the ADD and the CMP will not have the same result. No shader-db changes on any ELK platform. I suspect the problematic cases only occur after scheduling has rearranged instructions. This is likely the reason BRW didn't experience this problem until `09450faf`. Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:55 +00:00
Ian Romanick	bdbfe8de4d	elk/cmod: Don't propagate from CMP to possible Inf + (-Inf) This is a backport of BRW `e26270249b`. shader-db: All Intel platforms had similar results. (Broadwell shown) total instructions in shared programs: 18623918 -> 18624594 (<.01%) instructions in affected programs: 125179 -> 125855 (0.54%) helped: 0 / HURT: 139 total cycles in shared programs: 957073100 -> 957072484 (<.01%) cycles in affected programs: 16534168 -> 16533552 (<.01%) helped: 42 / HURT: 68 Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:54 +00:00
Ian Romanick	d1614cd6db	brw/cmod: Don't propagate from CMP to ADD if there is a write between If either source of the CMP is modified before an appropriate ADD is found, the ADD and the CMP will not have the same result. shader-db: Lunar Lake total instructions in shared programs: 17098815 -> 17098818 (<.01%) instructions in affected programs: 1187 -> 1190 (0.25%) helped: 0 / HURT: 3 total cycles in shared programs: 876858960 -> 876858968 (<.01%) cycles in affected programs: 6878 -> 6886 (0.12%) helped: 0 / HURT: 1 Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) total instructions in shared programs: 20034973 -> 20034984 (<.01%) instructions in affected programs: 4599 -> 4610 (0.24%) helped: 0 / HURT: 11 total cycles in shared programs: 881033088 -> 881033108 (<.01%) cycles in affected programs: 57872 -> 57892 (0.03%) helped: 0 / HURT: 5 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 918873064 -> 918873269 (+0.00%) CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00% Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00% Totals from 205 (0.01% of 2011421) affected shaders: Instrs: 290415 -> 290620 (+0.07%) CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03% Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02% Closes: #14874 Fixes: `020b0055e7` ("i965/fs: Propagate conditional modifiers from compares to adds") Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>	2026-02-19 21:28:54 +00:00
José Roberto de Souza	39ec9e3448	intel/brw: Add and call brw_lsc_supports_base_offset() in places that checks for support of this feature Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>	2026-02-19 16:53:03 +00:00
José Roberto de Souza	91c5744e25	intel/brw: Use computed push constants size in brw_assign_urb_setup() It was already computed in brw_shader::assign_curb_setup() so we can use it in brw_assign_urb_setup(). There was a mismatch between assign_curb_setup() and brw_assign_urb_setup() when push_sizes were not multiple of REG_SIZE, the first one was aligning every push_sizes before sum it, while brw_assign_urb_setup() was only aligning the sum of all push_size. By luck the only places that did not had a push_size aligned to REG_SIZE only had one push_size, so this was not an issue. So here also fixing this mismatch and adding an assert to caught any future mismatch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39817>	2026-02-19 16:53:03 +00:00
Alyssa Rosenzweig	5386e93865	brw: use data helper Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>	2026-02-19 14:47:11 +00:00
Kenneth Graunke	1478329c53	iris: Move ALT mode handling from brw to iris Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We just read this from the NIR and store it in iris_compiled_shader, there's no reason for the backend compiler to be involved. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>	2026-02-19 02:51:00 +00:00

1 2 3 4 5 ...

5019 commits