fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 13:58:05 +02:00

Author	SHA1	Message	Date
Marek Olšák	a965ada6ee	Inline mesa_sha1, SHA1_CTX Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	0da88d237a	Inline SHA1_DIGEST_STRING_LENGTH Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	110632f702	Inline SHA1_DIGEST_LENGTH Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:27 +00:00
Marek Olšák	2283244975	nir: change export_amd intrinsics to use target instead of base Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	b75a3112fd	nir: change export_amd intrinsics to use enabled_channels instead of write_mask Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	f9a10c46fa	nir/inline_uniforms: track visited state per component Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This prevents an instruction from being marked inlinable or non-inlinable when only a subset of components meet that condition. This might only be relevant for non-scalar ALU. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:40 +00:00
Marek Olšák	d9a2fac925	nir/inline_uniforms: update comments Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:40 +00:00
Marek Olšák	3b004ec60b	nir/inline_uniforms: rename new_num -> new_num_uniforms Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:39 +00:00
Marek Olšák	727d663f79	nir/inline_uniforms: rename num_offsets -> num_uniforms Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:39 +00:00
Timothy Arceri	06fc27b5a4	nir: test loop analyze sets exact trip flags correctly Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Introduces new test helper to create loop with multiple terminators and tests some scenaros to make sure exact trip flags are set correctly. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>	2026-03-21 11:46:14 +00:00
Timothy Arceri	82b474c3fb	nir: remove is_only_uniform_src() restriction Loop analysis seems to have assumed we needed a const here to be a useful loop, however this isn't true so drop the restriction. This allows the optimisation from `6ca81adffc` to become more powerful. Shader-db results radeonsi: TOTALS FROM AFFECTED SHADERS (19/168079) SGPRS: 904.00 -> 848.00 (-6.19 %) VGPRS: 712.00 -> 684.00 (-3.93 %) Spilled SGPRs: 0.00 -> 0.00 (0.00 %) Spilled VGPRs: 0.00 -> 0.00 (0.00 %) Private memory VGPRs: 0.00 -> 0.00 (0.00 %) Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread Code Size: 80340.00 -> 92980.00 (15.73 %) bytes Max Waves: 236.00 -> 238.00 (0.85 %) Outputs: 0.00 -> 0.00 (0.00 %) Patch Outputs: 0.00 -> 0.00 (0.00 %) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>	2026-03-21 11:46:14 +00:00
Daniel Schürmann	4ca0eb9f54	nir: validate that loop continue statements always link to continue constructs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	94f959972d	nir: ensure that loop continue statements always link to continue constructs Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	0089d81fb3	nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue We are going to disallow continue statements without loop continue constructs. Replaced with a test that checks that the optimization is not applied in absense of actual work after the conditional break. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	ff8c8858dc	nir/lower_goto_ifs: Add and lower loop continue constructs We are going to disallow continue statements without loop continue constructs. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	f159669cf3	nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	31af989270	nir/lower_continue_constructs: Simplify loops before lowering continue constructs The idea is inspired by LLVM's LoopSimplify pass. Before lowering continue constructs, the pass now also lowers all continue statements, leaving only the trivial continue. This ensures that loops will always only have one back-edge. Totals from 396 (0.47% of 84383) affected shaders: (Navi48) Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12% CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13% Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06% InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03% VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06% SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02% Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14% Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15% PreSGPRs: 34371 -> 34399 (+0.08%) PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05% VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15% SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32% VMEM: 22181 -> 22213 (+0.14%) SMEM: 41370 -> 41376 (+0.01%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Mary Guillemard	c6d8f7ce0c	nir/dead_cf: Add missing load_global_nv handling This was missing when this intrinsic was added. Fix some issue with FSI lowering and probably more. Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `e779538ad2` ("nir: add nvidia IO intrinsics") Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:35 +00:00
Mary Guillemard	bb6fc8cc20	nir/dead_cf: Add missing load_global_bounded handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `caa0854da8` ("nir: plumb load_global_bounded") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Mary Guillemard	6013667d61	nir/dead_cf: Add missing load_ssbo_ir3 handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `0092edfec0` ("nir/dead_cf: Do not remove loops with loads that can't be reordered") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Connor Abbott	ec37fed52b	tu, ir3, nir: Plumb through driver param for alpha-to-coverage We will need this when alpha-to-coverage is dynamic and we need to emulate it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:49 +00:00
Connor Abbott	22a061fb91	nir: Use better calculation for alpha-to-coverage mask The old calculation depended on the sample count, and gave subpar results for 8x MSAA with standard sample locations. The new calculation is based on the Intel pass, with some changing of the constants so that the sample count is always proportional to alpha for 2xMSAA and 4xMSAA and the addition of rotating the sample mask based on the pixel. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:48 +00:00
Georg Lehmann	643dd510d4	nir/opt_algebraic: optimize b2f(a) * b When the multiplication is only used by fadd, it's not a clear win because of potential fma fusion. Totals from 8015 (6.99% of 114655) affected shaders: MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01% Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04% CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06% VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01% SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50% Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04% InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04% VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09% SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01% Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15% Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01% PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07% PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02% VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02% SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16% VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	d2b37b667e	nir/opt_algebraic: optimize more fmulz(1.0, a) remains If dxvk's opencoded fmulz gets partially constant folded, it leaves this mess behind. It's important to do this before the more general fmul+b2f patterns added in the next commit, because they change the signed zero behavior in a way that can't be optimized back. Foz-DB Navi48: Totals from 36 (0.03% of 114655) affected shaders: Instrs: 16513 -> 15706 (-4.89%) CodeSize: 99756 -> 95760 (-4.01%) Latency: 45165 -> 44151 (-2.25%) InvThroughput: 8344 -> 7886 (-5.49%) VClause: 395 -> 401 (+1.52%) Copies: 639 -> 634 (-0.78%) PreSGPRs: 1158 -> 1154 (-0.35%) PreVGPRs: 1227 -> 1225 (-0.16%) VALU: 11310 -> 10769 (-4.78%) SALU: 813 -> 809 (-0.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	3ad142d4d7	nir/search: never insert movs for alu uses This means we respect the pattern order better because simple replacements like bcsel(False, a, b) -> b no longer insert movs that can block more specialized patterns. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	1626df7a90	nir: rework nir_alu_src_is_trivial_ssa to take an alu src Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	b96c42c916	nir/opt_algebraic: optimize more near useless bcsel Foz-DB Navi48: Totals from 327 (0.29% of 114655) affected shaders: Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01% CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00% Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01% InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01% Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02% Branches: 15598 -> 15571 (-0.17%) VALU: 262391 -> 261969 (-0.16%) SALU: 268112 -> 267699 (-0.15%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	6cfe6eaa79	nir/opt_algebraic: create ldexp from exp2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ldexp uses the full width VALU path, exp2 the transcendental SIMD8. Foz-DB Navi21: Totals from 729 (0.64% of 114627) affected shaders: MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02% Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00% CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00% VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05% Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00% InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00% VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02% SClause: 20520 -> 20509 (-0.05%) Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01% Branches: 22028 -> 22029 (+0.00%) PreVGPRs: 21601 -> 21602 (+0.00%) VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00% SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Georg Lehmann	ec331cc48a	nir: replace lower_ldexp with has_ldexp I can be bothered to fix all the backends that don't set lower_ldexp, and only two backends have ldexp anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Faith Ekstrand	3418525a82	pan/bi: Lower VS outputs in NIR Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:32 +00:00
Lorenzo Rossi	43ffcf06f4	pan/bi,nir: Divide memory_access from segments Valhall removed Bifrost's memory segments and added in its place memory access. Those were bolted on reserved bits as "pseudo-segments" and the emitter would catch these and emit the right memory access. This commit cleans it up a bit by making memory_access available directly and exposing it to NIR (this will be useful later). Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00
Lorenzo Rossi	c730e41ed5	pan/bi: Add is_psiz_store flag in bi_instr This removes the previous hack that searched the psiz write by looking for 16-bit stores with the correct pseudo segment. We also add a new intrinsic that mimicks global stores but tags psiz writes, this will be used later in the series. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00
Faith Ekstrand	de338dc908	pan,nir: Rework converted_mem_pan intrinsics First, rename them to make them a bit more clear. They act on global memory so they should be _global and they map to ld/st_cvt so so _cvt is nice and obvious. Second, they don't need IO semantics as they're not IO. But they do need ACCESS so that we can better control things like CAN_REORDER. Third, add a src_type to store_global_cvt even though it won't be used just yet because we'll want it for lowering VS stores. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:29 +00:00
Faith Ekstrand	d2f430bea9	pan/bi: Add new FS input load intrinsics Unlike load[_interpolated]_input, which has to deal with all sorts of ABI nonsense between driver and compiler, these new intrinsics are dumber than bricks. They're literally just the HW ops as NIR intrinsics. These will allow us do the lowering in NIR and put the driver in total control over what goes down what path. Among other things, a driver could choose to lower some things to ld_var and others to ld_var_buf. Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:28 +00:00
Georg Lehmann	57c05f72f9	nir/opt_large_constants: only use 16bit float alu when supported Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	5f37788ae9	nir/opt_large_constants: handle floating point power of two fractions Foz-DB Navi48: Totals from 365 (0.32% of 114655) affected shaders: MaxWaves: 10020 -> 10016 (-0.04%) Instrs: 486252 -> 486097 (-0.03%); split: -0.21%, +0.18% CodeSize: 2629536 -> 2628452 (-0.04%); split: -0.19%, +0.14% VGPRs: 19884 -> 19896 (+0.06%); split: -0.06%, +0.12% SpillSGPRs: 210 -> 212 (+0.95%) Latency: 3818610 -> 3765549 (-1.39%); split: -1.50%, +0.11% InvThroughput: 598445 -> 596281 (-0.36%); split: -0.58%, +0.22% VClause: 10053 -> 9698 (-3.53%); split: -3.54%, +0.01% SClause: 17548 -> 17334 (-1.22%); split: -1.24%, +0.02% Copies: 43196 -> 42249 (-2.19%); split: -2.34%, +0.14% Branches: 16695 -> 16628 (-0.40%); split: -0.47%, +0.07% PreSGPRs: 17988 -> 17971 (-0.09%) PreVGPRs: 13552 -> 13520 (-0.24%) VALU: 244842 -> 246611 (+0.72%); split: -0.02%, +0.74% SALU: 79163 -> 77778 (-1.75%); split: -2.05%, +0.30% VMEM: 13468 -> 13084 (-2.85%) SMEM: 23571 -> 23393 (-0.76%) VOPD: 8384 -> 8372 (-0.14%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	372c1a23dc	nir/opt_large_constants: support negative small constants Foz-DB Navi48: Totals from 511 (0.45% of 114655) affected shaders: MaxWaves: 14554 -> 14552 (-0.01%) Instrs: 767577 -> 768334 (+0.10%); split: -0.17%, +0.27% CodeSize: 4171036 -> 4181400 (+0.25%); split: -0.10%, +0.35% VGPRs: 27676 -> 27724 (+0.17%) SpillSGPRs: 144 -> 183 (+27.08%) Latency: 4053919 -> `4027092` (-0.66%); split: -0.88%, +0.22% InvThroughput: 817990 -> 819490 (+0.18%); split: -0.21%, +0.39% VClause: 11573 -> 11172 (-3.46%); split: -3.47%, +0.01% SClause: 14418 -> 14579 (+1.12%); split: -0.46%, +1.57% Copies: 71638 -> 71365 (-0.38%); split: -1.54%, +1.16% Branches: 20212 -> 20425 (+1.05%); split: -0.39%, +1.44% PreSGPRs: 21765 -> 21743 (-0.10%); split: -0.23%, +0.12% PreVGPRs: 19475 -> 19307 (-0.86%); split: -0.91%, +0.05% VALU: 411365 -> 413642 (+0.55%); split: -0.02%, +0.57% SALU: 126940 -> 125411 (-1.20%); split: -1.53%, +0.32% VMEM: 20574 -> 20062 (-2.49%) SMEM: 23724 -> 23677 (-0.20%); split: -0.25%, +0.05% VOPD: 19838 -> 19847 (+0.05%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	a9f3efcae0	nir/opt_large_constants: optimize small vector constant arrays Foz-DB Navi48: Totals from 2956 (2.58% of 114655) affected shaders: MaxWaves: 85080 -> 85110 (+0.04%) Instrs: 5167735 -> 5170572 (+0.05%); split: -0.12%, +0.17% CodeSize: 28882716 -> 28867340 (-0.05%); split: -0.14%, +0.08% VGPRs: 164484 -> 164616 (+0.08%); split: -0.09%, +0.18% SpillSGPRs: 612 -> 611 (-0.16%) Latency: 35017837 -> 34391146 (-1.79%); split: -1.80%, +0.01% InvThroughput: 6336245 -> 6323807 (-0.20%); split: -0.49%, +0.29% VClause: 112504 -> 111117 (-1.23%); split: -1.32%, +0.09% SClause: 121125 -> 117618 (-2.90%); split: -3.04%, +0.15% Copies: 392203 -> 384977 (-1.84%); split: -1.88%, +0.04% Branches: 155578 -> 155376 (-0.13%); split: -0.13%, +0.01% PreSGPRs: 127654 -> 127205 (-0.35%); split: -0.39%, +0.04% PreVGPRs: 112486 -> 112449 (-0.03%); split: -0.04%, +0.00% VALU: 2577362 -> 2586379 (+0.35%); split: -0.00%, +0.35% SALU: 889569 -> 888472 (-0.12%); split: -1.01%, +0.89% VMEM: 167203 -> 165750 (-0.87%) SMEM: 190438 -> 187313 (-1.64%) VOPD: 194411 -> 194344 (-0.03%); split: +0.01%, -0.04% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	f782524c36	nir/opt_large_constants: enable small constant optimization for non trivial strides Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:17 +00:00
Georg Lehmann	568b96f8b2	nir/opt_large_constants: set fp_math_ctrl for bit exact results Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:17 +00:00
Georg Lehmann	e810382a1e	nir/opt_large_constants: don't add constants implemented with ALU to the constant data Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:16 +00:00
Konstantin Seurer	581df90a89	nir/tests: Test nir_opt_large_constants Tests a whole bunch of cases that can be turned into literals. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:16 +00:00
Timothy Arceri	87ae5cab94	mesa: add force_explicit_uniform_loc_zero workaround Allows a uniform name to be passed to force_explicit_uniform_loc_zero allowing us to set that uniform to an explicit location of zero. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>	2026-03-18 07:28:07 +00:00
Caio Oliveira	f07138f244	spirv: Lower ShuffleUpINTEL and ShuffleDownINTEL to intrinsics Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>	2026-03-17 17:21:52 +00:00
Caio Oliveira	a2cbdfbde3	nir: Add intrinsics for ShuffleUpINTEL and ShuffleDownINTEL Move lowering to nir_lower_subgroups. At some point Intel backend might want to skip that and lower at the backend IR boundary, but for now lowering always applies. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>	2026-03-17 17:21:52 +00:00
Caio Oliveira	b494faa12d	spirv: Remove dead code in subgroup instruction handling This codepath had a bug (always setting `elems[0]`) since it was last reworked, but there's no subgroup instruction that uses this helper and support Composites, so it can be replace with an assert. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40356>	2026-03-17 15:32:36 +00:00
Erik Faye-Lund	5127568b98	compiler/nir: use common ycbcr math Let's use the common code, so we have a single place to update in case we want to add features etc. Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40175>	2026-03-17 15:00:54 +00:00
Connor Abbott	c13bdaaa40	vtn: Fix vtn_mediump_upconvert_value() with transposed matrices We can produce a transposed value sometimes, and we have to make sure that val->transposed is also updated when that happens. Noticed by inspection after the previous commit. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40017>	2026-03-16 18:33:54 +00:00
Connor Abbott	048d2a0c68	vtn: Fix vtn_mediump_downconvert_value() for transposed matrices We forgot to set the actual value. This meant that whenever we actually needed to use the transposed matrix we would immediately segfault. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40017>	2026-03-16 18:33:54 +00:00
Mike Blumenkrantz	fbf3305c1b	nir/print: print per_vertex for variables Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40412>	2026-03-16 14:42:11 +00:00

1 2 3 4 5 ...

11900 commits