fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 22:18:18 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	0089d81fb3	nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue We are going to disallow continue statements without loop continue constructs. Replaced with a test that checks that the optimization is not applied in absense of actual work after the conditional break. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	ff8c8858dc	nir/lower_goto_ifs: Add and lower loop continue constructs We are going to disallow continue statements without loop continue constructs. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	f159669cf3	nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	31af989270	nir/lower_continue_constructs: Simplify loops before lowering continue constructs The idea is inspired by LLVM's LoopSimplify pass. Before lowering continue constructs, the pass now also lowers all continue statements, leaving only the trivial continue. This ensures that loops will always only have one back-edge. Totals from 396 (0.47% of 84383) affected shaders: (Navi48) Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12% CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13% Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06% InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03% VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06% SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02% Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14% Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15% PreSGPRs: 34371 -> 34399 (+0.08%) PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05% VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15% SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32% VMEM: 22181 -> 22213 (+0.14%) SMEM: 41370 -> 41376 (+0.01%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Mary Guillemard	c6d8f7ce0c	nir/dead_cf: Add missing load_global_nv handling This was missing when this intrinsic was added. Fix some issue with FSI lowering and probably more. Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `e779538ad2` ("nir: add nvidia IO intrinsics") Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:35 +00:00
Mary Guillemard	bb6fc8cc20	nir/dead_cf: Add missing load_global_bounded handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `caa0854da8` ("nir: plumb load_global_bounded") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Mary Guillemard	6013667d61	nir/dead_cf: Add missing load_ssbo_ir3 handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `0092edfec0` ("nir/dead_cf: Do not remove loops with loads that can't be reordered") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Connor Abbott	ec37fed52b	tu, ir3, nir: Plumb through driver param for alpha-to-coverage We will need this when alpha-to-coverage is dynamic and we need to emulate it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:49 +00:00
Connor Abbott	22a061fb91	nir: Use better calculation for alpha-to-coverage mask The old calculation depended on the sample count, and gave subpar results for 8x MSAA with standard sample locations. The new calculation is based on the Intel pass, with some changing of the constants so that the sample count is always proportional to alpha for 2xMSAA and 4xMSAA and the addition of rotating the sample mask based on the pixel. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:48 +00:00
Georg Lehmann	643dd510d4	nir/opt_algebraic: optimize b2f(a) * b When the multiplication is only used by fadd, it's not a clear win because of potential fma fusion. Totals from 8015 (6.99% of 114655) affected shaders: MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01% Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04% CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06% VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01% SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50% Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04% InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04% VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09% SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01% Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15% Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01% PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07% PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02% VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02% SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16% VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	d2b37b667e	nir/opt_algebraic: optimize more fmulz(1.0, a) remains If dxvk's opencoded fmulz gets partially constant folded, it leaves this mess behind. It's important to do this before the more general fmul+b2f patterns added in the next commit, because they change the signed zero behavior in a way that can't be optimized back. Foz-DB Navi48: Totals from 36 (0.03% of 114655) affected shaders: Instrs: 16513 -> 15706 (-4.89%) CodeSize: 99756 -> 95760 (-4.01%) Latency: 45165 -> 44151 (-2.25%) InvThroughput: 8344 -> 7886 (-5.49%) VClause: 395 -> 401 (+1.52%) Copies: 639 -> 634 (-0.78%) PreSGPRs: 1158 -> 1154 (-0.35%) PreVGPRs: 1227 -> 1225 (-0.16%) VALU: 11310 -> 10769 (-4.78%) SALU: 813 -> 809 (-0.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	3ad142d4d7	nir/search: never insert movs for alu uses This means we respect the pattern order better because simple replacements like bcsel(False, a, b) -> b no longer insert movs that can block more specialized patterns. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	1626df7a90	nir: rework nir_alu_src_is_trivial_ssa to take an alu src Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	b96c42c916	nir/opt_algebraic: optimize more near useless bcsel Foz-DB Navi48: Totals from 327 (0.29% of 114655) affected shaders: Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01% CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00% Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01% InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01% Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02% Branches: 15598 -> 15571 (-0.17%) VALU: 262391 -> 261969 (-0.16%) SALU: 268112 -> 267699 (-0.15%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	6cfe6eaa79	nir/opt_algebraic: create ldexp from exp2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ldexp uses the full width VALU path, exp2 the transcendental SIMD8. Foz-DB Navi21: Totals from 729 (0.64% of 114627) affected shaders: MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02% Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00% CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00% VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05% Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00% InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00% VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02% SClause: 20520 -> 20509 (-0.05%) Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01% Branches: 22028 -> 22029 (+0.00%) PreVGPRs: 21601 -> 21602 (+0.00%) VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00% SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Georg Lehmann	ec331cc48a	nir: replace lower_ldexp with has_ldexp I can be bothered to fix all the backends that don't set lower_ldexp, and only two backends have ldexp anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Faith Ekstrand	3418525a82	pan/bi: Lower VS outputs in NIR Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:32 +00:00
Lorenzo Rossi	43ffcf06f4	pan/bi,nir: Divide memory_access from segments Valhall removed Bifrost's memory segments and added in its place memory access. Those were bolted on reserved bits as "pseudo-segments" and the emitter would catch these and emit the right memory access. This commit cleans it up a bit by making memory_access available directly and exposing it to NIR (this will be useful later). Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00
Lorenzo Rossi	c730e41ed5	pan/bi: Add is_psiz_store flag in bi_instr This removes the previous hack that searched the psiz write by looking for 16-bit stores with the correct pseudo segment. We also add a new intrinsic that mimicks global stores but tags psiz writes, this will be used later in the series. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00
Faith Ekstrand	de338dc908	pan,nir: Rework converted_mem_pan intrinsics First, rename them to make them a bit more clear. They act on global memory so they should be _global and they map to ld/st_cvt so so _cvt is nice and obvious. Second, they don't need IO semantics as they're not IO. But they do need ACCESS so that we can better control things like CAN_REORDER. Third, add a src_type to store_global_cvt even though it won't be used just yet because we'll want it for lowering VS stores. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:29 +00:00
Faith Ekstrand	d2f430bea9	pan/bi: Add new FS input load intrinsics Unlike load[_interpolated]_input, which has to deal with all sorts of ABI nonsense between driver and compiler, these new intrinsics are dumber than bricks. They're literally just the HW ops as NIR intrinsics. These will allow us do the lowering in NIR and put the driver in total control over what goes down what path. Among other things, a driver could choose to lower some things to ld_var and others to ld_var_buf. Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:28 +00:00
Georg Lehmann	57c05f72f9	nir/opt_large_constants: only use 16bit float alu when supported Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	5f37788ae9	nir/opt_large_constants: handle floating point power of two fractions Foz-DB Navi48: Totals from 365 (0.32% of 114655) affected shaders: MaxWaves: 10020 -> 10016 (-0.04%) Instrs: 486252 -> 486097 (-0.03%); split: -0.21%, +0.18% CodeSize: 2629536 -> 2628452 (-0.04%); split: -0.19%, +0.14% VGPRs: 19884 -> 19896 (+0.06%); split: -0.06%, +0.12% SpillSGPRs: 210 -> 212 (+0.95%) Latency: 3818610 -> 3765549 (-1.39%); split: -1.50%, +0.11% InvThroughput: 598445 -> 596281 (-0.36%); split: -0.58%, +0.22% VClause: 10053 -> 9698 (-3.53%); split: -3.54%, +0.01% SClause: 17548 -> 17334 (-1.22%); split: -1.24%, +0.02% Copies: 43196 -> 42249 (-2.19%); split: -2.34%, +0.14% Branches: 16695 -> 16628 (-0.40%); split: -0.47%, +0.07% PreSGPRs: 17988 -> 17971 (-0.09%) PreVGPRs: 13552 -> 13520 (-0.24%) VALU: 244842 -> 246611 (+0.72%); split: -0.02%, +0.74% SALU: 79163 -> 77778 (-1.75%); split: -2.05%, +0.30% VMEM: 13468 -> 13084 (-2.85%) SMEM: 23571 -> 23393 (-0.76%) VOPD: 8384 -> 8372 (-0.14%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	372c1a23dc	nir/opt_large_constants: support negative small constants Foz-DB Navi48: Totals from 511 (0.45% of 114655) affected shaders: MaxWaves: 14554 -> 14552 (-0.01%) Instrs: 767577 -> 768334 (+0.10%); split: -0.17%, +0.27% CodeSize: 4171036 -> 4181400 (+0.25%); split: -0.10%, +0.35% VGPRs: 27676 -> 27724 (+0.17%) SpillSGPRs: 144 -> 183 (+27.08%) Latency: 4053919 -> `4027092` (-0.66%); split: -0.88%, +0.22% InvThroughput: 817990 -> 819490 (+0.18%); split: -0.21%, +0.39% VClause: 11573 -> 11172 (-3.46%); split: -3.47%, +0.01% SClause: 14418 -> 14579 (+1.12%); split: -0.46%, +1.57% Copies: 71638 -> 71365 (-0.38%); split: -1.54%, +1.16% Branches: 20212 -> 20425 (+1.05%); split: -0.39%, +1.44% PreSGPRs: 21765 -> 21743 (-0.10%); split: -0.23%, +0.12% PreVGPRs: 19475 -> 19307 (-0.86%); split: -0.91%, +0.05% VALU: 411365 -> 413642 (+0.55%); split: -0.02%, +0.57% SALU: 126940 -> 125411 (-1.20%); split: -1.53%, +0.32% VMEM: 20574 -> 20062 (-2.49%) SMEM: 23724 -> 23677 (-0.20%); split: -0.25%, +0.05% VOPD: 19838 -> 19847 (+0.05%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	a9f3efcae0	nir/opt_large_constants: optimize small vector constant arrays Foz-DB Navi48: Totals from 2956 (2.58% of 114655) affected shaders: MaxWaves: 85080 -> 85110 (+0.04%) Instrs: 5167735 -> 5170572 (+0.05%); split: -0.12%, +0.17% CodeSize: 28882716 -> 28867340 (-0.05%); split: -0.14%, +0.08% VGPRs: 164484 -> 164616 (+0.08%); split: -0.09%, +0.18% SpillSGPRs: 612 -> 611 (-0.16%) Latency: 35017837 -> 34391146 (-1.79%); split: -1.80%, +0.01% InvThroughput: 6336245 -> 6323807 (-0.20%); split: -0.49%, +0.29% VClause: 112504 -> 111117 (-1.23%); split: -1.32%, +0.09% SClause: 121125 -> 117618 (-2.90%); split: -3.04%, +0.15% Copies: 392203 -> 384977 (-1.84%); split: -1.88%, +0.04% Branches: 155578 -> 155376 (-0.13%); split: -0.13%, +0.01% PreSGPRs: 127654 -> 127205 (-0.35%); split: -0.39%, +0.04% PreVGPRs: 112486 -> 112449 (-0.03%); split: -0.04%, +0.00% VALU: 2577362 -> 2586379 (+0.35%); split: -0.00%, +0.35% SALU: 889569 -> 888472 (-0.12%); split: -1.01%, +0.89% VMEM: 167203 -> 165750 (-0.87%) SMEM: 190438 -> 187313 (-1.64%) VOPD: 194411 -> 194344 (-0.03%); split: +0.01%, -0.04% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:18 +00:00
Georg Lehmann	f782524c36	nir/opt_large_constants: enable small constant optimization for non trivial strides Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:17 +00:00
Georg Lehmann	568b96f8b2	nir/opt_large_constants: set fp_math_ctrl for bit exact results Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:17 +00:00
Georg Lehmann	e810382a1e	nir/opt_large_constants: don't add constants implemented with ALU to the constant data Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:16 +00:00
Konstantin Seurer	581df90a89	nir/tests: Test nir_opt_large_constants Tests a whole bunch of cases that can be turned into literals. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>	2026-03-19 06:59:16 +00:00
Caio Oliveira	a2cbdfbde3	nir: Add intrinsics for ShuffleUpINTEL and ShuffleDownINTEL Move lowering to nir_lower_subgroups. At some point Intel backend might want to skip that and lower at the backend IR boundary, but for now lowering always applies. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>	2026-03-17 17:21:52 +00:00
Erik Faye-Lund	5127568b98	compiler/nir: use common ycbcr math Let's use the common code, so we have a single place to update in case we want to add features etc. Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40175>	2026-03-17 15:00:54 +00:00
Mike Blumenkrantz	fbf3305c1b	nir/print: print per_vertex for variables Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40412>	2026-03-16 14:42:11 +00:00
Georg Lehmann	85021cb5f0	nir/algebraic/tests: invert all excluded fp_math_ctrl flags Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Randomly thought about that, but of course only after marge was already done. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:50 +00:00
Georg Lehmann	98ff0a394a	nir/opt_algebraic: move some fsat patterns next to the other fsat patterns I almost missed that they already exist multiple times. No Foz-DB chagnes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:50 +00:00
Georg Lehmann	607f26814f	nir/opt_algebraic: remove manual patterns that optimizes flt([0.0, 1.0], 0.0) Range analysis can figure this out. No Foz-DB changes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:50 +00:00
Georg Lehmann	530bb4278c	nir/opt_algebraic: remove manual pattern that removes fmax(..., 0.0) Range analysis will figure this out. No Foz-DB changes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:50 +00:00
Georg Lehmann	4d176c8ea5	nir/opt_algebraic: turn fabs(a) into fneg(a) if a is not positive fneg is usually more optimizable. Foz-DB Navi48: Totals from 214 (0.19% of 114655) affected shaders: Instrs: 694279 -> 694155 (-0.02%); split: -0.02%, +0.00% CodeSize: 3749268 -> 3748024 (-0.03%); split: -0.03%, +0.00% VGPRs: 18252 -> 18264 (+0.07%) Latency: 5453691 -> 5453503 (-0.00%); split: -0.00%, +0.00% InvThroughput: 1024436 -> 1024314 (-0.01%); split: -0.01%, +0.00% VALU: 453136 -> 453041 (-0.02%); split: -0.02%, +0.00% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:50 +00:00
Georg Lehmann	d77c2a1ece	nir/opt_algebraic: take advantage of range helpers including nnan No Foz-DB changes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389>	2026-03-16 13:03:49 +00:00
Alyssa Rosenzweig	373358da45	nir/opt_sink: sink pack_64_2x32_split This comes up in lowered load_ubo sequences (observed in OpenCL test test_api min_max_parameter_size). Hopefully the pack gets coalesced, it's like nir_op_vec2 on most backends, so it should usually be ok to sink even though the register pressure heuristic will reject it. Allowing it to sink allows the UBO load to sink. Intel's backend scheduler can optimize the relevant sequences locally but there should still be a win here for global load sinking. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40267>	2026-03-13 17:03:00 +00:00
Alyssa Rosenzweig	507e7a04bf	nir/opt_sink: sink Intel UBO loads Acts like load_ubo, handle it in the same path. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40267>	2026-03-13 17:03:00 +00:00
Rhys Perry	3c67225afa	nir/range_analysis: cache results of non-alu fp class queries Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The dense array should be much faster than the previous hash table. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40346>	2026-03-13 15:38:55 +00:00
Rhys Perry	84eeecf822	nir/range_analysis: use a dense array ministat (nir_analyze_fp_class): Difference at 95.0% confidence -201983 +/- 1064.87 -9.31575% +/- 0.0468505% (Student's t, pooled s = 1257.67) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40346>	2026-03-13 15:38:54 +00:00
Rhys Perry	cebf60e059	nir/range_analysis: use uint16_t for sparse array elements ministat (nir_analyze_fp_class): Difference at 95.0% confidence -4484.55 +/- 1288.68 -0.205419% +/- 0.0589514% (Student's t, pooled s = 1521.99) This should also use less memory. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40346>	2026-03-13 15:38:54 +00:00
Georg Lehmann	cadc74b5e2	nir/search_helpers: assume float sources without preserve flag can't be inf/nan For example, this should let us avoid needing one pattern with is_a_number and one with nnan. Foz-DB Navi48: Totals from 3564 (3.11% of 114655) affected shaders: Instrs: 8256755 -> 8255042 (-0.02%); split: -0.02%, +0.00% CodeSize: 43143184 -> 43123192 (-0.05%); split: -0.05%, +0.00% VGPRs: 268252 -> 268240 (-0.00%) Latency: 218890225 -> 218881157 (-0.00%); split: -0.00%, +0.00% InvThroughput: 31044516 -> 31042297 (-0.01%); split: -0.01%, +0.00% VClause: 96074 -> 96067 (-0.01%); split: -0.01%, +0.00% SClause: 218042 -> 218037 (-0.00%); split: -0.00%, +0.00% Copies: 508677 -> 508661 (-0.00%); split: -0.01%, +0.01% Branches: 148570 -> 148569 (-0.00%) PreSGPRs: 228110 -> 228082 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 231996 -> 231982 (-0.01%) VALU: 4516327 -> 4515321 (-0.02%); split: -0.02%, +0.00% SALU: 1353696 -> 1353590 (-0.01%); split: -0.01%, +0.00% VMEM: 182189 -> 182179 (-0.01%) SMEM: 344771 -> 344756 (-0.00%) VOPD: 29463 -> 29438 (-0.08%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:10 +00:00
Georg Lehmann	19fa9bd152	nir/tests: test algebraic patterns with maximum fp_math_ctrl This means we don't run into undefined behavior when testing nan/inf inputs. Also make sure that patterns using is_only_used_as_float are signed zero correct. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:10 +00:00
Georg Lehmann	aad2b9bfc7	nir/opt_algebraic: be more strict when optimizing fcmp(a + #b, #c) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:10 +00:00
Georg Lehmann	624313d35d	nir/opt_algebraic: lower ninf fisfinite correctly Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:09 +00:00
Georg Lehmann	15eadc1253	nir/lower_frexp: preserve fp_math_ctrl Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:09 +00:00
Faith Ekstrand	f2f792996d	Revert "nir: Add a type parameter to nir_lower_point_size()" This reverts commit `6ee4ea5ea3`. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>	2026-03-12 22:59:13 +00:00
Faith Ekstrand	ceacec4cc9	nir: Allow 8-bit vertex output stores These can never come from the API but there's a few cases where panvk wants them. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>	2026-03-12 22:59:13 +00:00

1 2 3 4 5 ...

7272 commits