fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 22:28:06 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	df515cfb5b	nir: make nir_variable::descriptor_set a 32-bit variable With descriptor heap there is no limit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:25 +00:00
Lionel Landwerlin	302194a566	nir: improve deref_instr_get_variable So we can get through all the casting inserted by heaps. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:23 +00:00
Faith Ekstrand	e7e601f113	nir: Add tex sources for descriptor heaps We also add a new boolean which indicates that the texture op uses an embedded sampler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:22 +00:00
Faith Ekstrand	f117b81435	nir: Add intrinsics for descriptor heaps Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:22 +00:00
Faith Ekstrand	c29d8dd4ff	nir: Add sampler and resource heap system values Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:20 +00:00
Kenneth Graunke	0e143ae663	nir: Add nir_texop_resinfo_intel This is a combination of txs and query_levels in a single vec4 result. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40451>	2026-03-29 12:53:09 +00:00
Georg Lehmann	e7077e8f5c	nir/lower_non_uniform_access: fix fusing loops for same index but different array variable Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details struct nu_handle is hashed and deduplicated using struct nu_handle_key, which ignored parent_deref. That means all instructions will use the first parent_deref when rewriting the sources. Avoid this by not including the parent deref in the struct, and instead querying it when needed. Fixes: `4d09cd7fa5` ("nir/lower_non_uniform_access: Group accesses using the same resource") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15173 Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40654>	2026-03-29 08:31:51 +00:00
Lorenzo Rossi	c0e0591999	pan/compiler: Replace frag_coord_zw_pan with var_special_pan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Just a bit cleaner, and we can unify point size too. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>	2026-03-27 19:23:02 +00:00
Georg Lehmann	0d8e2354ed	nir: add fp_math_ctrl to convert_alu_types Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	35ca85176c	nir: add fp_math_ctrl to cmat alu ops Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	9cba104e11	nir/opt_fp_math_ctrl: use ddx/ddy fp_math_ctrl No Foz-DB changes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	85ff60e68a	nir/opt_uniform_subgroup: use ddx/ddy fp_math_ctrl Foz-DB Navi48: Totals from 16 (0.01% of 139781) affected shaders: Instrs: 12432 -> 11597 (-6.72%) CodeSize: 66204 -> 62440 (-5.69%) Latency: 77168 -> 76132 (-1.34%) InvThroughput: 8942 -> 8332 (-6.82%) VClause: 302 -> 290 (-3.97%) SClause: 207 -> 201 (-2.90%) Copies: 553 -> 517 (-6.51%) PreVGPRs: 589 -> 577 (-2.04%) VALU: 8007 -> 7473 (-6.67%) SALU: 1057 -> 900 (-14.85%) VMEM: 407 -> 395 (-2.95%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	5d2be211ea	nir: add fp_math_ctrl to ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:49 +00:00
Georg Lehmann	854911aeab	nir: add fp_math_ctrl as intrinsic index Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:49 +00:00
Georg Lehmann	d2be2fd4c1	nir/opt_fp_math_ctrl: ignore ffract input sign of zero ffract(-0.0) = fract(+0.0) = +0.0 Foz-DB Navi48: Totals from 23 (0.01% of 205040) affected shaders: Instrs: 12036 -> 11836 (-1.66%) CodeSize: 58392 -> 57716 (-1.16%); split: -1.19%, +0.03% Latency: 57532 -> 57204 (-0.57%); split: -0.61%, +0.04% InvThroughput: 10399 -> 10217 (-1.75%) VClause: 72 -> 70 (-2.78%) Copies: 324 -> 335 (+3.40%) PreVGPRs: 640 -> 646 (+0.94%) VALU: 8561 -> 8364 (-2.30%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:49 +00:00
Robert Mader	44fa9c8326	nir/lower_tex: Reinstate LSB to MSB shift Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details lower_sx10_external and lower_sx12_external are used for LSB aligned formats such as DRM_FORMAT_S010, which are typically used by software decoders. Unlike MSB aligned 10/12 bit formats used by hardware decoders such as P010 they need to manually get "shifted" in order to correctly map to the 0-1 range. In the commit mentioned below the corresponding code got removed, probably because it got confused with similar sounding code in the common path - and because we don't have tests on the CI for the affected formats yet. Note: the formats in question are not yet supported in Vulkan. Fixes: `5127568b98` ("compiler/nir: use common ycbcr math") Signed-off-by: Robert Mader <robert.mader@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40561>	2026-03-26 09:05:40 +00:00
Faith Ekstrand	60acd4da12	nir: Support primitive_id in lower_sysvals_to_varyings Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40512>	2026-03-25 03:11:56 +00:00
Mel Henning	e46f596325	nir/mem_access_bit_sizes: Handle global_bounded Fixes: `f7ad45e5fc` ("nak: support has_load_global_bounded on turing and newer") Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40577>	2026-03-24 18:55:30 +00:00
Mel Henning	f9a847114d	nir/lower_io: Add global_bounded to io_offset_src along with constant and offset variants Fixes: `f7ad45e5fc` ("nak: support has_load_global_bounded on turing and newer") Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40577>	2026-03-24 18:55:30 +00:00
Kenneth Graunke	0bbb48afb4	nir: Add is_sparse flag to texture builders This sets the is_sparse flag on the resulting nir_tex_instr and the resulting def to be one component larger. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>	2026-03-24 16:06:27 +00:00
Faith Ekstrand	3f870d62b0	nir: Consider if uses in nir_def_all_uses_* They check for if uses and want to return false but nir_foreach_use() means the if uses are never seen. Cc: mesa-stable Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37481>	2026-03-23 19:29:42 +00:00
Marek Olšák	353fe94c0e	Rename SHA1 words to BLAKE3 Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>	2026-03-23 07:03:28 +00:00
Marek Olšák	2283244975	nir: change export_amd intrinsics to use target instead of base Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	b75a3112fd	nir: change export_amd intrinsics to use enabled_channels instead of write_mask Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	f9a10c46fa	nir/inline_uniforms: track visited state per component Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This prevents an instruction from being marked inlinable or non-inlinable when only a subset of components meet that condition. This might only be relevant for non-scalar ALU. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:40 +00:00
Marek Olšák	d9a2fac925	nir/inline_uniforms: update comments Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:40 +00:00
Marek Olšák	3b004ec60b	nir/inline_uniforms: rename new_num -> new_num_uniforms Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:39 +00:00
Marek Olšák	727d663f79	nir/inline_uniforms: rename num_offsets -> num_uniforms Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>	2026-03-21 17:55:39 +00:00
Timothy Arceri	06fc27b5a4	nir: test loop analyze sets exact trip flags correctly Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Introduces new test helper to create loop with multiple terminators and tests some scenaros to make sure exact trip flags are set correctly. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>	2026-03-21 11:46:14 +00:00
Timothy Arceri	82b474c3fb	nir: remove is_only_uniform_src() restriction Loop analysis seems to have assumed we needed a const here to be a useful loop, however this isn't true so drop the restriction. This allows the optimisation from `6ca81adffc` to become more powerful. Shader-db results radeonsi: TOTALS FROM AFFECTED SHADERS (19/168079) SGPRS: 904.00 -> 848.00 (-6.19 %) VGPRS: 712.00 -> 684.00 (-3.93 %) Spilled SGPRs: 0.00 -> 0.00 (0.00 %) Spilled VGPRs: 0.00 -> 0.00 (0.00 %) Private memory VGPRs: 0.00 -> 0.00 (0.00 %) Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread Code Size: 80340.00 -> 92980.00 (15.73 %) bytes Max Waves: 236.00 -> 238.00 (0.85 %) Outputs: 0.00 -> 0.00 (0.00 %) Patch Outputs: 0.00 -> 0.00 (0.00 %) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>	2026-03-21 11:46:14 +00:00
Daniel Schürmann	4ca0eb9f54	nir: validate that loop continue statements always link to continue constructs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	94f959972d	nir: ensure that loop continue statements always link to continue constructs Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	0089d81fb3	nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue We are going to disallow continue statements without loop continue constructs. Replaced with a test that checks that the optimization is not applied in absense of actual work after the conditional break. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	ff8c8858dc	nir/lower_goto_ifs: Add and lower loop continue constructs We are going to disallow continue statements without loop continue constructs. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	f159669cf3	nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Daniel Schürmann	31af989270	nir/lower_continue_constructs: Simplify loops before lowering continue constructs The idea is inspired by LLVM's LoopSimplify pass. Before lowering continue constructs, the pass now also lowers all continue statements, leaving only the trivial continue. This ensures that loops will always only have one back-edge. Totals from 396 (0.47% of 84383) affected shaders: (Navi48) Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12% CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13% Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06% InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03% VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06% SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02% Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14% Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15% PreSGPRs: 34371 -> 34399 (+0.08%) PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05% VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15% SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32% VMEM: 22181 -> 22213 (+0.14%) SMEM: 41370 -> 41376 (+0.01%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>	2026-03-21 07:42:55 +00:00
Mary Guillemard	c6d8f7ce0c	nir/dead_cf: Add missing load_global_nv handling This was missing when this intrinsic was added. Fix some issue with FSI lowering and probably more. Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `e779538ad2` ("nir: add nvidia IO intrinsics") Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:35 +00:00
Mary Guillemard	bb6fc8cc20	nir/dead_cf: Add missing load_global_bounded handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `caa0854da8` ("nir: plumb load_global_bounded") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Mary Guillemard	6013667d61	nir/dead_cf: Add missing load_ssbo_ir3 handling Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `0092edfec0` ("nir/dead_cf: Do not remove loops with loads that can't be reordered") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>	2026-03-20 20:19:34 +00:00
Connor Abbott	ec37fed52b	tu, ir3, nir: Plumb through driver param for alpha-to-coverage We will need this when alpha-to-coverage is dynamic and we need to emulate it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:49 +00:00
Connor Abbott	22a061fb91	nir: Use better calculation for alpha-to-coverage mask The old calculation depended on the sample count, and gave subpar results for 8x MSAA with standard sample locations. The new calculation is based on the Intel pass, with some changing of the constants so that the sample count is always proportional to alpha for 2xMSAA and 4xMSAA and the addition of rotating the sample mask based on the pixel. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:48 +00:00
Georg Lehmann	643dd510d4	nir/opt_algebraic: optimize b2f(a) * b When the multiplication is only used by fadd, it's not a clear win because of potential fma fusion. Totals from 8015 (6.99% of 114655) affected shaders: MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01% Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04% CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06% VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01% SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50% Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04% InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04% VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09% SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01% Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15% Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01% PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07% PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02% VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02% SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16% VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	d2b37b667e	nir/opt_algebraic: optimize more fmulz(1.0, a) remains If dxvk's opencoded fmulz gets partially constant folded, it leaves this mess behind. It's important to do this before the more general fmul+b2f patterns added in the next commit, because they change the signed zero behavior in a way that can't be optimized back. Foz-DB Navi48: Totals from 36 (0.03% of 114655) affected shaders: Instrs: 16513 -> 15706 (-4.89%) CodeSize: 99756 -> 95760 (-4.01%) Latency: 45165 -> 44151 (-2.25%) InvThroughput: 8344 -> 7886 (-5.49%) VClause: 395 -> 401 (+1.52%) Copies: 639 -> 634 (-0.78%) PreSGPRs: 1158 -> 1154 (-0.35%) PreVGPRs: 1227 -> 1225 (-0.16%) VALU: 11310 -> 10769 (-4.78%) SALU: 813 -> 809 (-0.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	3ad142d4d7	nir/search: never insert movs for alu uses This means we respect the pattern order better because simple replacements like bcsel(False, a, b) -> b no longer insert movs that can block more specialized patterns. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	1626df7a90	nir: rework nir_alu_src_is_trivial_ssa to take an alu src Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	b96c42c916	nir/opt_algebraic: optimize more near useless bcsel Foz-DB Navi48: Totals from 327 (0.29% of 114655) affected shaders: Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01% CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00% Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01% InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01% Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02% Branches: 15598 -> 15571 (-0.17%) VALU: 262391 -> 261969 (-0.16%) SALU: 268112 -> 267699 (-0.15%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>	2026-03-20 08:50:41 +00:00
Georg Lehmann	6cfe6eaa79	nir/opt_algebraic: create ldexp from exp2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ldexp uses the full width VALU path, exp2 the transcendental SIMD8. Foz-DB Navi21: Totals from 729 (0.64% of 114627) affected shaders: MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02% Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00% CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00% VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05% Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00% InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00% VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02% SClause: 20520 -> 20509 (-0.05%) Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01% Branches: 22028 -> 22029 (+0.00%) PreVGPRs: 21601 -> 21602 (+0.00%) VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00% SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Georg Lehmann	ec331cc48a	nir: replace lower_ldexp with has_ldexp I can be bothered to fix all the backends that don't set lower_ldexp, and only two backends have ldexp anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>	2026-03-20 08:15:08 +00:00
Faith Ekstrand	3418525a82	pan/bi: Lower VS outputs in NIR Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:32 +00:00
Lorenzo Rossi	43ffcf06f4	pan/bi,nir: Divide memory_access from segments Valhall removed Bifrost's memory segments and added in its place memory access. Those were bolted on reserved bits as "pseudo-segments" and the emitter would catch these and emit the right memory access. This commit cleans it up a bit by making memory_access available directly and exposing it to NIR (this will be useful later). Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00

1 2 3 4 5 ...

7304 commits