Commit graph

7312 commits

Author SHA1 Message Date
Samuel Pitoiset
c4e3380187 nir,treewide: add nir_image_intrinsic_type
We have 4 image intrinsic variants now. This enum is useful for
nir_rewrite_image_intrinsic() and it will be used by other NIR passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40709>
2026-03-31 09:10:27 +00:00
Samuel Pitoiset
9d059a60f5 nir: introduce nir_descriptor_type for Vulkan like descriptors
This removes a Vulkan dependency in NIR core.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40670>
2026-03-31 07:16:20 +00:00
Karol Herbst
b7ca34db13 nir: unvendor ac_nir_lower_sin_cos
So we can use it for Nvidia.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541>
2026-03-31 01:47:31 +02:00
Karol Herbst
5bb3c9f69c nir: rename fsin_amd and fcos_amd to a more generic name
Nvidia implements both the same way as AMD does, so it makes sense to
allow for code sharing here.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541>
2026-03-31 01:47:29 +02:00
Georg Lehmann
1b6ed1b34e nir,radv: lower shadow compare gather to 16bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The output is 1.0 or 0.0 anyway, so there are no precision issues.
For hardware that has v_fma_mix_f32, the inserted conversions should be free
in most cases.

Foz-DB Navi21:
Totals from 1393 (0.68% of 205005) affected shaders:
MaxWaves: 40612 -> 40660 (+0.12%)
Instrs: 571239 -> 570266 (-0.17%); split: -0.19%, +0.02%
CodeSize: 2933912 -> 2979304 (+1.55%); split: -0.00%, +1.55%
VGPRs: 50504 -> 50256 (-0.49%)
Latency: 9883143 -> 9879335 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 2591073 -> 2570721 (-0.79%); split: -0.79%, +0.00%
VClause: 11600 -> 11551 (-0.42%); split: -0.43%, +0.01%
SClause: 26644 -> 26641 (-0.01%)
Copies: 31434 -> 30556 (-2.79%); split: -3.14%, +0.34%
PreVGPRs: 41762 -> 41509 (-0.61%)
VALU: 405533 -> 404655 (-0.22%); split: -0.24%, +0.03%
SALU: 55576 -> 55575 (-0.00%); split: -0.02%, +0.02%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40685>
2026-03-30 18:54:22 +00:00
Georg Lehmann
027503cac2 nir/lower_tex: fix lowering 16bit textureGatherOffsets
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40685>
2026-03-30 18:54:22 +00:00
Rhys Perry
213470b477 util: allow any key for hash tables
We sometimes use this with non-pointer keys.

This removes a footgun at the cost of a larger entry size on 32-bit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40318>
2026-03-30 11:29:44 +00:00
Konstantin Seurer
b127c11be9 spirv,nir: Preserve more information about the descriptor type
Descriptor heap mappings need the information to selectively apply
mappings (descriptor type masks).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:25 +00:00
Samuel Pitoiset
df515cfb5b nir: make nir_variable::descriptor_set a 32-bit variable
With descriptor heap there is no limit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:25 +00:00
Lionel Landwerlin
302194a566 nir: improve deref_instr_get_variable
So we can get through all the casting inserted by heaps.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:23 +00:00
Faith Ekstrand
e7e601f113 nir: Add tex sources for descriptor heaps
We also add a new boolean which indicates that the texture op uses an
embedded sampler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:22 +00:00
Faith Ekstrand
f117b81435 nir: Add intrinsics for descriptor heaps
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:22 +00:00
Faith Ekstrand
c29d8dd4ff nir: Add sampler and resource heap system values
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>
2026-03-30 06:51:20 +00:00
Kenneth Graunke
0e143ae663 nir: Add nir_texop_resinfo_intel
This is a combination of txs and query_levels in a single vec4 result.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40451>
2026-03-29 12:53:09 +00:00
Georg Lehmann
e7077e8f5c nir/lower_non_uniform_access: fix fusing loops for same index but different array variable
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
struct nu_handle is hashed and deduplicated using struct nu_handle_key, which ignored
parent_deref. That means all instructions will use the first parent_deref when rewriting
the sources.

Avoid this by not including the parent deref in the struct, and instead querying it
when needed.

Fixes: 4d09cd7fa5 ("nir/lower_non_uniform_access: Group accesses using the same resource")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15173
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40654>
2026-03-29 08:31:51 +00:00
Lorenzo Rossi
c0e0591999 pan/compiler: Replace frag_coord_zw_pan with var_special_pan
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Just a bit cleaner, and we can unify point size too.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>
2026-03-27 19:23:02 +00:00
Georg Lehmann
0d8e2354ed nir: add fp_math_ctrl to convert_alu_types
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
35ca85176c nir: add fp_math_ctrl to cmat alu ops
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
9cba104e11 nir/opt_fp_math_ctrl: use ddx/ddy fp_math_ctrl
No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
85ff60e68a nir/opt_uniform_subgroup: use ddx/ddy fp_math_ctrl
Foz-DB Navi48:
Totals from 16 (0.01% of 139781) affected shaders:
Instrs: 12432 -> 11597 (-6.72%)
CodeSize: 66204 -> 62440 (-5.69%)
Latency: 77168 -> 76132 (-1.34%)
InvThroughput: 8942 -> 8332 (-6.82%)
VClause: 302 -> 290 (-3.97%)
SClause: 207 -> 201 (-2.90%)
Copies: 553 -> 517 (-6.51%)
PreVGPRs: 589 -> 577 (-2.04%)
VALU: 8007 -> 7473 (-6.67%)
SALU: 1057 -> 900 (-14.85%)
VMEM: 407 -> 395 (-2.95%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
5d2be211ea nir: add fp_math_ctrl to ddx/ddy
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Georg Lehmann
854911aeab nir: add fp_math_ctrl as intrinsic index
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Georg Lehmann
d2be2fd4c1 nir/opt_fp_math_ctrl: ignore ffract input sign of zero
ffract(-0.0) = fract(+0.0) = +0.0

Foz-DB Navi48:
Totals from 23 (0.01% of 205040) affected shaders:
Instrs: 12036 -> 11836 (-1.66%)
CodeSize: 58392 -> 57716 (-1.16%); split: -1.19%, +0.03%
Latency: 57532 -> 57204 (-0.57%); split: -0.61%, +0.04%
InvThroughput: 10399 -> 10217 (-1.75%)
VClause: 72 -> 70 (-2.78%)
Copies: 324 -> 335 (+3.40%)
PreVGPRs: 640 -> 646 (+0.94%)
VALU: 8561 -> 8364 (-2.30%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Robert Mader
44fa9c8326 nir/lower_tex: Reinstate LSB to MSB shift
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
lower_sx10_external and lower_sx12_external are used for
LSB aligned formats such as DRM_FORMAT_S010, which are typically
used by software decoders. Unlike MSB aligned 10/12 bit formats
used by hardware decoders such as P010 they need to manually
get "shifted" in order to correctly map to the 0-1 range.

In the commit mentioned below the corresponding code got removed,
probably because it got confused with similar sounding code in
the common path - and because we don't have tests on the CI for the
affected formats yet.

Note: the formats in question are not yet supported in Vulkan.

Fixes: 5127568b98 ("compiler/nir: use common ycbcr math")
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40561>
2026-03-26 09:05:40 +00:00
Faith Ekstrand
60acd4da12 nir: Support primitive_id in lower_sysvals_to_varyings
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40512>
2026-03-25 03:11:56 +00:00
Mel Henning
e46f596325 nir/mem_access_bit_sizes: Handle global_bounded
Fixes: f7ad45e5fc ("nak: support has_load_global_bounded on turing and newer")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40577>
2026-03-24 18:55:30 +00:00
Mel Henning
f9a847114d nir/lower_io: Add global_bounded to io_offset_src
along with constant and offset variants

Fixes: f7ad45e5fc ("nak: support has_load_global_bounded on turing and newer")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40577>
2026-03-24 18:55:30 +00:00
Kenneth Graunke
0bbb48afb4 nir: Add is_sparse flag to texture builders
This sets the is_sparse flag on the resulting nir_tex_instr and the
resulting def to be one component larger.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>
2026-03-24 16:06:27 +00:00
Faith Ekstrand
3f870d62b0 nir: Consider if uses in nir_def_all_uses_*
They check for if uses and want to return false but nir_foreach_use()
means the if uses are never seen.

Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37481>
2026-03-23 19:29:42 +00:00
Marek Olšák
353fe94c0e Rename SHA1 words to BLAKE3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
2283244975 nir: change export_amd intrinsics to use target instead of base
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>
2026-03-23 06:10:49 +00:00
Marek Olšák
b75a3112fd nir: change export_amd intrinsics to use enabled_channels instead of write_mask
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>
2026-03-23 06:10:49 +00:00
Marek Olšák
f9a10c46fa nir/inline_uniforms: track visited state per component
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This prevents an instruction from being marked inlinable or non-inlinable
when only a subset of components meet that condition.

This might only be relevant for non-scalar ALU.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:40 +00:00
Marek Olšák
d9a2fac925 nir/inline_uniforms: update comments
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:40 +00:00
Marek Olšák
3b004ec60b nir/inline_uniforms: rename new_num -> new_num_uniforms
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:39 +00:00
Marek Olšák
727d663f79 nir/inline_uniforms: rename num_offsets -> num_uniforms
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:39 +00:00
Timothy Arceri
06fc27b5a4 nir: test loop analyze sets exact trip flags correctly
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Introduces new test helper to create loop with multiple terminators
and tests some scenaros to make sure exact trip flags are set
correctly.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>
2026-03-21 11:46:14 +00:00
Timothy Arceri
82b474c3fb nir: remove is_only_uniform_src() restriction
Loop analysis seems to have assumed we needed a const here to be
a useful loop, however this isn't true so drop the restriction.

This allows the optimisation from 6ca81adffc to become more powerful.

Shader-db results radeonsi:

TOTALS FROM AFFECTED SHADERS (19/168079)
  SGPRS: 904.00 -> 848.00 (-6.19 %)
  VGPRS: 712.00 -> 684.00 (-3.93 %)
  Spilled SGPRs: 0.00 -> 0.00 (0.00 %)
  Spilled VGPRs: 0.00 -> 0.00 (0.00 %)
  Private memory VGPRs: 0.00 -> 0.00 (0.00 %)
  Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread
  Code Size: 80340.00 -> 92980.00 (15.73 %) bytes
  Max Waves: 236.00 -> 238.00 (0.85 %)
  Outputs: 0.00 -> 0.00 (0.00 %)
  Patch Outputs: 0.00 -> 0.00 (0.00 %)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>
2026-03-21 11:46:14 +00:00
Daniel Schürmann
4ca0eb9f54 nir: validate that loop continue statements always link to continue constructs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
94f959972d nir: ensure that loop continue statements always link to continue constructs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
0089d81fb3 nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue
We are going to disallow continue statements without
loop continue constructs.

Replaced with a test that checks that the optimization is not
applied in absense of actual work after the conditional break.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
ff8c8858dc nir/lower_goto_ifs: Add and lower loop continue constructs
We are going to disallow continue statements without
loop continue constructs.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
f159669cf3 nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
31af989270 nir/lower_continue_constructs: Simplify loops before lowering continue constructs
The idea is inspired by LLVM's LoopSimplify pass. Before
lowering continue constructs, the pass now also lowers
all continue statements, leaving only the trivial continue.
This ensures that loops will always only have one back-edge.

Totals from 396 (0.47% of 84383) affected shaders: (Navi48)
Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12%
CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13%
Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06%
InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03%
VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06%
SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02%
Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14%
Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15%
PreSGPRs: 34371 -> 34399 (+0.08%)
PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05%
VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15%
SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32%
VMEM: 22181 -> 22213 (+0.14%)
SMEM: 41370 -> 41376 (+0.01%)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Mary Guillemard
c6d8f7ce0c nir/dead_cf: Add missing load_global_nv handling
This was missing when this intrinsic was added.
Fix some issue with FSI lowering and probably more.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e779538ad2 ("nir: add nvidia IO intrinsics")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:35 +00:00
Mary Guillemard
bb6fc8cc20 nir/dead_cf: Add missing load_global_bounded handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: caa0854da8 ("nir: plumb load_global_bounded")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:34 +00:00
Mary Guillemard
6013667d61 nir/dead_cf: Add missing load_ssbo_ir3 handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 0092edfec0 ("nir/dead_cf: Do not remove loops with loads that can't be reordered")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:34 +00:00
Connor Abbott
ec37fed52b tu, ir3, nir: Plumb through driver param for alpha-to-coverage
We will need this when alpha-to-coverage is dynamic and we need to
emulate it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
2026-03-20 18:09:49 +00:00
Connor Abbott
22a061fb91 nir: Use better calculation for alpha-to-coverage mask
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
2026-03-20 18:09:48 +00:00
Georg Lehmann
643dd510d4 nir/opt_algebraic: optimize b2f(a) * b
When the multiplication is only used by fadd, it's not a clear win
because of potential fma fusion.

Totals from 8015 (6.99% of 114655) affected shaders:
MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01%
Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04%
CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06%
VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01%
SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50%
Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04%
VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09%
SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01%
Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15%
Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07%
PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02%
VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02%
SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16%
VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00