Commit graph

326 commits

Author SHA1 Message Date
Connor Abbott
bd821b9a17 nir, tu: Add and use load_frag_coord_gmem_ir3
We used load_frag_coord_unscaled_ir3 for loading the fragment coord for
input attachments in GMEM, where the normal scaling for gl_FragCoord
shouldn't be used. However with custom resolve a different scaling will
apply to attachments in GMEM. Separate "unscaled" from "gmem" and rename
the NIR options, in preparation for this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38451>
2025-12-08 20:44:45 +00:00
Marek Olšák
e14f8ee0e4 nir/has_divergent_loop: require divergence metadata, check all function impls
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
instead of forcing callers to call nir_divergence_analysis

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38597>
2025-12-03 20:14:18 +00:00
Karol Herbst
626c6b35f0 nak: add Movm
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37998>
2025-11-26 14:09:37 +00:00
Kenneth Graunke
792762617a brw: Rename read_attribute_payload_intel to load_attribute_payload_intel
We're going to change the intrinsic to a load(...) which puts "load" in
the name.  Also, it's just more consistent with our usual terminology.

We also rename the corresponding backend opcode so they remain matched.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:58 +00:00
Kenneth Graunke
f1ab64ad74 nir: add new intrinsics to load/store from URB on intel
We add several new intrinsics for accessing URB handles:

- load_urb_output_handle_intel
- load_urb_input_handle_intel
- load_urb_input_handle_intel_indexed

The latter is used by stages like TCS and GS where each input control
point has a unique handle.  The index is which ICP to read from.  The
others are for most stages, where all inputs or outputs are accessed
via a single handle.

Then we have URB load and store operations, split for Xe2+ (URB via LSC)
and earlier (HDC OWord messages):

- load_urb_vec4_intel
- load_urb_lsc_intel
- store_urb_vec4_intel
- store_urb_lsc_intel

The legacy vec4 variants take a handle and a 128-bit OWord offset as
sources.  Additionally, stores take a set of channel enables to mask
off and avoid writing vec4 components.  We don't use the WRITE_MASK
const-index as our channel enables are not required to be constant.

The Xe2+ LSC variants are simpler.  Handles are byte offsets into the
URB memory region, and offsets are expressed in bytes.  So we simply
add them into a single "address" source.  We don't support writemasks
here, as they aren't really necessary with the better addressability.
(Plus, the store_cmask operations work significantly differently than
the previous HDC OWord messages).  We will lower disjoint writemasks
to multiple stores.

Based on earlier code by Lionel Landwerlin.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:54 +00:00
Dave Airlie
26eaba935d nir: add a cmat call instruction type.
This adds a new instruction type to handle cooperative matrix calls.

This clones the call instr, drops callee, and adds a single metadata
slot and a call operation (dummy only for now).

(Not NACKed by Alyssa)

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>
2025-11-17 23:33:58 +00:00
Konstantin Seurer
de32f9275f treewide: add & use parent instr helpers
We add a bunch of new helpers to avoid the need to touch >parent_instr,
including the full set of:

* nir_def_is_*
* nir_def_as_*_or_null
* nir_def_as_* [assumes the right instr type]
* nir_src_is_*
* nir_src_as_*
* nir_scalar_is_*
* nir_scalar_as_*

Plus nir_def_instr() where there's no more suitable helper.

Also an existing helper is renamed to unify all the names, while we're
churning the tree:

* nir_src_as_alu_instr -> nir_src_as_alu

..and then we port the tree to use the helpers as much as possible, using
nir_def_instr() where that does not work.

Acked-by: Marek Olšák <maraeo@gmail.com>

---

To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm
taking this opportunity to clean up a lot of NIR patterns.

Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
2025-11-12 21:22:13 +00:00
Faith Ekstrand
0e9fcb33c3 nir: Add a couple panfrost sysvals to divergence analysis
Fixes: 2af6e4beeb ("pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex}")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Lars-Ivar Hesselberg Simonsen
b3b6fba548 nir: Add pan intrinsics for texel buffer access
Will be used by panfrost to access texel buffers.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37007>
2025-11-07 17:03:53 +00:00
Konstantin Seurer
b962063d72 nir: Remove nir_parallel_copy_instr
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36483>
2025-11-04 18:51:51 +00:00
Marek Olšák
3e2c11597a nir: add nir_intrinsic_ssbo_descriptor_amd for lowering get_ssbo_size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38097>
2025-11-02 01:42:07 +00:00
Lionel Landwerlin
255d1e883d nir/divergence: fix handling of intel uniform block load
Those are normally uniform always, but for the purpose of fused
threads handling, we need to check their sources.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ca1533cd03 ("nir/divergence: add a new mode to cover fused threads on Intel HW")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
2025-10-21 06:13:10 +00:00
Aitor Camacho
f711c3afed nir: Add KosmicKrisp required utilities
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37520>
2025-10-20 16:22:00 +00:00
Daniel Schürmann
fad10b91a6 nir/divergence: don't assume that load_sample_positions_amd is always uniform
Sample positions aren't uniform when the sample id is divergent.
This was a regression when we started lowering fragment shader
barycentrics in NIR.

Fixes: 7f444fc72c ("nir: add nir_intrinsic_load_sample_positions_amd")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:10 +00:00
Lionel Landwerlin
ca1533cd03 nir/divergence: add a new mode to cover fused threads on Intel HW
The Intel Gfx12.x generation of GPU has an architecture feature called
EU fusion in which 2 subgroups run lock step. A typical case where
this happens is a compute shader with 1x1x1 local workgroup size and a
dispatch command of 2x1x1. In that case 2 threads will be run in lock
step for each of the workgroup.

This has been the sources of some troubles in the backend because one
subgroup can run with all lanes disabled, requiring care for SEND
messages using the NoMask flag (execution regardless of the lane mask).

We found out that other things are happening when 2 subgroups run
together :
  - the HW will use the surface/sampler handle from only one subgroup
  - the HW will use the sampler header from only one subgroup

So one of the fused subgroup can access the wrong surface/sampler if
the value is different between the 2 subgroups and that can happen
even with subgroup uniform values.

Fortunately we can flag SEND instructions to disable the fusion
behavior (most likely at a performance cost).

This change introduce a new divergence mode that tries to compute
things divergent between subgroups so that we can flag instructions
accordingly.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>
2025-10-10 11:19:39 +00:00
Marek Olšák
3fe651f607 nir: remove load_smem_amd
replaced by load_global_amd + ACCESS_SMEM_AMD

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>
2025-10-08 08:54:11 +00:00
Daniel Schürmann
7593667b0a nir/divergence_analysis: check ACCESS_SMEM_AMD
Revert "nir/divergence: make smem load_global_amd uniform"

This reverts commit 2d0f93631c.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>
2025-10-08 08:53:55 +00:00
Rhys Perry
8fba196164 nir: assume non-atomic loads don't tear
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
2025-10-07 17:41:30 +00:00
Kenneth Graunke
25cb6dfbf7 nir: Add load_simd_width_intel to divergence analysis
For some reason we missed adding this.  This prevents some asserts
from triggering when I call divergence analysis at certain points
in an upcoming patch.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>
2025-09-30 19:44:02 +00:00
Simon Perretta
c3325b22d8 pco: image atomics support
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:12 +00:00
Lionel Landwerlin
afea98593e nir: add a new intrinsic for load dynamic tessellation config
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>
2025-09-05 07:46:15 +00:00
Rhys Perry
2d0f93631c nir/divergence: make smem load_global_amd uniform
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 14:55:13 -04:00
Dave Airlie
c38170452d nir: add nir_intrinsic_cmat_load_shared_nv
This maps to NAK's OpLdsm

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36363>
2025-08-28 16:09:07 +02:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Georg Lehmann
2d16f457c5 nir: add ACCESS_SKIP_HELPERS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>
2025-08-12 08:56:37 +00:00
John Anthony
000bd3046d nir,spirv: Add support for SPV_ARM_core_builtins
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019>
2025-08-07 11:46:33 +02:00
John Anthony
a68a825aad nir,agx: unvendor core_id_agx
core_id will be used by SPV_ARM_core_builtins

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019>
2025-08-07 11:46:33 +02:00
Qiang Yu
4847e0b380 all: rename gl_shader_stage_uses_workgroup to mesa_shader_stage_uses_workgroup
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Alyssa Rosenzweig
bcf1a1c20b treewide: use nir_def_block
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -definition->parent_instr->block
    +nir_def_block(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Alyssa Rosenzweig
4f1bafa6d5 nir: drop load_sample_id_no_per_sample
unused now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36429>
2025-07-30 22:13:23 +00:00
Marek Olšák
6a85761c4c nir/divergence_analysis: simplify nir_vertex_divergence_analysis
by reusing nir_divergence_analysis_impl.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36099>
2025-07-24 18:41:38 +00:00
Alyssa Rosenzweig
3f795a2b8d nir/divergence_analysis: handle more AGX
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36265>
2025-07-23 14:15:57 +00:00
Rhys Perry
8fd5266b69 nir/divergence: ignore boolean phis for ignore_undef_if_phi_srcs
The only user of this option (ACO) doesn't support this for boolean phis.

fossil-db (navi21):
Totals from 1208 (1.51% of 79825) affected shaders:
Instrs: 826592 -> 823201 (-0.41%); split: -0.41%, +0.00%
CodeSize: 4228296 -> 4224280 (-0.09%); split: -0.11%, +0.01%
Latency: 3030803 -> 3028410 (-0.08%); split: -0.08%, +0.01%
InvThroughput: 578588 -> 578693 (+0.02%); split: -0.00%, +0.02%
VClause: 19500 -> 19494 (-0.03%)
Copies: 60914 -> 57589 (-5.46%); split: -5.47%, +0.01%
PreVGPRs: 50759 -> 50774 (+0.03%)
VALU: 528582 -> 528671 (+0.02%); split: -0.00%, +0.02%
SALU: 121134 -> 117646 (-2.88%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 25.1
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13455
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13509
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36005>
2025-07-21 08:27:01 +00:00
Alyssa Rosenzweig
24c708564f nir: add bindless_sampler_agx intrinsic
to facilitate pushing on AGX.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:17 +00:00
Mary Guillemard
90438bae51 nir: Add NVIDIA-specific muladd intrinsics
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777>
2025-07-15 23:34:31 +00:00
Natalie Vock
9707b30965 nir,aco: Add ds_bvh_stack_rtn
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:39 +00:00
Dave Airlie
2273b6c46a nak: add divergent attribute and wrapper for nir_load_sysval_nv
This wraps the sysval load in a builder where we can add
proper divergence for ctaid later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105>
2025-07-15 19:07:11 +00:00
Alyssa Rosenzweig
ee26938faf nir,agx: switch to bindless_image_agx intrinsic
this is more explicit than vec2's and hence has fewer footguns. in particular
it's easier to handle with preambles in a sane way.

modelled on what ir3 does.

there's probably room for more clean up but for now this unblocks what I want to
do.

stats don't seem concerning.

Totals from 692 (1.29% of 53701) affected shaders:
MaxWaves: 441920 -> 442112 (+0.04%)
Instrs: 1588748 -> 1589304 (+0.03%); split: -0.05%, +0.08%
CodeSize: 11487976 -> 11491620 (+0.03%); split: -0.04%, +0.07%
ALU: 1234867 -> 1235407 (+0.04%); split: -0.06%, +0.10%
FSCIB: 1234707 -> 1235249 (+0.04%); split: -0.06%, +0.10%
IC: 380514 -> 380518 (+0.00%)
GPRs: 117292 -> 117332 (+0.03%); split: -0.08%, +0.11%
Preamble instrs: 314064 -> 313948 (-0.04%); split: -0.05%, +0.01%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Mel Henning
94f4fc12ea nir/divergence_analysis: Add NV_shader_sm_builtins
Fixes crucible func.nv.shader-sm-builtins.q0

Fixes: a3839dbb90 ("nak: Change divergence analysis pass order")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36011>
2025-07-09 16:47:28 +00:00
Marek Olšák
4263b49778 ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass
This is a cleanup.

Old gs LDS layout: [es outputs][gs outputs][scratch]
Old nogs LDS layout: [xfb/cull][scratch]

New gs LDS layout: [es outputs][scratch|gs outputs]
New nogs LDS layout: [scratch|xfb/cull]

The LDS scratch is moved to the beginning of the preceding buffer in LDS,
while the addresses in that LDS buffer are offset by the scratch size.
It effectively merges the LDS scratch with the preceding buffer in LDS.
Thanks to that, we no longer need the ngg_scratch ABI and the offset
in a user SGPR.

The lowering passes now return the LDS scratch size, which is used
by the drivers to determine the final LDS size.

The ngg_lds_layout SGPR is now unused without GS in RADV.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
2025-07-02 20:27:41 +00:00
Alyssa Rosenzweig
16b53d356a nir: add rasterization_stream sysval
for plumbing transformFeedbackRasterizationStreamSelect (in turn for exercising
more CTS and proving out my design).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Matt Turner
102d7409ef nir: Add convert_cmat_intel intrinsic
This intrinsic will be used to implement matrix type and layout
conversions in the backend compiler.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Alyssa Rosenzweig
caa0854da8 nir: plumb load_global_bounded
this lets the backend implement bounded loads (i.e. robust SSBOs) in a way
that's more clever than a full branch. similar idea to
load_global_constant_bound which should eventually be merged into this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>
2025-06-26 16:41:53 +00:00
Lionel Landwerlin
16fca611d7 nir: add new intel ssbo intrinsics
Similar to ir3 ones, to optimize offsets in the backend.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
8ea124f877 nir/divergence: add missing intel intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:21 +00:00
Alyssa Rosenzweig
5795c8595f nir: model dynamic uniform layout on hk
add some new intrinsics so we can defer lowering until we have the information.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35658>
2025-06-20 18:46:13 +00:00
Faith Ekstrand
9f9cde04ec nir: Add a new load_input_attachment_coord intrinsic
This hoists all the annoyance of figuring out the current pixel's input
attachment coordinates to the driver.  The pass still deals with all the
annoyance of turning an image instruciton into a texture instruction but
it gives the driver more control over the position.  For most drivers,
this will be something like ivec3(int(gl_FragCoord.xy), gl_Layer) or
similar, some drivers need something more nuanced.  Turnip, for
instance, needs unscaled coordinates for some attachments and NVK
doesn't really want gl_Layer or gl_ViewIndex for the layer.  It's better
to just have a new system value that drivers can make what they want.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
2025-06-19 02:14:04 +00:00