Commit graph

244 commits

Author SHA1 Message Date
Connor Abbott
91f19bcbe0 ir3: Plumb through two-dimensional UAV loads
There is native support for D3D-style untyped UAVs, which are an unsized
array of "records."

This will be needed for acceleration structures, because normal SSBO
descriptors aren't large enough to cover all the 128-byte instance
descriptors for the maximum number of instances (2**24).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447>
2025-01-20 01:22:23 +00:00
Alyssa Rosenzweig
7bc9bbcc6e nir/lower_printf: support dynamic buffer size
this is required for vtn_bindgen2 where we don't know the buffer size until
the driver-specific code paths, but we need to lower printf (to hash format
strings) in common code. so defer the buffer size decision to an intrinsic.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33067>
2025-01-17 18:09:45 +00:00
Daniel Schürmann
d2f52e61c2 nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks
The important information is whether a loop has a uniform number
of iterations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28627>
2025-01-08 13:33:54 +01:00
Benjamin Lee
6f541e2016 panfrost: add intrinsic to load frag coord at a barycentric
This is needed for noperspective lowering, where we need to multiply the
varying value by gl_FragCoord.w at the same barycentric as the varying.
Normal nir_load_frag_coord_zw instructions are lowered to the new
intrinsic on bifrost with the pan_lower_frag_coord_zw pass.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>
2025-01-03 07:04:05 +00:00
Georg Lehmann
15d754fefa nir: add load_front_face_fsign
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>
2024-12-30 22:31:34 +00:00
Caterina Shablia
f4fcfa8016 pan,nir: introduce load_attribute_pan
load_attribute_pan is a panfrost-specific intrinsic for loading
vertex attributes. Takes explicit vertex and instance IDs which
we need in order to implement vertex attribute divisor with
non-zero base instance on v9+.

Passes which are used by panvk are modified to be aware of
load_attribute_pan.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32039>
2024-12-18 08:33:16 +00:00
Benjamin Lee
becb014d27 nir: treat per-view outputs as arrayed IO
This is needed for implementing multiview in panvk, where the address
calculation for multiview outputs is not well-represented by lowering to
nir_intrinsic_store_output with a single offset.

The case where a variable is both per-view and per-{vertex,primitive} is
now unsupported. This would come up with drivers implementing
NV_mesh_shader or using nir_lower_multiview on geometry, tessellation,
or mesh shaders. No drivers currently do either of these. There was some
code that attempted to handle the nested per-view case by unwrapping
per-view/arrayed types twice, but it's unclear to what extent this
actually worked.

ANV and Turnip both rely on per-view outputs being assigned a unique
driver location for each view, so I've added on option to configure that
behavior rather than removing it.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
2024-12-09 20:31:49 +00:00
Job Noorman
e6c63a88fb nir: add read_getlast_ir3 intrinsic
Like read_first_invocation but using getlast. Note that I intentionally
used the name of the ir3 instruction in the name as its semantics are
tricky to exactly describe otherwise.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31731>
2024-11-29 16:22:47 +00:00
Caterina Shablia
7ca8c19246 Revert "nir: introduce instance_index system value"
This reverts commit b9be1f1f20.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32332>
2024-11-28 07:53:01 +00:00
Caterina Shablia
b9be1f1f20 nir: introduce instance_index system value
The semantics of this newly introduced system value match
Vulkan's InstanceIndex exactly, and are equivalent to
instance_id + base_instance.

Some hardware, such as Mali Valhall or later, only provides
instance id offset by base_instance. Introducing a new system
value to represent this, rather than handling the mismatch
when lowering to BIR lets us use NIR to eliminate redundant
arithmetic that would follow from mismatched semantics, e.g.
instance_id could be lowered to instance_index - base_instance,
so expressions such as instance_id + base_instance would be
optimized to a simple instance_index.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32158>
2024-11-19 09:18:47 +00:00
Marek Olšák
9d043e138d nir: add nir_clear_divergence_info, use it in nir_opt_varyings
nir_opt_varyings computes vertex divergence, which isn't exactly expected
by any other passes.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>
2024-11-05 14:13:40 +00:00
Alyssa Rosenzweig
506b9a5ff5 nir/divergence_analysis: add AGX atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31909>
2024-10-30 19:04:32 +00:00
Marek Olšák
ee452129c6 nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
2227f5be9d nir: rename load_cull_small_primitive_precision -> triangle, add line_precision
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
0914e0d02f nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Daniel Schürmann
95ed72922e nir/divergence: Don't assume that LCSSA phis are not loop-invariant
Since we check for loop-invariance, we don't have to unconditionally
flag LCSSA phis as divergent in presence of divergent breaks.
This ensures consistency, with or without LCSSA form.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c5f142a695 nir/divergence: skip expensive nir_src_is_divergent() check in most cases
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
0eff03d385 nir/divergence: calculate divergence without requiring LCSSA form
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
d34d2f8fa8 nir: consider loop invariance in nir_src_is_divergent()
By doing so, this function does not require LCSSA form anymore
in order to provide correct results.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
1a55d6c23b nir/divergence: Introduce and set nir_def::loop_invariant
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c0b3d7a916 nir/divergence: require nir_metadata_block_index
This allows for fast checks whether some value is defined inside a loop.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c8348139fd nir: change signature of nir_src_is_divergent()
Now, it takes nir_src * instead of nir_src.
Also move the implementation to nir_divergence_analysis.c.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
421b42637d nir: remove nir_update_instr_divergence()
This function has obscure limitations.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c25c63ebc0 nir/divergence: separately indicate whether loops have divergent continues or breaks
bool nir_loop_is_divergent(nir_loop *)
 replaces the previous loop->divergent indicator.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Lionel Landwerlin
97b17aa0b1 brw/nir: rework inline_data_intel to work with compute
This intrinsic was initially dedicated to mesh/task shaders, but the
mechanism it exposes also exists in the compute shaders on Gfx12.5+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>
2024-10-17 19:35:59 +00:00
Job Noorman
4556b18f51 nir: add shuffle_{xor,up,down}_uniform_ir3 intrinsics
These are like shuffle_{xor,up,down} except they expect a dynamically
uniform index. This is necessary since the ir3 shfl instruction does not
work with a divergent index.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31501>
2024-10-16 22:05:10 +00:00
Rhys Perry
67ad7359ff nir/divergence_analysis: disable phi undef optimization by default
If the backend does not implement this too, or some other future transform
modifiess the phi so that this isn't the case (replace the phi with a
bcsel or replace undef with zero), then it will not actually be uniform.

This keeps it enabled to some degree for RADV/ACO.

fossil-db (navi31):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 195008 -> 195282 (+0.14%)
CodeSize: 1012592 -> 1015884 (+0.33%)
Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15%
InvThroughput: 460681 -> 460964 (+0.06%)
Copies: 13508 -> 13516 (+0.06%)
Branches: 5244 -> 5412 (+3.20%)
PreVGPRs: 5092 -> 5096 (+0.08%)
VALU: 116177 -> 116197 (+0.02%)
SALU: 23449 -> 23785 (+1.43%)

fossil-db (navi21):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 164471 -> 164981 (+0.31%)
CodeSize: 883988 -> 888420 (+0.50%)
Latency: 4074287 -> 4082043 (+0.19%)
InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06%
Branches: 5262 -> 5430 (+3.19%)
PreVGPRs: 5100 -> 5104 (+0.08%)
VALU: 116375 -> 116381 (+0.01%)
SALU: 23589 -> 23925 (+1.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211>
2024-10-10 14:59:26 +00:00
Georg Lehmann
e0bcab953d nir: add amd shared append/consume
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>
2024-09-19 16:21:47 +00:00
Alyssa Rosenzweig
4941d71846 nir/divergence_analysis: handle load_agx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30981>
2024-09-02 23:27:14 +00:00
Ian Romanick
c160ed212e nir/divergence: resource_intel is less divergent than you thought
When the non_uniform flag is not set, the result is never divergent.

v2: Remove redundant assignment to is_divergent. Suggested by Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:30 +00:00
Karol Herbst
fc88f04ba1 vtn, nir: handle OpImageQueryLevels on images
This is needed for cl_khr_mipmap_image, specifically the OpenCL C
function get_image_num_mip_levels.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30834>
2024-08-27 15:06:17 +00:00
Lionel Landwerlin
2158fe2ae2 nir/divergence: add missing load_constant_base_ptr
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30712>
2024-08-27 01:33:52 +00:00
Konstantin Seurer
ce24486ee4 nir: Introduce nir_debug_info_instr
Adds a new instruction type that stores metadata that might be useful
for debugging purposes. Passes must ignore these instructions when
making decisions.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18903>
2024-08-25 10:26:33 +00:00
Lionel Landwerlin
cf986dd589 nir: remove unused intel intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>
2024-08-22 19:44:40 +00:00
Lionel Landwerlin
fbafa9cabd intel/nir: remove load_global_const_block_intel intrinsic
load_global_constant_uniform_block_intel is equivalent in terms of
loading, then for the predicate we just do a bcsel afterward in places
where that is required.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30659>
2024-08-16 11:12:39 +00:00
Job Noorman
e0bad1dd20 ir3: replace @load_uniform by new @load_const_ir3 intrinsic
Uniforms are a legacy thing and this intrinsic was only used to load
from const registers so the new naming is less confusing.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Alyssa Rosenzweig
f04ae930d9 nir,agx: add "active threads in subgroup" intrinsic
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
2024-08-12 18:45:58 -04:00
Alyssa Rosenzweig
0566e9a51f nir/divergence_analysis: handle derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
340831dbcc nir/divergence_analysis: handle AGX stuff
bunch of vendor intrinsics, plus some standard intrinsics used in weird shader
stages.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30488>
2024-08-06 11:48:18 -04:00
Marek Olšák
b2d32ae246 nir: add nir_intrinsic_load_per_primitive_input, split from io_semantics flag
Instead of having 1 bit in nir_io_semantics indicating a per-primitive
FS input, add a dedicated intrinsic for it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:16 +00:00
Georg Lehmann
2d3f536174 aco,nir: add dpp16_shift_amd intrinsic
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24650>
2024-07-17 15:04:38 +00:00
Marek Olšák
1b2cd628b8 nir: rename ordered_xfb_counter_add_gfx12_amd -> ordered_add_loop_gfx12_amd
because it can also be used by compute.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30063>
2024-07-13 01:32:48 +00:00
Connor Abbott
ec37e65a2d ir3: Introduce elect_any_ir3
For preambles, we don't actually care which invocation we get, so we
don't have to enable helper invocations when the preamble uses "getone."
Introduce a new intrinsic with the right semantics and plumb it through.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29914>
2024-06-26 17:40:15 +00:00
Faith Ekstrand
7e3d157bee nak,nir: Drop r2ur_nv in favor of as_uniform
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29737>
2024-06-15 06:14:27 +00:00
Faith Ekstrand
b107240474 nir: Add some new _nv intrinsics
The ldc_nv and ldcx_nv intrinsics correspond to the index and bindless
forms of NVIDIA's LDC instruction, respectively.  ldc_nv is pretty much
load_ubo without some of the unnecessary constant bits while ldcx_nv
takes a 64-bit bindless handle instead of an index.  The other two give
us a little control over register allocation at the NIR level to ensure
that LDCX handles are placed in uniform registers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
2024-06-13 20:43:45 +00:00
Konstantin Seurer
a93f95c69c radv/rt: Remove load_rt_dynamic_callable_stack_base_amd
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619>
2024-05-28 12:23:45 +00:00
Lionel Landwerlin
2be28ee58a nir: add a base offset for printf indexing
This will allow a driver to use a single table of printf strings
across all shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
2024-05-15 13:13:37 +00:00
Lionel Landwerlin
8d336f069e nir/divergence: add missing load_printf_buffer_address
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
2024-05-15 13:13:37 +00:00
Juan A. Suarez Romero
87cd11ecd2 nir,v3d: rename tlb_color_v3d intrinsic
As this is intended to be used also by VC4, change the suffix to
something more convenient, like tlb_color_brcm.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29119>
2024-05-13 10:44:17 +00:00
Marek Olšák
b06a71b3cd nir: add streamout intrinsics for AMD GFX12
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28889>
2024-04-30 17:17:25 +00:00