Commit graph

5518 commits

Author SHA1 Message Date
Timothy Arceri
d681cf96fb nir/glsl: set deref cast mode during function inlining
See code comment for details.

Issue: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11535
Fixes: c6c150b4cd ("glsl_to_nir: support conversion of opaque function params")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30338>
2024-08-19 23:54:49 +00:00
Rob Clark
563ec4754a nir/opt_loop: Don't peel initial break if loop ends in break
A loop that looks like:

   loop {
      do_work_1();
      if (cond) {
         break;
      } else {
      }
      do_work_2();
      break;
   }

We can't pull that break ahead of do_work_1() after hoisting the initial
do_work_1() out of the loop.  So bail in this case.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11711
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30702>
2024-08-17 14:27:02 +00:00
Ian Romanick
198d8d9c03 nir/algebraic: Improve some find_lsb and ifind_msb patterns
These patterns were observed in shaders from parallel-rdp.

No shader-db changes on any Intel platform.

fossil-db:

Meteor Lake, DG2, Ice Lake had Skylake similar results. (Meteor Lake shown)
Totals:
Instrs: 152535883 -> 152535673 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17112406110 -> 17122827810 (+0.06%); split: -0.01%, +0.07%
Spill count: 78525 -> 78523 (-0.00%)
Fill count: 148132 -> 148127 (-0.00%); split: -0.01%, +0.00%
Max live registers: 31855320 -> 31855314 (-0.00%)

Totals from 206 (0.03% of 633223) affected shaders:
Instrs: 797124 -> 796914 (-0.03%); split: -0.03%, +0.00%
Cycle count: 4716743323 -> 4727165023 (+0.22%); split: -0.05%, +0.27%
Spill count: 18781 -> 18779 (-0.01%)
Fill count: 31381 -> 31376 (-0.02%); split: -0.03%, +0.01%
Max live registers: 31872 -> 31866 (-0.02%)

Tiger Lake
Totals:
Instrs: 150560465 -> 150560343 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15482372893 -> 15479328542 (-0.02%); split: -0.02%, +0.00%
Fill count: 103509 -> 103512 (+0.00%)
Max live registers: 31760378 -> 31760374 (-0.00%)

Totals from 199 (0.03% of 632445) affected shaders:
Instrs: 679513 -> 679391 (-0.02%); split: -0.02%, +0.00%
Cycle count: 4258406125 -> 4255361774 (-0.07%); split: -0.09%, +0.02%
Fill count: 30609 -> 30612 (+0.01%)
Max live registers: 30502 -> 30498 (-0.01%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30650>
2024-08-16 14:52:04 +00:00
Lionel Landwerlin
fbafa9cabd intel/nir: remove load_global_const_block_intel intrinsic
load_global_constant_uniform_block_intel is equivalent in terms of
loading, then for the predicate we just do a bcsel afterward in places
where that is required.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30659>
2024-08-16 11:12:39 +00:00
Timothy Arceri
08b93c841a nir: make static assert more flexible
The static assert used in encode deref modes used the fact there was
less than 16 modes that we wanted to compress as an opportunity to reuse
MODE_ENC_GENERIC_BIT as it just happened to represent 16. However if we
add more than 16 modes i.e need to compress to 6 bits not 5 bits then
MODE_ENC_GENERIC_BIT becomes 32 and the logic in the assert breaks.

Instead we more precisely make sure MODE_ENC_GENERIC_BIT is large
enough to fit all but the last 4 generic modes and that the last 4 modes
defined in the enum are in fact the 4 generic modes.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30654>
2024-08-15 23:02:20 +00:00
Matt Turner
c437f2e79c nir/tests: Add tests for opt_if_merge
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30629>
2024-08-15 20:34:54 +00:00
Matt Turner
d2e6be94ae nir: Skip opt_if_merge when next_if has block ending in a jump
Similar to commit 6cef804067 ("nir/opt_if: fix opt_if_merge when
destination branch has a jump"), we shouldn't combine if statements when
the second if-then-else has a block that ends in a jump.

This fixes a case where opt_if_merge combines

    if (cond) {
        [then-block-1]
    } else {
        [else-block-1]
    }

    if (cond) {
        [then-block-2]
    } else {
        [else-block-2]
    }

where `then-block-2` or `else-block-2` ends in a jump. The phi nodes
following the control flow will be incorrectly updated to have an input
from a block that is not a predecessor.

Fixes: 4d3f6cb973 ("nir: merge some basic consecutive ifs")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30629>
2024-08-15 20:34:54 +00:00
Job Noorman
9998b65695 nir/load_store_vectorize: add load/store_const_ir3
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
db2859cb7f nir/load_store_vectorize: support stores without wrmask
Some store intrinsics (e.g., store_const_ir3) don't have a wrmask so
don't assume it always exists.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
97aefc4405 nir/load_store_vectorize: support non-byte offset
Some load/store intrinsics (e.g., load/store_const_ir3) use offsets in
units other than bytes. Currently, byte offsets were assumed in multiple
places.

This patch adds a new offset_scale field to intrinsic_info and uses it
were needed.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
fbd2c80671 ir3: rename @store_uniform_ir3 to @store_const_ir3
Uniforms are a legacy thing and this intrinsic was only used to store to
the const file so the new naming is less confusing.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
e0bad1dd20 ir3: replace @load_uniform by new @load_const_ir3 intrinsic
Uniforms are a legacy thing and this intrinsic was only used to load
from const registers so the new naming is less confusing.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
6b611dbe79 nir/opt_vectorize: add support for phi nodes
Phi nodes are mostly handled the same way as ALU instructions: if all
sources point to the same def (which happens if they are scalar or have
been previously vectorized), combine them into a single vectorized phi
node.

There is one case where this doesn't work, however: sources that come
from a loop back-edge. Since their defs haven't been processed yet, they
are generally not the same. We could simply refuse to vectorize such
phi nodes but this could leave many values used in loops unnecessarily
scalarized.

Instead, this patch implements a simple heuristic: if all defs coming
from a back-edge have the same instructions type and, in case of ALU,
the same operation, assume they will be vectorized later. Since we
require that normal edges are vectorized already, chances are that the
back-edge can also be vectorized.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
79eb57de93 nir/opt_vectorize: process blocks in source-code order
To handle phi nodes, it's important that all sources have been processed
before processing the phi node itself. The current traversal order
(depth-first on dom_children) does not guarantee this. This patch
rewrites the pass to visit blocks in source-code order.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
b451575989 nir/opt_vectorize: prepare for multiple try_combine functions
Dispatch to different functions inside instr_try_combine. To prepare for
upcoming support for phi nodes.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Job Noorman
e2cb646148 nir/opt_vectorize: move rewriting of uses to a function
Will be shared with upcoming support for phi nodes.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28341>
2024-08-15 12:07:27 +00:00
Alyssa Rosenzweig
749205fe06 pan/bi: switch to derivative intrinsics
rewrote most of the impl but shrug.

regresses code gen for mediump but I'm not too bothered given the lackluster
perf of fp16 on bifrost :(

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30567>
2024-08-14 01:34:54 +00:00
Alyssa Rosenzweig
e754e54f88 nir: model AGX explicit coordinate intrinsics
I don't know what Apple calls these, so we're using the name "explicit
coordinates".

AGX has instructions for loading/stores register <---> tilebuffer ---> storage
images. Usually these are used in the fragment shader and end-of-tile shader to
implement colour attachments, with implicitly specified coordinates based on the
shader stage. However they can also be used in compute shaders with explicitly
specified coordinates ("imageblocks" in Apple parlance). Model this in NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
2024-08-12 18:46:31 -04:00
Alyssa Rosenzweig
f04ae930d9 nir,agx: add "active threads in subgroup" intrinsic
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
2024-08-12 18:45:58 -04:00
Alyssa Rosenzweig
16cadc04f3 nir/opt_reassociate_bfi: use alu_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
2643b3cfbf nir/lower_packing: use alu_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
6e39379183 nir/opt_idiv_const: use alu_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
b6daa35d9d nir/scale_fdiv: use alu_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
d2780d871b nir/lower_alu: use alu_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
9b07550908 treewide: use nir_shader_alu_pass
@def@
        typedef bool;
        typedef nir_builder;
        typedef nir_instr;
        typedef nir_def;
        identifier fn, instr, intr, x, builder, data;
        @@

        static fn(nir_builder* builder,
        -nir_instr *instr,
        +nir_alu_instr *intr,
        ...)
        {
        (
        -   if (instr->type != nir_instr_type_alu)
        -      return false;
        -   nir_alu_instr *intr = nir_instr_as_alu(instr);
        |
        -   nir_alu_instr *intr = nir_instr_as_alu(instr);
        -   if (instr->type != nir_instr_type_alu)
        -      return false;
        )

        <...
        (
        -instr->x
        +intr->instr.x
        |
        -instr
        +&intr->instr
        )
        ...>

        }

        @pass depends on def@
        identifier def.fn;
        expression shader, progress;
        @@

        (
        -nir_shader_instructions_pass(shader, fn,
        +nir_shader_alu_pass(shader, fn,
        ...)
        |
        -NIR_PASS_V(shader, nir_shader_instructions_pass, fn,
        +NIR_PASS_V(shader, nir_shader_alu_pass, fn,
        ...)
        |
        -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn,
        +NIR_PASS(progress, shader, nir_shader_alu_pass, fn,
        ...)
        )

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Alyssa Rosenzweig
cc1f092b62 nir: add nir_shader_alu_pass
after the smashing success of nir_shader_intrinsics_pass, let's add the ALU
version to help the odd non-algebraic ALU lowering pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30582>
2024-08-10 13:40:21 +00:00
Marek Olšák
1d66acf993 nir: add ACCESS_KEEP_SCALAR, preventing vectorization
The comment explains the reason.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Georg Lehmann
48acf9d358 nir/lower_int64: replace uadd_sat with ior for find_lsb64 and ufind_msb64
Using ior here is equivalent to using uadd_sat, but works for every driver
and shouldn't hurt anywhere.

I forgot to fix this up when fixing up some vvl errors with zink.

Fixes crashes with the integer_ctz CL CTS tests in zink.

Fixes: 39ec184db6 ("zink: lower 64 bit find_lsb, ufind_msb and bit_count")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30535>
2024-08-09 15:09:57 +00:00
Timur Kristóf
10dcf1fca6 nir: Remove unused nir_assign_linked_io_var_locations.
The only user of this pass was RADV.
Considering that driver locations are deprecated, nobody should
write new code relying on this pass.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Alyssa Rosenzweig
530498cb83 treewide: use new-style derivative builders
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
09c61d0e4c nir/schedule: handle derivative intrinsics
load bearing for broadcom

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
038bb53456 nir/instr_set: allow derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
0566e9a51f nir/divergence_analysis: handle derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
66724e28ac nir/opt_constant_folding: handle derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
e0cc041674 nir/lower_wpos_ytransform: handle intrinsic ddx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
9f9f96d2f9 nir/gather_info: handle derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
c7fbdc6b0c nir/opt_peephole_select: allow derivatives
match the old behaviour.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Alyssa Rosenzweig
24b722a692 nir: add derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Iván Briano
7fce39484e nir: add pass to convert ViewIndex to DeviceIndex
Used to implement VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT_KHR.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30329>
2024-08-07 19:09:55 +00:00
Georg Lehmann
b6d3f666ab nir/peephole_select: ignore masked/quad swizzle without fetch_inactive
Without fetch_inactive, these instructions need to return 0 for inactive lanes
and peephole_select changes which instructions are inactive.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30540>
2024-08-07 20:21:05 +02:00
Zan Dobersek
7fd5f76393 nir/lower_vars_to_scratch: calculate threshold-limited variable size separately
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.

nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
2024-08-07 14:32:28 +00:00
Alyssa Rosenzweig
796b3ab23d nir/opt_peephole_select: allow speculatable load constant
this is useful on AGX when soft fault is enabled.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30501>
2024-08-06 20:01:37 +00:00
Alyssa Rosenzweig
340831dbcc nir/divergence_analysis: handle AGX stuff
bunch of vendor intrinsics, plus some standard intrinsics used in weird shader
stages.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30488>
2024-08-06 11:48:18 -04:00
Alyssa Rosenzweig
d99c2ef059 nir/opt_uniform_atomics: add fs atomics predicated? flag
on agx (and mali), we predicate atomics on "if (!helper)", so doing so again in
this pass is redundant. and would cause a problem since we'd then have to lower
the "is helper inv?" flag late. so just skip the extra lowering code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30488>
2024-08-06 11:48:17 -04:00
Rhys Perry
810808b778 nir/opt_uniform_atomics: require block index metadata
is_atomic_already_optimized() uses this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30518>
2024-08-06 15:04:21 +00:00
Karol Herbst
14ea102175 nir: add load_global_size intrinsic
There is no need to compute it in the shader as the result is known at
runtime already.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Tested-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30467>
2024-08-01 17:43:42 +00:00
Timothy Arceri
298633e365 nir: set disallow_undef_to_nan for legacy ARB asm programs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11389
Fixes: 861d274453 ("nir: replace undef only used by ALU opcodes with 0 or NaN")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30419>
2024-08-01 02:28:24 +00:00
Christian Gmeiner
26474f8d4a nir_lower_mem_access_bit_sizes: Support load_kernel_input
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30407>
2024-07-30 06:51:22 +00:00
Timothy Arceri
017770ff14 nir: add nir_tex_src_{sampler,texture}_deref_intrinsic
To be used as a placeholder until after function inlining so we can
replace function params with bindless handles if needed.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30315>
2024-07-29 00:06:10 +00:00
Timothy Arceri
ef13ff00d1 nir: create validate_tex_src_texture_deref() helper
Will be used in a following patch.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30315>
2024-07-29 00:06:10 +00:00