Commit graph

8107 commits

Author SHA1 Message Date
Caio Oliveira
40ba00238b compiler/types: Tidy up the asserts in get_*_instance functions
Use the local variable in the assertions, move them out the critical region.
No behavior change.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279>
2023-06-15 03:43:46 +00:00
Caio Oliveira
efbbdeffc0 compiler/types: Be consistent when naming array element/size
The element type passed is different than the array type and it is not
a "base type" in the glsl_type sense, so pick a name that reflects that.
Also stick to a single name for the array_size.

Just renames, no behavior change.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23279>
2023-06-15 03:43:46 +00:00
Jesse Natalie
83f741124b nir_lower_returns: Mark assert-only var as ASSERTED
Fixes: 5d238c0c ("nir_lower_returns: Optimize phis before beginning the pass")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23634>
2023-06-15 03:09:29 +00:00
Ian Romanick
de60b463d7 nir/algebraic: Simplify various trivial bfi
These are mostly just obvious patterns that somebody will eventually
want to add.

DG2, Tiger Lake, Ice Lake, Skylake, Broadwell, and Haswell had similar
results (Ice Lake shown)
total instructions in shared programs: 20570033 -> 20570026 (<.01%)
instructions in affected programs: 7363 -> 7356 (-0.10%)
helped: 6 / HURT: 0

total cycles in shared programs: 902118781 -> 902118854 (<.01%)
cycles in affected programs: 419132 -> 419205 (0.02%)
helped: 4 / HURT: 2

DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819500 -> 152819380 (-0.00%)
Cycles: 15014627187 -> 15014624437 (-0.00%)

Totals from 115 (0.02% of 662497) affected shaders:
Instrs: 28963 -> 28843 (-0.41%)
Cycles: 404582 -> 401832 (-0.68%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
541e7eb389 nir/algebraic: Optimize some u2f of bfi
v2: Fix a copy-and-paste bug s/('find_lsb', a)/a/ in the patterns. See
piglit!819.

DG2, Tiger Lake, Ice Lake, Skylake, and Broadwell had similar results (Ice Lake shown)
total instructions in shared programs: 20570063 -> 20570033 (<.01%)
instructions in affected programs: 452 -> 422 (-6.64%)
helped: 30 / HURT: 0

total cycles in shared programs: 902118723 -> 902118781 (<.01%)
cycles in affected programs: 1762 -> 1820 (3.29%)
helped: 0 / HURT: 29

DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819969 -> 152819500 (-0.00%)
Cycles: 15014628652 -> 15014627187 (-0.00%); split: -0.00%, +0.00%

Totals from 469 (0.07% of 662497) affected shaders:
Instrs: 7644 -> 7175 (-6.14%)
Cycles: 31787 -> 30322 (-4.61%); split: -4.90%, +0.29%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
6603948a7a nir/algebraic: Lower some bfi with two constant sources
All Haswell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907054 -> 19906882 (<.01%)
instructions in affected programs: 8103 -> 7931 (-2.12%)
helped: 52 / HURT: 0

total cycles in shared programs: 855779334 -> 855781791 (<.01%)
cycles in affected programs: 724201 -> 726658 (0.34%)
helped: 38 / HURT: 7

total sends in shared programs: 1039308 -> 1039302 (<.01%)
sends in affected programs: 162 -> 156 (-3.70%)
helped: 2 / HURT: 0

No shader-db changes on any older Intel platforms.

All Intel platforms had similar restuls. (Ice Lake shown)
Totals:
Instrs: 153117340 -> 152825222 (-0.19%); split: -0.19%, +0.00%
Cycles: 15011904351 -> 15014072944 (+0.01%); split: -0.04%, +0.05%
Send messages: 7711509 -> 7711421 (-0.00%)
Spill count: 100745 -> 99907 (-0.83%); split: -0.85%, +0.02%
Fill count: 203684 -> 202459 (-0.60%); split: -0.62%, +0.02%
Scratch Memory Size: 4403200 -> 4376576 (-0.60%)

Totals from 18603 (2.81% of 662496) affected shaders:
Instrs: 5258303 -> 4966185 (-5.56%); split: -5.56%, +0.00%
Cycles: 447391388 -> 449559981 (+0.48%); split: -1.29%, +1.77%
Send messages: 559231 -> 559143 (-0.02%)
Spill count: 5009 -> 4171 (-16.73%); split: -17.17%, +0.44%
Fill count: 8769 -> 7544 (-13.97%); split: -14.33%, +0.36%
Scratch Memory Size: 194560 -> 167936 (-13.68%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
83bd87c558 nir: Add optimization pass to reassociate some bfi instructions
The needs of this pass are ever so slightly more than what
nir_opt_algebraic can do. :( Specifically, it needs to be able to look
at the relationship of constant values used in an expression tree.

v2: Add nir_mov_alu to handle swizzles on the original sources.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Lionel Landwerlin
4ee1a8bb9c nir: add a load_global_constant uniform intel variant
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
5ae8a78d8c intel/fs: make use of load_ubo_uniform_block_intel
The principle is the same as the load_ssbo_uniform_block_intel.
Whenever we see a uniform offset, load the data only once in GRFs to
reduce register pressure.

Iris shader-db run on DG2 :

total instructions in shared programs: 23001325 -> 23094969 (0.41%)
instructions in affected programs: 1775989 -> 1869633 (5.27%)
helped: 764
HURT: 2097
helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2
helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63%
HURT stats (abs)   min: 1 max: 2461 x̄: 47.19 x̃: 7
HURT stats (rel)   min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60%
95% mean confidence interval for instructions value: 25.43 40.03
95% mean confidence interval for instructions %-change: 3.60% 4.33%
Instructions are HURT.

total loops in shared programs: 5847 -> 5847 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 839329852 -> 845491482 (0.73%)
cycles in affected programs: 130229434 -> 136391064 (4.73%)
helped: 1098
HURT: 2228
helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22
helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71%
HURT stats (abs)   min: 1 max: 185309 x̄: 3426.24 x̃: 87
HURT stats (rel)   min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82%
95% mean confidence interval for cycles value: 1342.16 2362.97
95% mean confidence interval for cycles %-change: 3.70% 4.52%
Cycles are HURT.

total spills in shared programs: 10768 -> 11856 (10.10%)
spills in affected programs: 9717 -> 10805 (11.20%)
helped: 25
HURT: 28

total fills in shared programs: 13720 -> 16258 (18.50%)
fills in affected programs: 12016 -> 14554 (21.12%)
helped: 25
HURT: 28

total sends in shared programs: 1034790 -> 1031266 (-0.34%)
sends in affected programs: 33416 -> 29892 (-10.55%)
helped: 1005
HURT: 0
helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3
helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08%
95% mean confidence interval for sends value: -3.72 -3.29
95% mean confidence interval for sends %-change: -15.82% -14.57%
Sends are helped.

LOST:   26
GAINED: 183

shader-db on a number of VK/DX titles on DG2 :

 PERCENTAGE DELTAS  Shaders   Instrs    Cycles
 age_of_wonders_III 1928      +0.02%    -0.19%

 PERCENTAGE DELTAS       Shaders   Instrs    Cycles  Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width
 assassins_creed_odyssey 2119      +1.12%    -0.42%      -0.03%        -0.29%       -9.10%     -4.26%         -0.64%             +0.65%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Spill count Fill count Max live registers
 aztec_ruins_high  269       -0.05%    -0.45%     -0.29%     -7.27%         -0.33%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Max live registers Max dispatch width
 dark_souls_3_dxvk_g2 1420      +0.09%    +0.24%        +0.21%             +0.12%

(stats look bad, but it's just one shader affected)
 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Spill count Fill count Scratch Memory Size Max live registers
 fallout_4_dxvk_g2 1638      +0.67%    +8.32%    +16.02%     +7.17%         +100.00%            +0.48%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Send messages Spill count Fill count Max live registers Max dispatch width
 red_dead_redemption2 5969      +0.16%    -0.04%      -0.04%       +0.01%     +0.05%         -0.20%             +0.04%

 PERCENTAGE DELTAS          Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 rise_of_the_tomb_raider_g2 12129     +2.19%    +1.36%      -1.23%          -0.36%             +2.04%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers
 shooter-game      693       +0.07%    -0.89%      -0.09%          -0.09%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 talos_g2          1140      +0.37%    +3.80%      -0.86%          -0.67%             +0.19%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Max live registers Max dispatch width
 total_war_warhammer2 477       +0.25%    +0.66%        -0.17%             +0.10%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 witcher_3_dxvk_g2 1074      +0.75%   -10.45%      -0.15%          -0.16%             -0.16%

 PERCENTAGE DELTAS      Shaders   Instrs    Cycles  Send messages Max live registers
 wolfenstein_youngblood 1111      +0.52%    +0.66%      -0.59%          -0.03%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
4a23a5a904 nir: add a new ubo uniform loading intrinsic for intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Alyssa Rosenzweig
12eb23530b nir: Remove non-scoped barriers
Nothing uses them anymore.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:11 +00:00
Alyssa Rosenzweig
df51464cac nir: Remove handling for non-scoped barriers
Nothing generates them so this is all dead.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:11 +00:00
Alyssa Rosenzweig
c7232be537 nir/tests: Use scoped barriers internally
Test what drivers actually use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
1d4a59448c treewide: Remove use_scoped_barrier
It is now set by all relevant drivers and not checked anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
7173cbccbf nir: Assume use_scoped_barrier
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
5dfa8e4537 vtn: Assume use_scoped_barrier
True for all backends supporting barriers. This lets us collapse lots of code,
since scoped_barriers are based on the SPIR-V definition.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
c696fc4392 glsl: Assume use_scoped_barrier
True for all backends supporting barriers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
09b5e2a786 vtn: Handle atomic counter semantics
This can happen for GLSL-environment SPIR-V.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Suggested-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Jesse Natalie
92dcaf7deb dxil: Remove custom SSBO lowering
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:37 +00:00
Jesse Natalie
ecfbc16f61 dxil: Delete load_ubo_dxil intrinsic
Instead of splitting unaligned UBO loads while still using derefs,
and then lowering load_ubo to load_ubo_dxil in lower_loads_stores_to_dxil,
use lower_mem_access_bit_sizes and lower_ubo_vec4 to handle load size and
alignment restrictions while converting to load_ubo_vec4 instead, which
has the same semantics as load_ubo_dxil.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3842
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
f121d8fe12 microsoft/compiler: Un-lower shared/scratch to derefs
Derefs have index-based access semantics, which means we don't need
custom intrinsics to encode an index instead of a byte offset.

Remove the "masked" store intrinsics and just emit the pair of atomics
directly. This massively reduces duplication between scratch, shared,
and constant, while also moving more things into nir so more optimizations
can be done.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
f9b0382faf microsoft/compiler: Emit const accesses as load_deref
There's a few changes in here that are very inter-related.

First, we stop lowering load_deref on shader_temp to load_ptr_dxil,
and just leave it as load_deref. In order for that to work, we need
the derefs to be in a shape that's acceptable to DXIL, so the only
current producer of shader_temp loads (the CLC frontend) needs to
run some lowering passes on them first.

The DXIL backend is augmented to just write out deref indices while
walking a deref chain, which will get combined in the load op into
a GEP instruction. For non-mesh/raytracing shaders, these are required
to be single-level scalar arrays, but the complexity here is preparation
for when we don't need to do that anymore.

Additionally, the const lookups are changed from using a hash table
to just putting an index on the variable.

All of this together is enough to enable the authored-forever-ago test
which uses indirect array access into a const packed struct. The
load_ptr_dxil handling didn't deal with packed structs / unaligned
accesses, but now that we're in a logical address space with derefs
instead of physical, there's no alignment to deal with anymore and
the fact that it's packed goes out the window.

This removes one custom DXIL intrinsic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
fba82797d7 nir: Optimize unpacking 16 bit values that were originally packed
I was seeing u2u64 still in my final shader after pack/unpack were
lowered, which sounds to me like some other optimizations are missing
for detecting the post-lowering pack/unpack patterns, but let's at
least add some patterns for the simple cases.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
663d957480 nir: Fix constant expression for unpack_64_4x16
Cc: Mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
c70d94a889 nir_lower_mem_access_bit_sizes: Support unaligned stores via a pair of atomics
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8282
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
082eba6165 nir_lower_mem_access_bit_sizes: Move options into a struct
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
4217353e2d nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback
We'd like to use this callback to adjust loads and stores from things
that are unsupported to things that are supported, but if the input
is already supported, we'd prefer not to change it. Rather than making
up a bit size that'd work and doing a bunch of pack/unpack bit math,
only return a different bit size if the input one doesn't work for us
(i.e. can't load enough memory or just an unsupported size entirely).

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
e77fe70b1e nir_lower_ubo_vec4: Delete an invalid assert
This pass handles 16-component 8-bit loads, 8-component 16-bit loads,
and 2-component 64-bit loads. The number of components for the fallback
case doesn't need to be 4.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
bb311ce370 nir: Allow atomics as non-complex uses for var-splitting passes
The var splitting pass can rearrange the variables as long as their
position in memory doesn't matter. For block-arranged variables,
or things like memcpys or casts, the layout matters, but atomics
don't imply anything about the layout of the overall variable, so
don't treat them as "complex" for this use case.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
cf9ea94958 nir_split_struct_vars: Support more modes and constant initializers
Idiomatic DXIL has constants contained within global variables rather
than a big blob of data. Doing this allows us to have 16-bit and 64-bit
data as well, where normally bitcasts would be disallowed on variable
GEP chains.

Unfortunately, DXIL validation requires SOA to be turned into AOS,
which means we need to split structs. We want to be able to run this
on nir_var_mem_constant variables which have constant initializers,
so add a bit of logic to handle that case, and relax the mode validation.
There's nothing special about the modes it was set up to handle.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
c0e41e9b3e vtn: Set is_null_constant
Note that pointers are not considered to be nir null constants, since
a null pointer value might not be 0s.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
4edfb67fd4 nir: Add is_null_constant to nir_constant
Indicates that the values contained within are 0s, regardless of
type. Enables some optimizations.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
009d2de88f nir_opt_constant_folding: Fix nir_deref_path leak
Cc: Mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Alyssa Rosenzweig
5c1d614256 nir: Add interleave_agx instruction
While this is a generic bit twiddling ALU instruction, it's especially useful
for address calculations, since the architecture's tiled textures use Morton
coding within the tiles.

This will be used when lowering image_texel_address on AGX, as part of the image
atomics implementation. I don't know if there's any other neat uses I could
detect with opt_algebraic, this doesn't seem like an operation a shader would
open-code... Maybe useful for BVH building or something...

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
2023-06-12 20:09:53 +00:00
Alyssa Rosenzweig
d1b94a11bd nir/lower_tex: Use nir_steal_tex_src
The find-remove-use pattern is quite natural for texture lowering :)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
2023-06-12 20:09:53 +00:00
Alyssa Rosenzweig
36e779e4a9 nir/builder: Add steal_tex_src helper
I have this in the AGX compiler but I want to use it in more places.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
2023-06-12 20:09:53 +00:00
Sviatoslav Peleshko
08e95f8f8e nir/lower_shader_calls: Fix cursor if broken after nir_cf_extract() call
Fixes: e2dadda3 ("Revert "nir/lower_shader_calls: put inserted instructions into a dummy block")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8978
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22884>
2023-06-11 00:29:49 +00:00
Mykhailo Skorokhodov
40042ed25a nir: Rematerialize derefs after opt_dead_cf
Adding `nir_rematerialize_derefs_in_use_blocks_impl`
solves some cases when 'opt_dead_cf()' generates
a phi instruction for the first argument
of the `deref_store` intrinsic.

Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6742
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22983>
2023-06-09 21:35:21 +00:00
Alyssa Rosenzweig
5a55ef2fd1 nir: Add AGX atomic intrinsics
This is a piece of cake with unified atomics :-) This will let us do our
addressing math tricks nice and easily.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>
2023-06-09 12:06:00 +00:00
Caio Oliveira
4f9a23e339 spirv: Use vtn_translate_scope for OpReadClockKHR
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23508>
2023-06-08 21:21:47 +00:00
Caio Oliveira
089a0cf4ef spirv: Refactor and rename scope translation helper
This will make the change from nir_scope to mesa_scope
later less noisy.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23508>
2023-06-08 21:21:47 +00:00
Karol Herbst
90b8666ff2 clc: relax spec constant validation
Multiple values can have multiple spec constants assigned and vtn handles
this just fine. So just drop that assert as we need it to run SyCL
kernels.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9037
Fixes: a699844ffb ("microsoft/clc: Parse SPIR-V specialization consts into metadata")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23512>
2023-06-08 17:22:47 +00:00
Yonggang Luo
1eda220f18 compiler: use align instead glsl_align and remove glsl_align
#include "util/u_math.h" when necessary to call align function

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23421>
2023-06-08 06:41:21 +00:00
Yonggang Luo
4134f9ac09 util: Do not use align as variable name
Because align is also a function in u_math.h

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23421>
2023-06-08 06:41:21 +00:00
Daniel Schürmann
be9f4a80b8 nir: add nir_intrinsic_resume_shader_address_amd
This intrinsic returns a pointer to the end of the shader
and is intended for stitched binaries.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Daniel Schürmann
03c4b5b0cc nir,amd: add nir_intrinsic_store_[scalar|vector]_arg_amd to overwrite inputs
This intrinsic must only be used at top-level CF in order
to not break SSA properties.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
2023-06-08 00:37:03 +00:00
Caio Oliveira
0f54621564 compiler/types: Make key in subroutine_name more effective
Use the string itself as a key for searching -- and the internal
allocated name as a key when storing.

Because record_key_hash doesn't consider the name field, which is
the only used field for a SUBROUTINE type, the hash key was always
the same for all types.  Using the name fixes this.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23277>
2023-06-07 23:01:59 +00:00
Samuel Pitoiset
98bb7e10e7 nir: add nir_intrinsic_load_rasterization_primitive_amd
For VK_KHR_fragment_shader_barycentric, AMD needs to know the primitive
topology in the fragment shader but with fast-link GPL this is unknown
at compile time and it needs to be passed dynamically.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16742>
2023-06-07 14:40:35 +00:00
Samuel Pitoiset
0358a23012 nir: add nir_intrinsic_load_provoking_vtx_amd
Will be used to load provoking vertex info from the hardware to
determine the provoking vertex ID.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16742>
2023-06-07 14:40:35 +00:00
Yonggang Luo
b687fa4ccb vulkan: move nir_convert_ycbcr into vulkan runtime
This only used by vulkan drivers and depends on vulkan util, so do the move to decouple
nir from vulkan utils

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23444>
2023-06-07 08:42:03 +00:00