Commit graph

3921 commits

Author SHA1 Message Date
Rob Clark
a8e84f50bc nir: Add helper to create passthrough TCS shader
Based on si_create_passthrough_tcs() as that seemed the most generic of
the various different backend driver implementations.  Uses the
load_tess_level_outer_default and load_tess_level_inner_default
intrinsics to load the gl_TessLevelOuter and gl_TessLevelInner values,
so driver will somehow need to implement those to load the values set
by pipe_context::set_tess_state() or similar.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
2022-10-24 21:39:38 +00:00
Alyssa Rosenzweig
80de33cf6a nir/opt_preamble: Move load_texture_base_agx
nir_opt_preamble will be crucial to optimize out the lowering for array
textures on AGX, which involves this AGX-specific sysval.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>
2022-10-22 14:48:04 -04:00
Georg Lehmann
741dbadae0 nir: Fix ifind_msb_rev constant folding.
For example if src0 is 0x80000000 we should return 1, not 0.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Fixes: a5747f8ab3 ("nir: add opcodes for *find_msb_rev and lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:57:55 +02:00
Georg Lehmann
125741dbae nir/opt_algebraic: Optimize various find_msb_rev patterns.
From dxvk, dxil-spirv, fxc, dxc and others.

Totals from 177 (0.13% of 134913) affected shaders:
CodeSize: 1079504 -> 1059872 (-1.82%)
Instrs: 195381 -> 192269 (-1.59%)
Latency: 3664137 -> 3631951 (-0.88%)
InvThroughput: 599479 -> 585675 (-2.30%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:57:33 +02:00
Georg Lehmann
7505be3497 nir/opt_algebraic: Add an option to lower uclz.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:57:10 +02:00
Georg Lehmann
1e552b9c95 nir/opt_algebraic: Mirror optimizations for find_msb_rev.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
2022-10-22 11:56:44 +02:00
Emma Anholt
22f7f167cd nir/opt_phi_precision: Fix missing swizzles when narrowing phi srcs.
This NIR:

	vec4 32 ssa_169 = phi block_1: ssa_168, block_2: ssa_138
	vec1 16 ssa_209 = f2fmp ssa_169.x
	vec1 16 ssa_210 = f2fmp ssa_169.y
	vec1 16 ssa_211 = f2fmp ssa_169.z
	vec1 16 ssa_212 = f2fmp ssa_169.w
	vec4 16 ssa_213 = vec4 ssa_209, ssa_210, ssa_211, ssa_212
	intrinsic store_output (ssa_213, ssa_171) (base=0, wrmask=xyzw /*15*/, component=0, src_type=float16 /*144*/, io location=4 slots=1 mediump /*8388740*/, xfb() /*0*/, xfb2() /*0*/)

would turn into:

	vec4 32 ssa_169 = phi block_1: ssa_168, block_2: ssa_138
	vec4 16 ssa_216 = phi block_1: ssa_214, block_2: ssa_215
	vec1 16 ssa_209 = f2fmp ssa_169.x
	vec1 16 ssa_210 = f2fmp ssa_169.y
	vec1 16 ssa_211 = f2fmp ssa_169.z
	vec1 16 ssa_212 = f2fmp ssa_169.w
	vec4 16 ssa_213 = vec4 ssa_216.x, ssa_216.x, ssa_216.x, ssa_216.x
	intrinsic store_output (ssa_213, ssa_171) (base=0, wrmask=xyzw /*15*/, component=0, src_type=float16 /*144*/, io location=4 slots=1 mediump /*8388740*/, xfb() /*0*/, xfb2() /*0*/)

ignoring the swizzles from the f2fmp srcs.  Fixes failures in
dEQP-GLES2.functional.shaders.random.all_features.fragment.20 on
turnip+ANGLE.

Fixes: c7b935962b ("nir: Add pass to lower phi precision")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19179>
2022-10-22 03:06:31 +00:00
Timur Kristóf
e52c2f4fca nir, ac, aco: Add index src to load_buffer_amd/store_buffer_amd.
Also modify all existing uses to pass a zero to this new src.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
2022-10-20 20:00:50 +00:00
Timur Kristóf
c918f0934e nir, ac, aco: Add ACCESS intrinsic index to load/store_buffer_amd.
Previously, we always treated these as coherent, but now let's make
this configurable. Also set all current users to ACCESS_COHERENT.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
2022-10-20 20:00:49 +00:00
Rhys Perry
1ae73bc076 nir/algebraic: optimize b<<a + c<<a
fossil-db (navi21):
Totals from 248 (0.18% of 135636) affected shaders:
Instrs: 85836 -> 85611 (-0.26%); split: -0.27%, +0.00%
CodeSize: 481304 -> 480332 (-0.20%); split: -0.21%, +0.00%
Latency: 9596559 -> 9596152 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1423707 -> 1423670 (-0.00%)
SClause: 3872 -> 3874 (+0.05%)
PreSGPRs: 5034 -> 5038 (+0.08%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19137>
2022-10-20 18:57:23 +00:00
Samuel Pitoiset
09033c7b22 nir: add nir_intrinsic_load_ring_attr_{offset}_amd
These intrinsics will be used to lower NGG attributes to memory on
GFX11.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19173>
2022-10-20 15:59:44 +00:00
Qiang Yu
58e006b174 nir,ac/llvm,radv: add nir_intrinsic_load_provoking_vtx_in_prim_amd
For radeonsi which load this from arg.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19166>
2022-10-20 06:53:56 +00:00
Karol Herbst
d7156e5d9c nir/lower_cl_images: set binding
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19151>
2022-10-19 06:27:20 +00:00
Alyssa Rosenzweig
78adf44839 nir/lower_io: Set interpolated_input dest_type
...even for non-pixel interpolation, for consistency. Otherwise backends get
funny intrinsics with interpolateAt:

   vec4 32 ssa_4 = intrinsic load_interpolated_input (ssa_3, ssa_2) (base=1, component=0, dest_type=invalid /*0*/, io location=33 slots=1 /*161*/)

We know it'll be a float, but backends shouldn't need to special case this. (Or
maybe interpolated_input shouldn't have a dest_type index. I'd be ok with that
resolution too. But having one and not setting it consistently is wrong.)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19085>
2022-10-18 21:08:54 +00:00
Yonggang Luo
a9da108c6b nir: No need redefine snprintf anymore in nir.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18685>
2022-10-18 03:16:00 +00:00
Alyssa Rosenzweig
2c7be4d421 nir: Usher nir_normalize_cubemap_coords into 2022
I stumbled upon this old NIR pass (still in use by intel and broadcom)
and noticed how most of the code was NIR boilerplate that we have
helpers for. Rewrite the pass to use all the helpers.

v2: Fix cube map arrays.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18754>
2022-10-17 20:46:24 +00:00
Alyssa Rosenzweig
fc5c671e87 nir: Fix nir_fmax_abs_vec_comp
This failed to take fabs of the first component, implementing an unintended
formula that would return the right results in some common cases but is wrong in
general:

   max { x, |y|, |z| }

instead of the intended

   max { |x|, |y|, |z| }

Reexpress the implementation to make correctness obvious.

Fixes: 272e927d0e ("nir/spirv: initial handling of OpenCL.std extension opcodes")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18754>
2022-10-17 20:46:24 +00:00
Alyssa Rosenzweig
ac2964dfbd nir: Be smarter fusing ffma
If there is a single use of fmul, and that single use is fadd, it makes
sense to fuse ffma, as we already do. However, if there are multiple
uses, fusing may impede code gen. Consider the source fragment:

   a = fmul(x, y)
   b = fadd(a, z)
   c = fmin(a, t)
   d = fmax(b, c)

The fmul has two uses. The current ffma fusing is greedy and will
produce the following "optimized" code.

   a = fmul(x, y)
   b = ffma(x, y, z)
   c = fmin(a, t)
   d = fmax(b, c)

Actually, this code is worse! Instead of 1 fmul + 1 fadd, we now have 1
fmul + 1 ffma. In effect, two multiplies (and a fused add) instead of
one multiply and an add. Depending on the ISA, that could impede
scheduling or increase code size. It can also increase register
pressure, extending the live range.

It's tempting to gate on is_used_once, but that would hurt in cases
where we really do fuse everything, e.g.:

   a = fmul(x, y)
   b = fadd(a, z)
   c = fadd(a, t)

For ISAs that fuse ffma, we expect that 2 ffma is faster than 1 fmul + 2
fadd. So what we really want is to fuse ffma iff the fmul will get
deleted. That occurs iff all uses of the fmul are fadd and will
themselves get fused to ffma, leaving fmul to get dead code eliminated.
That's easy to implement with a new NIR search helper, checking that all
uses are fadd.

shader-db results on Mali-G57 [open shader-db + subset of closed]:

total instructions in shared programs: 179491 -> 178991 (-0.28%)
instructions in affected programs: 36862 -> 36362 (-1.36%)
helped: 190
HURT: 27

total cycles in shared programs: 10573.20 -> 10571.75 (-0.01%)
cycles in affected programs: 72.02 -> 70.56 (-2.02%)
helped: 28
HURT: 1

total fma in shared programs: 1590.47 -> 1582.61 (-0.49%)
fma in affected programs: 319.95 -> 312.09 (-2.46%)
helped: 194
HURT: 1

total cvt in shared programs: 812.98 -> 813.03 (<.01%)
cvt in affected programs: 118.53 -> 118.58 (0.04%)
helped: 65
HURT: 81

total quadwords in shared programs: 98968 -> 98840 (-0.13%)
quadwords in affected programs: 2960 -> 2832 (-4.32%)
helped: 20
HURT: 4

total threads in shared programs: 4693 -> 4697 (0.09%)
threads in affected programs: 4 -> 8 (100.00%)
helped: 4
HURT: 0

v2: Update trace checksums for virgl due to numerical differences.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18814>
2022-10-15 17:47:31 +00:00
Gert Wollny
2e50bf19cd nir: move fusing csel and comparisons to opt_late_algebraic
With that simple comparisons are cleaned up properly.

This helps with some tesselation shaders on r600.

Shader-db stats R600/Cayman:

--------------------------------------------------------------
total dw in shared programs: 1621806 -> 1620884 (-0.06%)
dw in affected programs: 41650 -> 40728 (-2.21%)
helped: 211
HURT: 4
helped stats (abs) min: 2 max: 26 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.30% max: 9.68% x̄: 2.87% x̃: 2.52%
HURT stats (abs)   min: 2 max: 8 x̄: 5.00 x̃: 5
HURT stats (rel)   min: 0.23% max: 1.67% x̄: 1.02% x̃: 1.09%
95% mean confidence interval for dw value: -4.81 -3.77
95% mean confidence interval for dw %-change: -3.03% -2.57%
Dw are helped.

total gprs in shared programs: 41192 -> 41182 (-0.02%)
gprs in affected programs: 731 -> 721 (-1.37%)
helped: 53
HURT: 45
helped stats (abs) min: 1 max: 3 x̄: 1.23 x̃: 1
helped stats (rel) min: 5.88% max: 40.00% x̄: 16.56% x̃: 14.29%
HURT stats (abs)   min: 1 max: 2 x̄: 1.22 x̃: 1
HURT stats (rel)   min: 7.69% max: 40.00% x̄: 19.42% x̃: 20.00%
95% mean confidence interval for gprs value: -0.37 0.16
95% mean confidence interval for gprs %-change: -3.92% 3.85%
Inconclusive result (value mean confidence interval includes 0).

total alu_groups in shared programs: 203677 -> 203632 (-0.02%)
alu_groups in affected programs: 2876 -> 2831 (-1.56%)
helped: 68
HURT: 30
helped stats (abs) min: 1 max: 4 x̄: 1.46 x̃: 1
helped stats (rel) min: 0.84% max: 25.00% x̄: 7.48% x̃: 5.41%
HURT stats (abs)   min: 1 max: 6 x̄: 1.80 x̃: 1
HURT stats (rel)   min: 1.98% max: 33.33% x̄: 10.09% x̃: 5.61%
95% mean confidence interval for alu_groups value: -0.81 -0.11
95% mean confidence interval for alu_groups %-change: -4.20% <.01%
Alu_groups are helped.

total loops in shared programs: 72 -> 72 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cf in shared programs: 88230 -> 88233 (<.01%)
cf in affected programs: 71 -> 74 (4.23%)
helped: 1
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.89% max: 33.33% x̄: 17.14% x̃: 16.67%
95% mean confidence interval for cf value: -0.51 1.71
95% mean confidence interval for cf %-change: -24.20% 38.29%
Inconclusive result (value mean confidence interval includes 0).

total stack in shared programs: 3827 -> 3827 (0.00%)
stack in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

Total CPU time (seconds): 45.32 -> 41.69 (-8.01%)
--------------------------------------------------------------

v2: Simplify replacement pattern (Rhys Perry)
v3: fix ws (Alexander Orzechowski)
v4: move the original lowering to opt_late_algebraic and
    drop cleanup code (Alyssa)
v5: Add shader-sb stats (Alyssa)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18970>
2022-10-14 13:08:15 +00:00
Lionel Landwerlin
eec49374b0 nir: fix NIR_DEBUG=validate_ssa_dominance
validate_ssa_def_dominance() asserts :

   validate_assert(state, !BITSET_TEST(state->ssa_defs_found, def->index));

Because the previous validation lefts bits set when it processed the
IR.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18966>
2022-10-14 10:36:56 +03:00
Alyssa Rosenzweig
f4b03ea6dc nir/lower_system_values: Fix cs_local_index_to_id with variable workgroups
In that case we need to use the sysval. That sysval can be optimized anyway in
the nonvariable case. Fixes test_basic.get_linear_ids on panfrost.

Fixes: 998d84fca5 ("nir/lower_system_values: Support lowering more intrinsics")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18662>
2022-10-13 21:25:23 +00:00
Georg Lehmann
00a8be3414 nir: Print nir_selection_control_divergent_always_taken.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:54 +00:00
Timur Kristóf
c0d0a7c176 nir: Add selection control enum for always taken divergent branches.
The new enum is called nir_selection_control_divergent_always_taken,
and it's almost the same as nir_selection_control_flatten.
The main difference between the two is that "flatten" represents
a choice made by the application but "divergent_always_taken" may
be applied by the compiler stack when it thinks this is beneficial.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:54 +00:00
Timur Kristóf
a2ec843727 nir: Document the flatten/dont_flatten selection control options.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>
2022-10-11 15:42:53 +00:00
Gert Wollny
9ebe893a61 nir_lower_to_source_mods: Don't sneek in an abs modifier from parent
If the abs source modifiers is not supported for the current
instruction because it is an instruction with three sources we
may still see a parent mov that has the `abs` modifier. In this
case we must not propagate that abs modifier from that parent
instructions.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7350

Fixes: cd73b6174b
    nir/lower_to_source_mods: Stop turning add, sat, and neg into mov

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18902>
2022-10-04 08:36:57 +02:00
Emma Anholt
594b638d4f nir/vars_to_ssa: Always do OOB load/store removal.
We elminated OOB loads while renaming vars to SSA.  However, if the OOB
load only appeared after some other passes had constant folded, there may
be no renaming work to do, at which point we'd leave the OOB load deref
around without renaming it or deleting it.  For vc4, this was quite a
surprise and caused a regression when we stop eliminating some OOB
accesses at the GLSL level.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18466>
2022-10-03 17:18:31 +00:00
Emma Anholt
6c38797101 nir/nir_opt_copy_prop_vars: Don't leak dynarray memory during the pass.
It was swept at the end, but it meant that in shaders with lots of copies
available at the start of lots of if statements, you'd blow up memory
usage.

turnip memory consumption on dEQP-VK.ssbo.layout.random.scalar.75 drops
from 1.4GB to 110MB, and runtime from 19s to 17s.

Fixes: #7361
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18891>
2022-10-03 15:33:21 +00:00
SoroushIMG
1e8e785a07 nir: allow to fine tune unrolling for loops with soft fp64 ops
Lowered fp64 ops can blow up the loop bodies while still being
suitable for unrolling.
Allow for using different parameters to unroll loops with
soft fp64.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18863>
2022-09-30 17:07:37 +00:00
SoroushIMG
121f30005f nir: track whether a loop contains soft fp64 ops
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18863>
2022-09-30 17:07:37 +00:00
Georg Lehmann
bfb12a3b6a nir/opt_algebraic: Optimize more (a cmp b ? a : b) to min/max.
Foz-DB Navi21:
Totals from 112 (0.08% of 134913) affected shaders:
CodeSize: 1618384 -> 1618172 (-0.01%); split: -0.06%, +0.04%
Instrs: 307695 -> 307535 (-0.05%); split: -0.05%, +0.00%
Latency: 3590228 -> 3589658 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 563692 -> 563447 (-0.04%); split: -0.05%, +0.01%
Copies: 24541 -> 24519 (-0.09%); split: -0.10%, +0.01%
Branches: 13480 -> 13468 (-0.09%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18548>
2022-09-30 11:10:52 +00:00
Kenneth Graunke
c9d399604e st/mesa: Optionally call nir_vectorize_tess_levels()
This lets us vectorize gl_TessLevel{Inner,Outer} writes, using a pass
developed for RADV.  Not all backends are prepared to handle this, so
we make it optional.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>
2022-09-27 18:17:47 -07:00
Mike Blumenkrantz
52edd8f764 nir/opt_undef: add a pass to clean up 64bit undefs
somehow 64bit lowering creates patterns like

vec1 64 ssa_1 = undefined
ssa_2 = unpack_64_2x32_split_x ssa_1

and then the 64bit value is never otherwise used. for this case, rewriting
the unpack to just be a 32bit undef allows the 64bit undef to be optimized
out, avoiding spec violations

fixes #6945

SoroushIMG <soroush.kashani@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18728>
2022-09-27 18:38:25 +00:00
Timothy Arceri
40c32dfbb1 nir/loop_analyze: remove cost of redundant selects
If we know that a select will be eliminated once the loop is
unrolled than we don't need to count the instruction towards the
cost of the loop.

This change helps 2 loops unroll in an xcom enemy unknown shader
that is loaded full of these redundant selects.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18587>
2022-09-27 00:31:47 +00:00
Timothy Arceri
13d0ae593b nir/loop_analyze: delay instruction cost calculation
Here we move the calculation of the instruction cost of the loop
after we have processed other information such as finding the
induction variables. This is useful because we can use this further
information to find instructions that will be eliminated if the
loop was to unroll and therefore give them a cost of 0.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18587>
2022-09-27 00:31:47 +00:00
Marcin Ślusarz
2a723f7a8d nir: use nir_shader_instructions_pass in nir_split_per_member_structs
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
67fe9ae5c3 nir: use nir_shader_instructions_pass in nir_split_var_copies
No functional changes.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
9dcff3ea53 nir: use nir_shader_instructions_pass in nir_lower_samplers
No functional changes.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
865d959090 nir: use nir_shader_instructions_pass in nir_lower_interpolation
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
6e0bcc1c4d nir: use nir_metadata_none instead of its value
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
dd51dedefd nir: use nir_shader_instructions_pass in nir_lower_frexp
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
a87070937d nir: use nir_shader_instructions_pass in nir_lower_fb_read
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
2e410c4c05 nir: use nir_shader_instructions_pass in nir_lower_drawpixels
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
00b0de5c83 nir: use nir_shader_instructions_pass in nir_lower_clip_halfz
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
c6e4641a21 nir: use nir_shader_instructions_pass in nir_lower_clip_disable
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
a1efa18dfe nir: use nir_shader_instructions_pass in nir_lower_clamp_color_outputs
Changes:
- removal of lower_state (not needed anymore)
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
  is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
c7078fe4e0 nir: use nir_shader_instructions_pass in nir_lower_64bit_phis
No functional changes.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
ea39efe9b8 nir: use nir_shader_instructions_pass in nir_lower_bool_to_int32
No functional changes.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
d28833d60f nir: use nir_shader_instructions_pass in nir_lower_bool_to_float
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
4a9e2dc1e9 nir: use nir_shader_instructions_pass in nir_lower_bool_to_bitsize
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00
Marcin Ślusarz
5beeb3c1db nir: use nir_shader_instructions_pass in nir_lower_alu
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress
- only metadata of the current function is invalidated (invalidation on
  one function was leaking to successive functions because "progress"
  was not reset)

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12282>
2022-09-26 11:13:03 +00:00