Commit graph

4525 commits

Author SHA1 Message Date
Rhys Perry
8649bde78f nir/opt_intrinsic: optimize quad vote
Optimizes a quadAll()/quadAny() pattern created by dxil-spirv:
7adc87d4de

dxil-spirv can't use clustered reductions because they are not guaranteed
to include helper invocations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23621>
2023-06-27 18:53:50 +00:00
Rhys Perry
58f8e0e2a0 nir,aco: add INCLUDE_HELPERS index to reduce intrinsic
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23621>
2023-06-27 18:53:50 +00:00
Rhys Perry
48674a1799 nir/peephole_select: allow some invocation broadcast intrinsics
fossil-db (navi21):
Totals from 3 (0.00% of 133428) affected shaders:
Instrs: 2074 -> 2083 (+0.43%)
CodeSize: 10596 -> 10692 (+0.91%)
Latency: 75754 -> 75946 (+0.25%)
InvThroughput: 16900 -> 16975 (+0.44%)
Copies: 312 -> 309 (-0.96%)
Branches: 150 -> 132 (-12.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23621>
2023-06-27 18:53:49 +00:00
Alyssa Rosenzweig
069cca9d66 treewide: Remove unused builders
-Wunused-variables kicks in now that it can see through the init.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig
173b9ee69a treewide: Use nir_builder_create more
perl -p0e 's/nir_builder_init\(&([^,]*), /\1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig
815efcdf7e nir: Use nir_builder_create
perl -p0e 's/nir_builder ([^;]*);\s*nir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig
e5410f9b00 nir: Add nir_builder_create returning nir_builder
More ergonomic.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Konstantin Seurer
ddb7cf7a25 nir/builder_opcodes: Remove nir_build_ prefixed helpers
This patch decreases the size of nir_builder_opcodes.h from 14292 loc to
13763 loc.

nir_build_ versions are still needed if the nir_ is a custom helper.
Intrinsics which need such a helper have to be added to
build_prefixed_intrinsics.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23858>
2023-06-27 17:37:54 +00:00
Konstantin Seurer
400645a565 nir: Use nir_ instead of nir_build_ helpers
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23858>
2023-06-27 17:37:54 +00:00
Alyssa Rosenzweig
c24b753378 nir/lower_blend: Optimize masked out RTs
While debugging KHR-GLES31.core.draw_buffers_indexed.color_masks, the noise from
piles of store_output(load_output) instructions got in the way. Optimize it out.

This does not fix the test, but if this case ever happened in a real app it
would improve performance. This is only load bearing on Asahi (and PanVK?),
since Panfrost wouldn't call nir_lower_blend at all in this case.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>
2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig
f318cab4a1 nir: Add lower_frag_coord_to_pixel_coord pass
We've open coded this in a few backends.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>
2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig
c7067660b2 nir: Add pixel_coord, frag_coord_zw intrinsics
On some architectures, gl_FragCoord.xy is available as an integer but
gl_FragCoord.zw requires interpolation. Add dedicated intrinsics so we can
lower it all in NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>
2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig
6689c678fe nir/lower_locals_to_regs: Add bool bitsize knob
GLSL booleans (and hence bool derefs) may be translated either as 1-bit or
32-bit NIR registers, depending whether the backend uses nir_lower_bool_to_int32
or not. Add a knob for this and choose the right type for different backends.

Fixes nir_validate failure on
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_bvec3 run under
lavapipe. That test indexes into a bvec3 array, and gallivm first lowers bools
and then lowers derefs to registers, resulting in random 1-bit booleans mixed in
with 32-bit bools.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
2023-06-26 08:22:06 -04:00
Alyssa Rosenzweig
5c8f21412f nir/lower_bool_to_int32: Fix progress reporting
If we only lower parameters, that's still progress. Technically.

Fixes: 6a29cb2654 ("nir/lower_bool_to_int32: add support for lowering functions.")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
2023-06-26 08:22:03 -04:00
Yonggang Luo
5b29463746 nir: Add function nir_function_set_impl
This function is added for create strong relationship between
nir_function_impl and nir_function.

So that nir_function->impl->function == nir_function is always true when
(nir_function->impl != NULL && nir_function->impl != NIR_SERIALIZE_FUNC_HAS_IMPL)

And indeed this invariant is already done in functions validate_function and validate_function_impl
of nir_validate

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>
2023-06-24 14:48:47 +00:00
Alyssa Rosenzweig
942c206cd1 nir: Add discard_agx intrinsic
sample_mask_agx corresponds directly to the hardware's 2-source instruction, but
it's hard to use correctly and even harder to legalize after the fact, since
it's responsible for not only discard but also late depth/stencil testing. For
our various high-level lowering passes, it's easier to use a one-source discard
(where we don't have to worry about sample masks), which the compiler will
internally lower to the two-source instruction. Introduce such an instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>
2023-06-23 17:37:41 +00:00
Diederik de Haas
231fa269ea treewide: spelling fixes
Debian's lintian tool flagged some spelling issues:
assumtion -> assumption
unkown -> unknown
memeber -> member
sucess -> success
perfomance -> performance

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23618>
2023-06-23 12:20:59 +00:00
Faith Ekstrand
f278b30e94 nir/opt_if: Use block_ends_in_jump
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23782>
2023-06-22 19:55:49 +00:00
Alyssa Rosenzweig
7ddfc43fdf nir: Remove integer and 64-bit modifiers
Now that Intel and R600 both do their own modifier propagation, the only
backends that still lower modifiers in NIR are:

* nir-to-tgsi
* lima
* etnaviv
* a2xx

The latter 3 backends do not support integers, and certainly do not support
fp64. So they don't use these.

TGSI in theory supports integer negate modifiers but NTT doesn't use them, so
they're unused there too.

Since they're unused, we remove NIR support for integer and 64-bit modifiers,
leaving only 16/32-bit float modifiers. This will reduce the scope needed for a
replacement to NIR modifiers, being pursued in !23089.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23782>
2023-06-22 19:55:49 +00:00
Pavel Ondračka
b4ca45911d nir_opt_algebraic: don't use i32csel without native integer support
Otherwise nir_lower_int_to_float (or specifically nir_gather_ssa_types)
will fail to recognize we already have float constants and converts them
again.

Example from spec/glsl-1.10/execution/vs-loop-array-index-unroll.shader_test
with r300 driver (after enabling has_fused_comp_and_csel).

impl main {
        block block_0:
        /* preds: */
        vec1 32 ssa_0 = load_const (0x00000000 = 0.000000)
        vec4 32 ssa_1 = intrinsic load_input (ssa_0) (base=0, component=0, dest_type=float32, io location=VERT_ATTRIB_POS slots=1)      /* gl_Vertex */
        vec3 32 ssa_2 = load_const (0x00000000, 0x3e800000, 0x3f800000) = (0.000000, 0.250000, 1.000000)
        vec3 32 ssa_3 = load_const (0x00000000, 0x3f000000, 0x3f800000) = (0.000000, 0.500000, 1.000000)
        vec3 32 ssa_4 = load_const (0x00000000, 0x3f400000, 0x3f800000) = (0.000000, 0.750000, 1.000000)
        vec2 32 ssa_5 = load_const (0x00000000, 0x3f800000) = (0.000000, 1.000000)
        vec1 32 ssa_6 = load_const (0x3f800000 = 1.000000)
        vec1 32 ssa_7 = intrinsic load_ubo_vec4 (ssa_0, ssa_0) (access=0, base=0, component=0)
        vec4 32 ssa_8 = load_const (0x00000000, 0x00000001, 0x00000002, 0x00000003) = (0.000000, 0.000000, 0.000000, 0.000000)
        vec4  1 ssa_9 = ilt ssa_8, ssa_7.xxxx
        vec3 32 ssa_10 = bcsel ssa_9.www, ssa_5.xyy, ssa_4
        vec3 32 ssa_11 = bcsel ssa_9.zzz, ssa_10, ssa_3
        vec3 32 ssa_12 = bcsel ssa_9.yyy, ssa_11, ssa_2
        vec3 32 ssa_15 = i32csel_gt ssa_7.xxx, ssa_12, ssa_6.xxx
        vec4 32 ssa_14 = fsat ssa_15.xyxz
        intrinsic store_output (ssa_14, ssa_0) (base=1, wrmask=xyzw, component=0, src_type=float32, io location=VARYING_SLOT_COL0 slots=1, xfb(), xfb2())       /* gl_FrontColor */
        intrinsic store_output (ssa_1, ssa_0) (base=0, wrmask=xyzw, component=0, src_type=float32, io location=VARYING_SLOT_POS slots=1, xfb(), xfb2()) /* gl_Position */
        /* succs: block_1 */
        block block_1:
}

and after nir_lower_int_to_float

impl main {
        block block_0:
        /* preds: */
        vec1 32 ssa_0 = load_const (0x00000000 = 0.000000)
        vec4 32 ssa_1 = intrinsic load_input (ssa_0) (base=0, component=0, dest_type=float32, io location=VERT_ATTRIB_POS slots=1)      /* gl_Vertex */
        vec3 32 ssa_2 = load_const (0x00000000, 0x4e7a0000, 0x4e7e0000) = (0.000000, 1048576000.000000, 1065353216.000000)
        vec3 32 ssa_3 = load_const (0x00000000, 0x4e7c0000, 0x4e7e0000) = (0.000000, 1056964608.000000, 1065353216.000000)
        vec3 32 ssa_4 = load_const (0x00000000, 0x4e7d0000, 0x4e7e0000) = (0.000000, 1061158912.000000, 1065353216.000000)
        vec2 32 ssa_5 = load_const (0x00000000, 0x4e7e0000) = (0.000000, 1065353216.000000)
        vec1 32 ssa_6 = load_const (0x4e7e0000 = 1065353216.000000)
        vec1 32 ssa_7 = intrinsic load_ubo_vec4 (ssa_0, ssa_0) (access=0, base=0, component=0)
        vec4 32 ssa_8 = load_const (0x00000000, 0x3f800000, 0x40000000, 0x40400000) = (0.000000, 1.000000, 2.000000, 3.000000)
        vec4  1 ssa_9 = flt ssa_8, ssa_7.xxxx
        vec3 32 ssa_10 = bcsel ssa_9.www, ssa_5.xyy, ssa_4
        vec3 32 ssa_11 = bcsel ssa_9.zzz, ssa_10, ssa_3
        vec3 32 ssa_12 = bcsel ssa_9.yyy, ssa_11, ssa_2
        vec3 32 ssa_13 = fcsel_gt ssa_7.xxx, ssa_12, ssa_6.xxx
        vec4 32 ssa_14 = fsat ssa_13.xyxz
        intrinsic store_output (ssa_14, ssa_0) (base=1, wrmask=xyzw, component=0, src_type=float32, io location=VARYING_SLOT_COL0 slots=1, xfb(), xfb2())       /* gl_FrontColor */
        intrinsic store_output (ssa_1, ssa_0) (base=0, wrmask=xyzw, component=0, src_type=float32, io location=VARYING_SLOT_POS slots=1, xfb(), xfb2()) /* gl_Position */
        /* succs: block_1 */
        block block_1:
}

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23704>
2023-06-22 07:25:44 +00:00
Mike Blumenkrantz
402ae3b132 nir/lower_tex: ignore saturate for txf ops
saturate is used for GL_CLAMP emulation, and GL_CLAMP cannot be used
with txf

ref #9226

cc: mesa-stable

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23750>
2023-06-21 23:13:50 +00:00
Michel Zou
badb85edb8 util: reinstate ENUM_PACKED
gets rid of warning: 'gcc_struct' attribute ignored [-Wattributes] introduced by !23338

Fixes: 86532fa21d ("util: Use the gcc_struct attribute for packed structures in mingw")
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23478>
2023-06-21 21:51:59 +00:00
Caio Oliveira
af9be8c024 nir/print: Print whether the shader is internal or not
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23756>
2023-06-21 00:01:10 +00:00
Caio Oliveira
59a72570b6 compiler: Move spirv into a module of its own
For historical reasons, nir and vtn were compiled together,
and a bunch of vtn specific targets were defined in
src/compiler/meson.build.

Now that we can, make src/compiler/spirv produce an internal
library that depends on NIR, and is used by the drivers/tools.
Also move the vtn specific targets into that directory's
meson.build.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23668>
2023-06-20 16:18:08 +00:00
Caio Oliveira
cb588d5d6e compiler/clc: Move related NIR passes to the common mesa clc
These were historically in the spirv+nir combo, but the common mesa clc
is a better home for them.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Nora Allen <blackcatgames@protonmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23667>
2023-06-20 03:43:41 +00:00
Caio Oliveira
1f3869ed4e nir/print: Use mesa_scope_name() function to print scopes
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>
2023-06-19 23:29:26 +00:00
Caio Oliveira
59cc77f0fa compiler: Move from nir_scope to mesa_scope
Just moving the enum and performing renames, no behavior change.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>
2023-06-19 23:29:26 +00:00
Iago Toral Quiroga
0de89e10ba nir/lower_tex: handle lower_tg4_offsets with lower_tg4_broadcom_swizzle
This pass uses a safe iterator so it can't lower new instructions that are
injected as part of the lowering, which is exactly what lower_tg4_offsets
does, and if lower_tg4_broadcom_swizzle is also set then we need to lower
these new instructions. Handle this by running the pass twice when both
are set: the first pass will only handle lower_tg4_offsets and
the second pass (which will see the new tg4 instructions produced with
lower_tg4_offsets) will process the remaining options, including
lower_tg4_broadcom_swizzle.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23616>
2023-06-19 08:13:06 +00:00
Iago Toral Quiroga
65353814a3 nir/lower_tex: copy missing fields when creating copy of tex instruction
This is missing both texture and sampler indices.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23616>
2023-06-19 08:13:06 +00:00
Alyssa Rosenzweig
fc3bf53a65 nir/builder: Add ubitfield_extract_imm helper
We have a ubfe_imm helper that creates ubfe ops. Not all drivers support ubfe,
however, as it requires SM5 semantics. A few drivers support oly
ubitfield_extract. They should still get the convenience of an _imm helper, so
add a symmetric helper.

It might be nice to unify these helpers into a single helper that asserts its
inputs do not overflow (such that the two ops become equivalent) and emits
either ubfe or ubitfield_extract depending on the underlying driver. That is
left for future work as it's unclear exactly what naming/semantics we want.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23351>
2023-06-15 13:08:41 -04:00
Erik Faye-Lund
3a64e3425f nir: add and use nir_imod_imm
Just a short-hand, really. Makes the code a bit easier to read.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:49 +00:00
Erik Faye-Lund
e1f4c79288 nir: add and use nir_fdiv_imm
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:49 +00:00
Erik Faye-Lund
590e191e77 nir: use nir_imm_{true,false}
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:48 +00:00
Erik Faye-Lund
9e5cd02fae nir: isub -> iadd_imm
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:48 +00:00
Erik Faye-Lund
8b03a54bcd nir: use more imm-helpers
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23461>
2023-06-15 13:34:48 +00:00
Erik Faye-Lund
2a71e332aa nir: use new immediate comparison helpers
There's plenty of places we can use these new and shiny helpers, so
let's clean up the code a bit.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460>
2023-06-15 13:33:58 +02:00
Erik Faye-Lund
f7bf0c774f nir: add nir_[fui]gt_imm and nir_[fui]le_imm helpers
These are similar to the nir_{cmp}_imm variants we already have, except
they negate the condition (apart from equality) and flip the arguments.
The reason we need this, is that we don't have all comparison directions
that would be required to always pass the immediate in the second
argument.

This allows us to create any comparison with an immediate without
having to manually create the immediate value.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460>
2023-06-15 13:33:58 +02:00
Jesse Natalie
83f741124b nir_lower_returns: Mark assert-only var as ASSERTED
Fixes: 5d238c0c ("nir_lower_returns: Optimize phis before beginning the pass")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23634>
2023-06-15 03:09:29 +00:00
Ian Romanick
de60b463d7 nir/algebraic: Simplify various trivial bfi
These are mostly just obvious patterns that somebody will eventually
want to add.

DG2, Tiger Lake, Ice Lake, Skylake, Broadwell, and Haswell had similar
results (Ice Lake shown)
total instructions in shared programs: 20570033 -> 20570026 (<.01%)
instructions in affected programs: 7363 -> 7356 (-0.10%)
helped: 6 / HURT: 0

total cycles in shared programs: 902118781 -> 902118854 (<.01%)
cycles in affected programs: 419132 -> 419205 (0.02%)
helped: 4 / HURT: 2

DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819500 -> 152819380 (-0.00%)
Cycles: 15014627187 -> 15014624437 (-0.00%)

Totals from 115 (0.02% of 662497) affected shaders:
Instrs: 28963 -> 28843 (-0.41%)
Cycles: 404582 -> 401832 (-0.68%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
541e7eb389 nir/algebraic: Optimize some u2f of bfi
v2: Fix a copy-and-paste bug s/('find_lsb', a)/a/ in the patterns. See
piglit!819.

DG2, Tiger Lake, Ice Lake, Skylake, and Broadwell had similar results (Ice Lake shown)
total instructions in shared programs: 20570063 -> 20570033 (<.01%)
instructions in affected programs: 452 -> 422 (-6.64%)
helped: 30 / HURT: 0

total cycles in shared programs: 902118723 -> 902118781 (<.01%)
cycles in affected programs: 1762 -> 1820 (3.29%)
helped: 0 / HURT: 29

DG2, Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
Totals:
Instrs: 152819969 -> 152819500 (-0.00%)
Cycles: 15014628652 -> 15014627187 (-0.00%); split: -0.00%, +0.00%

Totals from 469 (0.07% of 662497) affected shaders:
Instrs: 7644 -> 7175 (-6.14%)
Cycles: 31787 -> 30322 (-4.61%); split: -4.90%, +0.29%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
6603948a7a nir/algebraic: Lower some bfi with two constant sources
All Haswell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19907054 -> 19906882 (<.01%)
instructions in affected programs: 8103 -> 7931 (-2.12%)
helped: 52 / HURT: 0

total cycles in shared programs: 855779334 -> 855781791 (<.01%)
cycles in affected programs: 724201 -> 726658 (0.34%)
helped: 38 / HURT: 7

total sends in shared programs: 1039308 -> 1039302 (<.01%)
sends in affected programs: 162 -> 156 (-3.70%)
helped: 2 / HURT: 0

No shader-db changes on any older Intel platforms.

All Intel platforms had similar restuls. (Ice Lake shown)
Totals:
Instrs: 153117340 -> 152825222 (-0.19%); split: -0.19%, +0.00%
Cycles: 15011904351 -> 15014072944 (+0.01%); split: -0.04%, +0.05%
Send messages: 7711509 -> 7711421 (-0.00%)
Spill count: 100745 -> 99907 (-0.83%); split: -0.85%, +0.02%
Fill count: 203684 -> 202459 (-0.60%); split: -0.62%, +0.02%
Scratch Memory Size: 4403200 -> 4376576 (-0.60%)

Totals from 18603 (2.81% of 662496) affected shaders:
Instrs: 5258303 -> 4966185 (-5.56%); split: -5.56%, +0.00%
Cycles: 447391388 -> 449559981 (+0.48%); split: -1.29%, +1.77%
Send messages: 559231 -> 559143 (-0.02%)
Spill count: 5009 -> 4171 (-16.73%); split: -17.17%, +0.44%
Fill count: 8769 -> 7544 (-13.97%); split: -14.33%, +0.36%
Scratch Memory Size: 194560 -> 167936 (-13.68%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Ian Romanick
83bd87c558 nir: Add optimization pass to reassociate some bfi instructions
The needs of this pass are ever so slightly more than what
nir_opt_algebraic can do. :( Specifically, it needs to be able to look
at the relationship of constant values used in an expression tree.

v2: Add nir_mov_alu to handle swizzles on the original sources.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19968>
2023-06-14 18:49:53 +00:00
Lionel Landwerlin
4ee1a8bb9c nir: add a load_global_constant uniform intel variant
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
5ae8a78d8c intel/fs: make use of load_ubo_uniform_block_intel
The principle is the same as the load_ssbo_uniform_block_intel.
Whenever we see a uniform offset, load the data only once in GRFs to
reduce register pressure.

Iris shader-db run on DG2 :

total instructions in shared programs: 23001325 -> 23094969 (0.41%)
instructions in affected programs: 1775989 -> 1869633 (5.27%)
helped: 764
HURT: 2097
helped stats (abs) min: 1 max: 102 x̄: 6.96 x̃: 2
helped stats (rel) min: 0.03% max: 16.91% x̄: 1.36% x̃: 0.63%
HURT stats (abs)   min: 1 max: 2461 x̄: 47.19 x̃: 7
HURT stats (rel)   min: <.01% max: 199.34% x̄: 5.91% x̃: 2.60%
95% mean confidence interval for instructions value: 25.43 40.03
95% mean confidence interval for instructions %-change: 3.60% 4.33%
Instructions are HURT.

total loops in shared programs: 5847 -> 5847 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 839329852 -> 845491482 (0.73%)
cycles in affected programs: 130229434 -> 136391064 (4.73%)
helped: 1098
HURT: 2228
helped stats (abs) min: 1 max: 130102 x̄: 1340.64 x̃: 22
helped stats (rel) min: <.01% max: 64.25% x̄: 4.03% x̃: 0.71%
HURT stats (abs)   min: 1 max: 185309 x̄: 3426.24 x̃: 87
HURT stats (rel)   min: <.01% max: 92.85% x̄: 8.12% x̃: 3.82%
95% mean confidence interval for cycles value: 1342.16 2362.97
95% mean confidence interval for cycles %-change: 3.70% 4.52%
Cycles are HURT.

total spills in shared programs: 10768 -> 11856 (10.10%)
spills in affected programs: 9717 -> 10805 (11.20%)
helped: 25
HURT: 28

total fills in shared programs: 13720 -> 16258 (18.50%)
fills in affected programs: 12016 -> 14554 (21.12%)
helped: 25
HURT: 28

total sends in shared programs: 1034790 -> 1031266 (-0.34%)
sends in affected programs: 33416 -> 29892 (-10.55%)
helped: 1005
HURT: 0
helped stats (abs) min: 1 max: 22 x̄: 3.51 x̃: 3
helped stats (rel) min: 1.69% max: 60.00% x̄: 15.20% x̃: 14.08%
95% mean confidence interval for sends value: -3.72 -3.29
95% mean confidence interval for sends %-change: -15.82% -14.57%
Sends are helped.

LOST:   26
GAINED: 183

shader-db on a number of VK/DX titles on DG2 :

 PERCENTAGE DELTAS  Shaders   Instrs    Cycles
 age_of_wonders_III 1928      +0.02%    -0.19%

 PERCENTAGE DELTAS       Shaders   Instrs    Cycles  Subgroup size Send messages Spill count Fill count Max live registers Max dispatch width
 assassins_creed_odyssey 2119      +1.12%    -0.42%      -0.03%        -0.29%       -9.10%     -4.26%         -0.64%             +0.65%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Spill count Fill count Max live registers
 aztec_ruins_high  269       -0.05%    -0.45%     -0.29%     -7.27%         -0.33%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Max live registers Max dispatch width
 dark_souls_3_dxvk_g2 1420      +0.09%    +0.24%        +0.21%             +0.12%

(stats look bad, but it's just one shader affected)
 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Spill count Fill count Scratch Memory Size Max live registers
 fallout_4_dxvk_g2 1638      +0.67%    +8.32%    +16.02%     +7.17%         +100.00%            +0.48%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Send messages Spill count Fill count Max live registers Max dispatch width
 red_dead_redemption2 5969      +0.16%    -0.04%      -0.04%       +0.01%     +0.05%         -0.20%             +0.04%

 PERCENTAGE DELTAS          Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 rise_of_the_tomb_raider_g2 12129     +2.19%    +1.36%      -1.23%          -0.36%             +2.04%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers
 shooter-game      693       +0.07%    -0.89%      -0.09%          -0.09%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 talos_g2          1140      +0.37%    +3.80%      -0.86%          -0.67%             +0.19%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Max live registers Max dispatch width
 total_war_warhammer2 477       +0.25%    +0.66%        -0.17%             +0.10%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Send messages Max live registers Max dispatch width
 witcher_3_dxvk_g2 1074      +0.75%   -10.45%      -0.15%          -0.16%             -0.16%

 PERCENTAGE DELTAS      Shaders   Instrs    Cycles  Send messages Max live registers
 wolfenstein_youngblood 1111      +0.52%    +0.66%      -0.59%          -0.03%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Lionel Landwerlin
4a23a5a904 nir: add a new ubo uniform loading intrinsic for intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>
2023-06-14 12:04:05 +00:00
Alyssa Rosenzweig
12eb23530b nir: Remove non-scoped barriers
Nothing uses them anymore.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:11 +00:00
Alyssa Rosenzweig
df51464cac nir: Remove handling for non-scoped barriers
Nothing generates them so this is all dead.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:11 +00:00
Alyssa Rosenzweig
c7232be537 nir/tests: Use scoped barriers internally
Test what drivers actually use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
1d4a59448c treewide: Remove use_scoped_barrier
It is now set by all relevant drivers and not checked anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Alyssa Rosenzweig
7173cbccbf nir: Assume use_scoped_barrier
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00