Commit graph

3588 commits

Author SHA1 Message Date
Qiang Yu
9401990e6f nir/linker: set varying from uniform as flat
Flat varying can save some rasterization compute cost
and register needed by shader.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15341>
2022-03-22 01:33:23 +00:00
Qiang Yu
2617e6c028 nir/linker: disable varying from uniform lowering by default
This fixes performance regression for Specviewperf/Energy
on AMD GPU. Other GPUs passing varying by memory may choose
to re-enable it as need.

Fixes: 2604625043 ("nir/linker: support uniform when optimizing varying")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15341>
2022-03-22 01:33:23 +00:00
Karol Herbst
43c3f4386b nir: fix nir_sweep for printf
I hit a memory corruption trying to implement printf for Rusticl

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15474>
2022-03-21 13:23:04 +00:00
Jason Ekstrand
e9ff6f4f06 nir/print: Add support for generic pointers
The way they're handled is that deref->modes is treated as a bitfield of
possible modes.  Variables are required to have a specific mode and
derefs with deref_type_var are as well.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Mike Blumenkrantz
cdcfcb7916 nir/lower_is_helper_invocation: create load_helper_invocation instr with bitsize=1
the specification stipulates that this is a bool value, so don't load it as an int
or else nir_validate explodes

Fixes: f17b41ab4f ("nir: add lowering pass for helperInvocationEXT()")

Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15402>
2022-03-21 03:20:33 +00:00
Connor Abbott
acba08b58f ir3: Implement and document ldc.k
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
ccc64b7e00 ir3: Plumb through store_uniform_ir3 intrinsic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
3244e659e0 ir3: Implement basic shader preamble intrinsics
These will be used to implement the ir3-specific shader preamble
lowering in NIR. shps is conceptually similar to getone (although it
technically can't be duplicated) and shpe is similar to other barriers,
since it has to happen after any stores to the constant file in the
preamble. Add NIR intrinsics and plumbs them through ir3.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
3b96ad70ee nir: Add a preamble optimization pass
This pass tries to move computations that are uniform for the entire
draw to the preamble. There's also an API for backends to insert their
own instructions into the preamble, for porting existing UBO pushing
passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
31221ee556 nir: Add a "deep" instruction clone
For the shader preamble, we need to add support for cloning one
instruction at a time into the preamble, but we also need to rewrite
sources.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
d1b017d479 nir: Add preamble functions
These are functions that run before the entrypoint at least once per
draw and write their results via store_preamble, and then are loaded in
the rest of the shader via load_preamble.

We will add users in the following commits.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Iago Toral Quiroga
fed51585c4 nir/schedule: allow drivers to decide about instruction latency
On V3D reading UBOs from uniform addresses uses a more efficient
mechanism with lower latency. On other platforms there may be
simular scenarios.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga
e7a4e97076 nir/schedule: use larger delay for non-filtered memory reads
This has been pending for a long time. It is not very consistent to
add a significant delay for textures and not do it for UBOs, etc
The reason we have not been doing this so far is the accumulated effect
on register pressure for V3D as shown by shader-db results below, but
from the point of view of a generic scheduler it makes sense to do this.

Later patches will address V3D specific issues with register pressure
derived from this by letting the driver control its instruction delay
settings.

total instructions in shared programs: 12662138 -> 13126587 (3.67%)
instructions in affected programs: 1813091 -> 2277540 (25.62%)
helped: 2410
HURT: 10499

total threads in shared programs: 415858 -> 407208 (-2.08%)
threads in affected programs: 17348 -> 8698 (-49.86%)
helped: 8
HURT: 4333

total uniforms in shared programs: 3711483 -> 3812698 (2.73%)
uniforms in affected programs: 128012 -> 229227 (79.07%)
helped: 3474
HURT: 2143

total max-temps in shared programs: 2138763 -> 2318430 (8.40%)
max-temps in affected programs: 318780 -> 498447 (56.36%)
helped: 588
HURT: 11997

total spills in shared programs: 3860 -> 49086 (1171.66%)
spills in affected programs: 709 -> 45935 (6378.84%)
helped: 23
HURT: 1595

total fills in shared programs: 5573 -> 55810 (901.44%)
fills in affected programs: 1067 -> 51304 (4708.25%)
helped: 23
HURT: 1595

LOST:   3
GAINED: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga
3bd041e2fb nir/schedule: handle nir_intrinsic_group_memory_barrier
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga
46e330c07e nir/schedule: fix handling of generic memory barrier
We can get a generic nir_intrinsic_memory_barrier to represent a
barrier involving multiple semantics (instead of getting individual
specific barriers for each semantic). This means that we need to
consider these as potentially affecting shared memory access as well.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Mike Blumenkrantz
0d80aed363 nir/gather_info: check copy_deref instrs for writing outputs
this is a valid way to write an output even though it usually gets rewritten
to some other instruction later on

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Timur Kristóf
4b99b528f5 nir: Introduce workgroup_index and ability to lower workgroup_id to it.
The workgroup_index is intended for situations when a 3 dimensional
workgroup_id is not available on the HW, but a 1 dimensional index is.
In this case, we can use lower the 3D ID to use this.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
2022-03-08 17:36:31 +00:00
Timur Kristóf
6a4c01f3ef nir: Extract lower_id_to_index into a separate function.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
2022-03-08 17:36:31 +00:00
Timur Kristóf
64acec0ef9 nir: Fix lowering terminology of compute system values: "from"->"to".
This is to match other NIR terminology.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
2022-03-08 17:36:31 +00:00
Timur Kristóf
5b9bf3434f nir: Fix handling of NV_mesh_shader PRIMITIVE_INDICES output.
PRIMITIVE_INDICES is a flat array in NV_mesh_shader,
not a proper arrayed output, as opposed to D3D-style
mesh shaders where it's addressed by the primitive index.

Prevent assigning several slots to primitive indices,
to avoid issues.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15160>
2022-03-08 13:44:10 +00:00
Georg Lehmann
6731460194 nir: Fix source type for fragment_fetch_amd.
Like txf_ms, these take integers not floats.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15242>
2022-03-07 12:21:12 +00:00
Samuel Pitoiset
6532307555 nir: introduce nir_pack_{sint,uint}_2x16 instructions
These instructions have AMD hardware equivalent and they will be used
to lower fragment shader outputs in NIR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
2022-03-04 08:06:56 +00:00
Daniel Schürmann
ca4595e01a nir/opt_shrink_vectors: update docstring
in order to reflect the various recent improvements.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Daniel Schürmann
405829cd85 nir/opt_shrink_vectors: remove duplicate components from vecN
vecN instructions which are only used by other ALU
will now get duplicate channels removed.

i915g:
total instructions in shared programs: 396309 -> 396294 (<.01%)
instructions in affected programs: 186 -> 171 (-8.06%)

r300:
total instructions in shared programs: 1165059 -> 1164354 (-0.06%)
instructions in affected programs: 35884 -> 35179 (-1.96%)
total temps in shared programs: 165497 -> 165326 (-0.10%)
temps in affected programs: 2990 -> 2819 (-5.72%)

softpipe:
total instructions in shared programs: 2860028 -> 2859084 (-0.03%)
instructions in affected programs: 55539 -> 54595 (-1.70%)
total temps in shared programs: 516939 -> 516546 (-0.08%)
temps in affected programs: 6623 -> 6230 (-5.93%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Daniel Schürmann
e5963478c2 nir/opt_shrink_vectors: shrink load_const properly
This patch enables removal of arbitrary channels in
load_const instructions, if they are either unused or
duplicates of other channels and only used by ALU.

Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3)
VGPRs: 21832 -> 21544 (-1.32%)
CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01%
Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00%
Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15%
InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01%
VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09%
SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17%
Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02%
PreSGPRs: 20439 -> 20437 (-0.01%)
PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32%

i915g:
total instructions in shared programs: 396471 -> 396309 (-0.04%)
instructions in affected programs: 6408 -> 6246 (-2.53%)
total const in shared programs: 56458 -> 56422 (-0.06%)
const in affected programs: 407 -> 371 (-8.85%)
LOST:   shaders/closed/steam/trine-2/fp-3.shader_test FS

r300:
total instructions in shared programs: 1164421 -> 1165059 (0.05%)
instructions in affected programs: 143981 -> 144619 (0.44%)
total temps in shared programs: 165488 -> 165497 (<.01%)
temps in affected programs: 318 -> 327 (2.83%)
total consts in shared programs: 922140 -> 921952 (-0.02%)
consts in affected programs: 12438 -> 12250 (-1.51%)

softpipe:
total instructions in shared programs: 2859978 -> 2860028 (<.01%)
instructions in affected programs: 183355 -> 183405 (0.03%)
total temps in shared programs: 517071 -> 516939 (-0.03%)
temps in affected programs: 1416 -> 1284 (-9.32%)
total imm in shared programs: 103601 -> 102767 (-0.81%)
imm in affected programs: 3928 -> 3094 (-21.23%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Ilia Mirkin
8ed07c0da9 nir: remove bogus logic to allow cube + offset to work
This was done for an a4xx hack which is now removed. No API allows cube
texturing to have offsets.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Ian Romanick
06eb9fb125 nir/algebraic: Optimize some cases of (sXX(a, b) != 0.0)
I noticed the SGE case while looking at the output of
shaders/closed/steam/trine-2/fp-3.shader_test on i915g.  These are
especially bad on i915 that needs two instructions to implement SNE.

An alternative would be to duplicate the sne(sXX(a, b), 0.0) rules in an
algebraic pass that occurs after bool_to_float.  Doing the work earlier
seems preferable.

i915
total instructions in shared programs: 788274 -> 788223 (<.01%)
instructions in affected programs: 666 -> 615 (-7.66%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 12 x̄: 10.20 x̃: 9
helped stats (rel) min: 5.00% max: 11.11% x̄: 8.12% x̃: 8.16%
95% mean confidence interval for instructions value: -12.24 -8.16
95% mean confidence interval for instructions %-change: -10.81% -5.43%
Instructions are helped.

LOST:   0
GAINED: 2

The two gained shaders are assembly fragment programs in Euro Truck
Simulator 2.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
2022-03-03 00:07:58 +00:00
Emma Anholt
d506d910e4 nir: Switch to using nir_vec_scalars() for things that used nir_channel().
This should reduce follow-on optimization work to copy-propagate and
dead-code away the movs generated in construction of vectors.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Emma Anholt
16c064dfaf nir: Add a helper for setting up a nir_ssa_scalar struct.
Trivial, but will help users avoid some struct constructions that can be
awkward in C.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Emma Anholt
d95f9d189a nir: Introduce a nir_vec_scalars() helper using nir_ssa_scalar.
Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs
from other vectors to put the new vector together, which then just have to
get copy-propagated into the ALU srcs and DCEed away the temporary movs.
If they instead take nir_ssa_scalar, we can avoid that extra work.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Marek Olšák
1dcd1eac6a nir: pass nir_shader into nir_recompute_io_bases instead of func_impl
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
606811bded nir: add nir_print_xfb_info
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
ad68a1ee5a nir: add nir_gather_xfb_info_from_intrinsics for lowered IO
Drivers will use this.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
d4c051b047 nir: add nir_lower_io_passes() with new transform feedback
moved from radeonsi without the vectorization, which won't be needed for
now. We will lower IO in st/mesa instead of radeonsi to get the transform
feedback info into store instructions.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
3528dcdfa1 nir: add nir_io_semantics::no_varying, no_sysval_output, and helpers
This is for drivers that have separate store instructions for varyings,
system value outputs (such as clip distances), and transform feedback.
The flags tell the driver not to store the output to those locations.

This will be used by radeonsi initially, and then maybe by a new linker.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
548b2d47b2 nir: scalarize transform feedback info in nir_lower_io_to_scalar
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
4636fa7f38 nir: add transform feedback info into nir_intrinsic_store_output
This will allow compaction of transform feedback varyings because they
are no longer tied to varying slots with this information.
It will also make transform feedback info available to all NIR passes
after IO is lowered. It's meant to replace pipe_stream_output_info.

Other intrinsics are not used with transform feedback.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
2c6e41bfe1 nir: fix nir_io_semantics::gs_streams in nir_lower_io_to_scalar
gs_streams is relative to the component. Also clear the high bits.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák
73ef225fc2 nir: validate write_mask for all intrinsics that have it
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Mike Blumenkrantz
b28cff9f4a nir/lower_psiz_mov: stop clobbering existing exports
for this pass to work with xfb, the original value in the shader must be
preserved when xfb is active, and the driver must export only the newly
created output

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
2022-02-28 15:42:19 +00:00
Mike Blumenkrantz
3267417c22 nir/lower_psiz: create the store instruction more accurately
creating this at the start of the shader means it will get optimized out
when the pass is used to overwrite existing psiz values, and creating it
at the end means it will get optimized out in geometry shaders, so instead
just walk the instructions and create another store right after the existing one

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
2022-02-28 15:42:19 +00:00
Emma Anholt
b1f349dff4 nir: Allow the _replicates opcodes to have num_components != 4.
This required relaxing a core NIR assertion which I don't think is doing
any important validation.

The shader-db effects here are small, but they're important for avoiding a
regression when we start doing per-component DCE in opt_shrink_vectors
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468)

softpipe shader-db:
total instructions in shared programs: 2859777 -> 2859454 (-0.01%)
instructions in affected programs: 18881 -> 18558 (-1.71%)
total temps in shared programs: 293994 -> 293914 (-0.03%)
temps in affected programs: 418 -> 338 (-19.14%)

i915g:
total instructions in shared programs: 407562 -> 407544 (<.01%)
instructions in affected programs: 570 -> 552 (-3.16%)

r300:
total instructions in shared programs: 1414450 -> 1414459 (<.01%)
instructions in affected programs: 44494 -> 44503 (0.02%)
total vinst in shared programs: 473782 -> 473727 (-0.01%)
vinst in affected programs: 1102 -> 1047 (-4.99%)
total sinst in shared programs: 231224 -> 231216 (<.01%)
sinst in affected programs: 432 -> 424 (-1.85%)
total temps in shared programs: 197605 -> 197607 (<.01%)
temps in affected programs: 103 -> 105 (1.94%)

crocus hsw:
total instructions in shared programs: 8158185 -> 8158134 (<.01%)
instructions in affected programs: 10927 -> 10876 (-0.47%)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>
2022-02-25 12:31:48 -08:00
Timur Kristóf
f629fbd778 nir: Add new variable mode for task/mesh payload.
Task shader outputs work differently than other shaders, so they
need special consideration. Essentially, they have two kinds of
outputs:

1. Number of mesh shader workgroups to launch.
Will be still represented by a shader output.

2. Optional payload of up to (at least) 16K bytes.
These payload variables behave similarly to shared memory, but
the spec doesn't actually define them as shared memory (also, they
may be implemented differently by each backend), so we need to add
a new NIR variable mode for them.

These payload variables can't be represented by shader outputs
because the 16K bytes don't fit the 32x vec4 model that NIR uses
for its output variables.

This patch adds a new NIR variable mode: nir_var_mem_task_payload
and corresponding explicit I/O intrinsics, as well as support for
this new mode in nir_lower_io.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
2022-02-25 06:52:07 +00:00
Iago Toral Quiroga
f1d20ec67c nir/nir_opt_move: handle non-SSA defs
We just skip register defs and avoid moving register reads across them.
This allows us to run this pass in non-SSA form.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga
fe2249eac5 nir: add a nir_instr_def_is_register helper
This returns true if the instruction has a dest that is not an SSA value.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga
0a04468704 nir/nir_opt_move: allow to move uniform loads
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Rhys Perry
a9ac270c5f nir/validate: don't add instrs not present in shader to shader_gc_list
This makes the set smaller and GC list validation faster.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13547>
2022-02-21 11:57:22 +00:00
Rhys Perry
925c5f817d nir/validate: don't validate the GC list by default
This seems really slow.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13547>
2022-02-21 11:57:18 +00:00
Connor Abbott
6761550357 nir/serialize: Don't access blob->data directly
It won't work if the blob is fixed-size and we overrun the size, which
will be the case with the Vulkan pipeline cache.

This gets a bit tricky for the repeated-header optimization, because we
can't read the header from the blob. Instead we have to store the header
itself.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>
2022-02-19 01:25:46 +00:00
Alyssa Rosenzweig
2d6233d04f nir: Check all sizes in nir_alu_instr_is_comparison
nir_alu_instr_is_comparison needs to consider all comparison opcodes regardless
of size. Otherwise, they will be missed by nir_opt_move/sink.

Without this change, lowering booleans to integers regresses register
pressure (and spills/fills) significantly in certain shaders on Panfrost,
like android/com.miHoYo.GenshinImpact/1420.shader_test.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15073>
2022-02-18 19:22:01 +00:00