Commit graph

6407 commits

Author SHA1 Message Date
Timur Kristóf
872d21820f nir: Exclude non-generic patch variables from get_variable_io_mask.
These are I/O variables which are not going to be removed anyway.
However, get_variable_io_mask handles their location incorrectly.

Found using the GCC undefined behavior sanitizer.
Fixes the following error:

runtime error:
shift exponent 4294967258 is too large
for 64-bit type 'long unsigned int'

Closes: #5319
Fixes: cf5f8f55c3
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>
2021-09-20 18:08:16 +00:00
Ian Romanick
d7ba52cce9 nir/edgeflags: Add a flag to indicate the edge flag input is needed
Most modern hardware needs the edge flag added as a hidden vertex input
and needs code added to the vertex shader to copy the input to an
output.  Intel hardware is a little different.  Gfx4 and Gfx5 hardware
works in the previously described mannter.  Gfx6+ hardware needs the
edge flag as a specific vertex shader input, and that input is magically
processed by fixed-function hardware without need for extra shader code.

This flag signals only that the vertex shader input is needed.  It would
be nice if we could decouple adding the vertex shader input from
generating the copy-to-output code, but that has proven to be
challenging.  Not having that code causes other passes to want to
eliminate that shader input.

v2: Convert conditional to assertion.  This pass is only called for
vertex shaders.  Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
2021-09-17 16:36:08 -07:00
Rhys Perry
a1af902531 nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants
This allows for more MAD/FMA instructions to be created.

fossil-db (Sienna Cichlid):
Totals from 50134 (33.46% of 149839) affected shaders:
VGPRs: 2436536 -> 2436000 (-0.02%); split: -0.05%, +0.03%
SpillSGPRs: 13136 -> 13135 (-0.01%); split: -0.02%, +0.02%
CodeSize: 206621424 -> 206278292 (-0.17%); split: -0.23%, +0.07%
MaxWaves: 1116804 -> 1117448 (+0.06%); split: +0.07%, -0.01%
Instrs: 38977460 -> 38862886 (-0.29%); split: -0.33%, +0.04%
Latency: 832425389 -> 827432260 (-0.60%); split: -0.63%, +0.03%
InvThroughput: 184193457 -> 183563350 (-0.34%); split: -0.37%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7458>
2021-09-17 17:28:26 +00:00
Jason Ekstrand
6c7d23e6ca nir: Stop sweeping indirects
They're no longer ralloc'd.

Fixes: 879a569884 "nir: Switch from ralloc to malloc for NIR instructions."
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>
2021-09-16 11:28:36 +00:00
Jason Ekstrand
d1eae6f36b nir: Properly clean up nir_src/dest indirects
Now that they're no longer ralloc'd, we have to be much more careful
about indirects.  We have to make sure every time a source or
destination is overwritten, its indirect (if any) is freed.  We also
have to choose a memory ownership convention for the rewrite functions.
Assuming that they will be called with the source from some other
instruction, we choose to always make a copy of the indirect (if any).
It's the responsibility of the caller to ensure its copy of the indirect
is freed.

Unfortunately, all this extra logic is going to make
nir_instr_rewrite/move_src/dest more expensive because they now have
all the logic of nir_src/dest_copy instead of a simple struct
assignment.  Fortunately, the vast majority of rewrite calls are done by
nir_ssa_def_rewrite_uses which is an SSA-only fast-path.

Fixes: 879a569884 "nir: Switch from ralloc to malloc for NIR instructions."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>
2021-09-16 11:28:36 +00:00
Emma Anholt
aed4c0b5a9 nir: Drop the unused instr arg for src/dest copy functions.
Now that we don't use ralloc, we don't need this arg to get at the right
ralloc ctx.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Emma Anholt
879a569884 nir: Switch from ralloc to malloc for NIR instructions.
By replacing the 48-byte ralloc header with our exec_node gc_node (16
bytes), runtime of shader-db on my system across this series drops
-4.21738% +/- 1.47757% (n=5).

Inspired by discussion on #5034.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Emma Anholt
feee5e6974 nir/tests: Fix transmuting an SSA dest to be non-SSA
With the de-ralloc changes, having the register dest not have its .reg
properly initialized caused crashes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Emma Anholt
1edff520e2 nir/lower_phis_to_scalar: Use nir_instr_free() to free instrs.
Preparation for de-rallocing instrs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Emma Anholt
d1a2870f78 nir: Add all allocated instructions to a GC list.
Right now we're using ralloc to GC our NIR instructions, but ralloc has
significant overhead for its recursive nature so it would be nice to use a
simpler mechanism for GCing instructions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Emma Anholt
22788d68eb nir: Consistently pass the instr to nir_src_copy().
The arg says it's supposed to be the instr, not the shader.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:05 +00:00
Emma Anholt
5e37cfb7fe nir: Consistently pass the shader to the shader arg of instr creation.
We were using the ralloc parent in some places, which should work out to
be the shader I think, but to de-ralloc the instrs we should just pass the
existing shader pointer in.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:05 +00:00
Emma Anholt
7a4bbe60c1 nir/from_ssa: Use nir_instr_free() to free instrs instead of ralloc.
This code was being tricky with passing a mem_ctx instead of the shader,
then freeing the mem_ctx when the pass was done and all the parallel
copies had been removed from the shader.  Use the right type for instr
creation and do a bit of manual list management to prepare the way for
non-ralloc NIR instrs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:05 +00:00
Emma Anholt
b99efb8af0 nir: Pull the instr list free function out to a helper.
With the de-rallocing, we're going to have some more places that free a
list of instrs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:05 +00:00
Emma Anholt
36d9bdca0b nir: Add a nir_instr_free() to replace ralloc_free(instr).
This will gain another step shortly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:05 +00:00
Ian Romanick
7956a701d8 nir/lower_gs_intrinsics: Make nir_lower_gs_intrinsics be idempotent
Calling this lower pass twice in a row would cause spurious
set_vertex_and_primitive_count(0, undef) intrinsics after the proper
set_vertex_and_primitive_count intrinsic.  This pretty much turns any
geometry shader into garbage.

Fix this by treating nir_intrinsic_emit_vertex_with_counter and
nir_intrinsic_end_primitive_with_counter just like the non-_with_counter
versions.  If no blocks would need set_vertex_and_primitive_count
intrinsics added, exit the pass before doing any work.  This prevents
the need for DCE to do extra clean up later.

Since this pass is potentially called multiple times via multiple
invocations of a finalize_nir callback, it is (hypothetically?) possible
that control flow could be changed to add new blocks that need this
intrinsic.  The check implemented in this commit should be robust
against that possibility.

v2: Add a_block_needs_set_vertex_and_primitive_count.  Suggested by
Timur.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12802>
2021-09-14 09:13:07 -07:00
Ian Romanick
edf357b233 nir/lower_gs_intrinsics: Return progress if append_set_vertex_and_primitive_count makes progress
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 542d40d698 ("nir: Add new GS intrinsics that maintain a count of emitted vertices.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12802>
2021-09-14 09:12:47 -07:00
Mike Blumenkrantz
361a7d2451 compiler/spirv: add a fail if tex instr coord components aren't dimensional enough
this is really hard to pin down later on, so catch it here instead

gotta have those dimensions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12825>
2021-09-14 13:54:24 +00:00
Kenneth Graunke
d3b72d49cb glsl: Assert that lower_blend_equation_advanced is only called for FS
It only makes sense to call this pass for fragment shaders, and the
first thing the pass does is read a FS-specific field out of a union,
so it isn't safe to call it for other shader stages.

We could make it early return, but instead we just assert, so that
drivers know to only call it when appropriate.

(A previous version of this patch, which early returned instead of
asserting, was Reviewed-by: Emma Anholt <emma@anholt.net> as well.)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12839>
2021-09-14 03:55:05 +00:00
Emma Anholt
91dc863921 mesa: Move the advanced blend bitmask to shader_info.
For drivers that don't lower advanced blend to FBFETCH, we need the
bitmask to be in the NIR shader so that it gets carried over to TGSI
successfully.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12813>
2021-09-13 18:36:58 +00:00
Bas Nieuwenhuizen
b05cd10b8e nir: Avoid visiting instructions multiple times in nir_instr_free_and_dce.
Sadly need to poke a bit in the src internals to avoid using yet another
heap allocated datastructure.

Fixes: 5251548572 ("nir: Add a nir_instr_remove that recursively removes dead code.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5323
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12726>
2021-09-09 21:35:03 +00:00
Rhys Perry
c1f724b2b9 nir: fix serialization of loop/if control
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: e76ae39ae2 ("nir: add support for user defined select control")
Fixes: b56451f82c ("nir: add support for user defined loop control")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12778>
2021-09-09 10:32:30 +00:00
Qiang Yu
7054c1b7fd nir/linker: pack varyings with different interpolation qualifier
Driver like radeonsi load varying in a scalar manner, so prefer to pack
varying with different interpolation qualifier into same slot to save
space.

But driver like panfrost/bifrost can load varying in vector manner,
so prefer to pack varying with same interpolation qualifier.

Driver can add interpolation qualifiers which are able to be
packed into same varying slot to pack_varying_options nir option.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12537>
2021-09-09 06:00:58 +00:00
Qiang Yu
5a24aed1ac nir/lower_io_to_vector: check centroid & sample when merge variable
These qualifiers should be respected for different varying load code
generation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12537>
2021-09-09 06:00:58 +00:00
Marcin Ślusarz
30b2cc423c glsl: break out early if compound assignment's operand errored out
Fixes compiler crashes on:

struct Foo
{
  float does_exist_member;
};

in vec2 tex;
out vec4 color;

void
main(void)
{
  Foo foo;

  foo.does_not_exist_member %= 3; /* or any of: <<=, >>=, &=, |=, ^= */
  color = vec4(tex.xy, tex.xy);
}

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
CC: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12717>
2021-09-08 06:59:34 +00:00
Marcin Ślusarz
26302ccdc1 glsl: propagate errors from *=, /=, +=, -= operators
Fixes compiler crash on:

void main()
{
    gl_FragColor = a += 1;
}

(a is not declared anywhere)

Found with AFL++.

Fixes: d1fa69ed61 ("glsl: do not attempt assignment if operand type not parsed correctly")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12717>
2021-09-08 06:59:34 +00:00
Timothy Arceri
52893327fb glsl: fix variable scope for do-while loops
Without this the following code was successfully compiling.

   void main()
   {
      do
         int var_1 = 0;
      while (false);

      var_1++;
   }

Fixes: a87ac255cf ("Initial commit.  lol")

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12465>
2021-09-08 03:56:59 +00:00
Timothy Arceri
174c057926 glsl: handle scope correctly when inlining loop expression
We need to clone the previously evaluated loop expression when
inlining otherwise we will have conflicts with shadow variables
defined in deeper scopes.

Fixes: 5c02e2e2de ("glsl: Generate IR for switch statements")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5255

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12465>
2021-09-08 03:56:59 +00:00
Timothy Arceri
8bbbbb02cd glsl: fix variable scope for loop-expression
We need to evaluate the loop-expression of a for loop before
we evaluate the loop body otherwise we will find the wrong
variable for the following loop.

   int var_0 = 0;
   for(; var_0 < 10; var_0++) {
      const float var_0 = 1.0;
      gl_FragColor = vec4(0.0, var_0, 0.0, 0.0);
   }

Fixes: a87ac255cf ("Initial commit.  lol")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5262

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12465>
2021-09-08 03:56:59 +00:00
Rob Clark
b8b475ad4e nir/lower_amul: Fix usage of nir_foreach_src()
nir_foreach_src() bails after cb returns false for any src.  Which isn't
the behavior we were looking for.  Move progress flag to state struct
instead, so we don't skip visiting some sources.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12732>
2021-09-06 15:58:05 +00:00
Rob Clark
5800fde1bb nir/lower_amul: Handle load/store_global
These need more than 24b.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12732>
2021-09-06 15:58:05 +00:00
Enrico Galli
9461fe5cf1 nir: Add CAN_REORDER to load_ubo_dxil
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12707>
2021-09-03 16:21:03 +00:00
Rhys Perry
137974fabb spirv: use sdot_2x16 and udot_2x16 opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:27 +00:00
Rhys Perry
41ecef7855 nir: add sdot_2x16 and udot_2x16 opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:27 +00:00
Rhys Perry
ae00f5af61 nir: separate lower_add_sat
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:27 +00:00
Filip Gawin
2de348cdb0 glsl: use bool literals instead of integers
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12674>
2021-09-02 21:19:22 +00:00
Timur Kristóf
33630090a2 nir: Add comment to explain the sad_u8x4 opcode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12649>
2021-09-01 08:42:03 +00:00
Emma Anholt
33182c555f nir/nir_lower_uniforms_to_ubo: Set the explicit stride of the UBO 0 uniform.
Normal UBOs have explicit strides on them, make our lowered one behave the
same.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175>
2021-08-31 20:12:16 +00:00
Emma Anholt
01759d3fb2 nir: Set .driver_location for GLSL UBO/SSBOs when we lower to block indices.
Without this, there's no way to match the UBO nir_variable declarations to
the load_ubo intrinsics referencing their data.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175>
2021-08-31 20:12:16 +00:00
Timur Kristóf
548b383310 nir: Fix local_invocation_index upper bound for non-compute-like stages.
The lowered LS and NGG stages use local_invocation_index and they
can benefit from the unsigned upper bound because they can emit a
less expensive integer multiplication instruction.
This was working in the past, but accidentally borked by a refactor.

Fossil DB changes on Sienna Cichlid:

Totals from 956 (0.74% of 128647) affected shaders:
CodeSize: 2354172 -> 2344712 (-0.40%)
Instrs: 434359 -> 434327 (-0.01%)
Latency: 1883949 -> 1876814 (-0.38%)
InvThroughput: 762638 -> 757405 (-0.69%)

Fossil DB changes on Sienna Cichlid (with NGGC enabled):

Totals from 57873 (44.99% of 128647) affected shaders:
CodeSize: 155844192 -> 155607064 (-0.15%)
Instrs: 29799184 -> 29799152 (-0.00%)
Latency: 130959764 -> 130814224 (-0.11%); split: -0.11%, +0.00%
InvThroughput: 21100300 -> 20928635 (-0.81%); split: -0.81%, +0.00%

Fixes: 8af6766062
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Timur Kristóf
a25fd1787a nir: Add unsigned upper bound for extract opcodes.
This helps with some cases of extract, such as:
- Emitting more optimal integer multiplications
- Better address calculation
- Possibly others

Fossil DB results on Sienna Cichlid:

Totals from 4064 (3.16% of 128647) affected shaders:
VGPRs: 262040 -> 262032 (-0.00%)
CodeSize: 28856648 -> 28811892 (-0.16%); split: -0.18%, +0.02%
Instrs: 5370279 -> 5367827 (-0.05%); split: -0.08%, +0.04%
Latency: 74230112 -> 74016671 (-0.29%); split: -0.29%, +0.01%
InvThroughput: 12082532 -> 12036365 (-0.38%); split: -0.39%, +0.01%
VClause: 108506 -> 108721 (+0.20%); split: -0.03%, +0.22%
SClause: 217731 -> 216602 (-0.52%); split: -0.67%, +0.15%
Copies: 265689 -> 270811 (+1.93%); split: -0.26%, +2.19%
PreSGPRs: 201982 -> 204907 (+1.45%); split: -0.01%, +1.46%
PreVGPRs: 236099 -> 236079 (-0.01%)

Fossil DB results on Sienna Cichlid with NGGC enabled:

Totals from 60375 (46.93% of 128647) affected shaders:
VGPRs: 2212576 -> 2212568 (-0.00%)
CodeSize: 180870420 -> 179684816 (-0.66%); split: -0.66%, +0.00%
Instrs: 34386715 -> 34213682 (-0.50%); split: -0.51%, +0.01%
Latency: 199676290 -> 198987998 (-0.34%); split: -0.35%, +0.00%
InvThroughput: 32288299 -> 31736433 (-1.71%); split: -1.71%, +0.00%
VClause: 621521 -> 621743 (+0.04%); split: -0.00%, +0.04%
SClause: 900447 -> 899392 (-0.12%); split: -0.16%, +0.04%
Copies: 3439529 -> 3445305 (+0.17%); split: -0.02%, +0.19%
PreSGPRs: 2216297 -> 2219220 (+0.13%); split: -0.00%, +0.13%
PreVGPRs: 1842887 -> 1842867 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Caio Marcelo de Oliveira Filho
b34f9740ca spirv: Implement non-Multiview parts of SPV_NV_mesh_shader
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
10a03e30cf nir: Allow Task/Mesh to lower compute system values
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
4f52681a2d nir: Don't lower Task/Mesh I/O to temporaries
These won't work since a workgroup can span more than one thread, and
the temporaries are not shared memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
27697d5eb8 nir/divergence_analysis: Handle Task/Mesh shaders
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
bf5f6add01 nir/lower_io: Identify Mesh output as arrayed
Mesh shader outputs are either:

- non-array builtins
- array builtins that are either per-primitive or per-vertex
- user-defined outputs that must be either per-primitive or per-vertex

So we can identify any array output as "arrayed" for the purposes of
I/O lowering.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
9631d24c3f compiler: Add Task/Mesh to shader_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
813d41829d compiler: Add new non-Multiview Task/Mesh builtins
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
cd394017c8 nir: Add per-primitive I/O intrinsics
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
f95daad3a2 nir: Add a way to identify per-primitive variables
Per-primitive is similar to per-vertex attributes, but applies to all
fragments of the primitive without any interpolation involved.

Because they are regular input and outputs, keep track in shader_info
of which I/O is per-primitive so we can distinguish them after deref
lowering.  These fields can be used combined with the regular
`inputs_read`, `outputs_written` and `outputs_read`.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00