Commit graph

2290 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
6be766336a nir/tests: Use nir_scoped_memory_barrier() helper
Most of the vars tests already had a local helper, so just drop it in
favor of the one in nir_builder.  Remaining two tests changed to use
the helper.

The load_store_vectorizer tests were using the specific memory
barriers, but since scoped barriers are also handled, prefer that.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>
2020-02-24 19:12:11 +00:00
Caio Marcelo de Oliveira Filho
6ff898a653 nir: Add the alias NIR_MEMORY_ACQ_REL
This will help upcoming C++ code that will have to combine those two
semantics.  In C++ it is not possible to do this without a cast or
adding an operator| to the enum.  Since having the short form will
also be convient to C, we picked the former solution.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>
2020-02-24 19:12:11 +00:00
Caio Marcelo de Oliveira Filho
424737da3e nir/builder: Add nir_scoped_memory_barrier()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>
2020-02-24 19:12:11 +00:00
Eric Anholt
3e16434acd nir: Move intel's intrinsic_image_coordinate_components() to core nir.
This is a query that both Intel and freedreno need to do.  We can simplify
it a lot with the new glsl_get_sampler_dim_coordinate_components()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
2020-02-24 18:25:02 +00:00
Hyunjun Ko
9e8466a866 nir: Add optimization for doing removing f16/f32 conversions
This eliminates conversions between f16 and f32 where possible. We can
always remove an upcast followed by a down cast, that is:

  f2f16 ( f2f32 (a) )  ->  a
  f2fmp ( f2f32 (a) )  ->  a

In the other direction, f2f16 loses precision and can't be undone by a
f2f32.  However, by definition it's always safe to elminate f2fmp:

  f2f32 ( f2fmp (a) )  ->  a

v2. [Neil Roberts (nroberts@igalia.com)]

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
2020-02-24 17:24:13 +00:00
Neil Roberts
125f867d3d nir/opcodes: Add nir_op_f2fmp
This opcode is the same as the f2f16 opcode except that it comes with
a promise that it is safe to optimise it out if the result is
immediately converted back to a 32-bit float again. Normally this
would be a lossy conversion and so it would be visible to the
application, but if the conversion is generated as part of the mediump
lowering process then this removal doesn’t matter. The opcode is
eventually replaced with a regular f2f16 in the late optimisations so
the backends don’t need to handle it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
2020-02-24 17:24:13 +00:00
Jason Ekstrand
1598370aca nir/builder: Return an integer from nir_get_texture_size
It's convenient in a bunch of cases for nir_get_texture_size to return a
float but it's very unexpected.  This fixes a bug in the R600 NIR code.

Fixes: f718ac6268 "r600/sfn: Add a basic nir shader backend"
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3897>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3897>
2020-02-21 18:48:03 +00:00
Jason Ekstrand
265e234e23 nir: Fix the nir_builder include path for nir_builtin_builder
Because it's in double-quotes, it will search the current folder before
any search paths.  Since nir_builder.h and nir_builtin_builder.h are in
the same folder, this guarantees a correct include.  However,
nir/nir_builder.h does not unless the includer's path is set up just
right.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3897>
2020-02-21 18:48:03 +00:00
Karol Herbst
87365e263e nir/lower_ssbo: handle atomics
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753>
2020-02-21 13:06:22 +00:00
Alyssa Rosenzweig
7ab4e4dd96 nir: Add SSBO->global lowering pass
To facilitate lowering SSBOs to globals, we need a load_ssbo_address
intrinsic. This intrinsic takes an SSBO index and loads the address in
global memory of the SSBO (likely implemented via a uniform in the
driver). In the future, we'll support bounds checking, but at the moment
this is not supported (this pass should only be used for trusted
contexts at the moment, i.e. contexts without robustness extensions).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753>
2020-02-21 13:06:22 +00:00
Ian Romanick
58bdc1c748 nir/search: Use larger type to hold linearized index
"index" is an offset into a linearized 3-dimensional array.  Starting
with fbd5359a0a, the 3-dimensional array can have 43 elements in each
dimension.  43**3 = 79507, and that will overflow the uint16_t.

See also the discussion in MR !3765.

Fixes: fbd5359a0a ("nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select")
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
2020-02-19 19:07:34 +00:00
Kristian H. Kristensen
f1dc4c9554 Mark a few static inline helpers with ASSERTED
Quiet warnings in release builds where these look unused.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3866>
2020-02-19 18:34:33 +00:00
Rhys Perry
aca2458d1b nir: fix nir_const_value_as_uint bit size in load/store vectorizer tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3690>
2020-02-13 10:53:37 +00:00
Erik Faye-Lund
0c1ba69a27 Revert "nir: Add a couple trivial abs optimizations"
These were already added in 9fdaeb7776 ("nir: add min/max optimisation"),
and there's no point in doing them twice.

This reverts commit e4d346c86d.

Fixes: e4d346c86d ("nir: Add a couple trivial abs optimizations")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3786>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3786>
2020-02-13 09:18:27 +00:00
Arcady Goldmints-Orlov
e9f83185a2 Rename nir_lower_constant_initializers to nir_lower_variable_initalizers
This is naming is more clear as nir_variables can be initializes not
just with a nir_constant but with a pointer to another nir_variable.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
2020-02-12 15:41:49 +00:00
Arcady Goldmints-Orlov
7acc81056f compiler/nir: Add support for variable initialization from a pointer
Add a pointer_initializer field to nir_variable analogous to
constant_initializer, which can be used to initialize the nir_variable
to a pointer to another nir_variable. Just like the
constant_initializer, the pointer_initializer gets eliminated in the
nir_lower_constant_initializers pass.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>
2020-02-12 15:41:49 +00:00
Samuel Pitoiset
8e77280774 nir: do not use De Morgan's Law rules for flt and fge
In presence of NaNs, "!(flt(a, b) && flt(c, d))" is NOT EQUAL
to "fge(a, b) || fge(c, d)". These optimizations are unsafe for
apps that rely on NaN behaviour.

pipeline-db (GFX9/LLVM):
Totals from affected shaders:
SGPRS: 3176 -> 3136 (-1.26 %)
VGPRS: 2188 -> 2144 (-2.01 %)
Spilled SGPRs: 227 -> 169 (-25.55 %)
Code Size: 150572 -> 151800 (0.82 %) bytes
Max Waves: 307 -> 310 (0.98 %)

pipeline-db (GFX9/ACO):
Totals from affected shaders:
SGPRS: 18744 -> 18744 (0.00 %)
VGPRS: 15576 -> 15580 (0.03 %)
Spilled SGPRs: 164 -> 164 (0.00 %)
Code Size: 1573012 -> 1576492 (0.22 %) bytes
Max Waves: 1534 -> 1532 (-0.13 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2127
Fixes: d1ed4ffe0b ("nir: Use De Morgan's Law on logic compounded comparisons")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3696>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3696>
2020-02-11 08:36:00 +01:00
Ian Romanick
1d97d186fb nir: Mark fmin and fmax as commutative and associative
Per the resolution of Khronos GLSL issue 80
(https://github.com/KhronosGroup/GLSL/issues/80).  Spec updates have not
landed yet, but I'll get to it soon. :)

The extra hurt shaders on Gen8+ are a handful of shaders that see things like

    bcsel(fmin(b - a, a - c) >= 0, x, y)

converted to

   bcsel(a >= b && c >= a, x, y)

The former can be generated as a CSEL instruction.  If either b - a or a
- c is used elsewhere in the shader, this saves an instruction.

All Haswell+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14550188 -> 14550048 (<.01%)
instructions in affected programs: 12168 -> 12028 (-1.15%)
helped: 30
HURT: 3
helped stats (abs) min: 1 max: 17 x̄: 4.77 x̃: 2
helped stats (rel) min: 0.05% max: 3.85% x̄: 1.77% x̃: 1.80%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.50% max: 0.50% x̄: 0.50% x̃: 0.50%
95% mean confidence interval for instructions value: -6.15 -2.33
95% mean confidence interval for instructions %-change: -2.00% -1.12%
Instructions are helped.

total cycles in shared programs: 203770286 -> 203771464 (<.01%)
cycles in affected programs: 688466 -> 689644 (0.17%)
helped: 172
HURT: 220
helped stats (abs) min: 1 max: 286 x̄: 12.15 x̃: 6
helped stats (rel) min: 0.03% max: 5.97% x̄: 0.70% x̃: 0.35%
HURT stats (abs)   min: 1 max: 578 x̄: 14.85 x̃: 6
HURT stats (rel)   min: 0.03% max: 32.36% x̄: 1.21% x̃: 0.52%
95% mean confidence interval for cycles value: -0.74 6.75
95% mean confidence interval for cycles %-change: 0.15% 0.59%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 4525 -> 4523 (-0.04%)
fills in affected programs: 48 -> 46 (-4.17%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11858995 -> 11858898 (<.01%)
instructions in affected programs: 10822 -> 10725 (-0.90%)
helped: 25
HURT: 13
helped stats (abs) min: 1 max: 17 x̄: 5.32 x̃: 2
helped stats (rel) min: 0.40% max: 5.00% x̄: 2.16% x̃: 1.85%
HURT stats (abs)   min: 1 max: 15 x̄: 2.77 x̃: 2
HURT stats (rel)   min: 0.47% max: 2.90% x̄: 1.83% x̃: 2.15%
95% mean confidence interval for instructions value: -4.66 -0.45
95% mean confidence interval for instructions %-change: -1.54% -0.05%
Instructions are helped.

total cycles in shared programs: 177947023 -> 177946880 (<.01%)
cycles in affected programs: 822075 -> 821932 (-0.02%)
helped: 157
HURT: 175
helped stats (abs) min: 1 max: 164 x̄: 13.17 x̃: 4
helped stats (rel) min: 0.03% max: 6.72% x̄: 0.64% x̃: 0.17%
HURT stats (abs)   min: 1 max: 308 x̄: 11.00 x̃: 4
HURT stats (rel)   min: 0.03% max: 9.76% x̄: 0.70% x̃: 0.18%
95% mean confidence interval for cycles value: -3.86 3.00
95% mean confidence interval for cycles %-change: -0.09% 0.22%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4185 -> 4188 (0.07%)
spills in affected programs: 146 -> 149 (2.05%)
helped: 0
HURT: 1

total fills in shared programs: 5248 -> 5249 (0.02%)
fills in affected programs: 347 -> 348 (0.29%)
helped: 0
HURT: 1

Sandy Bridge
total instructions in shared programs: 10680224 -> 10680144 (<.01%)
instructions in affected programs: 4702 -> 4622 (-1.70%)
helped: 15
HURT: 3
helped stats (abs) min: 1 max: 17 x̄: 5.53 x̃: 5
helped stats (rel) min: 0.39% max: 4.76% x̄: 2.17% x̃: 1.67%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.52% max: 0.52% x̄: 0.52% x̃: 0.52%
95% mean confidence interval for instructions value: -7.24 -1.65
95% mean confidence interval for instructions %-change: -2.55% -0.89%
Instructions are helped.

total cycles in shared programs: 152988780 -> 152985691 (<.01%)
cycles in affected programs: 1072850 -> 1069761 (-0.29%)
helped: 168
HURT: 145
helped stats (abs) min: 1 max: 592 x̄: 33.90 x̃: 12
helped stats (rel) min: 0.02% max: 10.73% x̄: 0.90% x̃: 0.31%
HURT stats (abs)   min: 1 max: 259 x̄: 17.98 x̃: 6
HURT stats (rel)   min: 0.02% max: 8.17% x̄: 0.77% x̃: 0.19%
95% mean confidence interval for cycles value: -17.95 -1.79
95% mean confidence interval for cycles %-change: -0.34% 0.08%
Inconclusive result (%-change mean confidence interval includes 0).

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8107033 -> 8107025 (<.01%)
instructions in affected programs: 696 -> 688 (-1.15%)
helped: 5
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.60 x̃: 2
helped stats (rel) min: 0.34% max: 7.14% x̄: 3.47% x̃: 4.65%
95% mean confidence interval for instructions value: -2.28 -0.92
95% mean confidence interval for instructions %-change: -7.22% 0.28%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 188348526 -> 188348404 (<.01%)
cycles in affected programs: 33618 -> 33496 (-0.36%)
helped: 23
HURT: 0
helped stats (abs) min: 2 max: 12 x̄: 5.30 x̃: 6
helped stats (rel) min: 0.05% max: 1.83% x̄: 0.47% x̃: 0.51%
95% mean confidence interval for cycles value: -6.70 -3.91
95% mean confidence interval for cycles %-change: -0.64% -0.30%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1359>
2020-02-10 18:37:36 -08:00
Gert Wollny
37125b7cc2 r600/sfn: Add lowering UBO access to r600 specific codes
r600 reads vec4 from the UBO, but the offsets in nir are evaluated to the component.
If the offsets are not literal then all non-vec4 reads must resolve the component
after reading a vec4 component (TODO: figure out whether there is a consistent way
to deduct the component that is actually read).

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3225>
2020-02-10 19:09:08 +00:00
Eric Anholt
8d07d66180 glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.
This means you can directly use format utils on it without having to have
your own GL enum to number-of-components switch statement (or whatever) in
your vulkan backend.

Thanks to imirkin for fixing up the nouveau driver (and a couple of core
details).

This fixes the computed qualifiers for EXT_shader_image_load_store's
non-integer sizeNxM qualifiers, which we don't have tests for.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
2020-02-05 10:31:14 -08:00
Boris Brezillon
f5619f5073 pan/midgard: Turn Z/S stores into zs_output_pan intrinsics
Midgard can't write depth and stencil separately. It has to happen in
a single store operation containing both. Let's add a panfrost specific
intrinsic and turn all depth/stencil stores into a packed depth+stencil
one.

Note that this intrinsic is not yet handled in emit_intrinsic(), but
we'll address that later.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3697>
2020-02-05 15:41:55 +00:00
Michel Dänzer
65610ec774 gitlab-ci: Add ppc64el and s390x cross-build jobs
Using LLVM 8 for ppc64el and 7 for s390x (which hits some coroutine
related issues with LLVM 8).

There are some test failures we need to ignore for now. Also, the
timeout needs to be bumped from the default 30s for some tests, because
they can take longer under emulation.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3643>
2020-02-05 10:52:31 +00:00
Kristian H. Kristensen
6dd57f0e38 nir: Remove always-true assert
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
2020-02-04 06:03:52 +00:00
Kristian H. Kristensen
2be81a3bfa nir: Make unroll pragma work on clang
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
2020-02-04 06:03:52 +00:00
Kristian H. Kristensen
de856c6170 nir: Delete unused is_var_constant() helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
2020-02-04 06:03:52 +00:00
Caio Marcelo de Oliveira Filho
8548fe19f0 nir: Make nir_deref_path_init skip trivial casts
In a NIR generated using SPIR-V initializers to variables, copy
propagation can end up transforming

    vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
    vec1 32 ssa_35 = mov ssa_33
    vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_35 (shared mat2x4)  /* ptr_stride=0 */

into

    vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
    vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_33 (shared mat2x4)  /* ptr_stride=0 */

Before the optimization, the "head" of a path of deref that uses ssa_7
will be the cast.  After, it will be the variable in ssa_33.  Since
the types are the same, this is a trivial cast that would be picked up
by nir_opt_deref.

If we need to compare such deref-chain after optimization with another
deref-chain for the same variable, the compare function will get
confused by the cast in the middle.

One alternative would be to add nir_opt_deref to places that compare
derefs, but that might not scale well, so skip the trivial casts when
generating the paths instead.

Motivated by the discussion in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
2020-01-29 18:25:36 +00:00
Rhys Perry
1f72857739 nir/algebraic: add some half packing optimizations
pipeline-db (ACO):
Totals from affected shaders:
SGPRS: 29200 -> 29200 (0.00 %)
VGPRS: 17372 -> 17372 (0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1406576 -> 1389256 (-1.23 %) bytes
LDS: 83 -> 83 (0.00 %) blocks
Max Waves: 3976 -> 3976 (0.00 %)

pipeline-db (LLVM):
Totals from affected shaders:
SGPRS: 21320 -> 21320 (0.00 %)
VGPRS: 17056 -> 17036 (-0.12 %)
Spilled SGPRs: 22 -> 22 (0.00 %)
Spilled VGPRs: 503 -> 487 (-3.18 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 396 -> 396 (0.00 %) dwords per thread
Code Size: 1441244 -> 1423292 (-1.25 %) bytes
LDS: 463 -> 463 (0.00 %) blocks
Max Waves: 3609 -> 3611 (0.06 %)

v2: add pattern for ishr

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
2020-01-29 14:30:33 +00:00
Rhys Perry
5476d18183 nir/algebraic: add patterns for a >> #b << #b
Fixes compilation of a Battlefront 2 shader with ACO by removing VGPR
spilling. The reassociation makes it worse on LLVM though.

pipeline-db (ACO):
Totals from affected shaders:
SGPRS: 10704 -> 10688 (-0.15 %)
VGPRS: 18736 -> 18528 (-1.11 %)
Spilled SGPRs: 70 -> 70 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 909696 -> 885796 (-2.63 %) bytes
LDS: 225 -> 225 (0.00 %) blocks
Max Waves: 1115 -> 1129 (1.26 %)

pipeline-db (LLVM):
Totals from affected shaders:
SGPRS: 8472 -> 8424 (-0.57 %)
VGPRS: 14284 -> 14368 (0.59 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 442 -> 503 (13.80 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 268 -> 396 (47.76 %) dwords per thread
Code Size: 862568 -> 853028 (-1.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 971 -> 964 (-0.72 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
2020-01-29 14:30:33 +00:00
Samuel Pitoiset
cf6cae832c nir: lower interp_deref_at_vertex to load_input_vertex
This introduces a new NIR intrinsic for loading inputs at a specific
vertex index.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
d29f10a7ca nir: add nir_intrinsic_interp_deref_at_vertex
From the SPV_AMD_shader_explicit_vertex_parameter extension:
   "Returns the value of the input <interpolant> without any
    interpolation, i.e. the raw output value of previous shader
    stage."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
687f170311 nir: lower SYSTEM_VALUE_BARYCENTRIC_* to nir_load_barycentric()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
9021b45b35 nir: add nir_intrinsic_load_barycentric_model
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
746e9e5d66 compiler: add a new explicit interpolation mode
This introduces one more interpolation mode INTERP_MODE_EXPLICIT,
which is needed for AMD_shader_explicit_vertex_parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Vasily Khoruzhick
3c754900b5 nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops
Otherwise we'll have to lower it later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>
2020-01-23 21:16:22 +00:00
Anthony Pesch
1496cc92f6 util/hash_table: added hash functions for integer types
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html

Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Samuel Pitoiset
84b08971fb nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch
These instructions are allowed to fetch from multisampled
subpass input attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
603e6ba972 nir: add two new texture ops for multisample fragment color/mask fetches
This introduces:
   - nir_texop_fragment_mask_fetch (fetch a fragment mask from a
     compressed multisampled color surface)
   - nir_texop_fragment_fetch (fetch a color fragment for a
     particular sample at corresponding fragment mask index).

These two texture operations are necessary for implementing
SPV_AMD_shader_fragment_mask.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Matt Turner
4413537c80 util: Remove tmp argument from BITSET_FOREACH_SET macro
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
2020-01-23 01:52:43 +00:00
Ian Romanick
b065d8fb8c nir/algebraic: Optimize some 64-bit integer comparisons involving zero
I noticed that we can do better for these kinds of comparisons while
working on the lowering for iadd_sat@64 and isub_sat@64.  This
eliminated 11 instruction from the fs-addSaturate-int64.shader_test.

My hope is that this will improve the run-time of int64 tests on Ice
Lake.  I have no data to support or refute this.

Unsurprisingly, no changes on shader-db.

v2: Condition the min and max patterns with nir_lower_minmax64.
Suggested by Caio.  Very long discussion in the MR. :)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
d3d970166c nir/algebraic: Add lowering for 64-bit iadd_sat and isub_sat
v2: Rearranged and expand the comment about the optimizations applied to
the lowering.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
dcadbd2dd2 nir/algebraic: Add lowering for 64-bit uadd_sat
Fixes piglit fs-addsaturate-uint64 and vs-addsaturate-uint64 on Ice
Lake.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
1bdfc6d7cb nir/algebraic: Add lowering for 64-bit usub_sat
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

v3: Add a new lower_usub_sat64 flag that only applies to the 64-bit
version of the nir_op_usub_sat instruction.

v4: Also enable the lowering when nir_lower_iadd64 is set.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
a483771045 nir/algebraic: Add lowering for 64-bit hadd and rhadd
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

v3: Add a new lower_hadd64 flag that only applies to the 64-bit versions
of the instructions.

v4: Also enable the lowering when nir_lower_iadd64 is set.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
ea435560ee nir/algebraic: Add lowering for uabs_usub and uabs_isub
v2: Remove some rebase failures noticed by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
21f0d020fe nir: Add new instructions for INTEL_shader_integer_functions2
uctz isn't added because it will implemented in the GLSL path and the
SPIR-V path using other pre-existing instructions.

v2: Avoid signed integer overflow for uabs_isub(0, INT_MIN).  Noticed by
Caio.

v3: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Eric Anholt
d0975bfc4a nir: Drop the ssbo_offset to atomic lowering.
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers

The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
10dc4ac4c5 mesa: Make atomic lowering put atomics above SSBOs.
Gallium arbitrarily (it seems) put atomics below SSBOs, resulting in a
bunch of extra index management, and surprising shader code when you would
see your SSBOs up at index 16.  It makes a lot more sense to see atomics
converted to SSBOs appear as magic high numbers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
d55573aac6 nir: Fix printing of ~0 .locations.
I kept wondering what "429" meant in variable declarations, when it was
just a truncated ~0 snprintf.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
2020-01-16 23:29:10 +00:00
Jason Ekstrand
721666e52a anv,nir: Lower quad_broadcast with dynamic index in NIR
This is required for the subgroupBroadcastDynamicId feature that was
added in Vulkan 1.2.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-15 08:34:57 -06:00
Elie Tournier
22c5c54a4f nir/algebraic: sqrt(x)*sqrt(x) -> fabs(x)
total instructions in shared programs: 12840840 -> 12839341 (-0.01%)
instructions in affected programs: 122581 -> 121082 (-1.22%)
helped: 559
HURT: 0

total cycles in shared programs: 302505756 -> 302490031 (<.01%)
cycles in affected programs: 2022900 -> 2007175 (-0.78%)
helped: 1090
HURT: 130

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
2020-01-15 00:30:52 +00:00