Commit graph

2290 commits

Author SHA1 Message Date
Jason Ekstrand
b8d45d9307 nir: Add tests for nir_extract_bits 2019-11-11 17:17:02 +00:00
Jason Ekstrand
d0bbf98c96 nir/builder: Add a nir_extract_bits helper
This new helper is better than nir_bitcast_vector because it's able to
take a (mostly) arbitrary range from the source vector.  The only
requirement is that first_bit has to be aligned to the smaller of the
two bit sizes.  It wouldn't be hard to lift that requirement but it's
reasonable for now.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Alyssa Rosenzweig
03f73c7fc6 nir: Add load_output_u8_as_fp16_pan intrinsic
This is a single opcode, at least on newer Midgard chips. It's easier to
have this represented in NIR rather than trying to optimize out the
conversions, so let's add the intrinsic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Kristian H. Kristensen
e28fbbd861 freedreno/ir3: Implement TCS synchronization intrinsics
We add two new IR3 specific nir intrinsics that map to the new condend
and endpatch instructions.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
41984c8422 freedreno/ir3: Add ir3 intrinsics for tessellation
These provide the iovas for system memory buffers used for
tessellation as well as a new HW specific system value.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:50 -08:00
Kristian H. Kristensen
fe450ef4cf freedreno/ir3: Add load and store intrinsics for global io
These intrinsics take a ivec2 for the 64 bit base address and a
integer offset.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:44 -08:00
Kai Wasserbäch
acfea09dbd nir: fix unused function warning in src/compiler/nir/nir.c
This commit fixes the following warning:
../src/compiler/nir/nir.c:1827:1: warning: ‘dest_is_ssa’ defined but not used [-Wunused-function]
 1827 | dest_is_ssa(nir_dest *dest, void *_state)
      | ^~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-07 11:32:55 +11:00
Kai Wasserbäch
8aa4d0bff6 nir: fix unused variable warning in nir_lower_vars_to_explicit_types
This commit fixes the following warning:
../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_vars_to_explicit_types’:
../src/compiler/nir/nir_lower_io.c:1435:22: warning: unused variable ‘supported’ [-Wunused-variable]
 1435 |    nir_variable_mode supported = nir_var_mem_shared | nir_var_shader_temp | nir_var_function_temp;
      |                      ^~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-07 11:32:55 +11:00
Samuel Pitoiset
c0f76528ae nir: fix packing of nir_variable
The maximum number of descriptor sets is indeed 32 but without
the sign bit.

The maximum number of bindings for RADV is way larger, keep it
as 32-bit.

Fixes: 96e6ef80d9 ("nir: pack the rest of nir_variable::data")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2019-11-06 08:51:53 +01:00
Marek Olšák
8145492f4a nir/serialize: pack nir_variable flags
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
3aa72a394a nir/serialize: store 32-bit object IDs instead of 64-bit
That means we have only 30 bits for object IDs, because 2 bits are
sometimes used for something else.

This decrease the uncompressed shader size for the biggest Borderlands 2
shader from 33.6 KB to 23.2 KB. (31% decrease)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
d5768fcd45 nir/serialize: don't expand 16-bit variable state slots to 32 bits
the swizzle also needs only 16 bits

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
96e6ef80d9 nir: pack the rest of nir_variable::data
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:32:34 -05:00
Marek Olšák
4319cc8c0f nir: pack nir_variable::data::xfb_*
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 18:17:34 -05:00
Marek Olšák
08dc541b66 nir: pack nir_variable::data::stream
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 18:17:34 -05:00
Ian Romanick
9be4a422a0 nir/algebraic: Mark other comparison exact when removing a == a
This prevents some additional optimizations that would change the
original result.  This includes things like (b < a && b < c) => b <
min(a, c) and !(a < b) => b >= a.  Both of these optimizations were
specifically observed in the piglit tests added in piglit!160.

This was discovered while investigating
https://gitlab.freedesktop.org/mesa/mesa/issues/1958.  However, the
problem in that issue was Chrome or Angle is replacing calls to isnan()
with some stuff that we (correctly) optimize to false.  If they had left
the calls to isnan() alone, everything would have just worked.

No shader-db changes on any Intel platform.

I also tried marking the comparison generated by the isnan() function
precise.  The precise marker "infects" every computation involved in
calculating the parameter to the isnan() function, and this severely
hurt all of the (few) shaders in shader-db that use isnan().

I also considered adding a new ir_unop_isnan opcode that would implement
the functionality.  During GLSL IR-to-NIR translation, the resulting
comparison operation would be marked exact (and the samething would need
to happen in SPIR-V translation).

This approach taken by this patch seemed easier, but we may want to do
the ir_unop_isnan thing anyway.

Fixes: d55835b8bd ("nir/algebraic: Add optimizations for "a == a && a CMP b"")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 14:05:49 -08:00
Ian Romanick
ea19f2fb68 nir/algebraic: Add the ability to mark a replacement as exact
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 14:05:49 -08:00
Marek Olšák
af94600484 compiler: make variable::data::binding unsigned
Nothing seems to set a negative value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 16:49:46 -05:00
Dylan Baker
717606f9f3 nir: correct use of identity check in python
Python has the identity operator `is`, and the equality operator `==`.
Using `is` with strings sometimes works in CPython due to optimizations
(they have some kind of cache), but it may not always work.

Fixes: 96c4b135e3
       ("nir/algebraic: Don't put quotes around floating point literals")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-11-04 16:06:39 +00:00
Tapani Pälli
b380d47998 nir: fix couple of compile warnings
Fixes "warning: braces around scalar initializer" warnings.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 00:21:44 +00:00
Timothy Arceri
7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Dylan Baker
09ee11f5da nir: Fix invalid code for MSVC
Fixes: ee2050b111
       ("nir: Use BITSET for tracking varyings in lower_io_arrays")
2019-10-25 22:47:32 +00:00
Kenneth Graunke
f306d07932 nir: Use VARYING_SLOT_TESS_MAX to size indirect bitmasks
MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size
that's too small, so BITSET_SET writes words out of bounds, corrupting
the stack and causing all kinds of chaos.  VARYING_SLOT_TESS_MAX is
the right value to use here, as it's the largest location.

Closes: 2002
Fixes: ee2050b111 ("nir: Use BITSET for tracking varyings in lower_io_arrays")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-25 13:29:09 -07:00
Kristian H. Kristensen
ee2050b111 nir: Use BITSET for tracking varyings in lower_io_arrays
MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64
bits (per component) to track which vars have indirects. This pass was
trying to track patch varyings (which start at bit 63) in a separate
64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed
out of bounds.

Do away with the ad-hoc bit mask tracking and just use a BITSET.

Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 15:32:20 -07:00
Caio Marcelo de Oliveira Filho
901071044e nir/tests: Add copy propagation tests with scoped_memory_barrier
Three groups of tests, effectively defining what cases the
optimization is allowed or prevented

- Redudant loads       (a load  generated the value)
- Propagate SSA values (a store generated the value)
- Propagate a var      (a copy  generated the value)

Change the shader type of the tests to be COMPUTE so
nir_var_mem_shared can also be used.  Doesn't affect the semantic of
the copy propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
73572abc2a nir: Add scoped_memory_barrier intrinsic
Add a NIR instrinsic that represent a memory barrier in SPIR-V /
Vulkan Memory Model, with extra attributes that describe the barrier:

- Ordering: whether is an Acquire or Release;
- "Cache control": availability ("ensure this gets written in the memory")
  and visibility ("ensure my cache is up to date when I'm reading");
- Variable modes: which memory types this barrier applies to;
- Scope: how far this barrier applies.

Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory
Semantics" are split into two different attributes so we can use
variable modes for the former.

NIR passes that took barriers in consideration were also changed

- nir_opt_copy_prop_vars: clean up the values for the mode of an
  ACQUIRE barrier.  Copy propagation effect is to "pull up a load" (by
  not performing it), which is what ACQUIRE restricts.

- nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the
  pending writes for the modes of an RELEASE barrier.  Dead writes
  effect is to "push down a store", which is what RELEASE restricts.

- nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for
  the modes.  This is conservative, but since this is a GL-specific
  pass, doesn't make a difference for now.

v2: Fix the scoped barrier handling in copy propagation.  (Jason)
    Add scoped barrier handling to nir_opt_access and
    nir_opt_combine_writes.  (Rhys)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:55 -07:00
Timothy Arceri
922801b77d nir: improve nir_variable packing
Before:

/* size: 136, cachelines: 3, members: 10 */

After:

/* size: 128, cachelines: 2, members: 10 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-10-24 13:24:40 +11:00
Timothy Arceri
c412ff426b nir: fix nir_variable_data packing
Before:

/* size: 60, cachelines: 1, members: 29 */

After:

/* size: 56, cachelines: 1, members: 29 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-10-24 13:22:59 +11:00
Marek Olšák
28199aeee5 st/mesa: assign driver locations for VS inputs for NIR before caching
fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs
after caching

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Erik Faye-Lund
acf1bf47cc Revert "nir: drop support for using load_alpha_ref_float"
This reverts commit 5af272b474.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:52 +02:00
Erik Faye-Lund
beb6639a9d Revert "nir: drop unused alpha_ref_float"
This reverts commit e8095f2af0.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:38 +02:00
Marek Olšák
a0b711d8e9 nir: allow nir_lower_uniforms_to_ubo to be run repeatedly
for st/mesa

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-22 14:41:23 -04:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Rob Clark
5e08f070f0 nir: add nir_lower_amul pass
Lower amul to either imul or imul24, depending on whether 24b is enough
bits to calculate an offset within the thing being dereferenced.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-18 15:08:54 -07:00
Rob Clark
1bdde31392 nir: add address calc related opt rules
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
6320e37d4b nir: add amul instruction
Used for address/offset calculation (ie. array derefs), where we can
potentially use less than 32b for the multiply of array idx by element
size.  For backends that support `imul24`, this gives a lowering pass
an easy way to find multiplies that potentially can be converted to
`imul24`.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
0568761f8e nir: Add a new ALU nir_op_imul24
Some hardware can do 24b multiply in a single instruction, but not 32b.
However in most cases 24b is sufficient for address/offset calculation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev
32e5fbf47c nir: Add a new ALU nir_op_imad24_ir3
ir3 compiler has a signed integer multiply-add instruction (MAD_S24)
that is used for different offset calculations in the backend.
Since we intend to move some of these calculations to NIR, we need
a new ALU op that can directly represent it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
ad8167c1e0 nir/search: fix the PoT helpers
Otherwise, if the base type is (for example) uint32, we would
incorrectly think that PoT optimizations could not apply.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstsrand <jason@jleksrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev
f1d4fadf1b nir: Add new texop nir_texop_tex_prefetch
This is like nir_texop_tex, but signals that the sampling coordinates
are immutable during the shader stage, in a way that allows the HW
that supports pre-dispatching sampling operations to pre-fetch
the result prior to scheduling the shader stage.

This is introduced to support the feature in Freedreno. Adreno HW
from a4xx supports it.

A NIR pass introduced later in this series will detect sampling
operations that are eligible for pre-dispatch, and replace
nir_texop_tex by this new op, to tell the backend to enable
pre-fetch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Ian Romanick
050e4e28bf nir/search: Fix possible NULL dereference in is_fsign
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
2019-10-17 15:07:01 -07:00
Kristian H. Kristensen
8e16fb1528 freedreno/ir3: Implement lowering passes for VS and GS
This introduces two new lowering passes. One to lower VS to explicit
outputs using STLW and one to lower GS to load input using LDLW and
implement the GS specific functionality.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
0324706764 freedreno/ir3: Add intrinsics that map to LDLW/STLW
These intrinsics will let us do all the offset calculations in nir,
which is nicer to work with and lets nir_opt_algebraic eat it all up.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Erik Faye-Lund
e8095f2af0 nir: drop unused alpha_ref_float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
5af272b474 nir: drop support for using load_alpha_ref_float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
71c0dcf266 nir: support feeding state to nir_lower_clip_[vg]s
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
eb3047c094 nir: support lowering clipdist to arrays
This allows us to make sure clipdist is emitted as a scalar array rather
than two vec4s. This matches SPIR-V semantics, and will be useful for
Zink.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
011d692a52 nir: support derefs in two-sided lighting lowering
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
878c94288a nir: add lowering-pass for point-size mov
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
6d7e02e37d nir: allow passing alpha-ref state to lowering-code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00