Commit graph

659 commits

Author SHA1 Message Date
Eric Anholt
5a0d3e1129 nir: Print the components referenced for split or packed shader in/outs.
Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative,
much better to see:

decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0)

v2: Handle arrays and structs better (by Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-20 16:26:46 -07:00
Eric Anholt
d9ce4ac990 nir: Add a safety check that we don't remove dead I/O vars after lowering.
The pass only looks at var load/store intrinsics, not input load/store
intrinsics, so assert that we don't see the other type.

v2: Adjust comment indentation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-20 16:26:07 -07:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Samuel Iglesias Gonsálvez
e382890e25 nir: set default lod to texture opcodes that needed it but don't provide it
v2:
- Use helper to add a new source to the texture instruction.

v3:
- Use nir_tex_instr_src_index() to simplify the patch (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-20 08:29:09 +02:00
Jason Ekstrand
41c75b5354 nir: Add a helper for adding texture instruction sources
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-17 07:36:00 -07:00
Timothy Arceri
f1eb5e6399 nir: add component level support to remove_unused_io_vars()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
6af5e0bec9 nir: add variant of lower_io_to_scalar to be called earlier
This is intended to be called before nir_lower_io() so that we
can do some linking optimisations with the results. It can also
be used with drivers that don't use nir_lower_io() at all such
as RADV.

v2: pass mode mask rather than first and last stage integer.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Jason Ekstrand
3442c9fc3e nir: Get rid of the variable on vote intrinsics
This looks like a copy+paste error.  They don't actually write into that
variable as would be implied by putting the return there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Jason Ekstrand
a0947921eb nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0.  Switching it to bit >= 0 solves this problem.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Kenneth Graunke
a576c148cd nir: Make nir_shader_gather_info() track texelFetch texture accesses.
For TGSI-based drivers, st_glsl_to_tgsi records this information.
For NIR-based drivers, nir_shader_gather_info() will do so.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-12 17:22:42 -07:00
Dave Airlie
2d36efdb7f nir: bump loop unroll limit to 96.
With the ssao demo from Vulkan demos:
radv/rx480: 440->440fps
anv/haswell: 24->34 fps

The demo does a 0->32 loop across a ubo with 32 members.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 10:11:36 +10:00
Eric Anholt
c34295b1a3 nir: Move vc4's alpha test lowering to core NIR.
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).

This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.

v2: Add to meson.build as well.

Acked-by: Rob Clark <robdclark@gmail.com>
2017-10-10 11:42:04 -07:00
Dylan Baker
001b65a899 meson: add nir_linking_helpers.c to libnir
This was missed in a rebase, and doesn't affect radv or anv, only i965.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
7a5a986ddd meson: convert gtest to an internal dependency
In truth gtest is an external dependency that upstream expects you to
"vendor" into your own tree. As such, it makes sense to treat it more
like a dependency than an internal library, and collect it's
requirements together in a dependency object.

v2: - include with -isystem instead of setting compiler args (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:02:08 -07:00
Dylan Baker
d1992255bb meson: Add build Intel "anv" vulkan driver
This allows building and installing the Intel "anv" Vulkan driver using
meson and ninja, the driver has been tested against the CTS and has
seems to pass the same series of tests (they both segfault when the CTS
tries to run wayland wsi tests).

There are still a mess of TODO, XXX, and FIXME comments in here. Those
are mostly for meson bugs I'm trying to fix, or for additional things to
implement for other drivers/features.

I have configured all intermediate libraries and optional tools to not
build by default, meaning they will only be built if they're pulled in
as a dependency of a target that will actually be installed) this allows
us to avoid massive if chains, while ensuring that only the bits that
need to be built are.

v2: - enable anv, x11, and wayland by default
    - add configure option to disable valgrind
v3: - fix typo in meson_options (Nicholas)
v4: - Remove dead code (Eric)
    - Remove change to generator that was from v0 (Eric)
    - replace if chain with loop (Eric)
    - Fix typos (Eric)
    - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric)
v5: - rebase on util string buffer implementation

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v4)
2017-09-27 09:12:19 -07:00
Timothy Arceri
45ef10c06a nir: add some helpers for doing linking
The initial helpers add support for removing unused varyings between
stages.

V2:
- Moved the io mask helper function into this file rather than
  nir.h so it's not used elsewhere considering it doesn't handle
  all corner cases.
- Use bitmask rather than hash table to handle tcs outputs (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-26 22:37:02 +10:00
Timothy Arceri
4244bea859 nir: add always_active_io to nir variable
Will be used in nir link pass to decided if we can remove a varying
or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Dave Airlie
42d50c779b nir: put compact into bitfields in nir_variable_data
This being declared bool means it won't get merged with the previous
bitfields, this seems like an oversight rather than deliberate.

Noticed when running pahole.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:04 +10:00
Matt Turner
50e4099edf nir: Remove series of unnecessary conversions
Clang warns:

warning: absolute value function 'fabsf' given an argument of type
'const float64_t' (aka 'const double') but has parameter of type 'float'
which may cause truncation of value [-Wabsolute-value]

            float64_t dst = bit_size == 64 ? fabs(src0) : fabsf(src0);

The type of the ternary expression will be the common type of fabs() and
fabsf(): double. So fabsf(src0) will be implicitly converted to double.
We may as well just convert src0 to double before a call to fabs() and
remove the needless complexity, à la

            float64_t dst = fabs(src0);

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Jason Ekstrand
63e79a8a77 nir: Fix system_value_from_intrinsic for subgroups
A couple of the cases were backwards

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-28 08:57:52 -07:00
Jason Ekstrand
79d8d6b022 nir: Fix some whatespace
Somehow tabs got in there...

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-28 08:57:31 -07:00
Connor Abbott
de91461575 nir: fix algebraic optimizations
The optimizations are only valid for 32-bit integers. They were
mistakenly firing for 64-bit integers as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-01 12:20:49 -07:00
Nicolai Hähnle
e902ac3268 nir: add nir_lower_uniforms_to_ubo pass
This is a further lowering of default-block uniform loads that transforms
load_uniform intrinsics into load_ubo intrinsics. This simplifies the rest
of the backend.

v2: transform from load_uniform instead of straight from variables

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
bce6f99875 nir: add nir_lower_samplers_as_deref pass
This pass is a replacement for the nir_lower_samplers pass, which has the
advantage of keeping sampler references as derefs. This allows a unified
treatment of texture instructions and image intrinsics in the backend.
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
f1da97ef7a nir: add load_frag_coord system value intrinsic
Some drivers prefer to treat gl_FragCoord as a system value rather than
a fragment shader input, see Const.GLSLFragCoordIsSysVal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Nicolai Hähnle
5011923e09 nir: fix nir_lower_wpos_ytransform when gl_FragCoord is a system value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Nicolai Hähnle
b27c2d402e nir: add nir_instr_rewrite_deref
Allows modifying a texture instruction's texture and sampler derefs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Matt Turner
aff108f2fd nir: Optimize find_lsb/imsb/umsb error checks
Two of the ARB_shader_ballot piglit tests hit the find_lsb case,
removing some of the noise allowed me to better debug the test when it
was failing.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-20 16:56:50 -07:00
Matt Turner
1038d385a9 nir: Reduce destination size of ballot intrinsic when possible
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
3e7b8f6cd4 nir: Add pass to scalarize read_invocation/read_first_invocation
i965 will want these to be scalar operations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
636fe4d1c6 nir: Add intrinsics from ARB_shader_ballot
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
742cc6118a nir: Support lowering vote intrinsics
... trivially (as allowed by the spec!) by reusing the existing
nir_opt_intrinsics code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
d4c9d6a3b2 nir: Add pass to optimize intrinsics
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
ba2fbbf1c0 nir: Add intrinsics from ARB_shader_group_vote
These are intrinsics rather than opcodes, because they operate across
channels.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Kenneth Graunke
0320bb2c6c nir: Use nir_src_copy instead of direct assignments.
If the source is an indirect register, there is ralloc'd data.  Copying
with a direct assignment will copy the pointer, but the data will still
belong to the old instruction's memory context.  Since we're lowering
and throwing away instructions, that could free the data by mistake.

Instead, use nir_src_copy, which properly handles this.

This is admittedly not a common case, so I think the bug is real,
but unlikely to be hit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-18 23:44:50 -07:00
Timothy Arceri
3f0fb23b03 nir: fix nir_opt_copy_prop_vars() for arrays of arrays
Previously we only incremented the guide for a single
dimension/wildcard.

V2: rework logic to avoid code duplication

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-07-19 11:06:23 +10:00
Jason Ekstrand
ecf91898e0 nir/vars_to_ssa: Handle missing struct members in foreach_deref_node
This can happen if, for instance, you have an array of structs and there
are both direct and wildcard references to the same struct and some
members only have direct or only have indirect.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-19 11:06:23 +10:00
Connor Abbott
4df93a54f1 nir/lower_io_to_temporaries: don't set compact on shadow vars
The compact flag doesn't make sense on local variables, since the
packing on them is up to the driver. This fixes nir_validate assertions
in some cases, particularly when lower_io_to_temporaries is used on
per-vertex inputs/outputs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-13 14:45:25 -07:00
Connor Abbott
99ff7a9f1f nir: don't segfault when printing variables with no name
While normally we give variables whose name field is NULL a temporary
name when called from nir_print_shader(), when we were calling from
nir_print_instr() we never bothered, meaning that we just segfaulted
when trying to print out instructions with such a variable. Since
nir_print_instr() is meant to be called while debugging, we don't need
to bother too much about giving a consistent name, but we don't want to
crash in the middle of debugging.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-13 14:40:23 -07:00
Ilia Mirkin
f3958f1644 nir: copy front interpolation when creating fake back color input
Fixes a bunch of gl_BackColor interpolation tests that had explicit
interpolation specified on the fragment shader gl_Color.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-07-08 21:27:44 -04:00
Nicolai Hähnle
34df9525f6 nir: add NIR_PRINT environment variable
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-07-05 12:27:07 +02:00
Johnson Lin
8ff4be44b7 nir: Add a lowering pass for UYVY textures
Similar with support for YUYV but with byte order difference in sampler

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Juan A. Suarez Romero
4195a9450b nir: sge operation is defined for floating-point types
According to GLSL.std.450 spec, the operand for step() function must be
a floating-point. It does not restrict the value to 32-bit floats.

Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-27 12:01:11 +02:00
Grazvydas Ignotas
29b9f35704 nir: make various getters take const pointers
This will allow to constify other things.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-10 16:48:45 +03:00
Thomas Helland
cfb696dc82 nir: Delete nir_array.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
bc3a2be6c9 nir: Remove unused include
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Eric Engestrom
63a8a88ac4 tree-wide: remove trailing backslash
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-07 01:18:09 +01:00
Rob Clark
6f65a1a211 nir/lower-atomics-to-ssbo: remove atomic_uint arrays too
Maybe there is a better way to do this.  But by the time we get to
assigning uniform locs, we want the atomic_uint's to all be gone,
otherwise we assert in st_glsl_attrib_type_size().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:34 -04:00
Rob Clark
5f6c034f82 nir/lower-atomics-to-ssbo: fix num_components
Fixes some piglits like arb_shader_atomic_counters-active-counters

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:34 -04:00