Commit graph

3954 commits

Author SHA1 Message Date
Eric Engestrom
dffeaa55dd util: use standard name for snprintf()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 22:39:38 +01:00
Jason Ekstrand
6301f80b84 nir: Only rematerialize comparisons with all SSA sources
Otherwise, you may end up moving a register read and that could result
in an incorrect shader.  This commit fixes a rendering issue in Elite:
Dangerous.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152
Fixes: 3ee2e84c60 "nir: Rematerialize compare instructions"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-19 19:45:36 +00:00
Daniel Schürmann
e352b4d650 spirv: Fix order of barriers in SpvOpControlBarrier
Semantically, the memory barrier has to come first to wait
for the completion of pending memory requests.
Afterwards, the workgroups can be synchronized.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-19 10:37:37 -07:00
Caio Marcelo de Oliveira Filho
4061a3f6c9 nir: use a switch when printing intrinsic indices
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-07-19 10:04:52 -07:00
Rhys Perry
e8644122ed nir/algebraic: mark a few comparison simplifications as precise
No vkpipeline-db changes found.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Rhys Perry
79801b9d7d nir/algebraic: optimize contradictory iand operands
Some of these were found in a few GTAV, Rise of the Tomb Raider and
Shadow of the Tomb Raider shaders.

Results from vkpipeline-db run with ACO:
Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 220 -> 220 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13492 -> 11560 (-14.32 %) bytes
LDS: 6 -> 6 (0.00 %) blocks
Max Waves: 69 -> 69 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: use False instead of 0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Timothy Arceri
30038dd5ec nir/lower_clip: add support for geometry shaders
This will be used to enabled compat profile support for geometry
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
4b08bb4770 nir/lower_clip: add lower_clip_outputs() helper
This will be reused in the following patch to add support for clip
vertex lowering in geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
a59926b3ca nir/lower_clip: add create_clipdist_vars() helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
e38b930876 nir/lower_clip: add a find_clipvertex_and_position_outputs() helper
This will allow code sharing in a following patch that adds support
for lowering in geometry shaders. It also allows us to exit early
if there is no lowering to do which allows a small code tidy up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Caio Marcelo de Oliveira Filho
b6d4753568 nir/large_constants: De-duplicate constants
If a function has a constant and is called more than once, after
inlining we may end up with different variables representing the same
constant.  This commit look into the data and de-duplicate them.

The first pass now will collect the constant data in a per variable
buffer, then de-duplication happens (by sorting then linear walk), and
the second pass will use the data in var->data.location.

One side-effect of the current implementation is that constants will
be reordered.  If this turns out to be a problem is something that can
be fixed.

An alternative strategy considered was to perform this in a
per-function basis and then merge the results, the problem is that we
would have to fix up the offsets during the merge.  Given the data we
have, the current patch is good enough.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Caio Marcelo de Oliveira Filho
d9b67ad079 nir/large_constants: Use ralloc for var_infos
This will be used later on to allocate constant data for each
variable (and then deduplicate).  Also drop initializing found_read,
as it is already implicitly false in the literal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Eric Anholt
251c64a53d nir: Allow internal changes to the instr in nir_shader_lower_instructions().
v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in
NIR, but doesn't generate a new txf_ms instructions as replacement.  It's
pretty easy to allow that in nir_shader_lower_instructions, and it may be
common in lowering passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 11:28:56 -07:00
Andreas Baierl
f5804f1768 nir: Add gl_PointCoord system value
gl_PointCoord handling needs some special bits set in lima/ppir code
generation. Treating gl_PointCoord as a system value makes it easier
to distinguish from a regular varying.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Andreas Baierl
24af57407c glsl: Optionally declare gl_PointCoord as a system value
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Connor Abbott
4423552ff0 nir/lower_viewport: Check variable mode first
The location is unused for shader_temp and function_temp variables, and
due to the way we nir_lower_io_to_temproraries demotes shader_out
variables to shader_temp variables, it happened to equal
VARYING_SLOT_POS for the gl_Position temporary, which made this pass
fail with the offline compiler due to this coming before vars_to_ssa.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-18 14:21:41 +02:00
Iago Toral Quiroga
50016d7718 nir: add a V3D-specific intrinsic for per-sample color writes
For per-sample color writes we need the output intrinsic to pack the
sample index, which is not provided with regular store_output intrinsics
unless we figured out a way to encode it into the base or the offset.

v2:
 - Drop the writemask (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 08:59:35 +02:00
Caio Marcelo de Oliveira Filho
891a232214 nir/large_constants: Use dominance information to find more constants
Relax the restriction that all the writes need to be in the first
block: now accept variables that have all the writes in the same
block, and all the reads are dominated by that block.

This let the pass identify large constants that are local to a helper
function.  The writes will be at the place that the function is
inlined, possibly not in the first block (but still all in the same
block).

Results for vkpipeline-db in SKL:

total instructions in shared programs: 3624891 -> 3623145 (-0.05%)
instructions in affected programs: 79416 -> 77670 (-2.20%)
helped: 16
HURT: 0

total cycles in shared programs: 1458149667 -> 1458147273 (<.01%)
cycles in affected programs: 30154164 -> 30151770 (<.01%)
helped: 14
HURT: 2

total loops in shared programs: 2437 -> 2437 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 8813 -> 8745 (-0.77%)
spills in affected programs: 2894 -> 2826 (-2.35%)
helped: 8
HURT: 0

total fills in shared programs: 23470 -> 23392 (-0.33%)
fills in affected programs: 12248 -> 12170 (-0.64%)
helped: 6
HURT: 2

LOST:   0
GAINED: 0

Results for shader-db in SKL with Iris:

total instructions in shared programs: 15379442 -> 15379392 (<.01%)
instructions in affected programs: 837 -> 787 (-5.97%)
helped: 2
HURT: 2
helped stats (abs) min: 27 max: 27 x̄: 27.00 x̃: 27
helped stats (rel) min: 10.47% max: 10.67% x̄: 10.57% x̃: 10.57%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 1.23% max: 1.23% x̄: 1.23% x̃: 1.23%
95% mean confidence interval for instructions value: -39.14 14.14
95% mean confidence interval for instructions %-change: -15.51% 6.17%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4880 -> 4880 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 370677237 -> 370676567 (<.01%)
cycles in affected programs: 17852 -> 17182 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 338 max: 356 x̄: 347.00 x̃: 347
helped stats (rel) min: 13.98% max: 14.64% x̄: 14.31% x̃: 14.31%
HURT stats (abs)   min: 24 max: 24 x̄: 24.00 x̃: 24
HURT stats (rel)   min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18%

total spills in shared programs: 11772 -> 11772 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 24948 -> 24948 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-17 12:50:32 -07:00
Jason Ekstrand
812b341578 nir/algebraic: Optimize comparisons and up-casts
These seem like obvious enough optimizations in the world of multiple
integer bit sizes.  The only known thing which hits these at the moment
is some Vulkan CTS tests for 16-bit SSBO values which like to up-cast
and check for equality.  However, it's something that's bound to come up
as we start seeing more integers in shaders.

The optimizations of comparisons of casted values with constants are
something which we would ideally do with range analysis.  However,
lacking that, we can do it in opt_algebraic as long as one side is a
constant.

In dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13, this commit, along
with the previous commit, reduce the number of instructions emitted on
Skylake from 55328 to 44546, a reduction of 20%.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-17 18:44:35 +00:00
Jason Ekstrand
e8505e982a nir/algebraic: Optimize comparing unpacked values
We could, in theory, add the same optimization for 64-bit unpack
operations but that's likely to fight with 64-bit integer lowering on
platforms which require it so it will require more infrastructure before
that will be a good idea.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-17 18:44:35 +00:00
Jason Ekstrand
9fed031e4e nir/algebraic: Print out the list of transforms in the C file
This helps greatly when debugging algebraic transform generators because
you can now actually see the output and verify that your transforms are
getting generated.

Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-17 18:44:35 +00:00
Eric Anholt
28a808a11b nir: Fix nir_lower_alu_to_scalar's instr filtering.
It was checking if the dest or src[0] SSA values were vectors, rather than
whether the ALU op was using the source as a vector resulting in a
nir_fdot4 making it through to vc4 and v3d:

vec1 32 ssa_6 = fdot4 ssa_4.xxxx, ssa_5

Fixes: c1cffa4249 ("nir/alu_to_scalar: Use the new NIR lowering framework")
v2: Use Jason's recommendation to look at input_sizes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-17 10:30:43 -07:00
Caio Marcelo de Oliveira Filho
e2939dc5a1 spirv: Bail when we see CounterBuffer decoration
This decoration can be ignored, so we can just skip the next steps.
Otherwise we'd have to also handle it in apply_var_decoration.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-16 20:31:12 -07:00
Jason Ekstrand
6fb685fe4b nir/regs_to_ssa: Handle regs in phi sources properly
Sources of phi instructions act as if they occur at the very end of the
predecessor block not the block in which the phi lives.  In order to
handle them correctly, we have to skip phi sources on the normal
instruction walk and handle them as a separate walk over the successor
phis.  While registers in phi instructions is a bit of an oddity it can
happen when we temporarily go out-of-SSA for control-flow manipulations.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-16 23:28:03 +00:00
Jason Ekstrand
6394680f6b spirv: Add a warning for ArrayStride on arrays of blocks
It's disallowed according to the SPIR-V spec or at least I think that's
what the spec says.  It's in a section explicitly about explicit layout
of things in the StorageBuffer, Uniform, and PushConstant storage
classes so it's not 100% clear that it applies with other storage
classes.  However, it seems like it should apply in general and
violating it can trigger (fairly harmless) asserts in NIR.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-16 17:02:08 -05:00
Jason Ekstrand
548da20b22 nir/lower_doubles: Handle fdiv and fsub directly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
d7d35a9522 nir/lower_doubles: Use the new NIR lowering framework
One advantage of this is that we no longer need to run in a loop because
the new framework handles lowering instructions added by lowering.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
197a08dc69 nir/lower_doubles: Use "alu" for the nir_alu_instr
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
d65902c179 nir/lower_int64: Use the core NIR lowering framework
One advantage of this is that we no longer need to run in a loop because
the new framework handles lowering instructions added by lowering.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
c1cffa4249 nir/alu_to_scalar: Use the new NIR lowering framework
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
eb768b0a09 nir/alu_to_scalar: Use "alu" as the name for the nir_alu_instr
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
998d84fca5 nir/lower_system_values: Support lowering more intrinsics
Instead of only lowering system from variables, lower most to intrinsics
and let the lowering framework immediately lower the intrinsic.  This
will result in a bit more instruction churn but it means that NIR code
builders can just use intrinsics instead of everything having to go
through variables.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
ae8caaadee nir/lower_system_values: Drop the context-aware builder functions
Instead of having context-aware builder functions, just provide lowering
for the system value intrinsics and let nir_shader_lower_instructions
handle the recursion for us.  This makes everything a bit simpler and
means that the lowering can also be used if something comes in as a
system value intrinsic rather than a load_deref.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
58ffd7fbf6 nir/lower_system_values: Use the new generic NIR lowering helpers
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
ce3af830cb nir/lower_subgroups: Use the new generic NIR lowering helpers
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
758fdce9fe nir: Add some generic helpers for writing lowering passes
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
c74b98486a nir: Add a helper for fetching the SSA def from an instruction
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Caio Marcelo de Oliveira Filho
1210e8caaf spirv: Ignore ArrayStride for storage classes that should not use it
The stride was already overriden when using
lower_workgroup_access_to_offsets, so elaborate a bit the commentary
there.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-15 16:18:57 -07:00
Caio Marcelo de Oliveira Filho
026cfa1099 spirv: Fix stride calculation when lowering Workgroup to offsets
Use alignment to calculate the stride associated with the pointer
types.  That stride is used when the pointers are casted to arrays.

Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b;
} will have element an element size of 12 bytes, but the stride needs
to be 16 bytes to respect the 8 byte alignment.

Fixes: 050eb6389a "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-15 16:18:46 -07:00
Jason Ekstrand
0ba508d7a3 nir,intel: Add support for lowering 64-bit nir_opt_extract_*
We need this when doing full software 64-bit emulation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-15 16:08:37 -05:00
Jason Ekstrand
7a19e05e8c nir/opt_if: Clean up single-src phis in opt_if_loop_terminator
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071
Fixes: 2a74296f24 "nir: add opt_if_loop_terminator()"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-15 19:58:51 +00:00
Alejandro Piñeiro
bb3bbdfbbd glsl/shader_cache: handle SPIR-V shaders
Right now we don't have cache support for SPIR-V shaders (from
ARB_gl_spirv). Right now they are properly skipped because they fall
on the ff shader code path (no key, no name), but it would be better
to update current comments, and add some guards.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov
637b168470 nir/linker: Initialize UniformDataDefaults when using SPIR-V
Allocate UniformDataDefaults and fill in the data defaults when
linking a SPIR-V program. Among other things, this allows program
serialization to work.

It allows the following piglit test (when run on SPIR-V mode) to pass:
  spec/arb_get_program_binary/execution/uniform-after-restore.shader_test

v2: use memcpy to initialize UniformDataDefaults

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov
761b0fe95f glsl/serialize: Update write_program_resource_data() to handle NULL input and output variable names
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov
c3122d2431 glsl/serialize: Handle NULL uniform name in write_uniforms()
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Antia Puentes
cafc1a40d4 nir/types: Add glsl_type_is_unsized_array helper
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Antia Puentes
bfc5e46746 nir/linker: Fill TOP_LEVEL_ARRAY_SIZE and STRIDE
From the ARB_program_interface_query specification:

    "For the property TOP_LEVEL_ARRAY_SIZE, a single integer
    identifying the number of active array elements of the top-level
    shader storage block member containing to the active variable is
    written to <params>.  If the top-level block member is not
    declared as an array, the value one is written to <params>.  If
    the top-level block member is an array with no declared size, the
    value zero is written to <params>."

    "For the property TOP_LEVEL_ARRAY_STRIDE, a single integer
    identifying the stride between array elements of the top-level
    shader storage block member containing the active variable is
    written to <params>.  For top-level block members declared as
    arrays, the value written is the difference, in basic machine
    units, between the offsets of the active variable for consecutive
    elements in the top-level array.  For top-level block members not
    declared as an array, zero is written to <params>."

v2: move top_level_array_size and stride into nir_link_uniforms_state
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Antia Puentes
ae2ea5ec1f nir/linker: Compute the offset for non-trivial uniform types.
ARB_gl_spirv points that the offset must be explicit, however this is
true for 'root' types. For complex types, like struct members or
arrays of arraya, it needs to be computed.

We are not using the offset stored in the gl_buffer_variables during
the uniform blocks linking because currently we do not have a way to
relate a gl_buffer_variable with its corresponding gl_uniform_storage.
The GLSL path uses the name for that, but we can not rely on that
because names are optional in SPIR-V.

Notice that uniforms non-backed by a buffer object will have an offset
equal to -1, like in the GLSL path.

v2: add offset and var_is_in_block as per-variable state in
    nir_link_uniforms_state (Arcady)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00
Antia Puentes
e15c663d8e nir/linker: Add atomic counters to the program resource list
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-12 23:42:41 +02:00
Antia Puentes
e1464a1cf8 nir/linker: Add XFB resources to the program resource list
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-12 23:42:41 +02:00