Commit graph

113092 commits

Author SHA1 Message Date
Gert Wollny
50b66622f1 r600/sfn: Count only literals that are not inline to split instruction groups
An instruction group can only support 4 distinct literals, but inline
constants count into this number, so skip them when counting.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
9c7ce4d76e r600/sfn: Fix using the result of a fetch instruction in next fetch
The result of a fetch instruction can't be used as source in the same CF
block, so force a new CF block when the result would be used in the same
vertex fetch block.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
67495ff9aa r600/sfn: Fix handling of GS inputs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
58d6cda5f5 r600/sfn: Handle b2b1 like it was a mov
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
de7ea88ff8 r600/sfn: Fix null pointer deref in live range evalation
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
5d10e3ec60 r600/nir: Pin interpolation results to channel
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
5e036fef1f r600/sfn: Implementing instructions blocks
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
b51ced7306 r600/sfn: Fix setting alignments when lowering UBOs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
bc9cf6adff r600/sfn: Reduce array limit for scratch usage
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Gert Wollny
6fdc75d1c6 r600: Dump a few more variables when requested
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4609>
2020-04-21 15:10:43 +00:00
Abhishek Kumar
f06e4ab319 anv/android: fix assert in anv_import_ahw_memory
Commit fixes assert that triggers when running
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer#bind_export_import_bind

on a debug build of Mesa.

Fixes: c79a528d ("anv/android: support import/export of AHardwareBuffer objects")
Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4655>
2020-04-21 11:50:15 +00:00
Danylo Piliaiev
829013d0ca st/mesa: Re-assign vs in locations after updating nir info for ffvp/ARB_vp
After call to nir_shader_gather_info - inputs_read may have changed so
st_nir_assign_vs_in_locations should be called for shader to remain in
sync with vbo state.

Fixes piglit tests:
  gl-1.0-fpexceptions
  gl-1.1-color-material-unused-normal-array
  arb_vertex_program-unused-attributes
regression on several gallium drivers.

Fixes: d684fb37bf
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4645>
2020-04-21 11:16:41 +00:00
Connor Abbott
ae169f38ce tu: Fix the advertised maxFragmentInputComponents
This appears to be limited by VPC_CNTL_0::NUMNONPOSVAR, which is an
8-bit bitfield with no possibility for expansion. Also, in practice
we'll be limited by the vertex shader output maximum, which includes
gl_Position, of 128, so that users won't be able to use more than 124
components anyways. Lower it to match the GL blob.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4641>
2020-04-21 10:04:13 +00:00
Connor Abbott
45ec9c0f3d freedreno/a6xx: Expand various varying-count bitfields
The extra bit needs to be used when using the maximum of 128 varying
components. I confirmed that PC_PRIMITIVE_CNTL_1 and SP_PRIMITIVE_CNTL
are expanded using a trace of the Vulkan blob with the maximum number of
varyings, and changed the others by analogy.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4641>
2020-04-21 10:04:13 +00:00
Pierre-Eric Pelloux-Prayer
56f174d14e st/omx: fix gcc warnings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4584>
2020-04-21 09:16:28 +02:00
Pierre-Eric Pelloux-Prayer
07071cac7b gallium/utils: silence strncpy warning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4584>
2020-04-21 09:16:26 +02:00
Pierre-Eric Pelloux-Prayer
dbfeec62c3 mesa: fix crash in find_value
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4584>
2020-04-21 09:16:18 +02:00
Jason Ekstrand
7c43b8ce1b nir: Delete the fnoise opcodes
As of the previous commit, they are never used.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
2020-04-21 06:16:13 +00:00
Jason Ekstrand
4386c06770 glsl: Hard-code noise to zero in builtin_functions.cpp
Version 4.4 of the GLSL spec changed the definition of noise*() to
always return zero and earlier versions of the spec allowed zero as a
valid implementation.

All drivers, as far as I can tell, unconditionally call lower_noise()
today which turns ir_unop_noise into zero.  We've got a 10-year-old
comment in there saying "In the future, ir_unop_noise may be replaced by
a call to a function that implements noise."  Well, it's the future now
and we've not yet gotten around to that.  In the mean time, the GLSL
spec has made doing so illegal.

To make things worse, we then pretend to handle the opcode in
glsl_to_nir, ir_to_mesa, and st_glsl_to_tgsi even though it should never
get there given the lowering.  The lowering in st_glsl_to_tgsi defines
noise*() to be 0.5 which is an illegal implementation of the noise
functions according to pre-4.4 specs.  We also have opcodes for this in
NIR which are never used because, again, we always call lower_noise().

Let's just kill the whole opcode and make builtin_builder.cpp build a
bunch of functions that just return zero.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
2020-04-21 06:16:13 +00:00
Timothy Arceri
95f555a93a st/glsl_to_nir: make use of nir linker for linking uniforms
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
2020-04-21 01:57:34 +00:00
Timothy Arceri
0f79e0f7c6 glsl: fix gl_nir_set_uniform_initializers() for bindless textures
We need to skip opaque variables inside blocks, this is handled
elsewhere and will cause a crash here.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
2020-04-21 01:57:34 +00:00
Timothy Arceri
9546440227 glsl: add bindless support to nir uniform linker
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
2020-04-21 01:57:34 +00:00
Jason Ekstrand
a4b36cd3dd intel/fs: Coalesce when the src live range is contained in the dst
Consider the following case:

    // g119-123 are written somewhere above
    mul.sat(16)   g67<1>F    g6.4<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g69<1>F    g6.5<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g71<1>F    g6.6<0,1,0>F   g125<8,8,1>F
    mov(16)       g119<1>F   g67<8,8,1>F
    mov(16)       g121<1>F   g69<8,8,1>F
    mov(16)       g123<1>F   g71<8,8,1>F

We should be able to coalesce it into

    mul.sat(16)   g119<1>F   g6.4<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g121<1>F   g6.5<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g123<1>F   g6.6<0,1,0>F   g125<8,8,1>F

What's stopping us is an overly conservative check for writes to the two
registers being coalesced.  The check walks over the intersection of
their live ranges and checks for no writes to either one.  However,
because the register which starts the live range (the mul.sat in this
case) is inside that intersection, we flag it as a write in the
intersection and don't coalesce.  However, this case is safe because the
destination register of the copy is never read after the source is
written.

Shader-db changes on ICL:

    total instructions in shared programs: 16043613 -> 16042610 (<.01%)
    instructions in affected programs: 43036 -> 42033 (-2.33%)
    helped: 226
    HURT: 0
    helped stats (abs) min: 1 max: 30 x̄: 4.44 x̃: 4
    helped stats (rel) min: 0.09% max: 26.67% x̄: 4.89% x̃: 3.43%
    95% mean confidence interval for instructions value: -4.86 -4.02
    95% mean confidence interval for instructions %-change: -5.57% -4.22%
    Instructions are helped.

    total cycles in shared programs: 334766372 -> 334710124 (-0.02%)
    cycles in affected programs: 617548 -> 561300 (-9.11%)
    helped: 214
    HURT: 2
    helped stats (abs) min: 15 max: 1512 x̄: 263.21 x̃: 212
    helped stats (rel) min: 0.30% max: 75.36% x̄: 25.30% x̃: 21.58%
    HURT stats (abs)   min: 40 max: 40 x̄: 40.00 x̃: 40
    HURT stats (rel)   min: 0.15% max: 0.15% x̄: 0.15% x̃: 0.15%
    95% mean confidence interval for cycles value: -277.91 -242.90
    95% mean confidence interval for cycles %-change: -27.58% -22.55%
    Cycles are helped.

No spill/fill changes or gained/lost

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4627>
2020-04-21 01:00:24 +00:00
Jason Ekstrand
14b8d979db intel/fs: Rename block to scan_block in can_coalesce_vars
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4627>
2020-04-21 01:00:24 +00:00
Jonathan Marek
064d39e620 radv: use common nir_convert_ycbcr
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Jonathan Marek
7870d71459 anv: use common nir_convert_ycbcr
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Jonathan Marek
71820c6b02 nir: convert_ycbcr: preserve alpha channel
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Jonathan Marek
f8558fb1ce nir: add common convert_ycbcr for vulkan csc
Copied from anv, replaced state with passing model/range directly.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Dave Airlie
c2d8a4bf17 nir/linking: fix issue with two compact variables in a row. (v2)
If we have a clip dist float[1] compact followed by a tess factor
float[2] we don't want to overlap them, but the partial check
only happens for non-compact vars.

This fixes some issues seen with my sw vulkan layer with
dEQP-VK.clipping.user_defined.clip_distance*

v2: v1 failed with clip/cull mixtures, since in that
case the cull has a location_frac to follow after the clip
so only reset if we get a location_frac of 0 in a subsequent
clip var

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4635>
2020-04-20 21:08:54 +00:00
Rafael Antognolli
4abf0837cd anv: Add support for new MMAP_OFFSET ioctl.
v2: Update getparam check (Ken).

[jordan.l.justen@intel.com: use 0 offset for MMAP_OFFSET]

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
0d387da083 anv: Add anv_device parameter to anv_gem_munmap.
Also update all of its callers.

On the next commit, the device will be used by anv_gem_munmap to choose
whether we need to call the valgrind code or not, depending on which
type of mmap we are using.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
d1c1ead7cd iris/bufmgr: Add support for MMAP_OFFSET ioctl.
Use the new DRM_IOCTL_I915_GEM_MMAP_OFFSET ioctl when available.

[jordan.l.justen@intel.com: iris port]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Update getparam check (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
ae6f06c509 i965/bufmgr: Add support for MMAP_OFFSET ioctl.
Use the new DRM_IOCTL_I915_GEM_MMAP_OFFSET ioctl when available.

v2: update getparam check (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
5bc3f52dd8 iris/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.
We want to add a new ioctl for mmap'ing buffers, so let's avoid
duplicating that code on both functions by extracting it from them
first.

[jordan.l.justen@intel.com: iris port]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Rename helper function names (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
a42d715784 i965/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.
We want to add a new ioctl for mmap'ing buffers, so let's avoid
duplicating that code on both functions by extracting it from them
first.

v2: Update helper function names (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:05 -07:00
Caio Marcelo de Oliveira Filho
a1f6ae4744 spirv: Fix propagation of OpVariable access flags
After the decorations of a variable are evaluated, propagate the
access flag to the associated vtn_pointer.  This was done when
creating the pointer but at that point there was no access flags for
the variable.

Inline the pointer creation to make this point clearer, in isolation
the helper made the impression that the value was being propagated.

Issue found by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4620>
2020-04-20 16:46:06 +00:00
Caio Marcelo de Oliveira Filho
c76f2292b5 intel/fs,vec4: Properly account SENDs in IVB memory fence
Change brw_memory_fence to return the number of messages emitted, and
use that to update the send_count statistic in code generation.

This will fix the book-keeping for IVB since the memory fences will
result in two SEND messages.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4646>
2020-04-20 09:29:09 -07:00
Daniel Schürmann
c3c1f4d6bc aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101 aco: fix 64bit fsub
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Erik Faye-Lund
ed29b24e23 gtest: Update to 1.10.0
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4576>
2020-04-20 11:57:11 +00:00
Samuel Pitoiset
59427b6d1d nir/opt_algebraic: lower 64-bit fmin3/fmax3/fmed3
This unconditionally lowers 64-bit fmin3/fmax3/fmed3 because
AMD hardware doesn't have native instructions, and no drivers
except RADV uses these instructions.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.f64.*
with ACO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
2020-04-20 06:59:47 +00:00
Samuel Pitoiset
eed0ace466 nir/lower_int64: lower imin3/imax3/umin3/umax3/imed3/umed3
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.i64.*
with ACO because this backend compiler expects most of the 64-bit
operations to be lowered.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
2020-04-20 06:59:47 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0 radeonsi: skip vs output optimizations for some outputs
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
2020-04-20 08:45:16 +02:00
Timothy Arceri
839818332c nir/gcm: dont move movs unless we can replace them later with their src
This helps us avoid moving the movs outside if branches when there
src can't be scalarized.

For example it avoids:

   vec4 32 ssa_7 = tex ssa_6 (coord), 0 (texture), 0 (sampler),
   if ... {
      r0 = imov ssa_7.z
      r1 = imov ssa_7.y
      r2 = imov ssa_7.x
      r3 = imov ssa_7.w
      ...
   } else {
      ...
      if ... {
         r0 = imov ssa_7.x
         r1 = imov ssa_7.w
         ...
      else {
         r0 = imov ssa_7.z
         r1 = imov ssa_7.y
         ...
      }
      r2 = imov ssa_7.x
      r3 = imov ssa_7.w
   }
   ...
   vec4 32 ssa_36 = vec4 r0, r1, r2, r3

Becoming something like:

   vec4 32 ssa_7 = tex ssa_6 (coord), 0 (texture), 0 (sampler),
   r0 = imov ssa_7.z
   r1 = imov ssa_7.y
   r2 = imov ssa_7.x
   r3 = imov ssa_7.w

   if ... {
      ...
   } else {
      if ... {
         r0 = imov r2
         r1 = imov r3
         ...
      else {
         ...
      }
      ...
   }

While this is has a smaller instruction count it requires more work
for the same result. With more complex examples we can also end up
shuffling the registers around in a way that requires more registers
to use as temps so that we don't overwrite our original values along
the way.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Timothy Arceri
e4e5beee8a nir/gcm: be more conservative about moving instructions from loops
Here we only pull instructions further up control flow if they are
constant or texture instructions. See the code comment for more
information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Timothy Arceri
bf4a6c99d2 nir/gcm: allow derivative dependent intrinisics to be moved earlier
We can't move them later as we could move them into non-uniform
control flow, but moving them earlier should be fine.

This helps avoid a bunch of spilling in unigine shaders due to
moving the tex instructions sources earlier (outside if branches)
but not the instruction itself.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Jason Ekstrand
50a6dd0d65 nir/gcm: Prefer the instruction's original block
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Jason Ekstrand
d4cf2df01a nir/gcm: Delete dead instructions
Classically, global code motion is also a dead code pass.  However, in
the initial implementation, the decision was made to place every
instruction and let conventional DCE clean up the dead ones.  Because
any uses of a dead instruction are unreachable, we have no late block
and the dead instructions are always scheduled early.  The problem is
that, because we place the dead instruction early, it  pushes the
placement of any dependencies of the dead instruction earlier than they
may need to be placed.  In order prevent dead instructions from
affecting the placement of live ones, we need to delete them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Jason Ekstrand
dca3f351e5 nir/gcm: Add a real concept of "progress"
Now that the GCM pass is more conservative and only moves instructions
to different blocks when it's advantageous to do so, we can have a
proper notion of what it means to make progress.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00
Jason Ekstrand
5b1615fdb7 nir/gcm: Move block choosing into a helper function
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
2020-04-20 03:46:29 +00:00