Commit graph

1487 commits

Author SHA1 Message Date
Jason Ekstrand
ff2f44d865 intel/fs: Implement nir_intrinsic_load_global_constant
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6379>
2020-09-01 20:50:04 +00:00
Jason Ekstrand
cccb497d3c intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+
The immediate case is pretty uncommon to see but it can happen, in
theory.  BROADCAST is typically used to uniformize values and those are
usually 32-bit.  However, it does come up in some subgroup ops.

Fixes: 49c21802cb "intel/compiler: Split has_64bit_types into float/int"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6211>
2020-09-01 13:25:20 -05:00
Karol Herbst
70cbddc4a7 nir: use enum operator helper for nir_variable_mode and nir_metadata
those are used quite a bit

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6520>
2020-09-01 17:45:08 +00:00
Jason Ekstrand
4d18e71fea nir: Rename num_shared to shared_size
This one is always a size in bytes.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6524>
2020-09-01 17:30:51 +00:00
Jason Ekstrand
e8b3bc1d55 intel/nir: Lower things with > 4 components in lower_mem_access_bit_sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>
2020-08-31 17:04:40 +00:00
Jason Ekstrand
55ae704513 intel/fs: Add support for vec8 and vec16 ops
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>
2020-08-31 17:04:40 +00:00
Jason Ekstrand
b7db9ee320 intel/nir: Clean up lower_alpha_to_coverage a bit
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
b6fdb1405e intel/nir: Rewrite the guts of lower_alpha_to_coverage
I have no idea how this pass ever worked.  I guess it worked ok on the
one or two piglit tests but the whole thing seemed very fragile.  It
makes a number of undocumented and unasserted assumptions and they
aren't always valid.  This rewrite makes a number of changes:

 1. It now properly handles the case where the gl_SampleMask write comes
    before the gl_FragColor or gl_FragData[0] write.

 2. It should early-exit faster because it now looks at bits in
    shader_info::outputs_written instead of looking for variables.

 3. Instead of the fragile variable lookup where we try to look the
    variable up by both location and driver_location and match, we just
    use the driver_location calculations used by brw_fs_nir.

 4. It asserts that the index parameter to store_output is a constant
    instead of silently failing if it isn't.

 5. We now actually assert the implicit assumption that the two writes
    are in the same block.  We go even further and assert that they are
    in the last block in the shader.

 6. In the case where 3 or fewer components of the output are written,
    we explicitly choose to leave the sample mask alone.

Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Closes: #3166
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
72dc06e07e intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage
I'm honestly not sure how passing a builder by-value ever worked.  I
guess the struct is mostly copyable.  In any case, that's the wrong way
to use it and it's causing issues.

Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
c84e2784eb intel/nir: Allow splitting a single load into up to 32 loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
2020-08-21 22:49:54 +00:00
Jason Ekstrand
febe762246 intel/fs: Fix an assert in load_scratch
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
2020-08-21 22:49:54 +00:00
Jesse Natalie
d3faac7a15 nir: Add options to nir_lower_compute_system_values to control compute ID base lowering
If no options are provided, existing intrinsics are used.
If the lowering pass indicates there should be offsets used for global
invocation ID or work group ID, then those instructions are lowered to
include the offset.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
2020-08-21 22:07:05 +00:00
Jesse Natalie
2e1df6a17f nir: Move compute system value lowering to a separate pass
The actual variable -> intrinsic lowering stays where it is, but
ops which convert one intrinsic to be implemented in terms of
another have moved.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
2020-08-21 22:07:05 +00:00
Karol Herbst
e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Jason Ekstrand
1ccd681109 nir: Add an LOD parameter to image_*_size
The OpenCL image_width/height/depth functions have variants which can
take an LOD parameter.  More importantly, LLVM-SPIRV-Translator always
generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero.
Given that over half the hardware out there has an LOD field for image
size queries (based on a rudimentary scan through their NIR -> whatever
code), we may as well just add the source to the NIR intrinsic.  If this
is ever a problem for anyone, the lowering is pretty trivial.

I've also added asserts to everyone's drivers that should alert them if
they ever see an LOD other than zero.  This will never happen with GL or
Vulkan so there's no need for panic.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>
2020-08-20 20:48:10 +00:00
Louis-Francis Ratté-Boulianne
7dcb1d272f st/mesa: Replace UsesStreams by ActiveStreamMask for GS
Some drivers need to know which streams are used by a geometry
shader. Adding a mask of active streams makes the use of
UsesStreams superfluous as it's the equivalent of:

    ActiveStreamMask != (1 << 0)

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5984>
2020-08-18 11:17:26 +00:00
Jason Ekstrand
003b04e266 intel/compiler: Allow MESA_SHADER_KERNEL
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Caio Marcelo de Oliveira Filho
e2b6ccbdad intel/compiler: Use C99 array initializers for prog_data/key sizes
This is way better than a pile of STATIC_ASSERTs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Jason Ekstrand
8e1de8e5ac intel/cs_intrinsics: Handle 64-bit intrinsics
It's safe to do the math in 32 bits because they're all local workgroup
calculations.  We just need to do a conversion at the end.  For a couple
of intrinsics, we just turn them into 32-bit intrinsics and add a u2u64.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Mark Janes
cf52b40fb0 intel/fs: work around gen12 lower-precision source modifier limitation
GEN:BUG:1604601757 prevents source modifiers for multiplication of
lower precision integers.

lower_mul_dword_inst() splits 32x32 multiplication into 32x16, and
needs to eliminate source modifiers in this case.

Closes: #3329
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2020-08-10 13:30:45 -07:00
Mark Janes
ee06e47c5b intel/fs: Assert if lower_source_modifiers converts 32x16 to 32x32 multiplication
Lowering source modifiers will convert a 16bit source to a 32bit
value.  In the case of integer multiplication, this will reverse
previous lowering performed by lower_mul_dword_inst.

Assert to prevent an illegal DxD operation (and GPU hang).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2020-08-10 13:29:56 -07:00
Eric Anholt
023e6669cc i965: Enable vector shrinking in the vec4 backend.
This manages to make some extra vec operations that would turn into movs
go away.

brw shader-db:
total instructions in shared programs: 3895037 -> 3893221 (-0.05%)
total cycles in shared programs: 113832759 -> 113792154 (-0.04%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6050>
2020-08-03 21:26:45 +00:00
Matt Turner
c883c482be intel/compiler: Relax SENDS regioning assertions
The next commit fixes a mistake in the assembler and ends up running
afoul of this assertion.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5956>
2020-07-31 12:59:24 -07:00
David Stevens
6c11a7994d i965/i915: Add colorspace support to YUV sampling
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6122>
2020-07-31 07:27:03 +00:00
Boris Brezillon
025988f818 intel: Set int64_options to ~0 when lowering 64b ops
That's more future proof than setting each bit manually. Looks like we
already miss nir_lower_ufind_msb64 because of that.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
2020-07-30 16:54:24 +00:00
Boris Brezillon
bfee35b45c nir: Stop passing an options arg to nir_lower_int64()
This information is exposed through shader->options->lower_int64_options.
Removing the extra arg forces drivers to initialize this field correctly.

This also allows us to check the int64 lowering options from each int64
lowering helper and decide if we should lower the instructions we
introduce.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
2020-07-30 16:54:24 +00:00
Marcin Ślusarz
cb19fe24d3 intel/vec4: fix out of bounds read
NIR_MAX_VEC_COMPONENTS was bumped from 4 to 16 in a8ec4082
(2019.03.09, merged 2019.12.21)

float[4] array was added in acd7796a
(2019.06.11, merged 2019.07.11)

Found by Coverity.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3014

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Fixes: a8ec4082a4 ("nir+vtn: vec8+vec16 support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6067>
2020-07-30 10:41:00 +00:00
Jason Ekstrand
5c5555a862 nir: Add a find_variable_with_[driver_]location helper
We've hand-rolled this loop 10 places and those are just the ones I
found easily.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:58 +00:00
Jason Ekstrand
2956d53400 nir: Add nir_foreach_shader_in/out_variable helpers
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:57 +00:00
Francisco Jerez
4d73988f6f intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence.
This avoids some performance regressions on Gen12 platforms caused by
SIMD32 fragment shaders reported in titles like Dota2, TF2, Xonotic,
and GFXBench5 Car Chase and Aztec Ruins.

The most obvious pattern in the regressing shaders I identified among
these workloads is that they all had non-uniform discard statements,
which are handled rather optimistically by the current IR analysis
pass: No penalty is currently applied to the SIMD32 variant of the
shader in the form of differing branching weights like we do for other
control flow instructions in order to account for the greater
likelihood of divergence of a SIMD32 shader.

Simply changing that by giving the same treatment to discard
statements as we give to other branching instructions seemed to hurt
more than it helped on platforms earlier than Gen12, since it reversed
most of the improvement obtained from SIMD32 fragment shaders in
Manhattan for no measurable benefit in other workloads (Manhattan has
a handful of shaders with statically non-uniform discard statements
which actually perform better in SIMD32 mode due to their approximate
dynamic uniformity).  For that reason this change is applied to Gen12+
platforms only.

I've been running a number of tests trying to understand the
difference in behavior between Gen12 and earlier platforms, and most
of the evidence I've gathered seems to point at EU fusion being the
culprit: Unlike previous generations, on Gen12 EUs are arranged in
pairs which execute instructions in lockstep, giving an effective warp
size of 64 threads in SIMD32 mode, which seems to increase the
likelihood for control flow divergence in some of the affected shaders
significantly.

Fixes: 188a3659ae "intel/ir: Import shader performance analysis pass."
Reported-by: Caleb Callaway <caleb.callaway@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5910>
2020-07-23 01:40:06 +00:00
Jason Ekstrand
675d7b19a9 intel/fs: Use the correct logical op for global float atomics
Fixes: e644ed468f "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
2020-07-21 05:01:34 +00:00
Yevhenii Kolesnikov
36abb0c691 intel/compiler: don't propagate cmp to add if add is saturated
From the Kaby Lake PRM Vol. 7 "Assigning Conditional Flags":

   * Note that the [post condition signal] bits generated at
     the output of a compute are before the .sat.

Paragraph about post_zero does not mention saturation, but
testing it on actual GPUs shows that conditional modifiers
are applied after saturation.

   * post_zero bit: This bit reflects whether the final
     result is zero after all the clamping, normalizing,
     or format conversion logic.

For signed types we don't care about saturation: it won't
change the result of conditional modifier.

For floating and unsigned types there two special cases,
when we can remove inst even if scan_inst is saturated: G
and LE. Since conditional modifiers are just comparations
against zero, saturating positive values to the upper
limit never changes the result of comparation.

For negative values:
(sat(x) >  0) == (x >  0) --- false
(sat(x) <= 0) == (x <= 0) --- true

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2610

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4167>
2020-07-11 00:25:48 +00:00
Jordan Justen
8dfa072ed8 intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used
If INTEL_DEBUG=no16 is used, then simd16 will not be attempted. This,
in turn prevents simd32 from running, because we attempt to skip
simd32 when simd16 fails to compile.

This change more accurately recognizes when we attempted simd16, but
simd16 failed.

One easy way to cause an issue is to set both no8 and no16. Before
this change, we would be left with no FS program, even though simd32
could still be generated in some cases.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>
2020-07-09 15:44:57 -07:00
Jordan Justen
1a4a2f563b intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16
If no16 was specified, and the shader can't run in simd8 due to the
local_size, then we need to generate a simd32 program.

If both no8 and no16 are specified, then we need to generate a simd32
program.

Rework:
 * Drop update of `if` that would have changed `do32` to try simd32
   even if simd16 spilled registers. (Caio)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>
2020-07-09 15:44:34 -07:00
Timothy Arceri
1a8f918050 intel/compiler: add and fix up fallthrough comments for gcc warnings
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5714>
2020-07-02 12:11:30 +10:00
Matt Turner
8da810a7fb intel/compiler: Don't emit no-op cr0 changes
If mask is 0, we're asking for no changes to cr0.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>
2020-07-02 01:24:06 +00:00
Matt Turner
fe14dc98bf intel/compiler: Add assert that set bits are within mask
We generate bitfields of bits that we want to retain (mask) and bits
that we want to set (brw_mode) in the cr0 register, so the bits we want
to set should be in the set of bits we want to retain.

Also, remove the initialization of mask from
fs_visitor::emit_shader_float_controls_execution_mode since
brw_rnd_mode_from_nir initializes the mask parameter unconditionally.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>
2020-07-02 01:24:06 +00:00
Jason Ekstrand
561aaeeb48 intel/eu: Add the RNDU opcode
We don't want to use it on gen5 and earlier because only RNDD can be
done with a single instruction and we can implement RNDU(x) as -RNDD(-x)
so it's better to just do that when we have the instruction.  On gen6
and above, we may as well just use the right instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:54 +00:00
Jason Ekstrand
e0ab48e3ea intel/eu: Set the right subnr for ALIGN16 destinations
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:54 +00:00
Jason Ekstrand
8a0d772dca intel/eu: Add a brw_urb_dest_msg_type helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:54 +00:00
Kenneth Graunke
2c762955d4 intel/eu: Add a brw_urb_desc helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:53 +00:00
Jason Ekstrand
ecda98fbb2 intel/compiler: Expose brw_texture_offset to C
Some day we probably want to move it out of brw_shader if we're going to
share it with IBC but that can be another day.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:53 +00:00
Jason Ekstrand
479797e130 intel/fs: Move more prog_data setup into populate_wm_prog_data
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:53 +00:00
Jason Ekstrand
fc519cad57 intel/fs: Break wm_prog_data setup into a helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:53 +00:00
Jason Ekstrand
2687ec5ee6 intel/fs: Expose a couple of NIR lowering helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:53 +00:00
Arcady Goldmints-Orlov
04f77595f0 intel/compiler: Always apply sample mask on Vulkan.
With OpenGL, shader writes to the sample mask are ignored when not
rendering to a multisample render target. However, on Vulkan, writes to
the sample mask have still have their effect in that case.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3016

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5156>
2020-06-19 20:24:11 -05:00
Sagar Ghuge
a0ef4971d0 intel/compiler: Remove unnecessary optimization for MUL
2 source instruction only support immediate for src1 operand, so no
point in adding optimization condition for src0 oprand.

v2:
- Update commit message and don't remove ADD optimization (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5341>
2020-06-16 17:11:32 -07:00
Sagar Ghuge
d4f3f9390f intel/compiler: Optimize integer add with 0 into mov
Kaby Lake
total instructions in shared programs: 326560 -> 323616 (-0.90%)
instructions in affected programs: 178062 -> 175118 (-1.65%)
helped: 129
HURT: 0
helped stats (abs) min: 1 max: 118 x̄: 22.82 x̃: 8
helped stats (rel) min: 0.35% max: 6.56% x̄: 2.57% x̃: 2.47%
95% mean confidence interval for instructions value: -27.71 -17.93
95% mean confidence interval for instructions %-change: -2.81% -2.32%
Instructions are helped.

total cycles in shared programs: 43741127 -> 45397851 (3.79%)
cycles in affected programs: 40880261 -> 42536985 (4.05%)
helped: 94
HURT: 34
helped stats (abs) min: 5 max: 6160 x̄: 598.91 x̃: 45
helped stats (rel) min: 0.20% max: 34.86% x̄: 2.52% x̃: 1.09%
HURT stats (abs)   min: 1 max: 76198 x̄: 50383.00 x̃: 69677
HURT stats (rel)   min: 0.07% max: 48.41% x̄: 15.65% x̃: 6.49%
95% mean confidence interval for cycles value: 8023.10 17863.21
95% mean confidence interval for cycles %-change: <.01% 4.60%
Cycles are HURT.

total spills in shared programs: 1086 -> 978 (-9.94%)
spills in affected programs: 897 -> 789 (-12.04%)
helped: 24
HURT: 0

total fills in shared programs: 1686 -> 1584 (-6.05%)
fills in affected programs: 1371 -> 1269 (-7.44%)
helped: 24
HURT: 0

v2:
- Use brw_reg_type_is_integer (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5341>
2020-06-16 16:54:27 -07:00
Matt Turner
66111bc95a intel/compiler: Drop opt_sampler_eot()
Gen9 and Cherryview have the ability to mark texture instructions with
the End-of-thread bit under some conditions, which allows the texture
result to be written to the render target directly, rather than
returning to the EU.

In order to handle overlapping primitives correctly, we have to use the
'sendc' instruction which stalls until other threads potentially writing
to the same locations in the render target are retired. Unfortunately,
this stall happens before the texture is sampled (rather than in
parallel with stall), so for some literal edge cases (like the diagonal
edge between two triangles forming a rectangle) there can be a
performance penalty. As a result, it's probably not a good idea to use
this optimization in general.

I had planned to leave it enabled only for BLORP, where we use rectangle
primitives and are typically clearing/blitting an entire render target
without any overlapping primitives, but I noticed that the optimization
wasn't applied in some normal cases anyway. For example, in the piglit
test tests/shaders/glsl-fs-texture2d-bias.shader_test it is applied to
one BLORP-blit shader but not another due to some kind of mishandling of
register types (the destination register type of the texture operation
is UD while the color source of the render target write is F).

Additionally the instruction scheduler assumed that the combined texture
and render target write operation took 0 cycles, leading to cycle
estimates that are wildly inaccurate. Since the optimization was not
implemented for SIMD32 and our decision whether to use the SIMD32
program is made by comparing the estimated performance with that of the
SIMD16 shader, we wrongly threw out a bunch of SIMD32 programs that are
likely profitable.

   total cycles in shared programs: 472807891 -> 473784245 (0.21%)
   cycles in affected programs: 108277 -> 1084631 (901.72%)
   helped: 0
   HURT: 1290

   total sends in shared programs: 998955 -> 1000245 (0.13%)
   sends in affected programs: 1400 -> 2690 (92.14%)
   helped: 0
   HURT: 1290

   LOST:   0
   GAINED: 33

This patch shows no performance changes in Intel's Mesa performance CI.

Given the problems, the lack of evidence that the pass improves
performance, and the fact that the hardware feature was removed from
subsequent GPU generations, I think that the pass is not valuable and
should be removed.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5412>
2020-06-12 19:01:26 +00:00
Jason Ekstrand
92cfbb7d0c intel/nir: Call nir_metadata_preserve on !progress
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00