Commit graph

1663 commits

Author SHA1 Message Date
Connor Abbott
f3e2c65041 nir: Return correct size in nir_assign_io_var_locations()
It was double-counting cases where multiple variables were assigned to
the same slot, and not handling the case where the last variable is a
compact variable.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Connor Abbott
dd81d8808d nir: Handle compact variables when assigning i/o locations
These are used in Vulkan for clip/cull distances, instead of the GLSL
lowering when the clip/cull arrays are shared.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Connor Abbott
fd5ed6b9d6 nir: Move st_nir_assign_var_locations() to common code
It isn't really doing anything Gallium-specific, and it's needed for
handling component packing, overlapping, etc.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-08 14:15:06 +02:00
Connor Abbott
27f0c3c15e radv: Make FragCoord a sysval
load_fragcoord is already handled in common code for radeonsi, so we
don't need to do anything to handle it. However, there were some passes
creating NIR with the varying, so we switch them over to the sysval. In
the case of nir_lower_input_attachments which is used by both radv and
anv, we add handling for both until intel switches to using a sysval.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Daniel Schürmann
c31f470066 anv,nir: Move lower_input_attachments pass from ANV to NIR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:02:50 +02:00
Rob Clark
5787a2dfe3 nir: add pass to lower load_interpolated_input
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-02 16:15:25 +00:00
Sagar Ghuge
80117117bd nir: Add optimization to use ROR/ROL instructions
v2: 1) Add more optimization rules for ROL/ROR (Matt Turner)
    2) Add lowering rules for ROL/ROR (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Sagar Ghuge
81d342e2a1 nir: Add urol and uror opcodes
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Alejandro Piñeiro
12355c7e91 nir: add is_in_ubo/ssbo/block helpers
Equivalent to the already existing ir_variable is_in_buffer_block and
is_in_shader_storage_block, adding the uniform buffer object one. I'm
using the short forms (ssbo, ubo) to avoid having method names too
long.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-30 16:58:26 -05:00
Ian Romanick
02c6cd8481 nir/serach: Increase maximum commutative expressions from 4 to 8
No shader-db change on any Intel platform.  No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-28 18:56:19 -07:00
Ian Romanick
1a43cf9a40 nir/algebraic: Don't mark expression with duplicate sources as commutative
There is no reason to mark the fmul in the expression

    ('fmul', ('fadd', a, b), ('fadd', a, b))

as commutative.  If a source of an instruction doesn't match one of the
('fadd', a, b) patterns, it won't match the other either.

This change is enough to make this pattern work:

    ('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)),
                          ('fadd', 1.0, ('fneg', a))),
                 ('fmul', ('flrp', a, 1.0, a), b))

This pattern has 5 commutative expressions (versus a limit of 4), but
the first fmul does not need to be commutative.

No shader-db change on any Intel platform.  No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).

There are more subpatterns that could be marked as non-commutative, but
detecting these is more challenging.  For example, this fadd:

    ('fadd', ('fmul', a, b), ('fmul', a, c))

The first fadd:

    ('fmul', ('fadd', a, b), ('fadd', a, b))

And this fadd:

    ('flt', ('fadd', a, b), 0.0)

This last case may be easier to detect.  If all sources are variables
and they are the only instances of those variables, then the pattern can
be marked as non-commutative.  It's probably not worth the effort now,
but if we end up with some patterns that bump up on the limit again, it
may be worth revisiting.

v2: Update the comment about the explicit "len(self.sources)" check to
be more clear about why it is necessary.  Requested by Connor.  Many
Python fixes style / idom fixes suggested by Dylan.  Add missing (!!!)
opcode check in Expression::__eq__ method.  This bug is the reason the
expected number of commutative expressions in the bitfield_reverse
pattern changed from 61 to 45 in the first version of this patch.

v3: Use all() in Expression::__eq__ method.  Suggested by Connor.
Revert away from using __eq__ overloads.  The "equality" implementation
of Constant and Variable needed for commutativity pruning is weaker than
the one needed for propagating and validating bit sizes.  Using actual
equality caused the pruning to fail for my ('fmul', ('fadd', 1, a),
('fadd', 1, a)) case.  I changed the name to "equivalent" rather than
the previous "same_as" to further differentiate it from __eq__.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-06-28 18:56:19 -07:00
Ian Romanick
cae1af4339 nir/search: Log Boolean constants instead of asserting
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-28 18:56:19 -07:00
Ian Romanick
8d6b35fffd nir/algebraic: Fail build when too many commutative expressions are used
Search patterns that are expected to have too many (e.g., the giant
bitfield_reverse pattern) can be added to a white list.

This would have saved me a few hours debugging. :(

v2: Implement the expected-failure annotation as a property of the
search-replace pattern instead of as a property of the whole list of
patterns.  Suggested by Connor.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-06-28 18:56:19 -07:00
Ian Romanick
57704b8d22 nir/algebraic: Fix whitespace error
Trivial

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-06-28 18:56:19 -07:00
Eric Anholt
8fd8964302 nir: Fix lowering of bitfield_insert to shifts.
The bfi/bfm behavior change replaced the bfi/bfm usage in
lower_bitfield_insert_to_shifts with actual shifts like the name says,
but it failed to handle the offset=0, bits==32 case in the new
lowering.

v2: Use 31 < bits instead of bits == 32, to get the 31 < (iand bits,
    31) -> false optimization.

Fixes regressions in dEQP-GLES31.*bitfield_insert* on freedreno.

Fixes: 165b7f3a44 ("nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-06-28 16:38:23 -07:00
Caio Marcelo de Oliveira Filho
085c0f1f13 nir/algebraic: Add helpers and a rule involving wrapping
The helpers are needed so we can use the syntax `instr(cond)` in the
algebraic rules.  Add simple rule for dropping a pair of mul-div of
the same value when wrapping is guaranteed to not happen.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-26 14:13:02 -07:00
Caio Marcelo de Oliveira Filho
ae37237713 nir: Add a no wrapping bits to nir_alu_instr
They indicate the operation does not cause overflow or underflow.
This is motivated by SPIR-V decorations NoSignedWrap and
NoUnsignedWrap.

Change the storage of `exact` to be a single bit, so they pack
together.

v2: Handle no_wrap in nir_instr_set.  (Karol)

v3: Use two separate flags, since the NIR SSA values and certain
    instructions are typeless, so just no_wrap would be insufficient
    to know which one was referred to.  (Connor)

v4: Don't use nir_instr_set to propagate the flags, unlike `exact`,
    consider the instructions different if the flags have different
    values.  Fix hashing/comparing.  (Jason)

Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-26 14:13:02 -07:00
Jonathan Marek
a70ff70158 nir: remove fnot/fxor/fand/for opcodes
There doesn't seem to be any reason to keep these opcodes around:
* fnot/fxor are not used at all.
* fand/for are only used in lower_alu_to_scalar, but easily replaced

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-06-26 15:26:10 -04:00
Jonathan Marek
0b5a483baa nir: opt_vectorize: combine different constant sources
We can vectorize instructions with different constant sources by creating
a new load_const and using that.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-26 14:56:28 -04:00
Timothy Arceri
f5f31612d3 nir: add tess support to nir_lower_clamp_color_outputs()
This will be used to add compat profile support for higher GL
versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-26 00:36:48 +00:00
Daniel Schürmann
a8b0b6e52b nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select.
bitfield_select is defined as:
bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert)
matching the behavior of AMD's BFI instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
1403c3a7bf nir/algebraic: Use unsigned comparison when lowering bitfield insert/extract
This lets us use the optimization pattern
(('ult', 31, ('iand', b, 31)), False) to remove the
bcsel instruction for code originating in D3D shaders.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
4eeb49ea71 nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sources
The [iu]bfe and bfm instructions are defined to only use the five
least significant bits.
This optimizes a common pattern from D3D -> SPIR-V translation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
165b7f3a44 nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.

Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
a74f256c58 nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and friends.
These optimizations are based on the fact that
'and(a,b) <= umin(a,b)'.
For AMD, this series moves the optimization from LLVM to NIR,
so currently no vkpipeline-db changes here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-06-24 18:42:20 +02:00
Boris Brezillon
56434450f6 nir/lower_tex: Add an assert() in nir_lower_txs_lod()
We don't expect the output of a TXS instruction to be wider than a
vec3. Add an assert() to make sure this never happens.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 09:15:53 -07:00
Jason Ekstrand
81e51b412e nir: Make nir_constant a vector rather than a matrix
Most places in NIR, we treat matrices like arrays.  The one annoying
exception to this has been nir_constant where a matrix is a first-class
thing.  This commit changes that so a matrix nir_constant is the same as
an array nir_constant.  This makes matrix nir_constants a tiny bit more
expensive but shrinks all others by 96B.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-06-19 21:05:54 +00:00
Connor Abbott
77be5b2f88 nir: Use reorderable access flag
No changes with radeonsi shader-db.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:28 +02:00
Connor Abbott
a1c737927c nir: Add a helper to determine if an intrinsic can be reordered
This is simple now, but we're going to be adding a few more conditions
to this later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:28 +02:00
Connor Abbott
c813c5776d nir: Add reorderable memory access enum
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:28 +02:00
Connor Abbott
75063fbac5 nir/copy_prop_vars: Ignore volatile accesses
The spec explicitly says that volatile writes can't be removed and
volatile reads do not guarantee that the same value will still be around
after the read, as if there were a barrier after each read/write. Just
ignore them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:28 +02:00
Connor Abbott
364996d70d glsl/nir: Propagate access qualifiers
We were completely ignoring these before, except for putting them on
variables. While we're here, don't set access qualifiers when converting
to bindless since glsl_to_nir will already have set a more accurate
qualifier that includes any qualifiers on struct members that are
dereferenced.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:27 +02:00
Connor Abbott
6f20643b47 nir: Allow qualifiers on copy_deref and image instructions
In the next commit, we'll properly handle access qualifiers on struct
members by propagating them to load/store instructions, but these
instructions had no way to specify the qualifier.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:27 +02:00
Connor Abbott
47e7c6961a nir: add a vectorization pass
This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to be combined, but there are a few major
differences:

1. For now, we only support entirely per-component ALU operations.
2. Since it's not always guaranteed that we'll be able to combine
equivalent instructions, we keep a stack of equivalent instructions
around, trying to combine new instructions with instructions on the
stack.

The pass isn't comprehensive by far; it can't handle operations where
some of the sources are per-component and others aren't, and it can't
handle phi nodes. But it should handle the more common cases, and it
should be reasonably efficient.

[Alyssa: Rebase on latest master, updating with respect to typeless
moves]

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-18 06:43:30 -07:00
Boris Brezillon
296c5fd25d nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions
The V3D driver has an open-coded solution for this, and we need the
same thing for Panfrost, so let's add a generic way to lower TXS(LOD)
into max(TXS(0) >> LOD, 1).

Changes in v2:
* Use == 0 instead of !
* Rework the minification logic as suggested by Jason
* Assign cursor pos at the beginning of the function
* Patch the LOD just after retrieving the old value

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
0e489fd360 nir/lower_tex: Update ->sampler_dim value before calling get_texture_size()
get_texture_size() will create a txs instruction with ->sampler_dim set
to the original tex->sampler_dim. The condition to call lower_rect()
only checks the value of ->sampler_dim and whether lower_rect is
requested or not. This leads to an infinite loop when calling
nir_lower_tex() with the same options until it returns false.

In order to avoid that, let's move the tex->sampler_dim patching before
get_texture_size() is called. This way the txs instruction will have
->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try
to lower it on the subsequent passes.

Changes in v2:
* Add Jason R-b
* Add a comment explaining why we patch ->sampler_dim at the beginning
  of the lower_rect() func

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
352b1d9c31 nir/lower_tex: Actually report when projector lowering happened
The code considers that projector lowering was done even if it's not
really the case. Change the project_src() prototype to return a bool
encoding whether projector lowering happened or not and update the
progress var accordingly in nir_lower_tex_block().

---
Changes in v2:
* Add Jason R-b
* Drop the part suggesting that nir_lower_rect() could be called in
  a do-while(progress) loop.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Iago Toral Quiroga
2a2501247b nir: detect more dynamically uniform expressions
Shader-db results for v3d:

total instructions in shared programs: 9132728 -> 9119238 (-0.15%)
instructions in affected programs: 596886 -> 583396 (-2.26%)
helped: 1118
HURT: 224

total threads in shared programs: 234298 -> 234308 (<.01%)
threads in affected programs: 10 -> 20 (100.00%)
helped: 5
HURT: 0

total uniforms in shared programs: 3022949 -> 3022622 (-0.01%)
uniforms in affected programs: 29163 -> 28836 (-1.12%)
helped: 108
HURT: 37

total max-temps in shared programs: 1328030 -> 1327762 (-0.02%)
max-temps in affected programs: 10097 -> 9829 (-2.65%)
helped: 263
HURT: 15

total spills in shared programs: 3793 -> 3777 (-0.42%)
spills in affected programs: 432 -> 416 (-3.70%)
helped: 16
HURT: 0

total fills in shared programs: 4380 -> 4266 (-2.60%)
fills in affected programs: 828 -> 714 (-13.77%)
helped: 16
HURT: 0

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-14 08:00:52 +02:00
Connor Abbott
37b92b0ae6 nir: Don't manually index intrinsic index enum
This fixes a rebase fail in ea51275e07, and prevents it from happening
again. There's no reason to do this manually.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-13 17:10:41 +02:00
Daniel Schürmann
ea51275e07 nir: add intrinsics for AMD_shader_ballot
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-13 12:44:23 +00:00
Eduardo Lima Mitev
fb2169040a nir/opt_algebraic: Fix rules for imadsh_mix16
The rules added in patch 3addd7c are inverted:

It should be:

(al * bh) << 16 + c

instead of:

(ah * bl) << 16 + c

Fixes a number of regressions under
dEQP-GLES31.functional.draw_indirect.compute_interop.large.*
on Freedreno.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-10 22:27:46 +02:00
Eric Engestrom
440fe0eb43 nir: fix s/&&/||/ typo
Fixes: cd73b6174b "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-07 16:06:25 +01:00
Eduardo Lima Mitev
3addd7c8d9 nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16
For umul_low (al * bl), zero is returned if the low 16-bits word of either
source is zero.

for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl'
is zero.

A couple of nir_search_helpers are added:

is_upper_half_zero() returns true if the highest word of all components of
an integer NIR alu src are zero.

is_lower_half_zero() returns true if the lowest word of all components of
an integer nir alu src are zero.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev
c27b3758fa nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes
'umul_low' is the low 32-bits of unsigned integer multiply. It maps
directly to ir3's MULL_U.

'imadsh_mix16' is multiply add with shift and mix, an ir3 specific
instruction that maps directly to ir3's IMADSH_M16.

Both are necessary for the lowering of integer multiplication on
Freedreno, which will be introduced later in this series.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Jason Ekstrand
d96878a66a nir/propagate_invariant: Don't add NULL vars to the hash table
Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 00:27:53 +00:00
Kenneth Graunke
c7d1b52a2c nir: Combine lower_fmod16/32 back into a single lower_fmod.
We originally had a single lower_fmod option.  In commit 2ab2d2e5, Sam
split 32 and 64-bit lowering into separate flags, with the rationale
that some drivers might want different options there.  This left 16-bit
unhandled, so Iago added a lower_fmod16 option in commit ca31df6f.

Now that lower_fmod64 is gone (in favor of nir_lower_doubles and
nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a
single lower_fmod flag again.  I'm not aware of any hardware which
need lowering for one bitsize and not the other.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
edd45af9ba nir: Drop lower_fmod64 option.
nir_lower_doubles offers a wide variety of fp64 lowering, including
lowering fmod@64.  The version there also better handles imprecisions
due to lowered frcp@64.  Let's consolidate on one version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Jason Ekstrand
fe2fc30cb5 nir: Don't replace the nir_shader when NIR_TEST_SERIALIZE=1
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-06-05 20:07:28 +00:00
Jason Ekstrand
9eba6d9a88 nir: Don't replace the nir_shader when NIR_TEST_CLONE=1
Instead, we add a new helper which stomps one nir_shader and replaces it
with another.  The new helper effectively just changes which pointer
gets used for the base nir_shader.  It should be 99% as good at testing
cloning but without requiring that everything handle having the shader
swapped out from under it constantly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-06-05 20:07:28 +00:00
Alyssa Rosenzweig
d2d3cc66cf nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a)
This pattern was noticed in glmark's jellyfish scene.

v2: Add inexact qualifier due to NaN behaviour.

Minimal shader-db changes (slightly helped).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-06-04 19:57:19 +00:00