gl_MaxVaryingFloats was not removed from core until 4.20 and is still
available in compat shaders. Found while writing some new CTS to test
the correct declarations of this constant.
Fixes: 0ebf4257a385i ("glsl: define some GLES3 constants in GLSL 4.1")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9514>
There are only a couple patterns that use is_finite, so the changes
aren't huge. Mostly shaders from Batman Arkham City and a few shaders
from Shadow of the Tomb Raider were affected.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tiger Lake
Instructions in all programs: 160902591 -> 160902489 (-0.0%)
SENDs in all programs: 6812270 -> 6812270 (+0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7429003266 -> 7428992369 (-0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304539 (+0.0%)
Ice Lake
Instructions in all programs: 145301634 -> 145301460 (-0.0%)
SENDs in all programs: 6863890 -> 6863890 (+0.0%)
Loops in all programs: 38219 -> 38219 (+0.0%)
Cycles in all programs: 8798589772 -> 8798575869 (-0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334250 -> 334250 (+0.0%)
Skylake
Instructions in all programs: 135892010 -> 135891836 (-0.0%)
SENDs in all programs: 6802916 -> 6802916 (+0.0%)
Loops in all programs: 38216 -> 38216 (+0.0%)
Cycles in all programs: 8442597324 -> 8442583202 (-0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301116 -> 301116 (+0.0%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
Recall that when either value is NaN, fmax will pick the other value.
This means the result range of the fmax will either be the "ideal"
result range (calculated above) or the range of the non-NaN value.
Previously, something like fmax({gt_zero}, {lt_zero, is_a_number}) would
return a range of gt_zero. However, if the "gt_zero" parameter is NaN,
the actual result will be the "lt_zero" parameter.
This analysis depends on the is_a_number analysis also added in this MR.
Assuming this doesn't cause any unforeseen problems, I believe we should
wait a bit, then nominate a subset of the series for the stable
branches.
This fixes the piglit tests
tests/spec/glsl-1.30/execution/range_analysis_fmax_of_nan.shader_test
tests/spec/glsl-1.30/execution/range_analysis_fmin_of_nan.shader_test
from https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463.
Even with the added fsat fixes, range_analysis_fsat_of_nan.shader_test
still fails. There are some other issues there that will be addressed
in later commits (in another MR).
v2: Add fsat fixes. Suggested by Rhys.
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Shader-db results:
All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 21049290 -> 21049314 (<.01%)
instructions in affected programs: 3175 -> 3199 (0.76%)
helped: 0
HURT: 17
HURT stats (abs) min: 1 max: 3 x̄: 1.41 x̃: 1
HURT stats (rel) min: 0.20% max: 1.89% x̄: 0.97% x̃: 0.92%
95% mean confidence interval for instructions value: 1.09 1.73
95% mean confidence interval for instructions %-change: 0.75% 1.19%
Instructions are HURT.
total cycles in shared programs: 855136176 -> 855136406 (<.01%)
cycles in affected programs: 37579 -> 37809 (0.61%)
helped: 0
HURT: 17
HURT stats (abs) min: 12 max: 20 x̄: 13.53 x̃: 14
HURT stats (rel) min: 0.17% max: 1.13% x̄: 0.79% x̃: 0.91%
95% mean confidence interval for cycles value: 12.53 14.53
95% mean confidence interval for cycles %-change: 0.63% 0.94%
Cycles are HURT.
Fossil-db results:
Tiger Lake
Instructions in all programs: 160901033 -> 160902591 (+0.0%)
SENDs in all programs: 6812270 -> 6812270 (+0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7430016795 -> 7429003266 (-0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304539 (+0.0%)
Ice Lake
Instructions in all programs: 145299102 -> 145301634 (+0.0%)
SENDs in all programs: 6863890 -> 6863890 (+0.0%)
Loops in all programs: 38219 -> 38219 (+0.0%)
Cycles in all programs: 8798390846 -> 8798589772 (+0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334250 -> 334250 (+0.0%)
Skylake
Instructions in all programs: 135889478 -> 135892010 (+0.0%)
SENDs in all programs: 6802916 -> 6802916 (+0.0%)
Loops in all programs: 38216 -> 38216 (+0.0%)
Cycles in all programs: 8442624166 -> 8442597324 (-0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301116 -> 301116 (+0.0%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
This commit is necessary to support "nir/range_analysis: Fix analysis of
fmin and fmax with NaN".
No shader-db or fossil-db changes on any Intel platform.
v2: Pack and unpack is_a_number.
v3: Don't set is_a_number of integer constants. The bit pattern might
be NaN.
v4: Update handling of b2i32. intBitsToFloat(int(true)) is
1.401298464324817e-45. Return a value consistent with that.
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
The obvious changes to nir_search_helpers.h are in a separate commit to
limit the scope of this change. These additions are really only needed
to support the next commit "nir/range_analysis: Add "is a number" range
analysis tracking". This reduction in scope is intended to increase the
suitability for stable branches.
No shader-db or fossil-db changes on any Intel platform.
v2: Pack and unpack is_finite.
v3: Split nir_search_helpers.h changes into a separate commit.
v4: Remove assertion intended for the next commit. Update is_finite
comment for fsign. Both noticed by Rhys. Fix is_finite handling for
load_const vectors. If any element is not finite, set the flag to
false. This is the same way is_integral is already handled.
v5: Update handling of b2i32. intBitsToFloat(int(true)) is
1.401298464324817e-45. Return a value consistent with that.
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
This will greatly simplify a later commit. The assert(r.is_integral) in
the eq_zero case is dropped because I don't think it's useful anymore.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
This reverts commit 6c8cc9be12.
A spec bug was resolved confirming the original behaviour. Also it
seems the game Foundation no longer depends on the incorrect
behaviour.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9486>
In case a new gl_PointCoord shader input is created, increment shader
input count and set valid driver_location to the new input variable,
otherwise the input gets aliased to input 0 and shows up in NIR_PRINT
output as whatever shader input 0 is instead of gl_PointCoord. Also
set the input as used, otherwise it might get removed.
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9214>
When tyding up this section in 1e5b09f42f ("spirv: Tidy some repeated
if checks by using a switch statement.") the break got lost. It is
not a real problem because the next case just break, but better to
have it explicitly here instead of a FALLTHROUGH.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9440>
This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after()
with an SSA def, and rewrites all the users as needed.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This is currently an alias for nir_ssa_def_rewrite_uses but we move all
the instances which used it to write a non-SSA source to the newly named
helper.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This pass needs to run on the last shader in a pipeline writing
gl_Position. In GLES2, that's always the vertex shader, but in ES3.2, it
can be a geometry or tessellation shader. The shared code works the same
in this case, just make the assert more generous.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9444>
The search helps must *never* modify the instruction passed in, so let
the compiler enforce this.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9378>
Some games declare the wrong format, so we might want to disable this
optimization in that case.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: e4d75c22 ("nir/opt_shrink_vectors: shrink image stores using the format")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>
This appeared in softpipe's image operations, since NIR always uses
4-component values for the coords, while the GLSL IR only has 2 components
for a 2D image (for example).
arb_shader_image_load_store-shader-mem-barrier (which times out in CI and
spends its time inside of tgsi_exec) was spending 4/51 of its instructions
on moving these undefs around.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>
The original shared load op can't be reordered, so it might be better to
also not allow this for the lowered variant.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>
An earlier commit tried to make this shader compatible with GLSL 3.30,
but it requires, GL_ARB_gpu_shader_int64, which requires GLSL 4.00 and
GL 4.0 according to the extension spec. So we were failing to enable
the required extension, breaking compilation of this shader.
The original intention of that patch was to get this working on zink,
which at the time only supported GL 3.3. But now it supports later
OpenGL versions, so we don't need to do this any longer. Rather than
revert the patch and raise the version all the way back to 430, just
bump it to the require 400 at Ian Romanick's suggestion.
Fixes: 4d47b22bf0 ("glsl/float64: make this compatible with glsl 330")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3991
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9351>
The cache has been detangled from glsl and used outside it (with Vulkan drivers)
for years now.
This also cleans up the dependancies in the build file. The test doesn't
depend on the glsl lib but rather the util lib.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9327>
There are less copy instructions than sources, so instead of visiting each
source and rewriting it if it's uses a copy instruction, visit each copy
instruction and rewrite it's users.
Besides improving compile time, this also has a side effect of fixing a
rare situation where copy-propagation does not happen:
loop {
a = phi ..., b
c = vec ...
b = mov c.y
}
It might have been the case that a phi source could not be rewritten until
the copy was visited later.
Compile-time (nir_copy_prop):
Difference at 95.0% confidence
-2613.13 +/- 15.2094
-27.4333% +/- 0.150247%
(Student's t, pooled s = 17.963)
Comple-time (overall):
Difference at 95.0% confidence
-2627.89 +/- 201.557
-2.05404% +/- 0.156221%
(Student's t, pooled s = 238.048)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>
These were hurting performance of other passes.
Compile-time (overall):
Difference at 95.0% confidence
-5496.3 +/- 219.752
-4.11912% +/- 0.160285%
(Student's t, pooled s = 259.538)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>
At -O1 with GCC 10.2.1, _nir_visit_dest_indirect (declared ALWAYS_INLINE)
will fail to inline if it's caller (nir_foreach_dest) is not inlined,
because _nir_visit_dest_indirect is passed as a function pointer. This
results in a compilation error.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Fixes: 336bcbacd0 ("nir: inline nir_foreach_{src,dest}")
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4353
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9301>
In release builds, there should be no change, but in debug builds the
assert will help us catch undefined behavior resulting from using
util_cpu_caps before it is initialized.
With fix for u_half_test for MSVC from Jesse Natalie squashed in.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>
face
The opcode evaluates tha unnormalized coordinates, the length of the
major axis, and the cube face.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>
E.g. r600 a cube texture lookup uses a specific cube instruction
to evaluate the sample coordinates and the face ID, so that the cube
texture lookup can be lowered to a array texture lookup, thereby sharing
the code with the 2D array texture lopkup.
However, for TXD the given gradients still need to be three-component
vectors, so add a flag that the NIR validation knows that we deal with
cube texture that was lowered to an array and can validate accordingly.
v2: Handle new flag in serialization (Marek)
v3: Rebase so that the change does not require the patch to deduct the
number of offset and grad components from sampler type
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>
nir_tex_instr_src_index returns an int.
Fix defect reported by Coverity Scan.
Macro compares unsigned to 0 (NO_EFFECT)
unsigned_compare: This greater-than-or-equal-to-zero comparison of an unsigned value is always true. coord >= 0U.
Fixes: b154a4154b ("nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9181>
It's unnecessary to iterate twice for instructions outside loops.
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-630.64 +/- 6.18761
-27.0751% +/- 0.223134%
(Student's t, pooled s = 7.30785)
Compile-time (entire run):
Difference at 95.0% confidence
-749.54 +/- 48.8272
-1.82644% +/- 0.117838%
(Student's t, pooled s = 57.6672)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-319.51 +/- 5.67632
-12.0627% +/- 0.208076%
(Student's t, pooled s = 6.70399)
Compile-time (overall):
Difference at 95.0% confidence
-385.025 +/- 42.1124
-0.929489% +/- 0.10139%
(Student's t, pooled s = 49.7367)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Instead of a keeping a worklist of live instructions, use a bitset of live
ssa defs and iterate over instructions in reverse.
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-931.911 +/- 4.41383
-26.0263% +/- 0.105781%
(Student's t, pooled s = 5.21293)
Compile-time (overall):
Difference at 95.0% confidence
-882.245 +/- 28.3492
-2.08541% +/- 0.0665121%
(Student's t, pooled s = 33.4818)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
GPUs without native txs support (and without an emulation in sw)
can use this new lowering. Also it saves us from doing int/float
conversions.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>
Some nir lowerings might need to know if txs is supported by
the backend.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>
If a query is made of a vector ssa_def (possibly from an intermediate
result), return all_bits. If a constant source is a vector, swizzle
the correct component.
Unit tests were added for the constant vector cases. I don't see a
great way to make unit tests for the other cases.
v2: Add a FINIHSME comment about u16vec2 hardware.
Fixes: 96303a59ea ("nir: Add some range analysis for used bits")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9123>