Ian Romanick
ed138f2861
nir/algebraic: Partially revert 3f782cdd25
...
I'm not sure what the logic was, but there is no opportunity for
anything to flush to zero here. 'a' is a Boolean value, and b2f
produces 1.0 or 0.0.
This was originally part of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3765/ .
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8910 >
2021-02-07 18:31:01 -08:00
Ian Romanick
5923742356
nir/algebraic: add patterns for a >> #b << #b and a << #b >> #b
...
Commit 5476d18183 ("nir/algebraic: add patterns for a >> #b << #b")
added the ushr version, but it missed the ishr. A bunch of compute
shaders with stores to shared storage generate the ishr pattern.
Enabling this optimization also enables the iadd/iand reassociation
(right after this hunk), and that enables merging of stores to shared
storage. A couple shaders have spills and fills hurt on some
platforms. These all occur in shaders that also have SENDs helped.
On Gen9 and Gen11, the helped SENDs more than makes up for the extra
spills and fills.
On Gen7 and Gen8, it's not as clear. All of the shaders affected are
compute shaders in DiRT Rally 2 or Bioshock Inifinite. The most
affected Bioshock shader on Broadwell looks like:
Before: CS SIMD8 shader: 1335 inst, 0 loops, 22411 cycles, 42:36 spills:fills, 159 sends, scheduled with mode lifo, Promoted 2 constants, compacted 21360 to 16528 bytes.
After: CS SIMD8 shader: 1175 inst, 0 loops, 25916 cycles, 96:135 spills:fills, 72 sends, scheduled with mode lifo, Promoted 2 constants, compacted 18800 to 13648 bytes.
The results on Haswell and Ivy Bridge are similar. Given that there
are only 2 promoted constants, MR !7698 won't have any effect.
There were no statistically significant changes on Gen9+ in Bioshock in
our performance CI. Gen8 isn't in that CI, and DiRT Showdown 2 is also
not included in that CI. It is possible that these shaders aren't used
in the settings or demos used in the CI.
The other pattern, which switches the order of the shifts, only helps a
couple shaders. If I wasn't already adding another pattern, I
definitely wouldn't bother with that one.
v2: s/ishr/ushr/ in the replacement for the ushr pattern. Noticed by
Rhys.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tiger Lake
total instructions in shared programs: 21052760 -> 21049269 (-0.02%)
instructions in affected programs: 59497 -> 56006 (-5.87%)
helped: 46
HURT: 0
helped stats (abs) min: 2 max: 552 x̄: 75.89 x̃: 53
helped stats (rel) min: 0.28% max: 43.43% x̄: 5.87% x̃: 4.10%
95% mean confidence interval for instructions value: -108.96 -42.82
95% mean confidence interval for instructions %-change: -8.38% -3.35%
Instructions are helped.
total cycles in shared programs: 855229761 -> 855148518 (<.01%)
cycles in affected programs: 8491373 -> 8410130 (-0.96%)
helped: 33
HURT: 15
helped stats (abs) min: 42 max: 26940 x̄: 6200.70 x̃: 4329
helped stats (rel) min: 0.09% max: 38.78% x̄: 7.97% x̃: 4.29%
HURT stats (abs) min: 2 max: 18132 x̄: 8225.33 x̃: 7288
HURT stats (rel) min: <.01% max: 13.37% x̄: 5.72% x̃: 4.53%
95% mean confidence interval for cycles value: -4331.52 946.40
95% mean confidence interval for cycles %-change: -6.78% -0.61%
Inconclusive result (value mean confidence interval includes 0).
total sends in shared programs: 989947 -> 989694 (-0.03%)
sends in affected programs: 523 -> 270 (-48.37%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37
helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53%
95% mean confidence interval for sends value: -93.95 -7.25
95% mean confidence interval for sends %-change: -58.48% -28.50%
Sends are helped.
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20033498 -> 20030552 (-0.01%)
instructions in affected programs: 59220 -> 56274 (-4.97%)
helped: 48
HURT: 0
helped stats (abs) min: 1 max: 465 x̄: 61.38 x̃: 39
helped stats (rel) min: 0.03% max: 42.27% x̄: 5.19% x̃: 3.90%
95% mean confidence interval for instructions value: -89.57 -33.18
95% mean confidence interval for instructions %-change: -7.49% -2.89%
Instructions are helped.
total cycles in shared programs: 979993675 -> 979840773 (-0.02%)
cycles in affected programs: 6738454 -> 6585552 (-2.27%)
helped: 46
HURT: 0
helped stats (abs) min: 42 max: 6265 x̄: 3323.96 x̃: 3579
helped stats (rel) min: 0.09% max: 37.38% x̄: 4.34% x̃: 2.39%
95% mean confidence interval for cycles value: -3664.70 -2983.21
95% mean confidence interval for cycles %-change: -6.63% -2.06%
Cycles are helped.
total spills in shared programs: 10659 -> 10661 (0.02%)
spills in affected programs: 36 -> 38 (5.56%)
helped: 1
HURT: 1
total fills in shared programs: 11551 -> 11551 (0.00%)
fills in affected programs: 70 -> 70 (0.00%)
helped: 1
HURT: 1
total sends in shared programs: 1032117 -> 1031785 (-0.03%)
sends in affected programs: 711 -> 379 (-46.69%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74
helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31%
95% mean confidence interval for sends value: -101.79 -31.01
95% mean confidence interval for sends %-change: -58.42% -30.55%
Sends are helped.
Broadwell
total instructions in shared programs: 17865005 -> 17862757 (-0.01%)
instructions in affected programs: 66438 -> 64190 (-3.38%)
helped: 49
HURT: 0
helped stats (abs) min: 1 max: 266 x̄: 45.88 x̃: 39
helped stats (rel) min: 0.03% max: 11.99% x̄: 3.73% x̃: 3.92%
95% mean confidence interval for instructions value: -59.15 -32.61
95% mean confidence interval for instructions %-change: -4.35% -3.12%
Instructions are helped.
total cycles in shared programs: 1031298803 -> 1031219023 (<.01%)
cycles in affected programs: 7253602 -> 7173822 (-1.10%)
helped: 45
HURT: 2
helped stats (abs) min: 18 max: 7828 x̄: 1928.33 x̃: 1918
helped stats (rel) min: <.01% max: 10.51% x̄: 1.58% x̃: 1.31%
HURT stats (abs) min: 3490 max: 3505 x̄: 3497.50 x̃: 3497
HURT stats (rel) min: 15.56% max: 15.64% x̄: 15.60% x̃: 15.60%
95% mean confidence interval for cycles value: -2174.88 -1220.01
95% mean confidence interval for cycles %-change: -2.00% 0.30%
Inconclusive result (%-change mean confidence interval includes 0).
total spills in shared programs: 20799 -> 20924 (0.60%)
spills in affected programs: 843 -> 968 (14.83%)
helped: 0
HURT: 4
total fills in shared programs: 27110 -> 27334 (0.83%)
fills in affected programs: 1824 -> 2048 (12.28%)
helped: 1
HURT: 4
total sends in shared programs: 1017935 -> 1017603 (-0.03%)
sends in affected programs: 711 -> 379 (-46.69%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74
helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31%
95% mean confidence interval for sends value: -101.79 -31.01
95% mean confidence interval for sends %-change: -58.42% -30.55%
Sends are helped.
Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16397496 -> 16395411 (-0.01%)
instructions in affected programs: 59384 -> 57299 (-3.51%)
helped: 49
HURT: 0
helped stats (abs) min: 1 max: 208 x̄: 42.55 x̃: 39
helped stats (rel) min: 0.03% max: 8.18% x̄: 3.74% x̃: 3.91%
95% mean confidence interval for instructions value: -53.59 -31.51
95% mean confidence interval for instructions %-change: -4.24% -3.23%
Instructions are helped.
total cycles in shared programs: 1035483504 -> 1035397592 (<.01%)
cycles in affected programs: 9379739 -> 9293827 (-0.92%)
helped: 45
HURT: 4
helped stats (abs) min: 10 max: 5600 x̄: 2164.51 x̃: 2350
helped stats (rel) min: <.01% max: 11.61% x̄: 1.93% x̃: 1.56%
HURT stats (abs) min: 2 max: 5756 x̄: 2872.75 x̃: 2866
HURT stats (rel) min: <.01% max: 24.65% x̄: 12.29% x̃: 12.26%
95% mean confidence interval for cycles value: -2293.06 -1213.56
95% mean confidence interval for cycles %-change: -2.42% 0.88%
Inconclusive result (%-change mean confidence interval includes 0).
total spills in shared programs: 17672 -> 17803 (0.74%)
spills in affected programs: 364 -> 495 (35.99%)
helped: 2
HURT: 2
total fills in shared programs: 20752 -> 20937 (0.89%)
fills in affected programs: 656 -> 841 (28.20%)
helped: 2
HURT: 2
total sends in shared programs: 1044703 -> 1044450 (-0.02%)
sends in affected programs: 523 -> 270 (-48.37%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37
helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53%
95% mean confidence interval for sends value: -93.95 -7.25
95% mean confidence interval for sends %-change: -58.48% -28.50%
Sends are helped.
No changes on Gen6 or earlier GPUs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852 >
2021-02-08 00:25:22 +00:00
Ian Romanick
6b0443a900
nir/algebraic: Fix a >> #b << #b for sizes other than 32-bit
...
The base mask previously used was 0xffffffff. This is not correct (but
should still work) for 16-bit and 8-bit values, but it means the high
32-bits of 64-bit values will get chopped off.
Instead of just restricting the pattern to 32-bits (as was done before
00b28a50b2 ), this extends the optimization in two ways:
1. Make it correct for other bit sizes.
2. Make it work for arbitrary shift counts.
This has the added benefit of reducing the number of patterns actually
added (7 previously, 4 now).
The "Reassociate for improved CSE" part is just reverted to its
pre-00b28a50b2c behavior. I doubt that pattern is likely to have much
impact outside 32-bits.
This change fixes the piglit tests
tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and
tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test.
All of the shaders helped in shader-db are vertex shaders on platforms
with vector-oriented vertex processing. The shaders contain ((x >> 16)
<< 16). These platforms set lower_extract_word, so the optimization
that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60
shaders involved, I didn't bother trying to add extract_XYZ versions of
these patterns to try to get those cases.
Fixes: 00b28a50b2 ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Haswell and earlier Intel GPUs had simlar results. (Haswell shown)
total instructions in shared programs: 16397554 -> 16397496 (<.01%)
instructions in affected programs: 7961 -> 7903 (-0.73%)
helped: 58
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.13% -0.85%
Instructions are helped.
total cycles in shared programs: 1035483770 -> 1035483504 (<.01%)
cycles in affected programs: 75922 -> 75656 (-0.35%)
helped: 44
HURT: 2
helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2
helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06%
95% mean confidence interval for cycles value: -7.28 -4.29
95% mean confidence interval for cycles %-change: -1.03% -0.63%
Cycles are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852 >
2021-02-08 00:25:22 +00:00
Alyssa Rosenzweig
083843de1e
nir/lower_io: Fix grammar errors
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8846 >
2021-02-04 11:45:26 +00:00
Caio Marcelo de Oliveira Filho
a2414ada87
nir: Add nir_zero_initialize_shared_memory
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8708 >
2021-02-02 17:06:56 +00:00
Jason Ekstrand
774fae34f0
nir: Drop the lower_mem_constant_vars declaration
...
The function was removed in c730ace12b .
Fixes: c730ace12b "nir,clover: Drop nir_lower_mem_constant_vars"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8834 >
2021-02-02 16:34:22 +00:00
Jason Ekstrand
f064b7a42c
nir: Add some ssa-only fast-paths for nir_src rewrite
...
Basically every pass in NIR uses nir_ssa_def_rewrite_uses which calls
nir_instr_rewrite_src which is fairly complex because it handles all
sorts of non-SSA cases. Since we already know a priori that every
source written by nir_ssa_def_rewrite_uses is SSA, we can check new_src
once at the top of the function and cut out all that complexity.
While we're at it, we expose a new SSA-only nir_ssa_def_rewrite_uses_ssa
helper which takes an SSA def which avoids the one SSA check. It's also
more convenient 90% of the time.
Compile time as tested by Rhys Perry <pendingchaos02@gmail.com>
Difference at 95.0% confidence
-797.166 +/- 418.649
-0.566174% +/- 0.296441%
(Student's t, pooled s = 325.459)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8790 >
2021-02-02 15:35:55 +00:00
Yevhenii Kolesnikov
a678ec9b8c
nir/from_ssa: don't check for interference within the same set
...
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8246 >
2021-02-01 14:28:35 -06:00
Yevhenii Kolesnikov
fd05620e43
nir/from_ssa: consider defs in sibling blocks
...
If def a and def b are in sibling blocks, the one with higher
parent_instr's index does not necessarily come after the other.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3712
Fixes: 943ddb9458 "nir: Add a better out-of-SSA pass"
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8246 >
2021-02-01 14:27:56 -06:00
Jason Ekstrand
c7fc44f9eb
nir/from_ssa: Respect and populate divergence information
...
Reviewed-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726 >
2021-02-01 08:11:48 +00:00
Arcady Goldmints-Orlov
8fb6cbdcb6
nir: store the results of divergence analysis on loops
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726 >
2021-02-01 08:11:48 +00:00
Arcady Goldmints-Orlov
019449dad7
nir: handle v3d intrinsics in divergence analysis
...
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726 >
2021-02-01 08:11:48 +00:00
Arcady Goldmints-Orlov
349e4f3c65
nir: add more intrinsics to divergence analysis
...
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726 >
2021-02-01 08:11:48 +00:00
Caio Marcelo de Oliveira Filho
5de6c5973a
spirv: Implement SPV_KHR_workgroup_memory_explicit_layout
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8699 >
2021-01-27 22:20:53 +00:00
Caio Marcelo de Oliveira Filho
a9d230077f
nir: Two shared memory *blocks* may alias each other
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8699 >
2021-01-27 22:20:53 +00:00
Rhys Perry
30f40364f6
nir,spirv: allow non-uniform OpArrayLength
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7969 >
2021-01-27 13:00:33 +00:00
Caio Marcelo de Oliveira Filho
9f3d5e99ea
compiler: Use util/bitset.h for system_values_read
...
It is currently a bitset on top of a uint64_t but there are already
more than 64 values. Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585 >
2021-01-26 20:20:47 +00:00
Caio Marcelo de Oliveira Filho
ecd0ae09f9
nir/linking: Remove system_value handling from helper
...
All uses are passing variables of either nir_var_shader_in or
nir_var_shader_out modes. Note that currently there are more than 64
system values, so the uint64_t wouldn't be enough anyway.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585 >
2021-01-26 20:20:46 +00:00
Samuel Pitoiset
4c3ad4d065
nir/algebraic: mark more optimization with fsat(NaN) as inexact
...
These optimizations are duplicated from the main optimization table
to the late one... And I missed some in the original fix.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3368
Fixes: bc123c396a ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716 >
2021-01-26 17:06:23 +00:00
Gert Wollny
4099cdc97f
compiler/nir: Add support for lowering stores with nir_lower_instruction
...
The function is very convenient for lowering any type of instruction
that can be easily filtered, but so far instructions that didn't return
a value were siletly ignored.
Fix this by
- not requiring a return value in the instruction
- add a new special return value from the lowering implementation
function to indicated that an instruction that doesn't have a
return value must be removed, and
- don't try to collect and replace uses in this case.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8177 >
2021-01-26 15:27:17 +00:00
Rhys Perry
b729cd58d7
nir/algebraic: eliminate exact a*0.0 if float execution mode allow it
...
fossil-db (GFX10):
Totals from 611 (0.44% of 139391) affected shaders:
SGPRs: 40528 -> 40288 (-0.59%)
VGPRs: 16136 -> 16152 (+0.10%); split: -0.15%, +0.25%
CodeSize: 970192 -> 951036 (-1.97%)
MaxWaves: 10561 -> 10557 (-0.04%); split: +0.08%, -0.11%
Instrs: 174874 -> 172879 (-1.14%); split: -1.18%, +0.04%
fossil-db (GFX10.3):
Totals from 611 (0.44% of 139391) affected shaders:
SGPRs: 40680 -> 40488 (-0.47%)
VGPRs: 18368 -> 18276 (-0.50%); split: -0.57%, +0.07%
CodeSize: 1050712 -> 1033624 (-1.63%); split: -1.64%, +0.02%
MaxWaves: 8658 -> 8674 (+0.18%)
Instrs: 205364 -> 201220 (-2.02%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523 >
2021-01-26 11:36:13 +00:00
Rhys Perry
614ab26afd
nir/algebraic: optimize out exact a+0.0 if it's used only as a float
...
fossil-db (GFX10):
Totals from 133 (0.10% of 139391) affected shaders:
SGPRs: 7864 -> 7856 (-0.10%); split: -0.20%, +0.10%
VGPRs: 4884 -> 4836 (-0.98%)
CodeSize: 288932 -> 287084 (-0.64%)
MaxWaves: 1973 -> 1979 (+0.30%)
Instrs: 53899 -> 53550 (-0.65%)
fossil-db (GFX10.3):
Totals from 133 (0.10% of 139391) affected shaders:
SGPRs: 7832 -> 7835 (+0.04%); split: -0.06%, +0.10%
VGPRs: 5144 -> 5088 (-1.09%)
CodeSize: 318912 -> 316696 (-0.69%)
MaxWaves: 1735 -> 1746 (+0.63%)
Instrs: 65367 -> 64853 (-0.79%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523 >
2021-01-26 11:36:13 +00:00
Rhys Perry
2849f0b5aa
nir/algebraic: optimize out exact a*1.0 if it's used only as a float
...
fossil-db (GFX10):
Totals from 10180 (7.30% of 139391) affected shaders:
SGPRs: 549392 -> 549448 (+0.01%); split: -0.00%, +0.01%
VGPRs: 243228 -> 243008 (-0.09%); split: -0.11%, +0.02%
CodeSize: 12939080 -> 12603996 (-2.59%); split: -2.59%, +0.00%
MaxWaves: 186948 -> 186976 (+0.01%)
Instrs: 2497266 -> 2414648 (-3.31%)
fossil-db (GFX10.3):
Totals from 10180 (7.30% of 139391) affected shaders:
SGPRs: 549672 -> 549280 (-0.07%); split: -0.23%, +0.16%
VGPRs: 289296 -> 283672 (-1.94%); split: -2.83%, +0.88%
CodeSize: 13920180 -> 13255560 (-4.77%); split: -4.77%, +0.00%
MaxWaves: 151789 -> 153165 (+0.91%)
Instrs: 2756978 -> 2671517 (-3.10%); split: -3.10%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523 >
2021-01-26 11:36:13 +00:00
Caio Marcelo de Oliveira Filho
cb7352ae95
nir: Add a data pointer to the callback in nir_remove_dead_variables
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8706 >
2021-01-26 05:58:34 +00:00
Rhys Perry
f8072c133d
nir/opt_uniform_atomics: fix elect detection
...
fossil-db (GFX10.3):
Totals from 30 (0.02% of 139391) affected shaders:
SGPRs: 1736 -> 1712 (-1.38%)
CodeSize: 262116 -> 254728 (-2.82%)
Instrs: 50341 -> 48857 (-2.95%)
Cycles: 486384 -> 477556 (-1.82%)
VMEM: 4821 -> 4589 (-4.81%)
Copies: 5013 -> 4890 (-2.45%)
Branches: 2108 -> 1983 (-5.93%)
PreSGPRs: 1444 -> 1418 (-1.80%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8654 >
2021-01-25 21:04:52 +00:00
Rhys Perry
eb70c52abe
nir/opt_uniform_atomics: recognize more complicated invocation comparisons
...
For example, gl_LocalInvocationID.x + gl_LocalInvocationID.y * 8.
fossil-db (GFX10.3):
Totals from 8 (0.01% of 139391) affected shaders:
CodeSize: 15224 -> 14800 (-2.79%)
Instrs: 2880 -> 2798 (-2.85%)
Cycles: 44556 -> 44204 (-0.79%)
VMEM: 407 -> 473 (+16.22%); split: +17.69%, -1.47%
Copies: 491 -> 483 (-1.63%)
Branches: 200 -> 192 (-4.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8654 >
2021-01-25 21:04:52 +00:00
Erik Faye-Lund
bc0222d471
compiler/nir: add texcoord replace lowering pass
...
This lowering pass allows us to replace point-sprites to gl_PointCoord,
which better match what modern hardware does.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6473 >
2021-01-25 17:32:33 +00:00
Connor Abbott
6ca1ab3bb4
nir/lower_tex: Assume that nir_tex_instr::dest_type is sized
...
This reverts the code back to the form it was before, but with an
explicitly sized float32 instead of float, now that all producers are
switched over.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989 >
2021-01-25 11:21:59 +01:00
Connor Abbott
708c47e663
nir: Validate nir_tex_instr::dest_type bitsize
...
In theory, we could also verify this against the sampler type for
sampler derefs, but there are a number of complications there:
- SPIR-V 1.4 lets you override the signedness of integer samplers
per-instruction. So the base type may not match.
- mediump/RelaxedPrecision samplers may get lowered to f16 in the
instruction or may not. So the bitsize may not match.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989 >
2021-01-25 11:21:53 +01:00
Connor Abbott
b2da598ff9
nir: Use sized types for nir_tex_instr::dest_type
...
Revieweeviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989 >
2021-01-25 11:21:48 +01:00
Connor Abbott
3d803893da
nir/lower_bool: Rewrite dest_type for boolean destinations
...
This happens with nir_texop_samples_identical, and we need to keep
things consistent and (soon) keep the validator happy when expanding
booleans once we switch that to having a dest_type of bool1.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989 >
2021-01-25 11:21:42 +01:00
Connor Abbott
acd6616eab
nir/lower_tex: Handle sized tex destination types
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989 >
2021-01-25 11:21:26 +01:00
Jason Ekstrand
178820212b
nir/lower_int64: Lower 64-bit vote_ieq
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jason Ekstrand
731adf1e17
nir/lower_int64: Add lowering for 64-bit iadd shuffle/reduce
...
Lowering iadd is a bit trickier because we have to deal with potential
overflow but it's still not bad to do in NIR.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jason Ekstrand
bf7a114246
nir/lower_int64: Add lowering for some 64-bit subgroup ops
...
These are all pretty trivial because we can just split the op into one
subgroup op per half of the value. There's some question as to whether
these belong in lower_int64 or lower_subgroups but, on Intel, they key
decider of whether or not we need the lowering is based on whether or
not we have hardware int64 support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jason Ekstrand
da331f814f
nir/lower_int64: Fix lowering of f2[ui]64 for 16-bit float
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jason Ekstrand
70b4524de5
nir/lower_int64: Add a level of wrapper functions
...
We're about to start lowering a few intrinsics so we need support more
than just ALU.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Rhys Perry
a6d92eaf4f
nir/sink,nir/move: sink/move reorderable load_ssbo
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490 >
2021-01-21 18:07:03 +00:00
Rhys Perry
e200ce0996
nir/lower_io: fix array_length lowering if buffer is smaller than offset
...
Matches SPIR-V -> NIR implementation of OpArrayLength.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163 >
2021-01-21 11:53:12 +00:00
Jesse Natalie
13b21156e4
nir: Work around MSVC x86 internal compiler error
...
Fixes: 1fd8b466 ("nir,spirv: add sparse image loads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4108
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8581 >
2021-01-20 20:42:48 +00:00
Mike Blumenkrantz
652e51e1f3
nir/lower_uniforms_to_ubo: set explicit_binding on uniform_0
...
this variable is always bound to buffer index 0, so the binding info
here is actually useful
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7935 >
2021-01-14 17:29:09 +00:00
Mike Blumenkrantz
491e7decad
util/set: add the found param to search_or_add
...
this brings parity with the internal api
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8450 >
2021-01-14 13:51:35 +00:00
Rhys Perry
dfe429eb41
nir/loop_unroll: unroll more aggressively if it can improve load scheduling
...
Significantly improves performance of a Control compute shader. Also seems
to increase FPS at the very start of the game by ~5% (RX 580, 1080p,
medium settings, no MSAA).
fossil-db (Sienna):
Totals from 81 (0.06% of 139391) affected shaders:
SGPRs: 3848 -> 4362 (+13.36%); split: -0.99%, +14.35%
VGPRs: 4132 -> 4648 (+12.49%)
CodeSize: 275532 -> 659188 (+139.24%)
MaxWaves: 986 -> 906 (-8.11%)
Instrs: 54422 -> 126865 (+133.11%)
Cycles: 1057240 -> 750464 (-29.02%); split: -42.61%, +13.60%
VMEM: 26507 -> 61829 (+133.26%); split: +135.56%, -2.30%
SMEM: 4748 -> 5895 (+24.16%); split: +31.47%, -7.31%
VClause: 1933 -> 6802 (+251.89%); split: -0.72%, +252.61%
SClause: 1179 -> 1810 (+53.52%); split: -3.14%, +56.66%
Branches: 1174 -> 1157 (-1.45%); split: -23.94%, +22.49%
PreVGPRs: 3219 -> 3387 (+5.22%); split: -0.96%, +6.18%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6538 >
2021-01-13 18:54:18 +00:00
Daniel Schürmann
08fbd5d454
nir/divergence_analysis: mark load_push_constant as uniform
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8439 >
2021-01-12 14:46:13 +00:00
Daniel Schürmann
bd8e84eb8d
nir: replace .lower_sub with .has_fsub and .has_isub
...
This allows a more fine-grained control about whether
a backend supports one of these instructions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597 >
2021-01-11 19:13:51 +00:00
Daniel Schürmann
b3ce55b445
nir,vc4: Lower fneg to fmul(x, -1.0)
...
This patch also replaces lower_negate with lower_ineg / lower_fneg.
The fneg semantics have been clarified as of Version 1.5, Revision 1
of the SPIR-V specification, which means that the previous lowering
to fsub is not a viable solution anymore, and is replaced with
lowering to fmul(x, -1.0).
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597 >
2021-01-11 19:13:51 +00:00
Erico Nunes
faaba0d6af
nir/lower_vec_to_movs: don't vectorize unsupports ops
...
If the instruction being coalesced would be vectorized but the target
doesn't support vectorizing that op, skip coalescing.
Reuse the callbacks from alu_to_scalar to describe which ops should not
be vectorized.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506 >
2021-01-11 13:13:30 +00:00
Rhys Perry
b634d7f3e2
nir/opt_vectorize: fix srcs_equal() with two different non-const
...
To match hash_alu_src(), this should return false if both are different
non-const ssa defs.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391 >
2021-01-09 11:14:05 +00:00
Rhys Perry
bdf316ae7b
nir/opt_vectorize: fix typo in instr_can_rewrite()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391 >
2021-01-09 11:14:05 +00:00
Eric Anholt
670944ba04
nir/lower_locals_to_regs: Use the imul_imm helper instead of forcing it.
...
Cleaned up a bit of addressing math in the shader I just had to debug.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8373 >
2021-01-08 21:04:31 +00:00