Creates and returns a nir_builder from a cursor. The nir_function_impl
is retrieved using said cursor. This should be fine as long as it is not
used on extracted control flow.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23883>
Manually constructing the chip-id will stop working with future devices.
And now that we get the generation from the device table, we can't be
sloppy about using a bogus dev_id.
Fixes: 00900b76e0 ("freedreno: Decouple GPU gen from gpu_id/chip_id")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23953>
We need to keep container job as a manual one, while others are always
disabled.
Fixes: c9de0d2977 ("ci/microsoft: rename manual rules according to rest introduced rules")
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23968>
I would rather this be after several of these complicated lowering passes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23926>
This is to allow setting required subgroup size and
full subgroups on more than just the compute stage.
Use an enum (not the actual subgroup size integer)
so that we can have some bits reserved there for
future use.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23925>
This cleans up a lot and helps to generate much better code. There
are only benefits on GPUs without inline immediate support.
shader-db results on GC2000:
total instructions in shared programs: 237168 -> 235101 (-0.87%)
instructions in affected programs: 17297 -> 15230 (-11.95%)
helped: 758
HURT: 0
helped stats (abs) min: 1 max: 24 x̄: 2.73 x̃: 2
helped stats (rel) min: 7.14% max: 29.41% x̄: 14.47% x̃: 14.29%
95% mean confidence interval for instructions value: -2.94 -2.51
95% mean confidence interval for instructions %-change: -14.84% -14.09%
Instructions are helped.
total temps in shared programs: 85553 -> 84969 (-0.68%)
temps in affected programs: 2879 -> 2295 (-20.28%)
helped: 584
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 5.00% max: 25.00% x̄: 21.48% x̃: 20.00%
95% mean confidence interval for temps value: -1.00 -1.00
95% mean confidence interval for temps %-change: -21.76% -21.21%
Temps are helped.
total immediates in shared programs: 154800 -> 154800 (0.00%)
immediates in affected programs: 0 -> 0
helped: 0
HURT: 0
total loops in shared programs: 0 -> 0
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
LOST: 0
GAINED: 0
No changes on GC3000 and GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23947>
This is to inform you of some planned downtime in the LAVA lab as follows:
* Start: 2023-07-03 07:00 UTC
* End: 2023-07-03 11:00 UTC
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23759>
When we enabling the farm again, we don't want to run all the manual
jobs again, since some of them may take more than 1 hour.
We just have to wait until the nightly run.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23846>
The function iterator should be able to modified in this foreach loop
And the latter patches needs this
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23960>
This frees up the shorter names for the intrinsic-based versions that will
replace them.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23956>
This frees up the shorter names for the new register-based intrinsics.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23956>
It really isn't that hard. This drops the roundmode optimization but otherwise
should be at parity to what there was before, and it's massively more competent
at it anyway.
total instructions in shared programs: 1514477 -> 1508444 (-0.40%)
instructions in affected programs: 645848 -> 639815 (-0.93%)
helped: 2712
HURT: 187
Instructions are helped.
total bundles in shared programs: 645069 -> 642999 (-0.32%)
bundles in affected programs: 136233 -> 134163 (-1.52%)
helped: 1242
HURT: 319
Bundles are helped.
total quadwords in shared programs: 1130469 -> 1125969 (-0.40%)
quadwords in affected programs: 379780 -> 375280 (-1.18%)
helped: 1878
HURT: 376
Quadwords are helped.
total registers in shared programs: 90577 -> 90633 (0.06%)
registers in affected programs: 5627 -> 5683 (1.00%)
helped: 309
HURT: 294
Inconclusive result (value mean confidence interval includes 0).
total threads in shared programs: 55594 -> 55607 (0.02%)
threads in affected programs: 118 -> 131 (11.02%)
helped: 43
HURT: 33
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 1399 -> 1371 (-2.00%)
spills in affected programs: 345 -> 317 (-8.12%)
helped: 10
HURT: 4
total fills in shared programs: 5273 -> 5133 (-2.66%)
fills in affected programs: 1035 -> 895 (-13.53%)
helped: 12
HURT: 4
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
Some instructions are not able to swizzle their sources, so we conservatively
refused to propagate moves into them to avoid needing a swizzle on the source.
This is too conservative: we only need to do this if the move swizzles. If there
is only an identity swizzle on the move, we can propagate it without issue. This
will mitigate some instruction count regression from the later modifier
propagation, which will leave lots of moves that need to be propagated.
total instructions in shared programs: 1514834 -> 1514477 (-0.02%)
instructions in affected programs: 132297 -> 131940 (-0.27%)
helped: 349
HURT: 3
Instructions are helped.
total bundles in shared programs: 645093 -> 645069 (<.01%)
bundles in affected programs: 9650 -> 9626 (-0.25%)
helped: 42
HURT: 23
Bundles are helped.
total quadwords in shared programs: 1130751 -> 1130469 (-0.02%)
quadwords in affected programs: 78790 -> 78508 (-0.36%)
helped: 269
HURT: 21
Quadwords are helped.
total registers in shared programs: 90563 -> 90577 (0.02%)
registers in affected programs: 163 -> 177 (8.59%)
helped: 4
HURT: 16
Registers are HURT.
total spills in shared programs: 1400 -> 1399 (-0.07%)
spills in affected programs: 2 -> 1 (-50.00%)
helped: 1
HURT: 0
total fills in shared programs: 5276 -> 5273 (-0.06%)
fills in affected programs: 151 -> 148 (-1.99%)
helped: 1
HURT: 3
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we need to insert a mov in order to schedule a branch, we do not schedule
anything writer to the source of that mov in the same bundle to avoid a
data race between the read and the write. That's too conservative, though: it is
legitimate to write in the first part of the ALU word (VMUL/SADD stages) and
then read from the second part (VADD/SMUL/VLUT stages). Reset the
predicate.exclude when going from scheduling the latter stages to the former, to
allow a sequence of code like:
FCMP.vector 0.xyzw, ...
branch 0.x
to be scheduled as
vmul.FCMP.vector 0.xyzw
smul r31.w, 0.x
branch 0.x
rather than getting split up into two bundles.
This mitigates a cycle count regression from the copyprop change.
total instructions in shared programs: 1514856 -> 1514834 (<.01%)
instructions in affected programs: 3087 -> 3065 (-0.71%)
helped: 5
HURT: 1
Inconclusive result (value mean confidence interval includes 0).
total bundles in shared programs: 645327 -> 645093 (-0.04%)
bundles in affected programs: 40498 -> 40264 (-0.58%)
helped: 230
HURT: 68
Bundles are helped.
total quadwords in shared programs: 1130554 -> 1130751 (0.02%)
quadwords in affected programs: 75323 -> 75520 (0.26%)
helped: 49
HURT: 231
Quadwords are HURT.
total registers in shared programs: 90559 -> 90563 (<.01%)
registers in affected programs: 119 -> 123 (3.36%)
helped: 5
HURT: 8
Inconclusive result (value mean confidence interval includes 0).
total threads in shared programs: 55590 -> 55594 (<.01%)
threads in affected programs: 4 -> 8 (100.00%)
helped: 4
HURT: 0
Threads are helped.
total spills in shared programs: 1402 -> 1400 (-0.14%)
spills in affected programs: 289 -> 287 (-0.69%)
helped: 1
HURT: 1
total fills in shared programs: 5285 -> 5276 (-0.17%)
fills in affected programs: 448 -> 439 (-2.01%)
helped: 2
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we have multiple reads of the same SSA def in the same block, we don't need
to emit multiple copies for it, we can just reuse a copy (OR'ing in the mask,
knowing the source is already fully written since it's SSA). This will prevent
some regressions in moves from the copyprop patch.
There is a bit of a tradeoff here between increased pressure and reduced
instruction count but I'm not too worried. The affect on pressure seems all over
the place -- register use decreases overall, threads increase (great!) but a few
shaders that were *already spilling*, spill a bit worse. I'm not terribly
worried there.
total instructions in shared programs: 1518289 -> 1514856 (-0.23%)
instructions in affected programs: 292854 -> 289421 (-1.17%)
helped: 1557
HURT: 232
Instructions are helped.
total bundles in shared programs: 646903 -> 645327 (-0.24%)
bundles in affected programs: 91872 -> 90296 (-1.72%)
helped: 910
HURT: 256
Bundles are helped.
total quadwords in shared programs: 1133728 -> 1130554 (-0.28%)
quadwords in affected programs: 187170 -> 183996 (-1.70%)
helped: 1399
HURT: 44
Quadwords are helped.
total registers in shared programs: 90640 -> 90559 (-0.09%)
registers in affected programs: 2676 -> 2595 (-3.03%)
helped: 202
HURT: 124
Inconclusive result (%-change mean confidence interval includes 0).
total threads in shared programs: 55561 -> 55590 (0.05%)
threads in affected programs: 50 -> 79 (58.00%)
helped: 23
HURT: 6
Threads are helped.
total spills in shared programs: 1386 -> 1402 (1.15%)
spills in affected programs: 231 -> 247 (6.93%)
helped: 2
HURT: 13
total fills in shared programs: 5159 -> 5285 (2.44%)
fills in affected programs: 1282 -> 1408 (9.83%)
helped: 11
HURT: 16
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
1. Always calculate when asked. This is the sort of optimization that just
introduces bugs. Like one I hit when shuffling register indices around with
the register access changes.
2. Ask before using in RA.
3. Account for precoloured blend inputs.
Small shader-db hit, didn't investigate too much.
total instructions in shared programs: 1518017 -> 1518168 (<.01%)
instructions in affected programs: 2895 -> 3046 (5.22%)
helped: 0
HURT: 24
Instructions are HURT.
total bundles in shared programs: 646756 -> 646782 (<.01%)
bundles in affected programs: 1119 -> 1145 (2.32%)
helped: 1
HURT: 19
Bundles are HURT.
total quadwords in shared programs: 1133694 -> 1133728 (<.01%)
quadwords in affected programs: 1736 -> 1770 (1.96%)
helped: 0
HURT: 20
Quadwords are HURT.
total registers in shared programs: 90596 -> 90612 (0.02%)
registers in affected programs: 108 -> 124 (14.81%)
helped: 0
HURT: 16
Registers are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
mir_prev_op will point to the last instruction of the block in that case because
the block instruction list is circular. That would cause an invald
write-after-read relationship between the move we insert with the constants and
the CSEL reading them, which DCE "helpfully" optimizes out, leaving a read from
an undefined def. That ends up getting RA'd to an invalid register.
All in all, pretty bad.
Identified due to a new assert fail after the proper temp_count fix.
Affects dEQP-GLES31.functional.separate_shader.random.12.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we start with unscheduled IR:
0 = comparison
csel 1, 2, 0
the old code will schedule this as
r31.w = comparison
csel 1, 2, 0
leaving 0 as a dangling source, which can confuse the rest of the compiler.
Instead rewrite this to
r31.w = comparison
csel 1, 2, r31.w
Note the swizzle as already taken care of (i.e. turned to .x for scalar
conditions) by the time we get to scheduling so we can force to .w.
This keeps register allocation from doing stupid things.
total instructions in shared programs: 1518138 -> 1518017 (<.01%)
instructions in affected programs: 37714 -> 37593 (-0.32%)
helped: 48
HURT: 42
Instructions are helped.
total bundles in shared programs: 646877 -> 646756 (-0.02%)
bundles in affected programs: 17024 -> 16903 (-0.71%)
helped: 48
HURT: 42
Bundles are helped.
total registers in shared programs: 90624 -> 90596 (-0.03%)
registers in affected programs: 361 -> 333 (-7.76%)
helped: 31
HURT: 5
Registers are helped.
total threads in shared programs: 55561 -> 55566 (<.01%)
threads in affected programs: 5 -> 10 (100.00%)
helped: 4
HURT: 0
Threads are helped.
total spills in shared programs: 1386 -> 1383 (-0.22%)
spills in affected programs: 19 -> 16 (-15.79%)
helped: 3
HURT: 0
total fills in shared programs: 5159 -> 5077 (-1.59%)
fills in affected programs: 1305 -> 1223 (-6.28%)
helped: 20
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>