Colour outputs in TGSI are untyped so we have to ignore the register format on
the store_output. Luckily, Valhall gives us the auto32 escape hatch to deal with
it, and Bifrost doesn't care anyway. This will avoid regressions from native int
output without corrupting the compiler for GLSL and SPIR-V shaders that are
well-typed.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
Clause scheduler edition of db2bdc1dc3 ("pan/bi: Require ATEST coverage mask
input in R60"). ATEST wants to read r60, which can't work if its input isn't
even in a register.
When per-sample shading isn't in use, prevents regressions in:
KHR-GLES31.core.sample_variables.mask.*
These tests previously passed because per-sample shading was forced. It's
not clear whether the bug addressed in this patch is possible to hit "in the
wild", i.e. without the optimizations in this series that allow us to use
per-pixel shading in more cases.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
Fixes flaking in
dEQP-GLES31.functional.image_load_store.cube.qualifiers.volatile_r32i due to
image reads being moved past a BARRIER.
To make this more robust/optimal, we probably need scheduling information
(coherent/volatile/etc) added to instructions like ACO does. That's left for a
future extension, for now I just want the test to stop flaking.
Fixes: 569e5dc745 ("pan/bi: Schedule for pressure pre-RA")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
Unnecessary rename that breaks forward compatibility... but Apple says
this is just NULL. Do the simpler thing. Note that the argument is a
mach_port_t, which is a natural_t == uint32_t in userspace... even
though it's a pointer in the kernel. Although Apple's docs claim that
kIOMasterPortDefault is NULL, it's really just 0.
../src/asahi/lib/agx_device.c:290:35: warning: 'kIOMasterPortDefault' is deprecated: first deprecated in macOS 12.0 [-Wdeprecated-declarations]
IOServiceGetMatchingService(kIOMasterPortDefault, matching);
^~~~~~~~~~~~~~~~~~~~
kIOMainPortDefault
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18121>
Shader-db stats with RV530:
total instructions in shared programs: 166499 -> 164362 (-1.28%)
instructions in affected programs: 80056 -> 77919 (-2.67%)
total temps in shared programs: 21658 -> 21565 (-0.43%)
temps in affected programs: 1780 -> 1687 (-5.22%)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
Shader-db stats with RV530:
total instructions in shared programs: 169509 -> 166013 (-2.06%)
instructions in affected programs: 99126 -> 95630 (-3.53%)
total presub in shared programs: 10975 -> 10758 (-1.98%)
presub in affected programs: 744 -> 527 (-29.17%)
total temps in shared programs: 21722 -> 21649 (-0.34%)
temps in affected programs: 1350 -> 1277 (-5.41%)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
Skip the merge if one of the instructions writes just w channel
and we are compiling a fragment shader. We can pair-schedule it together
later anyway and it will also give the scheduler a bit more flexibility.
Shader-db stats with RV530:
total instructions in shared programs: 169522 -> 169509 (<.01%)
instructions in affected programs: 14170 -> 14157 (-0.09%)
total temps in shared programs: 21712 -> 21722 (0.05%)
temps in affected programs: 324 -> 334 (3.09%)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
To allow a simple extension with more merging combinations in the
later commits.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
Constant folding first, than copy propagate simple movs, after that
we run the merge movs pass and finally peephole. The important part
is to do the copy propagate for the whole program before running merge
movs.
We no longer check the return from merge_movs so convert it to void.
Shader-db changes with RV530:
total instructions in shared programs: 155361 -> 154787 (-0.37%)
instructions in affected programs: 67920 -> 67346 (-0.85%)
total temps in shared programs: 20836 -> 20773 (-0.30%)
temps in affected programs: 711 -> 648 (-8.86%)
total presub in shared programs: 8226 -> 8202 (-0.29%)
presub in affected programs: 223 -> 199 (-10.76%)
total temps in shared programs: 20836 -> 20773 (-0.30%)
temps in affected programs: 711 -> 648 (-8.86%)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
The main problem here is we can have a negate bit set for an unused
channel, so we can't just OR together the negates when channel merging.
Right now the bug is hidden because how we run the pass order, but
that will change in a later commit. Add some helpers for merging of the
negates, they will be also used more in a later commits. As a bonus
construct the new source separatelly and only rewrite the original
instructions after checking that the final swizzle is valid.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
This will prevent a regression in the number of inlined constants in
a later commit. Constructs like 4.000000 (0x48).w110 works just fine.
There is a small behavioral change. We would previously allow positive
and negative same-value contants to be produced, e.g.,
4.000000 (0x48).w-w__ and this would be later split into some extra
movs in the dataflow swizzle pass. We now explicitly check that the
final swizzle is valid while inlining. So there is a minor decrease
in inlined constants and in the total instructions.
total lits in shared programs: 4328 -> 4194 (-3.10%)
lits in affected programs: 554 -> 420 (-24.19%)
total instructions in shared programs: 155488 -> 155361 (-0.08%)
instructions in affected programs: 5707 -> 5580 (-2.23%)
Additonally, a fix for pair translation is needed since the constant
inlining can now produce swizzles like this: 4.000000 (0x48).w-0-0-_
so we have to teach pair translation to also ignore the sign for zero
swizzle.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17560>
When SCC is clobbered between s_cmp and its operand's writer,
the current optimization that eliminates s_cmp won't kick in.
However, when s_cmp is the only user of its operand temporary,
it is possible to "pull down" the instruction that wrote the operand.
Fossil DB stats on Navi 21:
Totals from 63302 (46.92% of 134906) affected shaders:
CodeSize: 176689272 -> 176418332 (-0.15%)
Instrs: 33552237 -> 33484502 (-0.20%)
Latency: 205847485 -> 205816205 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 34321285 -> 34319908 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
This was simply left out by accident when I wrote this.
Fossil DB stats on Navi 21:
Totals from 70165 (52.01% of 134906) affected shaders:
CodeSize: 246375656 -> 245814396 (-0.23%)
Instrs: 46519773 -> 46379458 (-0.30%)
Latency: 385159303 -> 385089261 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 66490172 -> 66487867 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
Also delete them when they are already dead in process_instruction().
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
if the final hash was the same as the last-used hash for this program,
it should be safe to reuse the last pipeline object in the presence of
dynamic vertex input since the number of permutations is low
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18135>
non-generated tcs has unique mechanics in that it doesn't have a base shader key,
so split that out to avoid unnecessary complexity
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18135>
these values don't need to be checked at all if dynamic vertex is enabled,
which wouldn't previously have been possible without adding even more
data to check to the pipeline state
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18135>