Commit graph

4281 commits

Author SHA1 Message Date
Eric Engestrom
c66622de3a meson: replace manual compiler flags with meson arguments
These would only have worked in GCC and Clang, which so far wasn't an
issue, but let's clean it up anyway.

Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18190>
2022-08-24 22:13:19 +00:00
Alyssa Rosenzweig
5777f99fc5 pan/mdg: Use correct idiv lowering
Rip off the bandaid. We can't tolerate straight-up wrong results. We have an
efficient umul_high implementation so it's not so bad.

total instructions in shared programs: 1537404 -> 1537204 (-0.01%)
instructions in affected programs: 143299 -> 143099 (-0.14%)
helped: 89
HURT: 283
helped stats (abs) min: 1.0 max: 41.0 x̄: 5.87 x̃: 6
helped stats (rel) min: 0.39% max: 6.67% x̄: 1.41% x̃: 1.44%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 1.14 x̃: 1
HURT stats (rel)   min: 0.24% max: 5.71% x̄: 0.35% x̃: 0.27%
95% mean confidence interval for instructions value: -0.96 -0.12
95% mean confidence interval for instructions %-change: -0.17% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

total bundles in shared programs: 647521 -> 648154 (0.10%)
bundles in affected programs: 45833 -> 46466 (1.38%)
helped: 92
HURT: 228
helped stats (abs) min: 1.0 max: 13.0 x̄: 3.10 x̃: 3
helped stats (rel) min: 0.69% max: 7.14% x̄: 2.11% x̃: 1.99%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 4.03 x̃: 5
HURT stats (rel)   min: 0.59% max: 7.22% x̄: 2.93% x̃: 3.40%
95% mean confidence interval for bundles value: 1.58 2.38
95% mean confidence interval for bundles %-change: 1.21% 1.76%
Bundles are HURT.

total quadwords in shared programs: 1135141 -> 1138268 (0.28%)
quadwords in affected programs: 101064 -> 104191 (3.09%)
helped: 30
HURT: 342
helped stats (abs) min: 1.0 max: 30.0 x̄: 4.97 x̃: 3
helped stats (rel) min: 0.24% max: 5.99% x̄: 1.72% x̃: 1.06%
HURT stats (abs)   min: 1.0 max: 16.0 x̄: 9.58 x̃: 10
HURT stats (rel)   min: 0.73% max: 17.14% x̄: 3.64% x̃: 3.80%
95% mean confidence interval for quadwords value: 7.84 8.97
95% mean confidence interval for quadwords %-change: 2.99% 3.43%
Quadwords are HURT.

total registers in shared programs: 91938 -> 92265 (0.36%)
registers in affected programs: 2639 -> 2966 (12.39%)
helped: 0
HURT: 280
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.17 x̃: 1
HURT stats (rel)   min: 9.09% max: 50.00% x̄: 12.75% x̃: 11.11%
95% mean confidence interval for registers value: 1.12 1.22
95% mean confidence interval for registers %-change: 12.05% 13.45%
Registers are HURT.

total threads in shared programs: 55280 -> 55268 (-0.02%)
threads in affected programs: 24 -> 12 (-50.00%)
helped: 0
HURT: 11
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.09 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.29 -0.89
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig
5bc830cbf2 pan/mdg: Reexpress umul_high packing
There are a bunch of subtle details of how 32-bit sources are
zero-extended to 64-bit, how their swizzles work, how 64-bit
destinations are shrunk to 32-bit, and how those two interact. This
fixes the interactions... mostly.

Fixes umul_high, all such tests should be passing now. Unblocks idiv
lowering that depends on umul_high.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig
7b78e05ba8 pan/mdg: Replicate swizzles for scalar sources
This works around issue packing 32-bit scalar swizzles zero-extended to
64-bit, seen with the umul_high implementation. I tried for a while
figuring out the root cause (even rewrote a big chunk of disassembler)
but am still a bit lost. Nevertheless this is a safe workaround with no
performance impact (and avoids relying on NIR undefined behaviour to
implement GPU undefined behaviour), so let's do this for now to fix
umul_high.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig
fcae7cfd27 panfrost: Assert that blend shaders are nontrivial
Even if the driver doesn't *use* trivial blend shaders, building and compiling
blend shaders is expensive. We shouldn't be building blend shaders that should
never be used.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
1d5aad9db4 panfrost: Include mask in replace blend shader name
Helpful to disambiguate blend shaders with different colour masks used for the
same format/replace operation.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
378b7e37f4 panfrost: Simplify blitter blend shader creation
We don't need blending in the blitter. That means blend shaders are only needed
on Midgard. Simplify accordingly.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
d849d9779a panfrost: Avoid blend shader when not blending
On Midgard, we need a "blend" shader even if blending is disabled, if the format
isn't blendable. This is inefficient. Bifrost solves this by decoupling the
format conversion from the blending, allowing opaque (unblended) output to any
format without a blend shader or fragment key.

Unfortunately, our blend code is from the Midgard era -- I wrote an early
version of nir_lower_blend when I was still in high school! -- so we've been
using blend shaders for opaque output even on Bifrost. Whoops!

In SuperTuxKart, reduces blend shader calls by 30%, translating to a 15%
reduction in i-cache misses.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
e59c74ec56 panfrost: Promote blend shader outputs 8->16-bit
..on Bifrost and later, where the conversion hardware makes this reasonable.
This saves us from inserting a pile of conversions in the compiler to lower away
the 8-bit input/output. This also generates substantially better code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
08746d7b52 panfrost: Don't saturate in Bifrost blend shaders
It's unnecessary since the hardware already does the conversion for us.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
b1c9c924c7 panfrost: Set blit output variable types correctly
The type of the output variable will propagate through the store_output
intrinsic's src_type field to the BLEND instruction's register format field. On
Valhall, the register format for a BLEND comes from the instruction -- the
register format specified in the conversion descriptor (used on Bifrost) is
ignored. That means it has to match.

Previously, we always used a blend shader for integer rendering. Since blend
shaders ignore the register format of the BLEND instruction, that masked this
issue. That also means we don't need to backport this.

Will prevent a regression from the following commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
d680560970 panfrost: Handle untyped_color_outputs on Bifrost
For untyped_color_outputs, we need to ignore the type of the colour output in
the shader and instead use the type from the format. We have all the information
to do this at blend descriptor pack time, but not at shader compile time. This
means we need a (somewhat expensive) fixup in this edge case to ingest
NIR-to-TGSI. This will prevent a regression from the rest of the series.

Although the register_format field is also present on Valhall blend descriptors,
it is ignored so we don't need the fixup there.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
cf55e05f8f pan/bi: Handle info.fs.untyped_color_outputs on Valhall
Colour outputs in TGSI are untyped so we have to ignore the register format on
the store_output. Luckily, Valhall gives us the auto32 escape hatch to deal with
it, and Bifrost doesn't care anyway. This will avoid regressions from native int
output without corrupting the compiler for GLSL and SPIR-V shaders that are
well-typed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
394e1f5862 pan/bi: Don't allow ATEST to take a temporary
Clause scheduler edition of db2bdc1dc3 ("pan/bi: Require ATEST coverage mask
input in R60"). ATEST wants to read r60, which can't work if its input isn't
even in a register.

When per-sample shading isn't in use, prevents regressions in:

   KHR-GLES31.core.sample_variables.mask.*

These tests previously passed because per-sample shading was forced. It's
not clear whether the bug addressed in this patch is possible to hit "in the
wild", i.e. without the optimizations in this series that allow us to use
per-pixel shading in more cases.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
e12a9ce8d6 pan/bi: Don't reorder image loads across stores
Fixes flaking in
dEQP-GLES31.functional.image_load_store.cube.qualifiers.volatile_r32i due to
image reads being moved past a BARRIER.

To make this more robust/optimal, we probably need scheduling information
(coherent/volatile/etc) added to instructions like ACO does. That's left for a
future extension, for now I just want the test to stop flaking.

Fixes: 569e5dc745 ("pan/bi: Schedule for pressure pre-RA")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig
2f3cc22bc4 pan/bi: Use nir_opt_idiv_const
Mitigates some of the hurt from idiv lowering.

total instructions in shared programs: 2734512 -> 2734269 (<.01%)
instructions in affected programs: 10419 -> 10176 (-2.33%)
helped: 11
HURT: 4
helped stats (abs) min: 9.0 max: 49.0 x̄: 22.45 x̃: 19
helped stats (rel) min: 1.84% max: 7.50% x̄: 3.65% x̃: 3.30%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.14% max: 0.14% x̄: 0.14% x̃: 0.14%
95% mean confidence interval for instructions value: -25.34 -7.06
95% mean confidence interval for instructions %-change: -3.91% -1.37%
Instructions are helped.

total cycles in shared programs: 140629.05 -> 140628.61 (<.01%)
cycles in affected programs: 25.12 -> 24.69 (-1.74%)
helped: 3
HURT: 0
helped stats (abs) min: 0.0625 max: 0.3125 x̄: 0.15 x̃: 0
helped stats (rel) min: 0.82% max: 3.17% x̄: 1.60% x̃: 0.82%

total cvt in shared programs: 14826.25 -> 14819.52 (-0.05%)
cvt in affected programs: 189.64 -> 182.91 (-3.55%)
helped: 42
HURT: 0
helped stats (abs) min: 0.046875 max: 1.015625 x̄: 0.16 x̃: 0
helped stats (rel) min: 0.74% max: 11.76% x̄: 3.73% x̃: 2.82%
95% mean confidence interval for cvt value: -0.23 -0.09
95% mean confidence interval for cvt %-change: -4.65% -2.82%
Cvt are helped.

total sfu in shared programs: 8601.81 -> 8613.56 (0.14%)
sfu in affected programs: 85.62 -> 97.38 (13.72%)
helped: 0
HURT: 41
HURT stats (abs)   min: 0.0625 max: 1.25 x̄: 0.29 x̃: 0
HURT stats (rel)   min: 3.45% max: 33.33% x̄: 15.48% x̃: 16.67%
95% mean confidence interval for sfu value: 0.21 0.36
95% mean confidence interval for sfu %-change: 13.28% 17.69%
Sfu are HURT.

total quadwords in shared programs: 1479736 -> 1479616 (<.01%)
quadwords in affected programs: 3392 -> 3272 (-3.54%)
helped: 8
HURT: 0
helped stats (abs) min: 8.0 max: 24.0 x̄: 15.00 x̃: 16
helped stats (rel) min: 1.54% max: 4.62% x̄: 3.57% x̃: 3.71%
95% mean confidence interval for quadwords value: -20.58 -9.42
95% mean confidence interval for quadwords %-change: -4.39% -2.75%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>
2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig
3eb57544b6 pan/bi: Don't use the broken idiv lowering
Rip off the band-aid. We can't tolerate straight-up wrong results, after all.
Addresses the Bifrost/Valhall portion of #6555.

Fixes test_integer_ops uint_math / subcase.

total instructions in shared programs: 2674840 -> 2734512 (2.23%)
instructions in affected programs: 189964 -> 249636 (31.41%)
helped: 0
HURT: 383
HURT stats (abs)   min: 8.0 max: 184.0 x̄: 155.80 x̃: 173
HURT stats (rel)   min: 1.85% max: 126.09% x̄: 32.38% x̃: 34.46%
95% mean confidence interval for instructions value: 150.98 160.63
95% mean confidence interval for instructions %-change: 31.27% 33.48%
Instructions are HURT.

total cycles in shared programs: 140627.36 -> 140629.05 (<.01%)
cycles in affected programs: 24.81 -> 26.50 (6.80%)
helped: 0
HURT: 4
HURT stats (abs)   min: 0.1875 max: 0.9375 x̄: 0.42 x̃: 0
HURT stats (rel)   min: 2.52% max: 37.50% x̄: 13.26% x̃: 6.52%
95% mean confidence interval for cycles value: -0.14 0.99
95% mean confidence interval for cycles %-change: -13.14% 39.67%
Inconclusive result (value mean confidence interval includes 0).

total fma in shared programs: 22578.03 -> 22549.94 (-0.12%)
fma in affected programs: 1056.33 -> 1028.23 (-2.66%)
helped: 383
HURT: 0
helped stats (abs) min: 0.015625 max: 0.375 x̄: 0.07 x̃: 0
helped stats (rel) min: 0.55% max: 50.00% x̄: 3.07% x̃: 2.34%
95% mean confidence interval for fma value: -0.08 -0.07
95% mean confidence interval for fma %-change: -3.39% -2.75%
Fma are helped.

total cvt in shared programs: 14128.91 -> 14826.25 (4.94%)
cvt in affected programs: 1636.23 -> 2333.58 (42.62%)
helped: 0
HURT: 383
HURT stats (abs)   min: 0.0625 max: 2.109375 x̄: 1.82 x̃: 2
HURT stats (rel)   min: 2.52% max: 162.50% x̄: 43.50% x̃: 46.40%
95% mean confidence interval for cvt value: 1.76 1.88
95% mean confidence interval for cvt %-change: 42.07% 44.93%
Cvt are HURT.

total sfu in shared programs: 7549.31 -> 8601.81 (13.94%)
sfu in affected programs: 758.62 -> 1811.12 (138.74%)
helped: 0
HURT: 383
HURT stats (abs)   min: 0.375 max: 5.0 x̄: 2.75 x̃: 3
HURT stats (rel)   min: 23.08% max: 266.67% x̄: 136.66% x̃: 150.00%
95% mean confidence interval for sfu value: 2.67 2.83
95% mean confidence interval for sfu %-change: 133.02% 140.29%
Sfu are HURT.

total quadwords in shared programs: 1449928 -> 1479736 (2.06%)
quadwords in affected programs: 96544 -> 126352 (30.88%)
helped: 0
HURT: 382
HURT stats (abs)   min: 8.0 max: 96.0 x̄: 78.03 x̃: 88
HURT stats (rel)   min: 1.82% max: 100.00% x̄: 31.71% x̃: 34.38%
95% mean confidence interval for quadwords value: 75.63 80.43
95% mean confidence interval for quadwords %-change: 30.67% 32.75%
Quadwords are HURT.

total threads in shared programs: 53556 -> 53479 (-0.14%)
threads in affected programs: 154 -> 77 (-50.00%)
helped: 0
HURT: 77
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Bifrost is hit harder, unfortunately:

total instructions in shared programs: 2414877 -> 2468058 (2.20%)
instructions in affected programs: 184585 -> 237766 (28.81%)
helped: 0
HURT: 383
HURT stats (abs)   min: 12.0 max: 160.0 x̄: 138.85 x̃: 155
HURT stats (rel)   min: 1.52% max: 111.94% x̄: 29.43% x̃: 31.44%
95% mean confidence interval for instructions value: 134.47 143.24
95% mean confidence interval for instructions %-change: 28.42% 30.45%
Instructions are HURT.

total tuples in shared programs: 1927478 -> 1964218 (1.91%)
tuples in affected programs: 133176 -> 169916 (27.59%)
helped: 0
HURT: 383
HURT stats (abs)   min: 5.0 max: 113.0 x̄: 95.93 x̃: 107
HURT stats (rel)   min: 1.02% max: 87.04% x̄: 28.44% x̃: 30.57%
95% mean confidence interval for tuples value: 92.80 99.05
95% mean confidence interval for tuples %-change: 27.47% 29.41%
Tuples are HURT.

total clauses in shared programs: 354853 -> 359513 (1.31%)
clauses in affected programs: 22918 -> 27578 (20.33%)
helped: 0
HURT: 381
HURT stats (abs)   min: 1.0 max: 15.0 x̄: 12.23 x̃: 14
HURT stats (rel)   min: 1.14% max: 60.00% x̄: 20.81% x̃: 22.58%
95% mean confidence interval for clauses value: 11.84 12.62
95% mean confidence interval for clauses %-change: 20.13% 21.49%
Clauses are HURT.

total cycles in shared programs: 166542.56 -> 167639.31 (0.66%)
cycles in affected programs: 5012.37 -> 6109.13 (21.88%)
helped: 0
HURT: 329
HURT stats (abs)   min: 0.20833199999999863 max: 4.666665999999999 x̄: 3.33 x̃: 3
HURT stats (rel)   min: 1.05% max: 51.06% x̄: 22.28% x̃: 22.78%
95% mean confidence interval for cycles value: 3.22 3.45
95% mean confidence interval for cycles %-change: 21.45% 23.10%
Cycles are HURT.

total arith in shared programs: 73643 -> 75173.17 (2.08%)
arith in affected programs: 5344.04 -> 6874.21 (28.63%)
helped: 0
HURT: 383
HURT stats (abs)   min: 0.20833199999999863 max: 4.666667 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 1.05% max: 97.92% x̄: 29.47% x̃: 31.64%
95% mean confidence interval for arith value: 3.87 4.13
95% mean confidence interval for arith %-change: 28.45% 30.49%
Arith are HURT.

total quadwords in shared programs: 1673974 -> 1701720 (1.66%)
quadwords in affected programs: 111686 -> 139432 (24.84%)
helped: 0
HURT: 383
HURT stats (abs)   min: 5.0 max: 84.0 x̄: 72.44 x̃: 81
HURT stats (rel)   min: 1.11% max: 78.72% x̄: 25.59% x̃: 27.56%
95% mean confidence interval for quadwords value: 70.16 74.73
95% mean confidence interval for quadwords %-change: 24.74% 26.43%
Quadwords are HURT.

total threads in shared programs: 53655 -> 53590 (-0.12%)
threads in affected programs: 130 -> 65 (-50.00%)
helped: 0
HURT: 65
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>
2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig
35a7490ce2 pan/bi: Optimize pattern from nir_lower_idiv
This takes advantage of the .i1 modifier on the comparison to get b2i32 "for
free" in typical circumstances, saving an instruction. Will help with an instr
count regression from lower_idiv.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>
2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig
1ef20f1f35 pan/bi: Optimize bitwise arithmetic of booleans
This is easier to schedule on Bifrost. In theory it's also better on Valhall,
but in practice the CVT unit is too overloaded on Valhall for this to help
at the moment. We can revisit these rules for Valhall in the future where the
Valhall optimizer is more mature and/or Valhall grows a scheduler to balance the
execution units.

total instructions in shared programs: 2415350 -> 2414877 (-0.02%)
instructions in affected programs: 120948 -> 120475 (-0.39%)
helped: 192
HURT: 49
helped stats (abs) min: 1.0 max: 5.0 x̄: 2.89 x̃: 4
helped stats (rel) min: 0.25% max: 4.35% x̄: 0.66% x̃: 0.52%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.67 x̃: 1
HURT stats (rel)   min: 0.11% max: 7.14% x̄: 1.73% x̃: 0.77%
95% mean confidence interval for instructions value: -2.24 -1.68
95% mean confidence interval for instructions %-change: -0.37% 0.02%
Inconclusive result (%-change mean confidence interval includes 0).

total tuples in shared programs: 1928474 -> 1927478 (-0.05%)
tuples in affected programs: 146482 -> 145486 (-0.68%)
helped: 514
HURT: 73
helped stats (abs) min: 1.0 max: 8.0 x̄: 2.11 x̃: 1
helped stats (rel) min: 0.18% max: 9.52% x̄: 1.35% x̃: 0.76%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.23 x̃: 1
HURT stats (rel)   min: 0.15% max: 7.14% x̄: 1.07% x̃: 0.76%
95% mean confidence interval for tuples value: -1.85 -1.55
95% mean confidence interval for tuples %-change: -1.19% -0.91%
Tuples are helped.

total clauses in shared programs: 354985 -> 354853 (-0.04%)
clauses in affected programs: 8562 -> 8430 (-1.54%)
helped: 124
HURT: 22
helped stats (abs) min: 1.0 max: 8.0 x̄: 1.24 x̃: 1
helped stats (rel) min: 0.83% max: 7.14% x̄: 2.47% x̃: 1.72%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.25% max: 20.00% x̄: 5.08% x̃: 4.35%
95% mean confidence interval for clauses value: -1.11 -0.70
95% mean confidence interval for clauses %-change: -1.92% -0.75%
Clauses are helped.

total cycles in shared programs: 166575.48 -> 166542.56 (-0.02%)
cycles in affected programs: 4556.58 -> 4523.67 (-0.72%)
helped: 395
HURT: 65
helped stats (abs) min: 0.041665999999999315 max: 0.33333199999999863 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.19% max: 11.11% x̄: 1.42% x̃: 0.81%
HURT stats (abs)   min: 0.041665999999999315 max: 0.08333400000000069 x̄: 0.05 x̃: 0
HURT stats (rel)   min: 0.15% max: 8.33% x̄: 1.21% x̃: 0.83%
95% mean confidence interval for cycles value: -0.08 -0.06
95% mean confidence interval for cycles %-change: -1.22% -0.87%
Cycles are helped.

total arith in shared programs: 73687.88 -> 73643 (-0.06%)
arith in affected programs: 6339 -> 6294.13 (-0.71%)
helped: 570
HURT: 72
helped stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.08 x̃: 0
helped stats (rel) min: 0.19% max: 12.50% x̄: 1.41% x̃: 0.77%
HURT stats (abs)   min: 0.041665999999999315 max: 0.08333400000000069 x̄: 0.05 x̃: 0
HURT stats (rel)   min: 0.15% max: 8.33% x̄: 1.13% x̃: 0.75%
95% mean confidence interval for arith value: -0.08 -0.06
95% mean confidence interval for arith %-change: -1.27% -0.98%
Arith are helped.

total quadwords in shared programs: 1674486 -> 1673974 (-0.03%)
quadwords in affected programs: 117696 -> 117184 (-0.44%)
helped: 424
HURT: 127
helped stats (abs) min: 1.0 max: 6.0 x̄: 1.64 x̃: 1
helped stats (rel) min: 0.19% max: 4.88% x̄: 1.00% x̃: 0.82%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.46 x̃: 1
HURT stats (rel)   min: 0.15% max: 6.25% x̄: 1.31% x̃: 0.88%
95% mean confidence interval for quadwords value: -1.07 -0.79
95% mean confidence interval for quadwords %-change: -0.58% -0.36%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>
2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig
718748fe61 pan/bi: Test int8/16 -> float32 opts
These are easy, since round modes don't matter.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>
2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig
c88b8cbee3 pan/bi: Fuse [US][8|16]_TO_F32 ops
This combines nicely with the previous isel change. Now GLSL like

   float(int_x >> 24)

will generate a single machine instruction

   S8_TO_F32 int_x.b3

Noticed when debugging

   KHR-GLES31.core.shader_bitfield_operation.unpackSnorm4x8.0

...but naturally no real workloads care. Helped shaders are from Android games
that appear to have run through a translator, naturally.

total instructions in shared programs: 2674831 -> 2674783 (<.01%)
instructions in affected programs: 11493 -> 11445 (-0.42%)
helped: 31
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.55 x̃: 1
helped stats (rel) min: 0.16% max: 2.90% x̄: 0.51% x̃: 0.41%
95% mean confidence interval for instructions value: -1.87 -1.22
95% mean confidence interval for instructions %-change: -0.69% -0.33%
Instructions are helped.

total cvt in shared programs: 14128.84 -> 14128.09 (<.01%)
cvt in affected programs: 78.17 -> 77.42 (-0.96%)
helped: 31
HURT: 0
helped stats (abs) min: 0.015625 max: 0.046875 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.36% max: 4.26% x̄: 1.28% x̃: 1.20%
95% mean confidence interval for cvt value: -0.03 -0.02
95% mean confidence interval for cvt %-change: -1.62% -0.94%
Cvt are helped.

total quadwords in shared programs: 1449920 -> 1449840 (<.01%)
quadwords in affected programs: 2184 -> 2104 (-3.66%)
helped: 10
HURT: 0
helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
helped stats (rel) min: 2.44% max: 5.88% x̄: 4.11% x̃: 4.76%
95% mean confidence interval for quadwords value: -8.00 -8.00
95% mean confidence interval for quadwords %-change: -5.11% -3.12%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>
2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig
eab1d36643 pan/bi: Implement some extracts and inserts
Rather than lowering in NIR. Importantly for Valhall, this allows
nir_opt_algebraic to optimize various bitwise ops into extracts and inserts,
taking pressure off the low-throughout SFU pipe and moving it onto the
high-throughput CVT pipe. This will mitigate a cycle count regression from
switching to the precise idiv lowering.

This also generates more integer widening conversions which we can fold into
32-bit instructions later, to allow optimizing GLSL like "(a & 0xFFFF) + b"

Valhall:

total instructions in shared programs: 2674836 -> 2674840 (<.01%)
instructions in affected programs: 6473 -> 6477 (0.06%)
helped: 14
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.16% max: 1.37% x̄: 0.41% x̃: 0.49%
HURT stats (abs)   min: 3.0 max: 3.0 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 1.19% max: 1.62% x̄: 1.35% x̃: 1.24%
95% mean confidence interval for instructions value: -0.68 1.08
95% mean confidence interval for instructions %-change: -0.30% 0.53%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 140627.42 -> 140627.36 (<.01%)
cycles in affected programs: 2.31 -> 2.25 (-2.70%)
helped: 1
HURT: 0

total cvt in shared programs: 14127.25 -> 14128.91 (0.01%)
cvt in affected programs: 153.50 -> 155.16 (1.08%)
helped: 0
HURT: 41
HURT stats (abs)   min: 0.015625 max: 0.09375 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.27% max: 4.44% x̄: 1.61% x̃: 1.22%
95% mean confidence interval for cvt value: 0.03 0.05
95% mean confidence interval for cvt %-change: 1.29% 1.93%
Cvt are HURT.

total sfu in shared programs: 7555.69 -> 7549.31 (-0.08%)
sfu in affected programs: 107.31 -> 100.94 (-5.94%)
helped: 48
HURT: 0
helped stats (abs) min: 0.0625 max: 0.375 x̄: 0.13 x̃: 0
helped stats (rel) min: 1.34% max: 50.00% x̄: 13.57% x̃: 7.14%
95% mean confidence interval for sfu value: -0.15 -0.11
95% mean confidence interval for sfu %-change: -17.07% -10.06%
Sfu are helped.

total quadwords in shared programs: 1449912 -> 1449928 (<.01%)
quadwords in affected programs: 256 -> 272 (6.25%)
helped: 0
HURT: 2

Bifrost:

total instructions in shared programs: 2415370 -> 2415380 (<.01%)
instructions in affected programs: 1642 -> 1652 (0.61%)
helped: 2
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.40% max: 0.40% x̄: 0.40% x̃: 0.40%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.95% max: 1.27% x̄: 1.07% x̃: 1.00%
95% mean confidence interval for instructions value: 0.09 2.41
95% mean confidence interval for instructions %-change: 0.13% 1.29%
Instructions are HURT.

total tuples in shared programs: 1928495 -> 1928476 (<.01%)
tuples in affected programs: 3329 -> 3310 (-0.57%)
helped: 9
HURT: 2
helped stats (abs) min: 1.0 max: 6.0 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.25% max: 2.33% x̄: 1.00% x̃: 0.75%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.48% max: 0.48% x̄: 0.48% x̃: 0.48%
95% mean confidence interval for tuples value: -3.46 0.00
95% mean confidence interval for tuples %-change: -1.35% -0.10%
Inconclusive result (value mean confidence interval includes 0).

total clauses in shared programs: 354978 -> 354983 (<.01%)
clauses in affected programs: 398 -> 403 (1.26%)
helped: 3
HURT: 8
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.33% max: 3.85% x̄: 2.83% x̃: 2.33%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 2.27% max: 3.70% x̄: 2.88% x̃: 2.78%
95% mean confidence interval for clauses value: -0.17 1.08
95% mean confidence interval for clauses %-change: -0.51% 3.16%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 166575.69 -> 166575.65 (<.01%)
cycles in affected programs: 6.88 -> 6.83 (-0.61%)
helped: 1
HURT: 0

total arith in shared programs: 73688.79 -> 73688 (<.01%)
arith in affected programs: 127.29 -> 126.50 (-0.62%)
helped: 9
HURT: 2
helped stats (abs) min: 0.04166700000000034 max: 0.25 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.26% max: 2.45% x̄: 1.07% x̃: 0.80%
HURT stats (abs)   min: 0.08333299999999966 max: 0.08333299999999966 x̄: 0.08 x̃: 0
HURT stats (rel)   min: 0.55% max: 0.55% x̄: 0.55% x̃: 0.55%
95% mean confidence interval for arith value: -0.14 0.00
95% mean confidence interval for arith %-change: -1.44% -0.11%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 1674514 -> 1674480 (<.01%)
quadwords in affected programs: 9086 -> 9052 (-0.37%)
helped: 23
HURT: 2
helped stats (abs) min: 1.0 max: 6.0 x̄: 1.65 x̃: 1
helped stats (rel) min: 0.15% max: 2.79% x̄: 0.63% x̃: 0.33%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.53% max: 0.53% x̄: 0.53% x̃: 0.53%
95% mean confidence interval for quadwords value: -2.08 -0.64
95% mean confidence interval for quadwords %-change: -0.86% -0.21%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>
2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig
93f69e0452 panfrost: Don't segfault on unknown models
If we don't recognize the model, dev->model will be NULL. In that case, we can't
dereference dev->model to get the tilebuffer size. If we do, we'll segfault,
instead of gracefully refusing to probe and loading the swrast instead.

Fixes: 96d65b47c7 ("panfrost: Use implementation-specific tile size")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18115>
2022-08-19 14:43:43 +00:00
Alyssa Rosenzweig
d7e6174c2b pan/mdg: Remove disassembler stats
They're now unused and they were never especially useful.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:56 +00:00
Alyssa Rosenzweig
76e8f8b40e pan/decode: Clean up _bifrost_ decode routines
It's noisy since Bifrost was introduced, unnecessary since we converted to
per-arch GenXML, and wrong since Valhall was added.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:56 +00:00
Alyssa Rosenzweig
5c00efa695 pan/decode: Centrally declare pandecode entrypoints
Deduplicate in preparation for CSF.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
aba69fc9c8 pan/decode: Defeature disassembler stats
Architecturally, these only work for Midgard, and even on Midgard didn't turn
out to be too useful. While we're removing pandecode cruft, let's remove the
stats that just add noise to Bifrost and Valhall (and largely just noise to
Midgard too).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
6dfd0998f2 pan/decode: Unify SFBD/MFBD decoding
It's the same core logic. Unify and let GenXML do its thing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
e88b4949de pan/decode: Reorder MFBD decoding
Eliminate some #ifdef by grouping v5 and v6 state separately.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
504022454c pan/decode: Simplify pandecode_fbd
Remove unsued width/height properties, and use cleaner C syntax to build the
return value.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
9621df9637 pan/decode: Stop passing suffixes around
Unused.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
42319c6b6d pan/decode: Stop passing job index around
There are a lot of problems with passing job_index around:

* Almost entirely unused
* Not particularly helpful even when used
* Mostly ignored for Valhall already
* Doesn't extend to CSF

It only really exists due to the early days of pandecode generating valid C code
as the trace format. With GenXML instead, that's not applicable.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
3298ac4b12 pan/decode: Remove pandecode_msg
It hasn't had a consistent semantic meaning since we've switched decoding over
to GenXML.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig
c4c3f246fe pan/decode: Don't pass around memory handles
The hardware doesn't care what BO a given buffer resides in, only what GPU
address it's at. It's simpler to fetch from a GPU address, rather than the pair
of a GPU address and a backing allocation. This cleans up a lot of cruft in
pandecode.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>
2022-08-17 17:25:55 +00:00
Yonggang Luo
1b38ca7844 panfrost: Do no use designated initializer for union
../src/panfrost/lib/tests/test-earlyzs.cpp: In function 'void test(pan_earlyzs, pan_earlyzs, uint32_t)':
../src/panfrost/lib/tests/test-earlyzs.cpp:59:4: error: 'pan_shader_info::<unnamed union>' has no non-static data member named 'can_discard'
   59 |    };
      |    ^

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18024>
2022-08-12 18:06:36 +00:00
Konstantin Kharlamov
91362340f3 meson: remove source_root() call in nir compiler path
source_root function is deprecated in Meson version 0.56.0, so let's use
instead a current_source_dir() function, available in all Meson
versions. This also allows to deduplicate some code by declaring
commonly used string at the top meson.build file.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17974>
2022-08-12 13:11:03 +00:00
Emma Anholt
cbbd9f3402 ci: Upgrade deqp-runner to 0.15.0.
This includes the new timeout fixes so that tests that throw lots of debug
don't delay the timeout triggering, and the fraction vs shuffling behavior
change so that "--fraction 2" doesn't just skip every other test as it
appears in the caselist (every vertex shader variant, for example).

The fraction vs shuffling change does mean we see some different fails on
some drivers now.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17876>
2022-08-07 02:11:46 +00:00
Alyssa Rosenzweig
07e9543270 pan/decode: Fix overrun decoding planes
We need to calculate the # of descriptors like we do on Midgard.

Fixes: ae9316f812 ("pan/decode: Decode Valhall surface descriptor")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17842>
2022-08-02 21:11:06 +00:00
Alyssa Rosenzweig
ac5c1039a2 pan/bi: Rename CLPER_V6.i32 to CLPER_OLD.i32
To reflect that it is the CLPER of choice on Mali-G31 which is a v7 target.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17754>
2022-08-01 18:42:57 +00:00
Alyssa Rosenzweig
d8bd80afeb pan/bi: Assert that we use the correct CLPER
Add an assert at pack time that would have caught the bug fixed in 7535362204
("pan/bi: Fix clper_xor on Mali-G31").

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17754>
2022-08-01 18:42:57 +00:00
Eli Schwartz
5780ea90c4 meson: add various generated header dependencies as order-only deps
https://mesonbuild.com/FAQ.html#how-do-i-tell-meson-that-my-sources-use-generated-headers

A few locations had underspecified deps on the header files, and this
caused builds to fail given sufficient parallelism.

Fixes #6531

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16659>
2022-07-31 18:10:15 +00:00
Eric Engestrom
2c67457e5e util/list: rename LIST_ENTRY() to list_entry()
This follows the Linux kernel convention, and avoids collision with
macOS header macro.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6751
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6840
Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17772>
2022-07-28 10:10:44 +00:00
Icecream95
a8dbf61b46 panfrost: Add a debug option for checking overflows on pool uploads
PAN_MESA_DEBUG=overflow will place objects as close as possible to a
protected region at the end of the buffer, so that overflows segfault.

Caught the bugs in all four of the preceding commits.

v2: memset the BO to 0xbb to catch code expecting zeroed allocations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17447>
2022-07-23 00:56:10 +00:00
Icecream95
379ae6d823 panfrost: Emit the correct number of attributes
create_vertex_elements_state is sometimes called with a too large
num_elements argument, for example with util_blitter, which causes a
buffer overflow.

There is no documentation to forbid this practice, so don't rely on
so->num_elements being correct and instead use the vertex shader
attribute count, which matches the value used to allocate the
descriptors.

Use attributes_read_count rather than attribute_count because the
latter also includes images and PAN_VERTEX_ID/PAN_INSTANCE_ID.

Fixes: 76de3e691c ("panfrost: Merge attribute packing routines")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17447>
2022-07-23 00:56:10 +00:00
Jason Ekstrand
b510ee0d22 Use vk_foreach_struct_const where needed
We're about to make it so that the compiler warns/errors if you use the
wrong iterator macro.  Fix up a bunch of places where someone used the
wrong one before we break anything.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17630>
2022-07-19 19:55:17 +00:00
Alyssa Rosenzweig
3a0a8688d3 panfrost: Use early-ZS helpers
Remove the previous compile-time early-ZS implementation and replace it with the
decoupled early-ZS implementation. This uses more efficient settings in some
cases (depth/stencil tests always passes or do not write), and fixes the
settings used in another case (alpha-to-coverage enabled with an otherwise
early-ZS shader.)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Closes: #6206
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig
fe875c0144 panfrost: Unit test early-ZS helpers
The new early-ZS helpers are pure functions, leaf nodes of the call graph, and
implemented with a different algorithm from the "oracle" table of correct values
for various combinations of states. Further, incorrect settings often still pass
CTS while causing game bugs or inefficiencies. That combination makes the
helpers an excellent candidate for unit tests. Add some.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig
e96292bc07 panfrost: Add decoupled early-ZS helpers
Bifrost (and Valhall) separate early-ZS configuration into two fields: when does
the depth/stencil buffer update happen? and when are pixels killed by the
depth/stencil tests? The driver separately configures these to occur early
(before the shader executes) or late (after the ATEST instruction executes at
the end of the shader). Early tests are generally more efficient, but various
combinations of API state and fragment shader properties can require late
updates and/or late kills for correctness. Determining how to configure these
fields is nontrivial.

Our current implementation (on Bifrost) configures these fields at fragment
shader compile time and bakes the settings into the RSD. This is both wrong
(using early testing when late testing is required) and suboptimal (using late
testing when early testing would suffice). We need to defer this configuration
until draw time, when we know rasterizer and Z/S state.

Reclassifying at draw time (as we currently do on Valhall) would be expensive,
especially with the extra terms added in here. To cope, decouple the shader
classification from the draw-time configuration. Since there are only a few bits
of draw state involved, this implementation just calculates all possible states.
Then the draw time classification is just indexing into a lookup table.

The actual algorithm used to classify is written with correctness and clarity in
mind. Unlike the current classification algorithm (which tries to match what the
DDK does, poorly), this algorithm embeds its proofs of correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig
29c33f75d3 pan/va: Stall after ATEST
In theory this wait is required for correct behaviour of discarded threads with
ATEST. Mesa usually waits before the instruction after ATEST, so this wait will
get optimized out by va_merge_flow, but as our scheduler gets more sophisticated
this could become an issue.

Let's stay on the safe side and insert the recommended wait.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig
db2bdc1dc3 pan/bi: Require ATEST coverage mask input in R60
In theory, ATEST can take any combination of registers for inputs.
Experimentally, however, ATEST requires the coverage mask in R60. This avoids
regressing the following dEQP tests, which write their coverage mask with
pixel-frequency-shading but without writing to the depth/stencil buffer.

dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_pixel.*

This issue is known to affect both Mali-G52 (v7) and Mali-G57 (v9). I am unsure
if this is a silicon bug or just an obscure implementation detail.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
2022-07-13 21:05:35 +00:00