fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 09:18:10 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	feb9020039	panfrost: Enable Mali-G57 Everything required for conformant OpenGL ES 3.1 support on Valhall (v9) is now upstream -- all that's left is to enable implementations! Add the GPU ID for the Mali-G57 implemented in the MediaTek MT8192 system-on-chip. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16890>	2022-06-06 19:30:15 +00:00
Alyssa Rosenzweig	28801cfba0	pan/va: Unit test constant lowering pass Like other optimizations, breaking this pass may not affect functional correctness. It's also dead simple to unit test the pass, so we have no excuse not to. Add unit tests for the functionality we currently support, since we just extended it and want to make sure everything still works. This includes tests for use of modifiers to get more small constants. There are lots of subtle gotchas there, so let's add lots of unit tests to make sure we got it right. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>	2022-06-06 18:10:24 +00:00
Alyssa Rosenzweig	9cfafbb09b	pan/va: Try widening small constants Many small integers are availabled as small constants, but the table of small constants is tightly packed. Zero and sign extensions are usually required to access small integers. When packing constants, try zero/sign extension for unsigned/signed integer instructions respectively. total instructions in shared programs: 2716912 -> 2707795 (-0.34%) instructions in affected programs: 1045609 -> 1036492 (-0.87%) helped: 4460 HURT: 125 helped stats (abs) min: 1.0 max: 58.0 x̄: 2.14 x̃: 1 helped stats (rel) min: 0.14% max: 23.85% x̄: 1.35% x̃: 0.88% HURT stats (abs) min: 1.0 max: 68.0 x̄: 3.41 x̃: 1 HURT stats (rel) min: 0.34% max: 3.88% x̄: 0.93% x̃: 0.70% 95% mean confidence interval for instructions value: -2.09 -1.89 95% mean confidence interval for instructions %-change: -1.33% -1.25% Instructions are helped. total cycles in shared programs: 141984.06 -> 141932.42 (-0.04%) cycles in affected programs: 552.08 -> 500.44 (-9.35%) helped: 18 HURT: 0 helped stats (abs) min: 0.015625 max: 11.0 x̄: 2.87 x̃: 0 helped stats (rel) min: 0.50% max: 19.64% x̄: 5.36% x̃: 1.53% 95% mean confidence interval for cycles value: -5.17 -0.56 95% mean confidence interval for cycles %-change: -9.28% -1.44% Cycles are helped. total cvt in shared programs: 13805.05 -> 13663.39 (-1.03%) cvt in affected programs: 6127.45 -> 5985.80 (-2.31%) helped: 4460 HURT: 125 helped stats (abs) min: 0.015625 max: 0.90625 x̄: 0.03 x̃: 0 helped stats (rel) min: 0.35% max: 50.00% x̄: 5.19% x̃: 4.00% HURT stats (abs) min: 0.015625 max: 1.0625 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.77% max: 9.30% x̄: 3.40% x̃: 2.78% 95% mean confidence interval for cvt value: -0.03 -0.03 95% mean confidence interval for cvt %-change: -5.10% -4.81% Cvt are helped. total ls in shared programs: 129545 -> 129494 (-0.04%) ls in affected programs: 495 -> 444 (-10.30%) helped: 6 HURT: 0 helped stats (abs) min: 2.0 max: 11.0 x̄: 8.50 x̃: 11 helped stats (rel) min: 1.49% max: 19.64% x̄: 13.95% x̃: 19.64% 95% mean confidence interval for ls value: -12.68 -4.32 95% mean confidence interval for ls %-change: -23.23% -4.67% Ls are helped. total quadwords in shared programs: 1476416 -> 1469824 (-0.45%) quadwords in affected programs: 121208 -> 114616 (-5.44%) helped: 820 HURT: 16 helped stats (abs) min: 8.0 max: 32.0 x̄: 8.28 x̃: 8 helped stats (rel) min: 1.39% max: 50.00% x̄: 11.00% x̃: 10.00% HURT stats (abs) min: 8.0 max: 32.0 x̄: 12.50 x̃: 8 HURT stats (rel) min: 1.38% max: 10.00% x̄: 6.19% x̃: 7.14% 95% mean confidence interval for quadwords value: -8.14 -7.63 95% mean confidence interval for quadwords %-change: -11.20% -10.15% Quadwords are helped. total threads in shared programs: 53633 -> 53663 (0.06%) threads in affected programs: 39 -> 69 (76.92%) helped: 33 HURT: 3 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.64 1.02 95% mean confidence interval for threads %-change: 73.27% 101.73% Threads are helped. total spills in shared programs: 154 -> 103 (-33.12%) spills in affected programs: 75 -> 24 (-68.00%) helped: 6 HURT: 0 total fills in shared programs: 656 -> 656 (0.00%) fills in affected programs: 148 -> 148 (0.00%) helped: 2 HURT: 4 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>	2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig	72146051d5	pan/va: Try negating small constants when lowering If a constant is used with a floating point instruction with a floating-point negate modifier, we can use the modifier to negate constants in the table for free. Each floating point in the table is positive, so this is required for negative small constants. total instructions in shared programs: 2728438 -> 2716912 (-0.42%) instructions in affected programs: 1418220 -> 1406694 (-0.81%) helped: 6053 HURT: 94 helped stats (abs) min: 1.0 max: 43.0 x̄: 1.94 x̃: 1 helped stats (rel) min: 0.06% max: 18.18% x̄: 1.34% x̃: 0.84% HURT stats (abs) min: 1.0 max: 5.0 x̄: 2.34 x̃: 2 HURT stats (rel) min: 0.09% max: 21.43% x̄: 1.87% x̃: 0.91% 95% mean confidence interval for instructions value: -1.93 -1.82 95% mean confidence interval for instructions %-change: -1.34% -1.25% Instructions are helped. total cycles in shared programs: 142103 -> 141984.06 (-0.08%) cycles in affected programs: 766.70 -> 647.77 (-15.51%) helped: 97 HURT: 0 helped stats (abs) min: 0.015625 max: 40.0 x̄: 1.23 x̃: 0 helped stats (rel) min: 0.27% max: 41.24% x̄: 3.63% x̃: 2.08% 95% mean confidence interval for cycles value: -2.41 -0.04 95% mean confidence interval for cycles %-change: -4.68% -2.57% Cycles are helped. total cvt in shared programs: 13983.34 -> 13805.05 (-1.28%) cvt in affected programs: 7952.45 -> 7774.16 (-2.24%) helped: 6049 HURT: 98 helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.03 x̃: 0 helped stats (rel) min: 0.25% max: 100.00% x̄: 4.74% x̃: 2.52% HURT stats (abs) min: 0.015625 max: 0.078125 x̄: 0.04 x̃: 0 HURT stats (rel) min: 0.17% max: 100.00% x̄: 5.48% x̃: 2.54% 95% mean confidence interval for cvt value: -0.03 -0.03 95% mean confidence interval for cvt %-change: -4.83% -4.32% Cvt are helped. total ls in shared programs: 129660 -> 129545 (-0.09%) ls in affected programs: 601 -> 486 (-19.13%) helped: 7 HURT: 0 helped stats (abs) min: 3.0 max: 40.0 x̄: 16.43 x̃: 8 helped stats (rel) min: 2.88% max: 41.24% x̄: 17.41% x̃: 12.50% 95% mean confidence interval for ls value: -31.42 -1.44 95% mean confidence interval for ls %-change: -29.25% -5.58% Ls are helped. total quadwords in shared programs: 1482728 -> 1476416 (-0.43%) quadwords in affected programs: 131200 -> 124888 (-4.81%) helped: 798 HURT: 15 helped stats (abs) min: 8.0 max: 24.0 x̄: 8.06 x̃: 8 helped stats (rel) min: 0.34% max: 50.00% x̄: 10.15% x̃: 6.67% HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 HURT stats (rel) min: 1.49% max: 100.00% x̄: 11.25% x̃: 2.78% 95% mean confidence interval for quadwords value: -7.92 -7.60 95% mean confidence interval for quadwords %-change: -10.52% -8.99% Quadwords are helped. total threads in shared programs: 53585 -> 53633 (0.09%) threads in affected programs: 51 -> 99 (94.12%) helped: 49 HURT: 1 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.88 1.04 95% mean confidence interval for threads %-change: 90.97% 103.03% Threads are helped. total spills in shared programs: 125 -> 154 (23.20%) spills in affected programs: 75 -> 104 (38.67%) helped: 3 HURT: 4 total fills in shared programs: 800 -> 656 (-18.00%) fills in affected programs: 476 -> 332 (-30.25%) helped: 7 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>	2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig	cecfa0c44a	pan/va: Record which instructions are signed We need to distinguish signed integer instructions from unsigned integer instructions, to distinguish sign-extension and zero-extension of sources. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>	2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig	e57dfed419	pan/bi: Implement b2i with MUX The result_type modifier propagation looks for MUX instructions, so using this canonical b2i implementation allows the sequence b2i(cmp) to be fused. It's also faster on its own: on Valhall, MUX may be implemented as CSEL on the CVT unit, while AND may only be implemented on the SFU unit. So in case this doesn't get fused, we expect 4x better throughput for b2i with this implementation. Similarly, on Bifrost, MUX may be scheduled to either unit (as CSEL on FMA or MUX on ADD), whereas AND may only be scheduled to FMA. Results on Mali-G52: total instructions in shared programs: 2419171 -> 2414814 (-0.18%) instructions in affected programs: 272203 -> 267846 (-1.60%) helped: 767 HURT: 0 helped stats (abs) min: 1.0 max: 138.0 x̄: 5.68 x̃: 2 helped stats (rel) min: 0.12% max: 15.57% x̄: 2.09% x̃: 0.68% 95% mean confidence interval for instructions value: -6.68 -4.68 95% mean confidence interval for instructions %-change: -2.37% -1.82% Instructions are helped. total tuples in shared programs: 1932822 -> 1929234 (-0.19%) tuples in affected programs: 76485 -> 72897 (-4.69%) helped: 380 HURT: 3 helped stats (abs) min: 1.0 max: 138.0 x̄: 9.46 x̃: 1 helped stats (rel) min: 0.14% max: 15.96% x̄: 3.81% x̃: 0.92% HURT stats (abs) min: 1.0 max: 6.0 x̄: 2.67 x̃: 1 HURT stats (rel) min: 0.38% max: 8.57% x̄: 3.80% x̃: 2.44% 95% mean confidence interval for tuples value: -11.30 -7.44 95% mean confidence interval for tuples %-change: -4.27% -3.22% Tuples are helped. total clauses in shared programs: 356094 -> 355992 (-0.03%) clauses in affected programs: 3264 -> 3162 (-3.12%) helped: 80 HURT: 0 helped stats (abs) min: 1.0 max: 9.0 x̄: 1.27 x̃: 1 helped stats (rel) min: 0.81% max: 50.00% x̄: 4.83% x̃: 3.39% 95% mean confidence interval for clauses value: -1.49 -1.06 95% mean confidence interval for clauses %-change: -6.23% -3.43% Clauses are helped. total cycles in shared programs: 167337.10 -> 167329.19 (<.01%) cycles in affected programs: 510.08 -> 502.17 (-1.55%) helped: 80 HURT: 2 helped stats (abs) min: 0.041665999999999315 max: 0.7916659999999993 x̄: 0.10 x̃: 0 helped stats (rel) min: 0.51% max: 13.64% x̄: 2.12% x̃: 1.34% HURT stats (abs) min: 0.041665999999999315 max: 0.0416669999999999 x̄: 0.04 x̃: 0 HURT stats (rel) min: 0.39% max: 2.78% x̄: 1.58% x̃: 1.58% 95% mean confidence interval for cycles value: -0.12 -0.07 95% mean confidence interval for cycles %-change: -2.59% -1.48% Cycles are helped. total arith in shared programs: 73819.54 -> 73669.25 (-0.20%) arith in affected programs: 2840.54 -> 2690.25 (-5.29%) helped: 383 HURT: 3 helped stats (abs) min: 0.041665999999999315 max: 5.75 x̄: 0.39 x̃: 0 helped stats (rel) min: 0.33% max: 18.81% x̄: 4.39% x̃: 0.98% HURT stats (abs) min: 0.041665999999999315 max: 0.25 x̄: 0.11 x̃: 0 HURT stats (rel) min: 0.39% max: 8.96% x̄: 4.04% x̃: 2.78% 95% mean confidence interval for arith value: -0.47 -0.31 95% mean confidence interval for arith %-change: -4.93% -3.71% Arith are helped. total quadwords in shared programs: 1679798 -> 1676259 (-0.21%) quadwords in affected programs: 72826 -> 69287 (-4.86%) helped: 381 HURT: 15 helped stats (abs) min: 1.0 max: 142.0 x̄: 9.35 x̃: 1 helped stats (rel) min: 0.25% max: 18.87% x̄: 4.33% x̃: 1.13% HURT stats (abs) min: 1.0 max: 6.0 x̄: 1.47 x̃: 1 HURT stats (rel) min: 0.30% max: 6.25% x̄: 0.77% x̃: 0.35% 95% mean confidence interval for quadwords value: -10.76 -7.11 95% mean confidence interval for quadwords %-change: -4.71% -3.56% Quadwords are helped. Results on Mali-G57: total instructions in shared programs: 2704193 -> 2699317 (-0.18%) instructions in affected programs: 293366 -> 288490 (-1.66%) helped: 758 HURT: 5 helped stats (abs) min: 1.0 max: 151.0 x̄: 6.45 x̃: 2 helped stats (rel) min: 0.11% max: 22.22% x̄: 2.05% x̃: 0.64% HURT stats (abs) min: 1.0 max: 7.0 x̄: 2.20 x̃: 1 HURT stats (rel) min: 0.22% max: 1.69% x̄: 0.87% x̃: 1.08% 95% mean confidence interval for instructions value: -7.42 -5.36 95% mean confidence interval for instructions %-change: -2.27% -1.79% Instructions are helped. total cycles in shared programs: 141711.73 -> 141711.84 (<.01%) cycles in affected programs: 214.36 -> 214.47 (0.05%) helped: 4 HURT: 42 helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.20 x̃: 0 helped stats (rel) min: 1.85% max: 12.78% x̄: 9.12% x̃: 10.93% HURT stats (abs) min: 0.015625 max: 0.09375 x̄: 0.02 x̃: 0 HURT stats (rel) min: 0.17% max: 17.65% x̄: 0.84% x̃: 0.34% 95% mean confidence interval for cycles value: -0.02 0.03 95% mean confidence interval for cycles %-change: -1.23% 1.17% Inconclusive result (value mean confidence interval includes 0). total cvt in shared programs: 14479.14 -> 14474.19 (-0.03%) cvt in affected programs: 2877.05 -> 2872.09 (-0.17%) helped: 508 HURT: 209 helped stats (abs) min: 0.015625 max: 0.453125 x̄: 0.02 x̃: 0 helped stats (rel) min: 0.25% max: 16.67% x̄: 1.23% x̃: 0.37% HURT stats (abs) min: 0.015625 max: 0.296875 x̄: 0.03 x̃: 0 HURT stats (rel) min: 0.15% max: 18.18% x̄: 1.70% x̃: 0.34% 95% mean confidence interval for cvt value: -0.01 -0.00 95% mean confidence interval for cvt %-change: -0.57% -0.18% Cvt are helped. total sfu in shared programs: 7875.69 -> 7590.75 (-3.62%) sfu in affected programs: 1567.38 -> 1282.44 (-18.18%) helped: 906 HURT: 0 helped stats (abs) min: 0.0625 max: 8.625 x̄: 0.31 x̃: 0 helped stats (rel) min: 2.38% max: 100.00% x̄: 16.80% x̃: 5.63% 95% mean confidence interval for sfu value: -0.37 -0.26 95% mean confidence interval for sfu %-change: -18.43% -15.17% Sfu are helped. total quadwords in shared programs: 1468152 -> 1465800 (-0.16%) quadwords in affected programs: 37104 -> 34752 (-6.34%) helped: 161 HURT: 2 helped stats (abs) min: 8.0 max: 80.0 x̄: 14.71 x̃: 8 helped stats (rel) min: 1.67% max: 20.00% x̄: 8.05% x̃: 7.69% HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 HURT stats (rel) min: 3.57% max: 3.85% x̄: 3.71% x̃: 3.71% 95% mean confidence interval for quadwords value: -16.29 -12.57 95% mean confidence interval for quadwords %-change: -8.58% -7.22% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>	2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig	8f3b62f87e	pan/va: Add MUX lowering tests Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>	2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig	677a66b3eb	pan/va: Lower MUX to CSEL where possible CSEL executes on the conversion unit (CVT), while MUX executes on the special function unit (SFU). Throughput on CVT is 4x higher than SFU, so this is (almost) always an optimization. The "real" MUX is still used for unusual cases, like 8-bit and bitselect. Note that it's easier for us to use MUX everywhere for the IR. This is an easy fixup to get better codegen on Valhall without touching the core Bifrost code. shader-db is a bit of a toss up: register pressure and instruction count are hurt in some cases due to restrictions on FAU access. In particular, a shader that muxes between two uniforms needs an extra move due to extra constant (zero). However, in terms of throughput this is still a win: 2 CVT instructions (MOV + CSEL) have 2x throughput to 1 SFU instruction (MUX). The MOV has opportunities for CSE, but that can hurt pressure in turn. Overall, cycles are helped substantially. total instructions in shared programs: 2728438 -> 2731597 (0.12%) instructions in affected programs: 414391 -> 417550 (0.76%) helped: 87 HURT: 1063 helped stats (abs) min: 1.0 max: 6.0 x̄: 5.17 x̃: 6 helped stats (rel) min: 0.19% max: 15.79% x̄: 4.12% x̃: 4.11% HURT stats (abs) min: 1.0 max: 56.0 x̄: 3.40 x̃: 2 HURT stats (rel) min: 0.11% max: 23.43% x̄: 1.15% x̃: 0.63% 95% mean confidence interval for instructions value: 2.47 3.03 95% mean confidence interval for instructions %-change: 0.61% 0.90% Instructions are HURT. total cycles in shared programs: 142103 -> 142015.75 (-0.06%) cycles in affected programs: 1263.45 -> 1176.20 (-6.91%) helped: 281 HURT: 176 helped stats (abs) min: 0.015625 max: 2.234375 x̄: 0.50 x̃: 0 helped stats (rel) min: 0.71% max: 54.17% x̄: 16.93% x̃: 15.31% HURT stats (abs) min: 0.015625 max: 30.0 x̄: 0.30 x̃: 0 HURT stats (rel) min: 0.84% max: 120.00% x̄: 7.16% x̃: 5.00% 95% mean confidence interval for cycles value: -0.33 -0.05 95% mean confidence interval for cycles %-change: -9.08% -6.22% Cycles are helped. total cvt in shared programs: 13983.34 -> 14891.70 (6.50%) cvt in affected programs: 7498.36 -> 8406.72 (12.11%) helped: 71 HURT: 4711 helped stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0 helped stats (rel) min: 5.41% max: 40.00% x̄: 10.23% x̃: 9.30% HURT stats (abs) min: 0.015625 max: 2.640625 x̄: 0.19 x̃: 0 HURT stats (rel) min: 0.18% max: 141.18% x̄: 16.21% x̃: 9.52% 95% mean confidence interval for cvt value: 0.18 0.20 95% mean confidence interval for cvt %-change: 15.21% 16.42% Cvt are HURT. total sfu in shared programs: 11320.44 -> 7882.56 (-30.37%) sfu in affected programs: 7618.50 -> 4180.62 (-45.13%) helped: 4782 HURT: 0 helped stats (abs) min: 0.0625 max: 10.5625 x̄: 0.72 x̃: 0 helped stats (rel) min: 1.34% max: 100.00% x̄: 41.91% x̃: 37.50% 95% mean confidence interval for sfu value: -0.75 -0.68 95% mean confidence interval for sfu %-change: -42.68% -41.14% Sfu are helped. total ls in shared programs: 129660 -> 129690 (0.02%) ls in affected programs: 25 -> 55 (120.00%) helped: 0 HURT: 1 total quadwords in shared programs: 1482728 -> 1484128 (0.09%) quadwords in affected programs: 58624 -> 60024 (2.39%) helped: 24 HURT: 195 helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 helped stats (rel) min: 3.70% max: 20.00% x̄: 10.34% x̃: 10.00% HURT stats (abs) min: 8.0 max: 24.0 x̄: 8.16 x̃: 8 HURT stats (rel) min: 1.41% max: 50.00% x̄: 4.84% x̃: 2.56% 95% mean confidence interval for quadwords value: 5.70 7.09 95% mean confidence interval for quadwords %-change: 2.22% 4.14% Quadwords are HURT. total spills in shared programs: 125 -> 127 (1.60%) spills in affected programs: 0 -> 2 helped: 0 HURT: 1 total fills in shared programs: 800 -> 828 (3.50%) fills in affected programs: 0 -> 28 helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>	2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig	3741606b25	pan/va: Implement more lanes Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>	2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig	1768afa5b9	pan/bi: Extract MUX to CSEL optimization It's portable, and useful to both Bifrost and Valhall, in the clause scheduler and in an instruction selection respectively. Move it from the Bifrost clause scheduler to common code so we can share the benefits. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>	2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig	e1fb182d90	pan/va: Do not insert NOPs into empty shaders It's unnecessary and breaks the empty shader optimizations. Noticed while inspecting a trace from dEQP-GLES3.functional.color_clear.masked_scissored_rgb, which does not produce any varyings other than gl_Position in its vertex shader and hence should omit the varying shader. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16868>	2022-06-06 14:28:59 +00:00
Alyssa Rosenzweig	3b3cd59fb8	panfrost: Launch transform feedback shaders We now have infrastructure in place to generate variants of vertex shaders specialized for transform feedback. All that's left is launching these compute-like kernels before the IDVS job, implementing both the transform feedback and the regular rasterization pipeline. This implements transform feedback on Valhall, passing the relevant GLES3.1 tests. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	ed5a5a9d6d	panfrost: Wire up transfrom feedback sysvals Wire the Gallium interface for transform feedback up to the system values that will be fed into our lowering code. This is based on our existing transform feedback implementation for Midgard. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	4e341e70d8	pan/bi: Handle transform feedback intrinsics Translate the intrinsics we introduced to lower away transform feedback into Panfrost system values which the GL driver can handle. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	ae3fa6cc1d	pan/bi: Add transform feedback lowering pass Add a simple NIR-based implementation of transform feedback, appropriate for OpenGL ES 3.1 class hardware (compute but no geometry or tessellation shaders). Stores to varyings that will be captured are replaced by stores to transform feedback buffers and some addressing math. This allows implementing the semantic of transform feedback in a compute-like stage. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	ed4bd8738d	panfrost/ci: Mark draw_buffers_indexed.* as flakes These keep flaking. Icecream95 observes the issue relates to AFBC in the discussion of the flake in issue 6604. Until the root cause can be identified and fixed, mark the tests as known flakes for CI. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16855>	2022-06-03 21:05:22 +00:00
Alyssa Rosenzweig	7535362204	pan/bi: Fix clper_xor on Mali-G31 Mali-G31 has the old CLPER instruction, not the new one, which means we don't get to specify a custom lane op. But the clper_xor helper incorrectly checked the arch, not the implementation quirk. Fixes: `c00e7b729f` ("pan/bi: Optimize abs(derivative)") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reported-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16846>	2022-06-02 20:32:43 -04:00
Alyssa Rosenzweig	ad5c84999b	pan/bi: Rework Valhall register alignment Because we lower SPLIT and COLLECT before RA, we need to consider offsets when determining the dimensions of vectors, in order to align properly. Lowering COLLECT post-RA would avoid this special case. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	0770e7a90c	pan/bi: Align 64-bit register sources Similar idea to aligning staging register sources. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	8553dd97ad	pan/bi: Allow vec6 for collects Hit for some Valhall texturing instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	1bfff407b9	pan/bi: Use nodearrays for linear constraints Speeds up compiling shaders/skia/781.shader_test in shader-db by 8x (Icecream95). ...At least it did before I extended to support register allocation of vec8. On Valhall, texture instructions require up to 8 consecutive registers. To handle this, provide for vec8 register allocation. Liveness was already (accidentally?) vec8. The increased memory requirement is acceptable given that the interference matrix is now stored sparsely (Alyssa). Icecream95 reports the vec8 changes hurt RA performance by about 1% on average. I consider this acceptable for now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	c70daa74f0	pan/bi: Add nodearray datastructure This is an array which can either be sparse or dense, and was designed to be used to track liveness and interference information. Either a sparse array with sorted indices or dense array is used. Other data structures were tried, such as red-black trees or hash tables, but they were slower. When used for storing constraints, the indices do not have to be sorted as duplicating elements is okay, but the speedup from that was not enough to justify the extra complexity. v2: Add a comment about how to potentially speed it up. But it seems fast enough even without this change. v3: Use a custom struct rather than relying on util_dynarray. v4: Split out functions only used for liveness analysis, rather than the simpler data structure needed for the register interference matrix. If we need to optimize liveness, that can follow on after. Also make it for vec8 (Alyssa). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	c24b78cceb	pan/bi: Reverse linear constraint bits This will make it simpler to implement parallel RA where multiple possible registers for a node are tested at once. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	bc4d42023d	pan/bi: Respect swizzles in nir_op_pack_64_2x32_split Triggered a BIR validation error, which made debugging a breeze. That validation pass (dimensionality checks) gets a lot of use, it seems :-) Fixes: dEQP-VK.ssbo.layout.2_level_array.std430.row_major_mat4x2_comp_access_store_cols Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16724>	2022-06-01 20:08:42 +00:00
Alyssa Rosenzweig	7831508740	panvk: Use vk_image_subresource__count for clears This handles VK_REMAINING_ for us, instead of underflowing and clearing no levels/layers. Fixes dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.* Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16724>	2022-06-01 20:08:42 +00:00
Alyssa Rosenzweig	82d3eb7f18	panfrost: Handle texturing from AFBC on Valhall We need to pack special AFBC-specific plane descriptors instead of the generic plane descriptor. Nothing too fancy here, though. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	9afa8cc555	panfrost: Support rendering to AFBC on Valhall Add the required handling when packing render target and depth buffer descriptors on Valhall. This is mostly equivalent to Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	c2207d27c2	panfrost: Add pan_afbc_compression_mode on Valhall Map a canonical format (a hardware-independent pipe_format) to a compression mode (Valhall-specific hardware enum defined in GenXML). To be used for packing plane descriptors and render target descriptors when AFBC is in use on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	87dcdbdad6	panfrost: Pass arch instead of dev into afbc_format For callers that have a device object, it's easy to pass dev->arch instead of dev. But this requires callers to have a reference to the device, which is tricky for callers that only have the arch via PAN_ARCH. Pass dev->arch instead of dev to accommodate them. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	2cc2f217d4	panfrost: Fix XML for AFBC header on v9 Misnamed field due to copy/paste fail from Bifrost. Fixes: `c011ea6c26` ("panfrost: Shuffle render target AFBC for Valhall") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	e596a0423b	pan/mdg: Print outmods when printing IR In particular, this lets us distinguish mul_high from regular mul. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	a099834b97	pan/mdg: Distinguish SSA vs reg when printing IR This makes it easy to match the printed IR with the indices in the NIR. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	520204ae18	pan/mdg: Only print 1 source for moves This makes the printed IR easier to read at a glance. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	0ee24c46e0	pan/mdg: Only print 2 sources for ALU ..and assert the other sources are null. The one place this might fail in the future is for real FMA, but we don't support that for GL. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	9c9db27e3c	pan/mdg: Only print masked components of swizzle This matches the IR printer with the disassembler, making the output of the IR printer much easier to parse at a glance. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	c9093554d0	pan/mdg: Use "<<" instead of "lsl" Easier to read and consistent with C code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	8c11f4809b	pan/mdg: Remove uppercase write masks These do not convey any additional information, and fail to account for shrinking. In particular, a 64-bit writemask with .keephi would fail to disassemble and instead trip the assertion, since that would be the ZW components. Just delete the broken code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	9e4b457958	pan/mdg: Scalarize with 64-bit sources Otherwise, we can get vec3 with u2u32 with 64-bit sources which we need lowered. Since our current approach is "scalarize all 64-bit ops", we need to check for conversions too. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:05 -04:00
Alyssa Rosenzweig	5067a26f44	pan/bi: Use flow control lowering on Valhall Logically at the same part of the compile pipeline as clause scheduling on Bifrost. Lots of similarities, too. Now that we generate flow control only as a late pass, various hacks in the compiler are no longer necessary and are dropped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	a394c32cd2	pan/va: Unit test flow control merging Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	4b06e7f5b6	pan/va: Unit test flow control insertion Test that we correctly track the scoreboard, helper invocations, reconvergence, and ends and insert NOPs to effect this expected flow control. As the pass inserts NOPs but does not otherwise modify the shader, this is easy to test with well-defined behaviour of the pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	0fa9204049	pan/va: Respect assigned slots Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	492f4055dd	pan/va: Assign slots roundrobin This should reduce false dependencies with asynchronous instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	aa7393f81a	pan/va: Add flow control merging pass Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	03d8439c0a	pan/va: Terminate helper threads On Bifrost, to terminate helper threads we set the td bit on the clause. On Valhall, we need to use the .discard flow control. Extend the flow control NOP insertion to insert NOP.discard where necessary to terminate helper threads. This should reduce wasted work in fragment shaders. This requires fairly involved data flow analysis, but the handling here should be optimal. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	41b39d6d5d	pan/va: Do scoreboard analysis Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	7e3b9cf754	pan/va: Add pass to insert flow control To set flow control modifiers correctly and efficiently, we need a pass that runs after register allocation and scheduling, but before packing. Add such a pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	82b1897900	pan/bi: Print flow control on instructions This helps debug the flow control lowering passes on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	c0180f6bd3	pan/bi: Export helper termination analysis The current helper termination analysis code is hardwired for clauses, so it won't work for Valhall. However, the bulk of it is dataflow analysis which is portable between Bifrost and Valhall. Export the interesting bits so we can reuse them on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	7bb635316b	pan/bi: Export bi_block_add_successor For use in unit tests that need to create blocks. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00

1 2 3 4 5 ...

4152 commits