fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 06:18:10 +02:00

Author	SHA1	Message	Date
Eric Engestrom	c66622de3a	meson: replace manual compiler flags with meson arguments These would only have worked in GCC and Clang, which so far wasn't an issue, but let's clean it up anyway. Cc: mesa-stable Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18190>	2022-08-24 22:13:19 +00:00
Alyssa Rosenzweig	5777f99fc5	pan/mdg: Use correct idiv lowering Rip off the bandaid. We can't tolerate straight-up wrong results. We have an efficient umul_high implementation so it's not so bad. total instructions in shared programs: 1537404 -> 1537204 (-0.01%) instructions in affected programs: 143299 -> 143099 (-0.14%) helped: 89 HURT: 283 helped stats (abs) min: 1.0 max: 41.0 x̄: 5.87 x̃: 6 helped stats (rel) min: 0.39% max: 6.67% x̄: 1.41% x̃: 1.44% HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.14 x̃: 1 HURT stats (rel) min: 0.24% max: 5.71% x̄: 0.35% x̃: 0.27% 95% mean confidence interval for instructions value: -0.96 -0.12 95% mean confidence interval for instructions %-change: -0.17% 0.03% Inconclusive result (%-change mean confidence interval includes 0). total bundles in shared programs: 647521 -> 648154 (0.10%) bundles in affected programs: 45833 -> 46466 (1.38%) helped: 92 HURT: 228 helped stats (abs) min: 1.0 max: 13.0 x̄: 3.10 x̃: 3 helped stats (rel) min: 0.69% max: 7.14% x̄: 2.11% x̃: 1.99% HURT stats (abs) min: 1.0 max: 7.0 x̄: 4.03 x̃: 5 HURT stats (rel) min: 0.59% max: 7.22% x̄: 2.93% x̃: 3.40% 95% mean confidence interval for bundles value: 1.58 2.38 95% mean confidence interval for bundles %-change: 1.21% 1.76% Bundles are HURT. total quadwords in shared programs: 1135141 -> 1138268 (0.28%) quadwords in affected programs: 101064 -> 104191 (3.09%) helped: 30 HURT: 342 helped stats (abs) min: 1.0 max: 30.0 x̄: 4.97 x̃: 3 helped stats (rel) min: 0.24% max: 5.99% x̄: 1.72% x̃: 1.06% HURT stats (abs) min: 1.0 max: 16.0 x̄: 9.58 x̃: 10 HURT stats (rel) min: 0.73% max: 17.14% x̄: 3.64% x̃: 3.80% 95% mean confidence interval for quadwords value: 7.84 8.97 95% mean confidence interval for quadwords %-change: 2.99% 3.43% Quadwords are HURT. total registers in shared programs: 91938 -> 92265 (0.36%) registers in affected programs: 2639 -> 2966 (12.39%) helped: 0 HURT: 280 HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.17 x̃: 1 HURT stats (rel) min: 9.09% max: 50.00% x̄: 12.75% x̃: 11.11% 95% mean confidence interval for registers value: 1.12 1.22 95% mean confidence interval for registers %-change: 12.05% 13.45% Registers are HURT. total threads in shared programs: 55280 -> 55268 (-0.02%) threads in affected programs: 24 -> 12 (-50.00%) helped: 0 HURT: 11 HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.09 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.29 -0.89 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>	2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig	5bc830cbf2	pan/mdg: Reexpress umul_high packing There are a bunch of subtle details of how 32-bit sources are zero-extended to 64-bit, how their swizzles work, how 64-bit destinations are shrunk to 32-bit, and how those two interact. This fixes the interactions... mostly. Fixes umul_high, all such tests should be passing now. Unblocks idiv lowering that depends on umul_high. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>	2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig	7b78e05ba8	pan/mdg: Replicate swizzles for scalar sources This works around issue packing 32-bit scalar swizzles zero-extended to 64-bit, seen with the umul_high implementation. I tried for a while figuring out the root cause (even rewrote a big chunk of disassembler) but am still a bit lost. Nevertheless this is a safe workaround with no performance impact (and avoids relying on NIR undefined behaviour to implement GPU undefined behaviour), so let's do this for now to fix umul_high. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>	2022-08-24 19:54:23 +00:00
Alyssa Rosenzweig	fcae7cfd27	panfrost: Assert that blend shaders are nontrivial Even if the driver doesn't use trivial blend shaders, building and compiling blend shaders is expensive. We shouldn't be building blend shaders that should never be used. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	1d5aad9db4	panfrost: Include mask in replace blend shader name Helpful to disambiguate blend shaders with different colour masks used for the same format/replace operation. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	378b7e37f4	panfrost: Simplify blitter blend shader creation We don't need blending in the blitter. That means blend shaders are only needed on Midgard. Simplify accordingly. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	d849d9779a	panfrost: Avoid blend shader when not blending On Midgard, we need a "blend" shader even if blending is disabled, if the format isn't blendable. This is inefficient. Bifrost solves this by decoupling the format conversion from the blending, allowing opaque (unblended) output to any format without a blend shader or fragment key. Unfortunately, our blend code is from the Midgard era -- I wrote an early version of nir_lower_blend when I was still in high school! -- so we've been using blend shaders for opaque output even on Bifrost. Whoops! In SuperTuxKart, reduces blend shader calls by 30%, translating to a 15% reduction in i-cache misses. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	e59c74ec56	panfrost: Promote blend shader outputs 8->16-bit ..on Bifrost and later, where the conversion hardware makes this reasonable. This saves us from inserting a pile of conversions in the compiler to lower away the 8-bit input/output. This also generates substantially better code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	08746d7b52	panfrost: Don't saturate in Bifrost blend shaders It's unnecessary since the hardware already does the conversion for us. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	b1c9c924c7	panfrost: Set blit output variable types correctly The type of the output variable will propagate through the store_output intrinsic's src_type field to the BLEND instruction's register format field. On Valhall, the register format for a BLEND comes from the instruction -- the register format specified in the conversion descriptor (used on Bifrost) is ignored. That means it has to match. Previously, we always used a blend shader for integer rendering. Since blend shaders ignore the register format of the BLEND instruction, that masked this issue. That also means we don't need to backport this. Will prevent a regression from the following commit. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	d680560970	panfrost: Handle untyped_color_outputs on Bifrost For untyped_color_outputs, we need to ignore the type of the colour output in the shader and instead use the type from the format. We have all the information to do this at blend descriptor pack time, but not at shader compile time. This means we need a (somewhat expensive) fixup in this edge case to ingest NIR-to-TGSI. This will prevent a regression from the rest of the series. Although the register_format field is also present on Valhall blend descriptors, it is ignored so we don't need the fixup there. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	cf55e05f8f	pan/bi: Handle info.fs.untyped_color_outputs on Valhall Colour outputs in TGSI are untyped so we have to ignore the register format on the store_output. Luckily, Valhall gives us the auto32 escape hatch to deal with it, and Bifrost doesn't care anyway. This will avoid regressions from native int output without corrupting the compiler for GLSL and SPIR-V shaders that are well-typed. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	394e1f5862	pan/bi: Don't allow ATEST to take a temporary Clause scheduler edition of `db2bdc1dc3` ("pan/bi: Require ATEST coverage mask input in R60"). ATEST wants to read r60, which can't work if its input isn't even in a register. When per-sample shading isn't in use, prevents regressions in: KHR-GLES31.core.sample_variables.mask.* These tests previously passed because per-sample shading was forced. It's not clear whether the bug addressed in this patch is possible to hit "in the wild", i.e. without the optimizations in this series that allow us to use per-pixel shading in more cases. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	e12a9ce8d6	pan/bi: Don't reorder image loads across stores Fixes flaking in dEQP-GLES31.functional.image_load_store.cube.qualifiers.volatile_r32i due to image reads being moved past a BARRIER. To make this more robust/optimal, we probably need scheduling information (coherent/volatile/etc) added to instructions like ACO does. That's left for a future extension, for now I just want the test to stop flaking. Fixes: `569e5dc745` ("pan/bi: Schedule for pressure pre-RA") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>	2022-08-21 19:37:10 +00:00
Alyssa Rosenzweig	2f3cc22bc4	pan/bi: Use nir_opt_idiv_const Mitigates some of the hurt from idiv lowering. total instructions in shared programs: 2734512 -> 2734269 (<.01%) instructions in affected programs: 10419 -> 10176 (-2.33%) helped: 11 HURT: 4 helped stats (abs) min: 9.0 max: 49.0 x̄: 22.45 x̃: 19 helped stats (rel) min: 1.84% max: 7.50% x̄: 3.65% x̃: 3.30% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.14% max: 0.14% x̄: 0.14% x̃: 0.14% 95% mean confidence interval for instructions value: -25.34 -7.06 95% mean confidence interval for instructions %-change: -3.91% -1.37% Instructions are helped. total cycles in shared programs: 140629.05 -> 140628.61 (<.01%) cycles in affected programs: 25.12 -> 24.69 (-1.74%) helped: 3 HURT: 0 helped stats (abs) min: 0.0625 max: 0.3125 x̄: 0.15 x̃: 0 helped stats (rel) min: 0.82% max: 3.17% x̄: 1.60% x̃: 0.82% total cvt in shared programs: 14826.25 -> 14819.52 (-0.05%) cvt in affected programs: 189.64 -> 182.91 (-3.55%) helped: 42 HURT: 0 helped stats (abs) min: 0.046875 max: 1.015625 x̄: 0.16 x̃: 0 helped stats (rel) min: 0.74% max: 11.76% x̄: 3.73% x̃: 2.82% 95% mean confidence interval for cvt value: -0.23 -0.09 95% mean confidence interval for cvt %-change: -4.65% -2.82% Cvt are helped. total sfu in shared programs: 8601.81 -> 8613.56 (0.14%) sfu in affected programs: 85.62 -> 97.38 (13.72%) helped: 0 HURT: 41 HURT stats (abs) min: 0.0625 max: 1.25 x̄: 0.29 x̃: 0 HURT stats (rel) min: 3.45% max: 33.33% x̄: 15.48% x̃: 16.67% 95% mean confidence interval for sfu value: 0.21 0.36 95% mean confidence interval for sfu %-change: 13.28% 17.69% Sfu are HURT. total quadwords in shared programs: 1479736 -> 1479616 (<.01%) quadwords in affected programs: 3392 -> 3272 (-3.54%) helped: 8 HURT: 0 helped stats (abs) min: 8.0 max: 24.0 x̄: 15.00 x̃: 16 helped stats (rel) min: 1.54% max: 4.62% x̄: 3.57% x̃: 3.71% 95% mean confidence interval for quadwords value: -20.58 -9.42 95% mean confidence interval for quadwords %-change: -4.39% -2.75% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>	2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig	3eb57544b6	pan/bi: Don't use the broken idiv lowering Rip off the band-aid. We can't tolerate straight-up wrong results, after all. Addresses the Bifrost/Valhall portion of #6555. Fixes test_integer_ops uint_math / subcase. total instructions in shared programs: 2674840 -> 2734512 (2.23%) instructions in affected programs: 189964 -> 249636 (31.41%) helped: 0 HURT: 383 HURT stats (abs) min: 8.0 max: 184.0 x̄: 155.80 x̃: 173 HURT stats (rel) min: 1.85% max: 126.09% x̄: 32.38% x̃: 34.46% 95% mean confidence interval for instructions value: 150.98 160.63 95% mean confidence interval for instructions %-change: 31.27% 33.48% Instructions are HURT. total cycles in shared programs: 140627.36 -> 140629.05 (<.01%) cycles in affected programs: 24.81 -> 26.50 (6.80%) helped: 0 HURT: 4 HURT stats (abs) min: 0.1875 max: 0.9375 x̄: 0.42 x̃: 0 HURT stats (rel) min: 2.52% max: 37.50% x̄: 13.26% x̃: 6.52% 95% mean confidence interval for cycles value: -0.14 0.99 95% mean confidence interval for cycles %-change: -13.14% 39.67% Inconclusive result (value mean confidence interval includes 0). total fma in shared programs: 22578.03 -> 22549.94 (-0.12%) fma in affected programs: 1056.33 -> 1028.23 (-2.66%) helped: 383 HURT: 0 helped stats (abs) min: 0.015625 max: 0.375 x̄: 0.07 x̃: 0 helped stats (rel) min: 0.55% max: 50.00% x̄: 3.07% x̃: 2.34% 95% mean confidence interval for fma value: -0.08 -0.07 95% mean confidence interval for fma %-change: -3.39% -2.75% Fma are helped. total cvt in shared programs: 14128.91 -> 14826.25 (4.94%) cvt in affected programs: 1636.23 -> 2333.58 (42.62%) helped: 0 HURT: 383 HURT stats (abs) min: 0.0625 max: 2.109375 x̄: 1.82 x̃: 2 HURT stats (rel) min: 2.52% max: 162.50% x̄: 43.50% x̃: 46.40% 95% mean confidence interval for cvt value: 1.76 1.88 95% mean confidence interval for cvt %-change: 42.07% 44.93% Cvt are HURT. total sfu in shared programs: 7549.31 -> 8601.81 (13.94%) sfu in affected programs: 758.62 -> 1811.12 (138.74%) helped: 0 HURT: 383 HURT stats (abs) min: 0.375 max: 5.0 x̄: 2.75 x̃: 3 HURT stats (rel) min: 23.08% max: 266.67% x̄: 136.66% x̃: 150.00% 95% mean confidence interval for sfu value: 2.67 2.83 95% mean confidence interval for sfu %-change: 133.02% 140.29% Sfu are HURT. total quadwords in shared programs: 1449928 -> 1479736 (2.06%) quadwords in affected programs: 96544 -> 126352 (30.88%) helped: 0 HURT: 382 HURT stats (abs) min: 8.0 max: 96.0 x̄: 78.03 x̃: 88 HURT stats (rel) min: 1.82% max: 100.00% x̄: 31.71% x̃: 34.38% 95% mean confidence interval for quadwords value: 75.63 80.43 95% mean confidence interval for quadwords %-change: 30.67% 32.75% Quadwords are HURT. total threads in shared programs: 53556 -> 53479 (-0.14%) threads in affected programs: 154 -> 77 (-50.00%) helped: 0 HURT: 77 HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.00 -1.00 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. Bifrost is hit harder, unfortunately: total instructions in shared programs: 2414877 -> 2468058 (2.20%) instructions in affected programs: 184585 -> 237766 (28.81%) helped: 0 HURT: 383 HURT stats (abs) min: 12.0 max: 160.0 x̄: 138.85 x̃: 155 HURT stats (rel) min: 1.52% max: 111.94% x̄: 29.43% x̃: 31.44% 95% mean confidence interval for instructions value: 134.47 143.24 95% mean confidence interval for instructions %-change: 28.42% 30.45% Instructions are HURT. total tuples in shared programs: 1927478 -> 1964218 (1.91%) tuples in affected programs: 133176 -> 169916 (27.59%) helped: 0 HURT: 383 HURT stats (abs) min: 5.0 max: 113.0 x̄: 95.93 x̃: 107 HURT stats (rel) min: 1.02% max: 87.04% x̄: 28.44% x̃: 30.57% 95% mean confidence interval for tuples value: 92.80 99.05 95% mean confidence interval for tuples %-change: 27.47% 29.41% Tuples are HURT. total clauses in shared programs: 354853 -> 359513 (1.31%) clauses in affected programs: 22918 -> 27578 (20.33%) helped: 0 HURT: 381 HURT stats (abs) min: 1.0 max: 15.0 x̄: 12.23 x̃: 14 HURT stats (rel) min: 1.14% max: 60.00% x̄: 20.81% x̃: 22.58% 95% mean confidence interval for clauses value: 11.84 12.62 95% mean confidence interval for clauses %-change: 20.13% 21.49% Clauses are HURT. total cycles in shared programs: 166542.56 -> 167639.31 (0.66%) cycles in affected programs: 5012.37 -> 6109.13 (21.88%) helped: 0 HURT: 329 HURT stats (abs) min: 0.20833199999999863 max: 4.666665999999999 x̄: 3.33 x̃: 3 HURT stats (rel) min: 1.05% max: 51.06% x̄: 22.28% x̃: 22.78% 95% mean confidence interval for cycles value: 3.22 3.45 95% mean confidence interval for cycles %-change: 21.45% 23.10% Cycles are HURT. total arith in shared programs: 73643 -> 75173.17 (2.08%) arith in affected programs: 5344.04 -> 6874.21 (28.63%) helped: 0 HURT: 383 HURT stats (abs) min: 0.20833199999999863 max: 4.666667 x̄: 4.00 x̃: 4 HURT stats (rel) min: 1.05% max: 97.92% x̄: 29.47% x̃: 31.64% 95% mean confidence interval for arith value: 3.87 4.13 95% mean confidence interval for arith %-change: 28.45% 30.49% Arith are HURT. total quadwords in shared programs: `1673974` -> 1701720 (1.66%) quadwords in affected programs: 111686 -> 139432 (24.84%) helped: 0 HURT: 383 HURT stats (abs) min: 5.0 max: 84.0 x̄: 72.44 x̃: 81 HURT stats (rel) min: 1.11% max: 78.72% x̄: 25.59% x̃: 27.56% 95% mean confidence interval for quadwords value: 70.16 74.73 95% mean confidence interval for quadwords %-change: 24.74% 26.43% Quadwords are HURT. total threads in shared programs: 53655 -> 53590 (-0.12%) threads in affected programs: 130 -> 65 (-50.00%) helped: 0 HURT: 65 HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.00 -1.00 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>	2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig	35a7490ce2	pan/bi: Optimize pattern from nir_lower_idiv This takes advantage of the .i1 modifier on the comparison to get b2i32 "for free" in typical circumstances, saving an instruction. Will help with an instr count regression from lower_idiv. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>	2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig	1ef20f1f35	pan/bi: Optimize bitwise arithmetic of booleans This is easier to schedule on Bifrost. In theory it's also better on Valhall, but in practice the CVT unit is too overloaded on Valhall for this to help at the moment. We can revisit these rules for Valhall in the future where the Valhall optimizer is more mature and/or Valhall grows a scheduler to balance the execution units. total instructions in shared programs: 2415350 -> 2414877 (-0.02%) instructions in affected programs: 120948 -> 120475 (-0.39%) helped: 192 HURT: 49 helped stats (abs) min: 1.0 max: 5.0 x̄: 2.89 x̃: 4 helped stats (rel) min: 0.25% max: 4.35% x̄: 0.66% x̃: 0.52% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.11% max: 7.14% x̄: 1.73% x̃: 0.77% 95% mean confidence interval for instructions value: -2.24 -1.68 95% mean confidence interval for instructions %-change: -0.37% 0.02% Inconclusive result (%-change mean confidence interval includes 0). total tuples in shared programs: 1928474 -> 1927478 (-0.05%) tuples in affected programs: 146482 -> 145486 (-0.68%) helped: 514 HURT: 73 helped stats (abs) min: 1.0 max: 8.0 x̄: 2.11 x̃: 1 helped stats (rel) min: 0.18% max: 9.52% x̄: 1.35% x̃: 0.76% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.23 x̃: 1 HURT stats (rel) min: 0.15% max: 7.14% x̄: 1.07% x̃: 0.76% 95% mean confidence interval for tuples value: -1.85 -1.55 95% mean confidence interval for tuples %-change: -1.19% -0.91% Tuples are helped. total clauses in shared programs: 354985 -> 354853 (-0.04%) clauses in affected programs: 8562 -> 8430 (-1.54%) helped: 124 HURT: 22 helped stats (abs) min: 1.0 max: 8.0 x̄: 1.24 x̃: 1 helped stats (rel) min: 0.83% max: 7.14% x̄: 2.47% x̃: 1.72% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.25% max: 20.00% x̄: 5.08% x̃: 4.35% 95% mean confidence interval for clauses value: -1.11 -0.70 95% mean confidence interval for clauses %-change: -1.92% -0.75% Clauses are helped. total cycles in shared programs: 166575.48 -> 166542.56 (-0.02%) cycles in affected programs: 4556.58 -> 4523.67 (-0.72%) helped: 395 HURT: 65 helped stats (abs) min: 0.041665999999999315 max: 0.33333199999999863 x̄: 0.09 x̃: 0 helped stats (rel) min: 0.19% max: 11.11% x̄: 1.42% x̃: 0.81% HURT stats (abs) min: 0.041665999999999315 max: 0.08333400000000069 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.15% max: 8.33% x̄: 1.21% x̃: 0.83% 95% mean confidence interval for cycles value: -0.08 -0.06 95% mean confidence interval for cycles %-change: -1.22% -0.87% Cycles are helped. total arith in shared programs: 73687.88 -> 73643 (-0.06%) arith in affected programs: 6339 -> 6294.13 (-0.71%) helped: 570 HURT: 72 helped stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.08 x̃: 0 helped stats (rel) min: 0.19% max: 12.50% x̄: 1.41% x̃: 0.77% HURT stats (abs) min: 0.041665999999999315 max: 0.08333400000000069 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.15% max: 8.33% x̄: 1.13% x̃: 0.75% 95% mean confidence interval for arith value: -0.08 -0.06 95% mean confidence interval for arith %-change: -1.27% -0.98% Arith are helped. total quadwords in shared programs: 1674486 -> `1673974` (-0.03%) quadwords in affected programs: 117696 -> 117184 (-0.44%) helped: 424 HURT: 127 helped stats (abs) min: 1.0 max: 6.0 x̄: 1.64 x̃: 1 helped stats (rel) min: 0.19% max: 4.88% x̄: 1.00% x̃: 0.82% HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.46 x̃: 1 HURT stats (rel) min: 0.15% max: 6.25% x̄: 1.31% x̃: 0.88% 95% mean confidence interval for quadwords value: -1.07 -0.79 95% mean confidence interval for quadwords %-change: -0.58% -0.36% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17266>	2022-08-19 20:48:37 +00:00
Alyssa Rosenzweig	718748fe61	pan/bi: Test int8/16 -> float32 opts These are easy, since round modes don't matter. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>	2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig	c88b8cbee3	pan/bi: Fuse [US][8\|16]_TO_F32 ops This combines nicely with the previous isel change. Now GLSL like float(int_x >> 24) will generate a single machine instruction S8_TO_F32 int_x.b3 Noticed when debugging KHR-GLES31.core.shader_bitfield_operation.unpackSnorm4x8.0 ...but naturally no real workloads care. Helped shaders are from Android games that appear to have run through a translator, naturally. total instructions in shared programs: 2674831 -> 2674783 (<.01%) instructions in affected programs: 11493 -> 11445 (-0.42%) helped: 31 HURT: 0 helped stats (abs) min: 1.0 max: 3.0 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.16% max: 2.90% x̄: 0.51% x̃: 0.41% 95% mean confidence interval for instructions value: -1.87 -1.22 95% mean confidence interval for instructions %-change: -0.69% -0.33% Instructions are helped. total cvt in shared programs: 14128.84 -> 14128.09 (<.01%) cvt in affected programs: 78.17 -> 77.42 (-0.96%) helped: 31 HURT: 0 helped stats (abs) min: 0.015625 max: 0.046875 x̄: 0.02 x̃: 0 helped stats (rel) min: 0.36% max: 4.26% x̄: 1.28% x̃: 1.20% 95% mean confidence interval for cvt value: -0.03 -0.02 95% mean confidence interval for cvt %-change: -1.62% -0.94% Cvt are helped. total quadwords in shared programs: 1449920 -> 1449840 (<.01%) quadwords in affected programs: 2184 -> 2104 (-3.66%) helped: 10 HURT: 0 helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 helped stats (rel) min: 2.44% max: 5.88% x̄: 4.11% x̃: 4.76% 95% mean confidence interval for quadwords value: -8.00 -8.00 95% mean confidence interval for quadwords %-change: -5.11% -3.12% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>	2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig	eab1d36643	pan/bi: Implement some extracts and inserts Rather than lowering in NIR. Importantly for Valhall, this allows nir_opt_algebraic to optimize various bitwise ops into extracts and inserts, taking pressure off the low-throughout SFU pipe and moving it onto the high-throughput CVT pipe. This will mitigate a cycle count regression from switching to the precise idiv lowering. This also generates more integer widening conversions which we can fold into 32-bit instructions later, to allow optimizing GLSL like "(a & 0xFFFF) + b" Valhall: total instructions in shared programs: 2674836 -> 2674840 (<.01%) instructions in affected programs: 6473 -> 6477 (0.06%) helped: 14 HURT: 6 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.16% max: 1.37% x̄: 0.41% x̃: 0.49% HURT stats (abs) min: 3.0 max: 3.0 x̄: 3.00 x̃: 3 HURT stats (rel) min: 1.19% max: 1.62% x̄: 1.35% x̃: 1.24% 95% mean confidence interval for instructions value: -0.68 1.08 95% mean confidence interval for instructions %-change: -0.30% 0.53% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 140627.42 -> 140627.36 (<.01%) cycles in affected programs: 2.31 -> 2.25 (-2.70%) helped: 1 HURT: 0 total cvt in shared programs: 14127.25 -> 14128.91 (0.01%) cvt in affected programs: 153.50 -> 155.16 (1.08%) helped: 0 HURT: 41 HURT stats (abs) min: 0.015625 max: 0.09375 x̄: 0.04 x̃: 0 HURT stats (rel) min: 0.27% max: 4.44% x̄: 1.61% x̃: 1.22% 95% mean confidence interval for cvt value: 0.03 0.05 95% mean confidence interval for cvt %-change: 1.29% 1.93% Cvt are HURT. total sfu in shared programs: 7555.69 -> 7549.31 (-0.08%) sfu in affected programs: 107.31 -> 100.94 (-5.94%) helped: 48 HURT: 0 helped stats (abs) min: 0.0625 max: 0.375 x̄: 0.13 x̃: 0 helped stats (rel) min: 1.34% max: 50.00% x̄: 13.57% x̃: 7.14% 95% mean confidence interval for sfu value: -0.15 -0.11 95% mean confidence interval for sfu %-change: -17.07% -10.06% Sfu are helped. total quadwords in shared programs: 1449912 -> 1449928 (<.01%) quadwords in affected programs: 256 -> 272 (6.25%) helped: 0 HURT: 2 Bifrost: total instructions in shared programs: 2415370 -> 2415380 (<.01%) instructions in affected programs: 1642 -> 1652 (0.61%) helped: 2 HURT: 6 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.40% max: 0.40% x̄: 0.40% x̃: 0.40% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.95% max: 1.27% x̄: 1.07% x̃: 1.00% 95% mean confidence interval for instructions value: 0.09 2.41 95% mean confidence interval for instructions %-change: 0.13% 1.29% Instructions are HURT. total tuples in shared programs: 1928495 -> `1928476` (<.01%) tuples in affected programs: 3329 -> 3310 (-0.57%) helped: 9 HURT: 2 helped stats (abs) min: 1.0 max: 6.0 x̄: 2.56 x̃: 2 helped stats (rel) min: 0.25% max: 2.33% x̄: 1.00% x̃: 0.75% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.48% max: 0.48% x̄: 0.48% x̃: 0.48% 95% mean confidence interval for tuples value: -3.46 0.00 95% mean confidence interval for tuples %-change: -1.35% -0.10% Inconclusive result (value mean confidence interval includes 0). total clauses in shared programs: 354978 -> 354983 (<.01%) clauses in affected programs: 398 -> 403 (1.26%) helped: 3 HURT: 8 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.33% max: 3.85% x̄: 2.83% x̃: 2.33% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 2.27% max: 3.70% x̄: 2.88% x̃: 2.78% 95% mean confidence interval for clauses value: -0.17 1.08 95% mean confidence interval for clauses %-change: -0.51% 3.16% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 166575.69 -> 166575.65 (<.01%) cycles in affected programs: 6.88 -> 6.83 (-0.61%) helped: 1 HURT: 0 total arith in shared programs: 73688.79 -> 73688 (<.01%) arith in affected programs: 127.29 -> 126.50 (-0.62%) helped: 9 HURT: 2 helped stats (abs) min: 0.04166700000000034 max: 0.25 x̄: 0.11 x̃: 0 helped stats (rel) min: 0.26% max: 2.45% x̄: 1.07% x̃: 0.80% HURT stats (abs) min: 0.08333299999999966 max: 0.08333299999999966 x̄: 0.08 x̃: 0 HURT stats (rel) min: 0.55% max: 0.55% x̄: 0.55% x̃: 0.55% 95% mean confidence interval for arith value: -0.14 0.00 95% mean confidence interval for arith %-change: -1.44% -0.11% Inconclusive result (value mean confidence interval includes 0). total quadwords in shared programs: 1674514 -> 1674480 (<.01%) quadwords in affected programs: 9086 -> 9052 (-0.37%) helped: 23 HURT: 2 helped stats (abs) min: 1.0 max: 6.0 x̄: 1.65 x̃: 1 helped stats (rel) min: 0.15% max: 2.79% x̄: 0.63% x̃: 0.33% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.53% max: 0.53% x̄: 0.53% x̃: 0.53% 95% mean confidence interval for quadwords value: -2.08 -0.64 95% mean confidence interval for quadwords %-change: -0.86% -0.21% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17857>	2022-08-19 17:25:58 +00:00
Alyssa Rosenzweig	93f69e0452	panfrost: Don't segfault on unknown models If we don't recognize the model, dev->model will be NULL. In that case, we can't dereference dev->model to get the tilebuffer size. If we do, we'll segfault, instead of gracefully refusing to probe and loading the swrast instead. Fixes: `96d65b47c7` ("panfrost: Use implementation-specific tile size") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18115>	2022-08-19 14:43:43 +00:00
Alyssa Rosenzweig	d7e6174c2b	pan/mdg: Remove disassembler stats They're now unused and they were never especially useful. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:56 +00:00
Alyssa Rosenzweig	76e8f8b40e	pan/decode: Clean up _bifrost_ decode routines It's noisy since Bifrost was introduced, unnecessary since we converted to per-arch GenXML, and wrong since Valhall was added. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:56 +00:00
Alyssa Rosenzweig	5c00efa695	pan/decode: Centrally declare pandecode entrypoints Deduplicate in preparation for CSF. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	aba69fc9c8	pan/decode: Defeature disassembler stats Architecturally, these only work for Midgard, and even on Midgard didn't turn out to be too useful. While we're removing pandecode cruft, let's remove the stats that just add noise to Bifrost and Valhall (and largely just noise to Midgard too). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	6dfd0998f2	pan/decode: Unify SFBD/MFBD decoding It's the same core logic. Unify and let GenXML do its thing. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	e88b4949de	pan/decode: Reorder MFBD decoding Eliminate some #ifdef by grouping v5 and v6 state separately. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	504022454c	pan/decode: Simplify pandecode_fbd Remove unsued width/height properties, and use cleaner C syntax to build the return value. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	9621df9637	pan/decode: Stop passing suffixes around Unused. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	42319c6b6d	pan/decode: Stop passing job index around There are a lot of problems with passing job_index around: * Almost entirely unused * Not particularly helpful even when used * Mostly ignored for Valhall already * Doesn't extend to CSF It only really exists due to the early days of pandecode generating valid C code as the trace format. With GenXML instead, that's not applicable. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	3298ac4b12	pan/decode: Remove pandecode_msg It hasn't had a consistent semantic meaning since we've switched decoding over to GenXML. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Alyssa Rosenzweig	c4c3f246fe	pan/decode: Don't pass around memory handles The hardware doesn't care what BO a given buffer resides in, only what GPU address it's at. It's simpler to fetch from a GPU address, rather than the pair of a GPU address and a backing allocation. This cleans up a lot of cruft in pandecode. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18094>	2022-08-17 17:25:55 +00:00
Yonggang Luo	1b38ca7844	panfrost: Do no use designated initializer for union ../src/panfrost/lib/tests/test-earlyzs.cpp: In function 'void test(pan_earlyzs, pan_earlyzs, uint32_t)': ../src/panfrost/lib/tests/test-earlyzs.cpp:59:4: error: 'pan_shader_info::<unnamed union>' has no non-static data member named 'can_discard' 59 \| }; \| ^ Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18024>	2022-08-12 18:06:36 +00:00
Konstantin Kharlamov	91362340f3	meson: remove source_root() call in nir compiler path source_root function is deprecated in Meson version 0.56.0, so let's use instead a current_source_dir() function, available in all Meson versions. This also allows to deduplicate some code by declaring commonly used string at the top meson.build file. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17974>	2022-08-12 13:11:03 +00:00
Emma Anholt	cbbd9f3402	ci: Upgrade deqp-runner to 0.15.0. This includes the new timeout fixes so that tests that throw lots of debug don't delay the timeout triggering, and the fraction vs shuffling behavior change so that "--fraction 2" doesn't just skip every other test as it appears in the caselist (every vertex shader variant, for example). The fraction vs shuffling change does mean we see some different fails on some drivers now. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17876>	2022-08-07 02:11:46 +00:00
Alyssa Rosenzweig	07e9543270	pan/decode: Fix overrun decoding planes We need to calculate the # of descriptors like we do on Midgard. Fixes: `ae9316f812` ("pan/decode: Decode Valhall surface descriptor") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17842>	2022-08-02 21:11:06 +00:00
Alyssa Rosenzweig	ac5c1039a2	pan/bi: Rename CLPER_V6.i32 to CLPER_OLD.i32 To reflect that it is the CLPER of choice on Mali-G31 which is a v7 target. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17754>	2022-08-01 18:42:57 +00:00
Alyssa Rosenzweig	d8bd80afeb	pan/bi: Assert that we use the correct CLPER Add an assert at pack time that would have caught the bug fixed in `7535362204` ("pan/bi: Fix clper_xor on Mali-G31"). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17754>	2022-08-01 18:42:57 +00:00
Eli Schwartz	5780ea90c4	meson: add various generated header dependencies as order-only deps https://mesonbuild.com/FAQ.html#how-do-i-tell-meson-that-my-sources-use-generated-headers A few locations had underspecified deps on the header files, and this caused builds to fail given sufficient parallelism. Fixes #6531 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16659>	2022-07-31 18:10:15 +00:00
Eric Engestrom	2c67457e5e	util/list: rename LIST_ENTRY() to list_entry() This follows the Linux kernel convention, and avoids collision with macOS header macro. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6751 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6840 Cc: mesa-stable Signed-off-by: Eric Engestrom <eric@igalia.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17772>	2022-07-28 10:10:44 +00:00
Icecream95	a8dbf61b46	panfrost: Add a debug option for checking overflows on pool uploads PAN_MESA_DEBUG=overflow will place objects as close as possible to a protected region at the end of the buffer, so that overflows segfault. Caught the bugs in all four of the preceding commits. v2: memset the BO to 0xbb to catch code expecting zeroed allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17447>	2022-07-23 00:56:10 +00:00
Icecream95	379ae6d823	panfrost: Emit the correct number of attributes create_vertex_elements_state is sometimes called with a too large num_elements argument, for example with util_blitter, which causes a buffer overflow. There is no documentation to forbid this practice, so don't rely on so->num_elements being correct and instead use the vertex shader attribute count, which matches the value used to allocate the descriptors. Use attributes_read_count rather than attribute_count because the latter also includes images and PAN_VERTEX_ID/PAN_INSTANCE_ID. Fixes: `76de3e691c` ("panfrost: Merge attribute packing routines") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17447>	2022-07-23 00:56:10 +00:00
Jason Ekstrand	b510ee0d22	Use vk_foreach_struct_const where needed We're about to make it so that the compiler warns/errors if you use the wrong iterator macro. Fix up a bunch of places where someone used the wrong one before we break anything. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17630>	2022-07-19 19:55:17 +00:00
Alyssa Rosenzweig	3a0a8688d3	panfrost: Use early-ZS helpers Remove the previous compile-time early-ZS implementation and replace it with the decoupled early-ZS implementation. This uses more efficient settings in some cases (depth/stencil tests always passes or do not write), and fixes the settings used in another case (alpha-to-coverage enabled with an otherwise early-ZS shader.) Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Closes: #6206 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>	2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig	fe875c0144	panfrost: Unit test early-ZS helpers The new early-ZS helpers are pure functions, leaf nodes of the call graph, and implemented with a different algorithm from the "oracle" table of correct values for various combinations of states. Further, incorrect settings often still pass CTS while causing game bugs or inefficiencies. That combination makes the helpers an excellent candidate for unit tests. Add some. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>	2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig	e96292bc07	panfrost: Add decoupled early-ZS helpers Bifrost (and Valhall) separate early-ZS configuration into two fields: when does the depth/stencil buffer update happen? and when are pixels killed by the depth/stencil tests? The driver separately configures these to occur early (before the shader executes) or late (after the ATEST instruction executes at the end of the shader). Early tests are generally more efficient, but various combinations of API state and fragment shader properties can require late updates and/or late kills for correctness. Determining how to configure these fields is nontrivial. Our current implementation (on Bifrost) configures these fields at fragment shader compile time and bakes the settings into the RSD. This is both wrong (using early testing when late testing is required) and suboptimal (using late testing when early testing would suffice). We need to defer this configuration until draw time, when we know rasterizer and Z/S state. Reclassifying at draw time (as we currently do on Valhall) would be expensive, especially with the extra terms added in here. To cope, decouple the shader classification from the draw-time configuration. Since there are only a few bits of draw state involved, this implementation just calculates all possible states. Then the draw time classification is just indexing into a lookup table. The actual algorithm used to classify is written with correctness and clarity in mind. Unlike the current classification algorithm (which tries to match what the DDK does, poorly), this algorithm embeds its proofs of correctness. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>	2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig	29c33f75d3	pan/va: Stall after ATEST In theory this wait is required for correct behaviour of discarded threads with ATEST. Mesa usually waits before the instruction after ATEST, so this wait will get optimized out by va_merge_flow, but as our scheduler gets more sophisticated this could become an issue. Let's stay on the safe side and insert the recommended wait. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>	2022-07-13 21:05:35 +00:00
Alyssa Rosenzweig	db2bdc1dc3	pan/bi: Require ATEST coverage mask input in R60 In theory, ATEST can take any combination of registers for inputs. Experimentally, however, ATEST requires the coverage mask in R60. This avoids regressing the following dEQP tests, which write their coverage mask with pixel-frequency-shading but without writing to the depth/stencil buffer. dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_pixel.* This issue is known to affect both Mali-G52 (v7) and Mali-G57 (v9). I am unsure if this is a silicon bug or just an obscure implementation detail. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>	2022-07-13 21:05:35 +00:00

1 2 3 4 5 ...

4281 commits