fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-14 14:28:08 +02:00

Author	SHA1	Message	Date
Ian Romanick	7aad19ccd2	brw/lower: Lower invalid source conversion to better code There are two fragment shaders from RDR2 that is hurt for spills and fills on Lunar Lake. Totals from 2 (0.00% of 551413) affected shaders: Spill count: 1252 -> 1317 (+5.19%) Fill count: 2518 -> 2642 (+4.92%) Those shaders... have a lot of room for improvement. There are some patterns in those shaders that we handle very, very poorly. Improving those patterns would likely improve the spills and fills in these shaders quite dramatically. Given how much other platforms are helped, I don't this should block this commit. No shader-db or fossil-db changes on any pre-Gfx12.5 Intel platforms. v2: Add some comments and an additional assertion. Suggested by Ken. shader-db: Lunar Lake total instructions in shared programs: 18094517 -> 18094511 (<.01%) instructions in affected programs: 809 -> 803 (-0.74%) helped: 6 / HURT: 0 total cycles in shared programs: 921532158 -> 921532168 (<.01%) cycles in affected programs: 2266 -> 2276 (0.44%) helped: 0 / HURT: 3 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19820845 -> 19820839 (<.01%) instructions in affected programs: 803 -> 797 (-0.75%) helped: 6 / HURT: 0 total cycles in shared programs: 906372999 -> 906372949 (<.01%) cycles in affected programs: 3216 -> 3166 (-1.55%) helped: 6 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 141887377 -> 141884465 (-0.00%); split: -0.00%, +0.00% Cycle count: 21990301498 -> 21990267232 (-0.00%); split: -0.00%, +0.00% Spill count: 69732 -> 69797 (+0.09%) Fill count: 128521 -> 128645 (+0.10%) Totals from 349 (0.06% of 551413) affected shaders: Instrs: 506117 -> 503205 (-0.58%); split: -0.79%, +0.21% Cycle count: 32362996 -> 32328730 (-0.11%); split: -0.52%, +0.41% Spill count: 1951 -> 2016 (+3.33%) Fill count: 4899 -> 5023 (+2.53%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152773732 -> 152761383 (-0.01%); split: -0.01%, +0.00% Cycle count: 17187529968 -> 17187450663 (-0.00%); split: -0.00%, +0.00% Spill count: 79279 -> 79003 (-0.35%) Fill count: 148803 -> 147942 (-0.58%) Scratch Memory Size: 3949568 -> 3946496 (-0.08%) Max live registers: 31879325 -> 31879230 (-0.00%) Totals from 366 (0.06% of 633185) affected shaders: Instrs: 557377 -> 545028 (-2.22%); split: -2.22%, +0.01% Cycle count: 26171205 -> 26091900 (-0.30%); split: -0.54%, +0.24% Spill count: 3238 -> 2962 (-8.52%) Fill count: 10018 -> 9157 (-8.59%) Scratch Memory Size: 257024 -> 253952 (-1.20%) Max live registers: 28187 -> 28092 (-0.34%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	2a57568ebd	brw/build: Add scalar_group() helper Some uses of the old pattern still exist. The use in brw_fs_nir.cpp is deleted by commits !29884. The use in brw_lower_logical_sends.cpp seems different, so I decided to keep it. The next commit wants to use this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	5dfea87623	brw/opt: Always do both kinds of copy propagation before lower_load_payload shader-db: All Intel platforms except Skylake had similar results. (Lunar Lake shown) total instructions in shared programs: 18092932 -> 18092713 (<.01%) instructions in affected programs: 139290 -> 139071 (-0.16%) helped: 103 HURT: 18 helped stats (abs) min: 1 max: 8 x̄: 2.43 x̃: 2 helped stats (rel) min: 0.02% max: 9.09% x̄: 0.73% x̃: 0.29% HURT stats (abs) min: 1 max: 5 x̄: 1.72 x̃: 1 HURT stats (rel) min: 0.02% max: 0.55% x̄: 0.10% x̃: 0.08% 95% mean confidence interval for instructions value: -2.17 -1.45 95% mean confidence interval for instructions %-change: -0.83% -0.38% Instructions are helped. total cycles in shared programs: 922792268 -> 921495900 (-0.14%) cycles in affected programs: 400296984 -> 399000616 (-0.32%) helped: 765 HURT: 635 helped stats (abs) min: 2 max: 77018 x̄: 6739.33 x̃: 60 helped stats (rel) min: <.01% max: 35.59% x̄: 1.98% x̃: 0.32% HURT stats (abs) min: 2 max: 88658 x̄: 6077.51 x̃: 152 HURT stats (rel) min: <.01% max: 51.33% x̄: 2.75% x̃: 0.63% 95% mean confidence interval for cycles value: -1620.41 -231.54 95% mean confidence interval for cycles %-change: -0.10% 0.44% Inconclusive result (%-change mean confidence interval includes 0). LOST: 4 GAINED: 3 Skylake total instructions in shared programs: 18658324 -> 18579715 (-0.42%) instructions in affected programs: 2089957 -> 2011348 (-3.76%) helped: 9842 HURT: 23 helped stats (abs) min: 1 max: 24 x̄: 7.99 x̃: 8 helped stats (rel) min: 0.05% max: 40.00% x̄: 5.37% x̃: 4.52% HURT stats (abs) min: 1 max: 5 x̄: 1.57 x̃: 1 HURT stats (rel) min: 0.02% max: 1.28% x̄: 0.36% x̃: 0.24% 95% mean confidence interval for instructions value: -7.98 -7.95 95% mean confidence interval for instructions %-change: -5.43% -5.29% Instructions are helped. total cycles in shared programs: 860031654 -> 860237548 (0.02%) cycles in affected programs: 449175235 -> 449381129 (0.05%) helped: 7895 HURT: 4416 helped stats (abs) min: 1 max: 14129 x̄: 113.70 x̃: 22 helped stats (rel) min: <.01% max: 40.95% x̄: 1.31% x̃: 0.56% HURT stats (abs) min: 1 max: 33397 x̄: 249.89 x̃: 34 HURT stats (rel) min: <.01% max: 67.47% x̄: 2.65% x̃: 0.65% 95% mean confidence interval for cycles value: 1.46 31.98 95% mean confidence interval for cycles %-change: 0.02% 0.19% Cycles are HURT. LOST: 557 GAINED: 900 fossil-db: Lunar Lake Totals: Instrs: 141933621 -> 141884681 (-0.03%); split: -0.03%, +0.00% Cycle count: 21990657282 -> 21990200212 (-0.00%); split: -0.14%, +0.14% Spill count: 69754 -> 69732 (-0.03%); split: -0.05%, +0.02% Fill count: 128559 -> 128521 (-0.03%); split: -0.05%, +0.02% Scratch Memory Size: 5934080 -> 5925888 (-0.14%) Max live registers: 48021653 -> 48051253 (+0.06%); split: -0.00%, +0.06% Totals from 13510 (2.45% of 551410) affected shaders: Instrs: 19497180 -> 19448240 (-0.25%); split: -0.25%, +0.00% Cycle count: 2455370202 -> 2454913132 (-0.02%); split: -1.25%, +1.23% Spill count: 10975 -> 10953 (-0.20%); split: -0.32%, +0.12% Fill count: 21709 -> 21671 (-0.18%); split: -0.28%, +0.10% Scratch Memory Size: 674816 -> 666624 (-1.21%) Max live registers: 2502653 -> 2532253 (+1.18%); split: -0.01%, +1.19% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152763523 -> 152772716 (+0.01%); split: -0.00%, +0.01% Cycle count: 17188701887 -> 17187510768 (-0.01%); split: -0.10%, +0.09% Spill count: 79280 -> 79279 (-0.00%); split: -0.00%, +0.00% Fill count: 148809 -> 148803 (-0.00%) Max live registers: 31879240 -> 31879093 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5559984 -> 5559712 (-0.00%); split: +0.00%, -0.01% Totals from 20524 (3.24% of 633183) affected shaders: Instrs: 20366964 -> 20376157 (+0.05%); split: -0.01%, +0.05% Cycle count: 2406162382 -> 2404971263 (-0.05%); split: -0.68%, +0.63% Spill count: 19935 -> 19934 (-0.01%); split: -0.02%, +0.01% Fill count: 34487 -> 34481 (-0.02%) Max live registers: 1745598 -> 1745451 (-0.01%); split: -0.01%, +0.01% Max dispatch width: 117992 -> 117720 (-0.23%); split: +0.03%, -0.26% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 150694108 -> 150683859 (-0.01%); split: -0.01%, +0.00% Cycle count: 15526754059 -> 15529031079 (+0.01%); split: -0.10%, +0.12% Max live registers: 31791599 -> 31791441 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5569488 -> 5569296 (-0.00%); split: +0.00%, -0.01% Totals from 15000 (2.37% of 632406) affected shaders: Instrs: 10965577 -> 10955328 (-0.09%); split: -0.11%, +0.02% Cycle count: 2025347115 -> 2027624135 (+0.11%); split: -0.80%, +0.91% Max live registers: 983373 -> 983215 (-0.02%); split: -0.02%, +0.00% Max dispatch width: 83064 -> 82872 (-0.23%); split: +0.12%, -0.35% Skylake Totals: Instrs: 140588784 -> 140413758 (-0.12%); split: -0.13%, +0.00% Cycle count: 14724286265 -> 14723402393 (-0.01%); split: -0.04%, +0.04% Fill count: 100130 -> 100129 (-0.00%) Max live registers: 31418029 -> 31417146 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5513400 -> 5535192 (+0.40%); split: +0.89%, -0.49% Totals from 39733 (6.35% of 625986) affected shaders: Instrs: 17240737 -> 17065711 (-1.02%); split: -1.02%, +0.01% Cycle count: 1994668203 -> 1993784331 (-0.04%); split: -0.31%, +0.27% Fill count: 44481 -> 44480 (-0.00%) Max live registers: 2766781 -> 2765898 (-0.03%); split: -0.03%, +0.00% Max dispatch width: 210600 -> 232392 (+10.35%); split: +23.23%, -12.89% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	be26012f1d	brw/opt: Always do copy prop, DCE, and register coalesce after lower_regioning shader-db: Lunar Lake total instructions in shared programs: 18100289 -> 18083853 (-0.09%) instructions in affected programs: 790048 -> 773612 (-2.08%) helped: 3058 / HURT: 1 total cycles in shared programs: 921691992 -> 921293816 (-0.04%) cycles in affected programs: 37210762 -> 36812586 (-1.07%) helped: 2329 / HURT: 624 LOST: 27 GAINED: 26 Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 19825635 -> 19821391 (-0.02%) instructions in affected programs: 138675 -> 134431 (-3.06%) helped: 877 / HURT: 0 total cycles in shared programs: 907900598 -> 907885713 (<.01%) cycles in affected programs: 7127161 -> 7112276 (-0.21%) helped: 318 / HURT: 242 total spills in shared programs: 5790 -> 5758 (-0.55%) spills in affected programs: 660 -> 628 (-4.85%) helped: 8 / HURT: 0 total fills in shared programs: 6744 -> 6712 (-0.47%) fills in affected programs: 708 -> 676 (-4.52%) helped: 8 / HURT: 0 LOST: 10 GAINED: 0 Skylake total instructions in shared programs: 18722197 -> 18637637 (-0.45%) instructions in affected programs: 2757553 -> 2672993 (-3.07%) helped: 12290 / HURT: 1 total cycles in shared programs: 859716039 -> 859432560 (-0.03%) cycles in affected programs: 113731837 -> 113448358 (-0.25%) helped: 9555 / HURT: 2422 LOST: 265 GAINED: 714 fossil-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) Totals: Instrs: 142000618 -> 141928331 (-0.05%); split: -0.05%, +0.00% Subgroup size: 10995136 -> 10995072 (-0.00%) Cycle count: 21994723230 -> 21990481140 (-0.02%); split: -0.08%, +0.06% Spill count: 69911 -> 69754 (-0.22%); split: -0.23%, +0.00% Fill count: 128723 -> 128559 (-0.13%); split: -0.15%, +0.02% Scratch Memory Size: 5936128 -> 5934080 (-0.03%) Max live registers: 48006880 -> 48020936 (+0.03%); split: -0.01%, +0.04% Totals from 17450 (3.16% of 551410) affected shaders: Instrs: 14984149 -> 14911862 (-0.48%); split: -0.48%, +0.00% Subgroup size: 365744 -> 365680 (-0.02%) Cycle count: 2585095128 -> 2580853038 (-0.16%); split: -0.71%, +0.54% Spill count: 20893 -> 20736 (-0.75%); split: -0.76%, +0.00% Fill count: 44181 -> 44017 (-0.37%); split: -0.44%, +0.07% Scratch Memory Size: 995328 -> 993280 (-0.21%) Max live registers: 2378069 -> 2392125 (+0.59%); split: -0.20%, +0.79% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 150719758 -> 150676269 (-0.03%); split: -0.04%, +0.01% Subgroup size: 7764560 -> 7764632 (+0.00%) Cycle count: 15526689814 -> 15525687740 (-0.01%); split: -0.03%, +0.02% Spill count: 60120 -> 59472 (-1.08%); split: -1.17%, +0.10% Fill count: 105973 -> 104675 (-1.22%); split: -1.40%, +0.17% Scratch Memory Size: 2396160 -> 2381824 (-0.60%); split: -0.73%, +0.13% Max live registers: 31782879 -> 31788857 (+0.02%); split: -0.01%, +0.03% Max dispatch width: 5569200 -> 5569344 (+0.00%); split: +0.00%, -0.00% Totals from 10089 (1.60% of 632405) affected shaders: Instrs: 6389866 -> 6346377 (-0.68%); split: -0.87%, +0.19% Subgroup size: 102912 -> 102984 (+0.07%) Cycle count: 681310278 -> 680308204 (-0.15%); split: -0.65%, +0.51% Spill count: 19571 -> 18923 (-3.31%); split: -3.61%, +0.30% Fill count: 38229 -> 36931 (-3.40%); split: -3.88%, +0.48% Scratch Memory Size: 808960 -> 794624 (-1.77%); split: -2.15%, +0.38% Max live registers: 677473 -> 683451 (+0.88%); split: -0.45%, +1.33% Max dispatch width: 88672 -> 88816 (+0.16%); split: +0.27%, -0.11% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	b2d7a823be	brw/lower: Don't emit spurious moves to or from NULL register Previously an instruction like cmp.l.f0.0(16) null:F, v359:F, 0f would get lowered to undef(16) v13703:UD cmp.l.f0.0(16) v13703:F, v359:F, 0f mov(16) null:UD, v13703:UD After copy propagation and dead-code elimination are run again, the original CMP gets turned back into its original form! Some cases can also emit MOVs from the original NULL register. It should be possible to not do any lowering here, but there are some interactions with source lowering passes for things like cmp.l.f0.0(16) null:HF, g89.1<16,16,1>:HF, 0hf What inspired this was... diff'ing step-by-step dumps from INTEL_DEBUG=optimizer had a lot of useless changes due to these MOVs and undefs. It was very annoying. This low-effort change gets the majority of the possible benefit. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	9aba731d03	brw/cse: Don't eliminate instructions that write flags With other changes in my tree, I observed this code from dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second cmp.z removed. undef(8) %69:UD cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F mov(1) v58+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0 cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d ... undef(8) %72:UD cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F mov(1) v63+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0 This was also fixed by running dead-code elimination before CSE. That seems more like avoiding the problem than fixing it, though. I believe this affects shader-db results because leaving the second CMP in the shader can give more opportunities for cmod propagation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `234c45c929` ("intel/brw: Write a new global CSE pass that works on defs") shader-db: All Intel platforms had similar results. (Lunar Lake shown) total cycles in shared programs: 922097690 -> 922260862 (0.02%) cycles in affected programs: 3178926 -> 3342098 (5.13%) helped: 130 HURT: 88 helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16 helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18% HURT stats (abs) min: 4 max: 11992 x̄: 2292.55 x̃: 47 HURT stats (rel) min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61% 95% mean confidence interval for cycles value: 320.36 1176.63 95% mean confidence interval for cycles %-change: 1.59% 5.73% Cycles are HURT. LOST: 2 GAINED: 1 fossil-db: Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown) Totals: Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00% Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00% Max live registers: 48013385 -> 48013343 (-0.00%) Totals from 507 (0.09% of 551441) affected shaders: Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01% Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86% Max live registers: 94413 -> 94371 (-0.04%) DG2 Totals: Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00% Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00% Fill count: 150673 -> 150631 (-0.03%) Max live registers: 31871520 -> 31871476 (-0.00%) Totals from 506 (0.08% of 633197) affected shaders: Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01% Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74% Fill count: 2779 -> 2737 (-1.51%) Max live registers: 51383 -> 51339 (-0.09%) Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00% Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00% Fill count: 106610 -> 106614 (+0.00%) Max live registers: 32195006 -> 32194966 (-0.00%) Totals from 411 (0.06% of 637268) affected shaders: Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01% Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02% Fill count: 2865 -> 2869 (+0.14%) Max live registers: 42883 -> 42843 (-0.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	80a5d158ae	brw/copy: Don't copy propagate through smaller entry dest size Copy propagation would incorrectly occur in this code mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 to create mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, u0<0>:UD NoMask group0 This has different behavior. I think I just made a mistake when I changed this condition in `e3f502e007`. It seems like this condition could be relaxed to cover cases like (note the change of destination stride) mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 I'm not sure it's worth it. No shader-db or fossil-db changes on any Intel platform. Even the code for the test case mentioned in the original commit did not change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e3f502e007` ("intel/fs: Allow copy propagation between MOVs of mixed sizes") Closes: #12116 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Samuel Pitoiset	ced2404cb4	vulkan/runtime: return same cmdbuf level from the command pool freelist This fixes a performance issue on RADV because secondaries are allocated in GTT instead of VRAM for primaries. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12119 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32010>	2024-11-08 17:20:43 +00:00
Ian Romanick	c1c09e3c4a	brw/emit: Add correct 3-source instruction assertions for each platform Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled on this while trying some stuff with !31852. v2: Don't be lazy. Add proper assertions for all the things on all the platforms. Based on a suggestion by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7bed11fbde` ("intel/brw: Allow immediates in the BFE instruction on Gfx12+") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31858>	2024-11-08 16:48:57 +00:00
Gurchetan Singh	aebc6c974f	gfxstream: use vulkan_lite_runtime This results in faster compiles. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:09 -08:00
Gurchetan Singh	dd5244e6ac	gfxstream: nuke android::base::SubAllocator Use u_mm, one less dependency on libaemu v0.1.2 Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:05 -08:00
Gurchetan Singh	6a9eb986c2	gfxstream: move isHostVisible function It's separable from the rest of CoherentMemory class. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:00 -08:00
Gurchetan Singh	5d299a0bd4	util: add c++ guards to u_mm.h Needed for gfxstream. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:09:49 -08:00
Hans-Kristian Arntzen	5f70858ece	vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO. This is required, otherwise we regress latency in cases where applications are using FIFO without explicit KHR_present_wait. This is an unacceptable regression. The fix is to normalize the behavior to X11 WSI. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Fixes: `d052b0201e` ("vulkan/wsi/wayland: Use fifo protocol for FIFO") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32029>	2024-11-08 14:28:08 +00:00
Samuel Pitoiset	437bd63265	radv,aco: dump m0 and exec from the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:15 +00:00
Samuel Pitoiset	d1d41be43f	aco: declare phys regs for tba_hi/tma_hi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:15 +00:00
Samuel Pitoiset	13bab450a2	aco: fix storing SQ_WAVE_STATUS in the trap handler shader SQ_WAVE_STATUS can change inside the trap because of SCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	494050d2ea	aco: add a helper to dump SGPR to memory for the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	8c6f2fef1b	aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8 This avoids using any VGPRs on GFX8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	17f6b4e51e	aco: save/restore SCC in the trap handler shader SCC is only updated on GFX9+ but let's do it by default because the trap handler shader is likely going to be more and more complex over time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	7b4386facd	aco: cleanup using fixed registers in the trap handler shader It's easier to read and potentially less error prone. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Pierre-Eric Pelloux-Prayer	9c3ac69568	ac/perfcounter: fix buffer overflow If block->b->selectors is larger than 999, "+ 4" is not enough. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	8467f57e30	radeonsi/tests: update expected results Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	cce45dc0bf	ac: switch AMD_FORCE_FAMILY handling to using ac_fake_hw_db ac_fake_hw_db can be the single place where radeon_info content is emulated when overriding the GPU type. For some fields we need to avoid overriding them with the value coming from the ioctls to get the correct behavior. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	c097c37455	ac: add 'polaris12' gpu to ac_fake_hw_db Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	1c31cec31e	ac: rename ac_surface_test_common -> ac_fake_hw_db The next commit will reuse the radeon_info when AMD_FORCE_FAMILY is used. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	2ff67083e5	radeonsi: refuse to import texture with family_overriden being set If the gfx version is overriden by the exporter process, the surface layout might not be compatible with the importer process (which uses the real gfx version). So fail early, except if the layout is LINEAR (because it should work on all gen) or a modifier is used (which should be rejected elsewhere if not supported). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	9d0aba1f97	ac/surface: add flags to surface metadata Instead of increasing the version number to describe which fields are set, use the lower 16 bits for the metadata format version, and the other bits as flags. This way the version number defines the layout, and the flags tell which values are set. The format version is bumped to 3 (= can have flags), and 2 flags are defined: * AC_SURF_METADATA_FLAG_EXTRA_MD_BIT: replaces what was version number = 2. This means the metadata contains extra information for tools. * AC_SURF_METADATA_FLAG_FAMILY_OVERRIDEN_BIT: if set, it means the surface was allocated from a context that used an overriden gfx family. This allows the importer process to fail the import early, as the surface is likely to be invalid. It also adds an extra dw at the end, to store the fake family. This is a breaking change for existing code that interpreted "version > 1" as 2, but only in one case: AC_SURF_METADATA_FLAG_FAMILY_OVERRIDEN_BIT being set, but not AC_SURF_METADATA_FLAG_EXTRA_MD_BIT, which produces a version number of 0x20001 but there's not extra data. I think this is ok, since both gfx family overriding and extra_md are debugging tools. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer	acc32cadf5	radv: set info->family_overridden when RADV_FORCE_FAMILY is used Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>	2024-11-08 13:31:02 +00:00
Karol Herbst	3154920c36	gallium: drop PIPE_SHADER_IR_NIR_SERIALIZED It's not used anymore Acked-by: David Heidelberg <david@ixit.cz> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27783>	2024-11-08 12:49:23 +00:00
Karol Herbst	80c4ffb61a	clover: drop support for nir drivers People had enough time to migrate to rusticl, also nobody would support this anyway anymore. Acked-by: David Heidelberg <david@ixit.cz> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27783>	2024-11-08 12:49:23 +00:00
Karol Herbst	277925471e	nvc0: return NULL instead of asserting in nvc0_resource_from_user_memory Fixes: `212f1ab40e` ("nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY") Acked-by: David Heidelberg <david@ixit.cz> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27783>	2024-11-08 12:49:23 +00:00
Corentin Noël	89d709a43e	virgl: Propagate the GL_MAX_stage_SHADER_STORAGE_BLOCKS for each stage Some hardware have a higher number in the computer stage than in others, let's simply propagate everything when possible. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12003 Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31666>	2024-11-08 12:26:06 +00:00
Collabora's Gfx CI Team	85d25cc5c8	Uprev Piglit to eebe1b555f51dbb702f696d08ad5ae8153bcdcdd `c2b3133392...eebe1b555f` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32020>	2024-11-08 11:21:05 +00:00
David Rosca	79b12001fd	radeonsi/vcn: Stop clearing decode internal buffers FW will clear them if needed. Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31677>	2024-11-08 09:48:54 +00:00
David Rosca	1f00dfd1a7	radeonsi: Support PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE Starting with .59 amdgpu now clears VRAM on allocation, so we don't need to clear video buffers which are always allocated in VRAM. Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31677>	2024-11-08 09:48:54 +00:00
David Rosca	b4b74617ae	frontends/vdpau: Support skip clear on surface creation Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31677>	2024-11-08 09:48:54 +00:00
David Rosca	5df9097c95	frontends/va: Support skip clear on surface creation Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31677>	2024-11-08 09:48:54 +00:00
David Rosca	76df53f59b	gallium: Add PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE Used to skip calling clear_render_target when creating surface. Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31677>	2024-11-08 09:48:54 +00:00
Karol Herbst	47a1565c3d	nv/codegen: Do not use a zero immediate for tex instructions They aren't always legal for tex instructions, specifically for TXQ when an actual source is needed. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11999 Fixes: `85a31fa1fc` ("nv50/ir/nir: fix txq emission on MS textures") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32043>	2024-11-08 09:18:54 +00:00
David Rosca	2c3dd2a37d	frontends/va: Add minus_1 to AV1 render_width/height Rename to match the spec and to match the actual value. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31977>	2024-11-08 08:39:49 +00:00
David Rosca	7f2624e6ae	radeonsi/vcn: Fix coding AV1 render size This is only header metadata hint, so it should be passed directly from packed headers to output. Also fix the value as render_width from frontend is actually render_width_minus_1 (and same for height). Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31977>	2024-11-08 08:39:49 +00:00
Eric Engestrom	4ad8a5443b	ci/build: add workaround for incorrect maybe-uninitialized error Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31890>	2024-11-08 07:09:15 +00:00
Eric Engestrom	f09ae95c10	ci/build: drop "verify after bump to F39" as that did not help Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31890>	2024-11-08 07:09:15 +00:00
Eric Engestrom	45e1ffeceb	ci: upgrade the fedora image from 38 to 41 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31890>	2024-11-08 07:09:15 +00:00
Lionel Landwerlin	3ecf2a0518	anv: fix extent computation in image->image host copies Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0317c44872` ("anv: add VK_EXT_host_image_copy support") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32027>	2024-11-07 22:44:41 +00:00
Eric Engestrom	625ad5bc52	freedreno/ci: add more flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>	2024-11-07 21:49:29 +01:00
Eric Engestrom	a1b309a177	broadcom/ci: add more flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>	2024-11-07 21:49:29 +01:00
Eric Engestrom	e83613d906	radv/ci: add more flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>	2024-11-07 21:49:29 +01:00
Eric Engestrom	9229bcaf13	radeonsi/ci: add more flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>	2024-11-07 21:49:29 +01:00

1 2 3 4 5 ...

197436 commits