Ian Romanick
db2b1e4d76
brw/nir: Treat load_btd_{global,local}_arg_addr_intel and load_btd_shader_type_intel as convergent
...
No shader-db changes on any Intel platform. No fossil-db changes on
Tiger Lake, Ice Lake, or Skylake.
fossil-db:
Lunar Lake
Totals:
Instrs: 141808714 -> 141808513 (-0.00%); split: -0.00%, +0.00%
Cycle count: 22177889310 -> 22181410192 (+0.02%); split: -0.00%, +0.02%
Spill count: 69892 -> 69890 (-0.00%); split: -0.01%, +0.01%
Fill count: 128313 -> 128331 (+0.01%)
Max live registers: 48052083 -> 48052742 (+0.00%); split: -0.00%, +0.00%
Totals from 549 (0.10% of 551446) affected shaders:
Instrs: 911251 -> 911050 (-0.02%); split: -0.10%, +0.07%
Cycle count: 1244153266 -> 1247674148 (+0.28%); split: -0.04%, +0.32%
Spill count: 15849 -> 15847 (-0.01%); split: -0.04%, +0.03%
Fill count: 35087 -> 35105 (+0.05%)
Max live registers: 68047 -> 68706 (+0.97%); split: -0.25%, +1.22%
Meteor Lake
Totals:
Instrs: 152744298 -> 152741241 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17410258529 -> 17405949054 (-0.02%); split: -0.04%, +0.01%
Spill count: 78528 -> 78598 (+0.09%); split: -0.01%, +0.09%
Fill count: 147893 -> 147978 (+0.06%); split: -0.00%, +0.06%
Scratch Memory Size: 3962880 -> 3969024 (+0.16%)
Max live registers: 31887206 -> 31887413 (+0.00%); split: -0.00%, +0.00%
Totals from 552 (0.09% of 633315) affected shaders:
Instrs: 907279 -> 904222 (-0.34%); split: -0.48%, +0.15%
Cycle count: 1152358569 -> 1148049094 (-0.37%); split: -0.56%, +0.19%
Spill count: 15290 -> 15360 (+0.46%); split: -0.03%, +0.48%
Fill count: 35313 -> 35398 (+0.24%); split: -0.02%, +0.26%
Scratch Memory Size: 1313792 -> 1319936 (+0.47%)
Max live registers: 34218 -> 34425 (+0.60%); split: -0.47%, +1.08%
DG2
Totals:
Instrs: 152766492 -> 152763061 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17406058608 -> 17406396943 (+0.00%); split: -0.02%, +0.02%
Spill count: 78626 -> 78624 (-0.00%); split: -0.01%, +0.01%
Fill count: 147956 -> 148007 (+0.03%); split: -0.01%, +0.04%
Scratch Memory Size: 3962880 -> 3969024 (+0.16%)
Max live registers: 31887158 -> 31887365 (+0.00%); split: -0.00%, +0.00%
Totals from 552 (0.09% of 633315) affected shaders:
Instrs: 908513 -> 905082 (-0.38%); split: -0.47%, +0.09%
Cycle count: 1148162185 -> 1148500520 (+0.03%); split: -0.23%, +0.26%
Spill count: 15364 -> 15362 (-0.01%); split: -0.07%, +0.06%
Fill count: 35343 -> 35394 (+0.14%); split: -0.03%, +0.17%
Scratch Memory Size: 1313792 -> 1319936 (+0.47%)
Max live registers: 34218 -> 34425 (+0.60%); split: -0.47%, +1.08%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
f3593df877
brw/nir: Treat load_reloc_const_intel as convergent
...
shader-db:
Lunar Lake, Meteor Lake, DG2, and Tiger Lake had similar results. (Lunar Lake shown)
Lunar Lake
total instructions in shared programs: 18096549 -> 18096537 (<.01%)
instructions in affected programs: 26128 -> 26116 (-0.05%)
helped: 7 / HURT: 2
total cycles in shared programs: 922073090 -> 922093922 (<.01%)
cycles in affected programs: 10574198 -> 10595030 (0.20%)
helped: 19 / HURT: 76
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20503943 -> 20504053 (<.01%)
instructions in affected programs: 23378 -> 23488 (0.47%)
helped: 6 / HURT: 5
total cycles in shared programs: 875477036 -> 875480112 (<.01%)
cycles in affected programs: 13840528 -> 13843604 (0.02%)
helped: 22 / HURT: 55
total spills in shared programs: 4546 -> 4552 (0.13%)
spills in affected programs: 8 -> 14 (75.00%)
helped: 0 / HURT: 1
total fills in shared programs: 5280 -> 5298 (0.34%)
fills in affected programs: 24 -> 42 (75.00%)
helped: 0 / HURT: 1
One compute shader in Tomb Raider was hurt for spills and fills.
fossil-db:
Lunar Lake
Totals:
Instrs: 141808815 -> 141808714 (-0.00%); split: -0.00%, +0.00%
Cycle count: 22185066952 -> 22177889310 (-0.03%); split: -0.05%, +0.02%
Spill count: 69859 -> 69892 (+0.05%); split: -0.03%, +0.07%
Fill count: 128344 -> 128313 (-0.02%); split: -0.04%, +0.01%
Scratch Memory Size: 5833728 -> 5829632 (-0.07%)
Totals from 13384 (2.43% of 551446) affected shaders:
Instrs: 13852162 -> 13852061 (-0.00%); split: -0.00%, +0.00%
Cycle count: 7691993336 -> 7684815694 (-0.09%); split: -0.15%, +0.06%
Spill count: 53266 -> 53299 (+0.06%); split: -0.03%, +0.10%
Fill count: 96492 -> 96461 (-0.03%); split: -0.05%, +0.02%
Scratch Memory Size: 3827712 -> 3823616 (-0.11%)
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 152744735 -> 152744298 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17400199290 -> 17410258529 (+0.06%); split: -0.01%, +0.07%
Max live registers: 31887208 -> 31887206 (-0.00%)
Totals from 12435 (1.96% of 633315) affected shaders:
Instrs: 13445310 -> 13444873 (-0.00%); split: -0.00%, +0.00%
Cycle count: 6941685096 -> 6951744335 (+0.14%); split: -0.03%, +0.18%
Max live registers: 1071302 -> 1071300 (-0.00%)
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 150644063 -> 150643944 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15618718733 -> 15622092285 (+0.02%); split: -0.01%, +0.03%
Spill count: 58816 -> 58790 (-0.04%)
Fill count: 101054 -> 101065 (+0.01%)
Max live registers: 31792771 -> 31792766 (-0.00%); split: -0.00%, +0.00%
Totals from 13383 (2.12% of 632544) affected shaders:
Instrs: 12016285 -> 12016166 (-0.00%); split: -0.00%, +0.00%
Cycle count: 5239956851 -> 5243330403 (+0.06%); split: -0.02%, +0.08%
Spill count: 28977 -> 28951 (-0.09%)
Fill count: 47568 -> 47579 (+0.02%)
Max live registers: 1001554 -> 1001549 (-0.00%); split: -0.00%, +0.00%
Skylake
Totals:
Instrs: 140943195 -> 140943154 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14818940190 -> 14816706154 (-0.02%); split: -0.02%, +0.00%
Max live registers: 31663173 -> 31663168 (-0.00%); split: -0.00%, +0.00%
Totals from 12625 (2.01% of 629351) affected shaders:
Instrs: 11598223 -> 11598182 (-0.00%); split: -0.00%, +0.00%
Cycle count: 4519027823 -> 4516793787 (-0.05%); split: -0.05%, +0.00%
Max live registers: 970275 -> 970270 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
fb9b363376
brw/nir: Treat load_inline_data_intel as convergent
...
No shader-db changes on any Intel platform.
fossil-db:
Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 141808595 -> 141808815 (+0.00%); split: -0.00%, +0.00%
Cycle count: 22181300418 -> 22185066952 (+0.02%); split: -0.01%, +0.03%
Max live registers: 48052077 -> 48052083 (+0.00%)
Totals from 720 (0.13% of 551446) affected shaders:
Instrs: 116778 -> 116998 (+0.19%); split: -0.01%, +0.20%
Cycle count: 1197931082 -> 1201697616 (+0.31%); split: -0.21%, +0.53%
Max live registers: 56552 -> 56558 (+0.01%)
No fossil-db changes on any other Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
3e63920ca5
brw/nir: Treat some load_ubo as convergent
...
v2: Fix for Xe2.
No changes in shader-db or fossil-db on Lunar Lake, Meteor Lake, or DG2.
shader-db:
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
total instructions in shared programs: 19626547 -> 19634353 (0.04%)
instructions in affected programs: 1591181 -> 1598987 (0.49%)
helped: 925 / HURT: 3595
total cycles in shared programs: 865236718 -> 866682659 (0.17%)
cycles in affected programs: 151284264 -> 152730205 (0.96%)
helped: 3430 / HURT: 5510
total sends in shared programs: 1032237 -> 1032233 (<.01%)
sends in affected programs: 20 -> 16 (-20.00%)
helped: 4 / HURT: 0
LOST: 48
GAINED: 141
fossil-db:
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 150662952 -> 150641175 (-0.01%); split: -0.03%, +0.02%
Subgroup size: 7768880 -> 7768888 (+0.00%)
Send messages: 7502265 -> 7502044 (-0.00%)
Cycle count: 15621785298 -> 15618640525 (-0.02%); split: -0.06%, +0.04%
Spill count: 58818 -> 58816 (-0.00%)
Fill count: 101063 -> 101054 (-0.01%)
Max live registers: 31795403 -> 31792179 (-0.01%); split: -0.01%, +0.00%
Max dispatch width: 5572160 -> 5571488 (-0.01%); split: +0.00%, -0.01%
Totals from 10278 (1.62% of 632539) affected shaders:
Instrs: 5276493 -> 5254716 (-0.41%); split: -0.89%, +0.48%
Subgroup size: 156432 -> 156440 (+0.01%)
Send messages: 279259 -> 279038 (-0.08%)
Cycle count: 6483576378 -> 6480431605 (-0.05%); split: -0.16%, +0.11%
Spill count: 27133 -> 27131 (-0.01%)
Fill count: 49384 -> 49375 (-0.02%)
Max live registers: 675781 -> 672557 (-0.48%); split: -0.49%, +0.01%
Max dispatch width: 97256 -> 96584 (-0.69%); split: +0.08%, -0.77%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
c48570d2b2
brw/nir: Treat some ALU results as convergent
...
v2: Fix for Xe2.
v3: Fix handling of 64-bit CMP results.
v4: Scalarize 16-bit comparison temporary destination when used as a
source (as was already done for 64-bit). Suggested by Ken.
shader-db:
Lunar Lake
total instructions in shared programs: 18096500 -> 18096549 (<.01%)
instructions in affected programs: 15919 -> 15968 (0.31%)
helped: 8 / HURT: 21
total cycles in shared programs: 921841300 -> 922073090 (0.03%)
cycles in affected programs: 115946336 -> 116178126 (0.20%)
helped: 386 / HURT: 135
Meteor Lake and DG2 (Meteor Lake shown)
total instructions in shared programs: 19836053 -> 19836016 (<.01%)
instructions in affected programs: 19547 -> 19510 (-0.19%)
helped: 21 / HURT: 18
total cycles in shared programs: 906713777 -> 906588541 (-0.01%)
cycles in affected programs: 96914584 -> 96789348 (-0.13%)
helped: 335 / HURT: 134
total fills in shared programs: 6712 -> 6710 (-0.03%)
fills in affected programs: 52 -> 50 (-3.85%)
helped: 1 / HURT: 0
LOST: 1
GAINED: 1
Tiger Lake
total instructions in shared programs: 19641284 -> 19641278 (<.01%)
instructions in affected programs: 12358 -> 12352 (-0.05%)
helped: 10 / HURT: 19
total cycles in shared programs: 865413131 -> 865460513 (<.01%)
cycles in affected programs: 74641489 -> 74688871 (0.06%)
helped: 388 / HURT: 100
total spills in shared programs: 3899 -> 3898 (-0.03%)
spills in affected programs: 17 -> 16 (-5.88%)
helped: 1 / HURT: 0
total fills in shared programs: 3249 -> 3245 (-0.12%)
fills in affected programs: 51 -> 47 (-7.84%)
helped: 1 / HURT: 0
LOST: 1
GAINED: 1
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20495826 -> 20496111 (<.01%)
instructions in affected programs: 53220 -> 53505 (0.54%)
helped: 28 / HURT: 16
total cycles in shared programs: 875173550 -> 875243910 (<.01%)
cycles in affected programs: 51700652 -> 51771012 (0.14%)
helped: 400 / HURT: 39
total spills in shared programs: 4546 -> 4546 (0.00%)
spills in affected programs: 288 -> 288 (0.00%)
helped: 1 / HURT: 2
total fills in shared programs: 5224 -> 5280 (1.07%)
fills in affected programs: 795 -> 851 (7.04%)
helped: 0 / HURT: 4
LOST: 1
GAINED: 1
fossil-db:
Lunar Lake
Totals:
Instrs: 141811551 -> 141807640 (-0.00%); split: -0.00%, +0.00%
Cycle count: 22183128332 -> 22181285594 (-0.01%); split: -0.06%, +0.05%
Spill count: 69890 -> 69859 (-0.04%); split: -0.09%, +0.04%
Fill count: 128877 -> 128344 (-0.41%); split: -0.42%, +0.00%
Max live registers: 48053415 -> 48051613 (-0.00%); split: -0.00%, +0.00%
Totals from 6817 (1.24% of 551443) affected shaders:
Instrs: 4300169 -> 4296258 (-0.09%); split: -0.14%, +0.05%
Cycle count: 17263755610 -> 17261912872 (-0.01%); split: -0.08%, +0.07%
Spill count: 41822 -> 41791 (-0.07%); split: -0.15%, +0.07%
Fill count: 75523 -> 74990 (-0.71%); split: -0.71%, +0.01%
Max live registers: 733647 -> 731845 (-0.25%); split: -0.29%, +0.04%
Meteor Lake and all older Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 152735305 -> 152735801 (+0.00%); split: -0.00%, +0.00%
Subgroup size: 7733536 -> 7733616 (+0.00%)
Cycle count: 17398725539 -> 17400873100 (+0.01%); split: -0.00%, +0.02%
Max live registers: 31887018 -> 31885742 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5561696 -> 5561712 (+0.00%)
Totals from 5672 (0.90% of 633314) affected shaders:
Instrs: 2817606 -> 2818102 (+0.02%); split: -0.05%, +0.07%
Subgroup size: 81128 -> 81208 (+0.10%)
Cycle count: 10021470543 -> 10023618104 (+0.02%); split: -0.01%, +0.03%
Max live registers: 306520 -> 305244 (-0.42%); split: -0.43%, +0.01%
Max dispatch width: 74136 -> 74152 (+0.02%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
7eab2cb67e
brw/nir: Treat load_workgroup_id as convergent
...
v2: Fix for Xe2.
shader-db:
Lunar Lake Meteor Lake, DG2, and Tiger Lake had similar results. (Lunar Lake shown)
total instructions in shared programs: 18096526 -> 18096500 (<.01%)
instructions in affected programs: 6759 -> 6733 (-0.38%)
helped: 9 / HURT: 3
total cycles in shared programs: 921727804 -> 921841300 (0.01%)
cycles in affected programs: 110049730 -> 110163226 (0.10%)
helped: 90 / HURT: 372
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20496591 -> 20496402 (<.01%)
instructions in affected programs: 48757 -> 48568 (-0.39%)
helped: 25 / HURT: 8
total cycles in shared programs: 875253948 -> 875237902 (<.01%)
cycles in affected programs: 56760140 -> 56744094 (-0.03%)
helped: 363 / HURT: 34
total spills in shared programs: 4555 -> 4546 (-0.20%)
spills in affected programs: 174 -> 165 (-5.17%)
helped: 2 / HURT: 0
total fills in shared programs: 5243 -> 5224 (-0.36%)
fills in affected programs: 382 -> 363 (-4.97%)
helped: 2 / HURT: 0
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 141811577 -> 141811551 (-0.00%); split: -0.00%, +0.00%
Cycle count: 22173792370 -> 22183128332 (+0.04%); split: -0.00%, +0.04%
Max live registers: 48053498 -> 48053415 (-0.00%)
Totals from 3911 (0.71% of 551443) affected shaders:
Instrs: 2164804 -> 2164778 (-0.00%); split: -0.00%, +0.00%
Cycle count: 2404062476 -> 2413398438 (+0.39%); split: -0.02%, +0.41%
Max live registers: 413583 -> 413500 (-0.02%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
6fab1b77c2
brw/nir: Treat some load_uniform as convergent
...
No shader-db changes on any Intel platform.
v2: Fix for Xe2.
v3: Rework the way that we determine that an intrinsic can actually be
convergent. This will now depend on whether or not the important
sources have previously be determined to be convergent. Fixes
intermitent failures in some test cases (including
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.push_constant_float_16_to_32.scalar_frag).
v4: s/the it/it/ in a comment. Noticed by Ken.
fossil-db:
No fossil-db changes on Lunar Lake.
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 152743449 -> 152743161 (-0.00%)
Cycle count: 17399179660 -> 17399193488 (+0.00%)
Totals from 144 (0.02% of 633314) affected shaders:
Instrs: 5936 -> 5648 (-4.85%)
Cycle count: 51616 -> 65444 (+26.79%)
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 150646195 -> 150645907 (-0.00%)
Cycle count: 15618427818 -> 15618428942 (+0.00%)
Totals from 144 (0.02% of 632567) affected shaders:
Instrs: 6218 -> 5930 (-4.63%)
Cycle count: 39968 -> 41092 (+2.81%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00
Ian Romanick
341e5117ec
brw/nir: Treat load_const as convergent
...
opt_combine_constants goes to great effort to pack 8 constants into a
single register, this can't have much effect.
There is a lot of fossil-db variation among platforms, but the results
are generally positive.
v2: Fix for Xe2.
shader-db:
Lunar Lake
total instructions in shared programs: 18095100 -> 18092845 (-0.01%)
instructions in affected programs: 158931 -> 156676 (-1.42%)
helped: 423 / HURT: 0
total cycles in shared programs: 921523326 -> 921522784 (<.01%)
cycles in affected programs: 7522774 -> 7522232 (<.01%)
helped: 225 / HURT: 228
LOST: 1
GAINED: 7
Meteor Lake and all older Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19820211 -> 19820303 (<.01%)
instructions in affected programs: 53087 -> 53179 (0.17%)
helped: 135 / HURT: 1
total cycles in shared programs: 906380523 -> 906383031 (<.01%)
cycles in affected programs: 1402315 -> 1404823 (0.18%)
helped: 156 / HURT: 100
LOST: 1
GAINED: 16
fossil-db:
Lunar Lake
Totals:
Instrs: 141876801 -> 141783010 (-0.07%); split: -0.07%, +0.00%
Subgroup size: 10994624 -> 10994704 (+0.00%)
Cycle count: 22173441950 -> 22172949188 (-0.00%); split: -0.01%, +0.01%
Spill count: 69850 -> 69890 (+0.06%); split: -0.00%, +0.06%
Fill count: 129285 -> 128877 (-0.32%)
Max live registers: 48047900 -> 48043650 (-0.01%); split: -0.01%, +0.00%
Totals from 29837 (5.41% of 551396) affected shaders:
Instrs: 7842512 -> 7748721 (-1.20%); split: -1.23%, +0.03%
Subgroup size: 940320 -> 940400 (+0.01%)
Cycle count: 3444846368 -> 3444353606 (-0.01%); split: -0.09%, +0.08%
Spill count: 23358 -> 23398 (+0.17%); split: -0.01%, +0.18%
Fill count: 52296 -> 51888 (-0.78%)
Max live registers: 3183481 -> 3179231 (-0.13%); split: -0.16%, +0.03%
Meteor Lake
Totals:
Instrs: 152709353 -> 152666543 (-0.03%); split: -0.03%, +0.00%
Cycle count: 17397176906 -> 17397668904 (+0.00%); split: -0.00%, +0.01%
Fill count: 147896 -> 147893 (-0.00%)
Max live registers: 31862891 -> 31861888 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5559664 -> 5561776 (+0.04%); split: +0.08%, -0.04%
Totals from 20913 (3.30% of 633046) affected shaders:
Instrs: 6676676 -> 6633866 (-0.64%); split: -0.64%, +0.00%
Cycle count: 1498330125 -> 1498822123 (+0.03%); split: -0.06%, +0.09%
Fill count: 41010 -> 41007 (-0.01%)
Max live registers: 1799295 -> 1798292 (-0.06%); split: -0.06%, +0.00%
Max dispatch width: 12880 -> 14992 (+16.40%); split: +33.29%, -16.89%
DG2 and Tiger Lake had similar results. (DG2 shown)
Totals:
Instrs: 152730878 -> 152688139 (-0.03%); split: -0.03%, +0.00%
Cycle count: 17394835605 -> 17394179808 (-0.00%); split: -0.01%, +0.00%
Max live registers: 31862843 -> 31861840 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5559664 -> 5561776 (+0.04%); split: +0.08%, -0.04%
Totals from 20912 (3.30% of 633046) affected shaders:
Instrs: 6563021 -> 6520282 (-0.65%); split: -0.65%, +0.00%
Cycle count: 1201999616 -> 1201343819 (-0.05%); split: -0.08%, +0.03%
Max live registers: 1798392 -> 1797389 (-0.06%); split: -0.06%, +0.00%
Max dispatch width: 12872 -> 14984 (+16.41%); split: +33.31%, -16.90%
Ice Lake
Totals:
Instrs: 151914872 -> 151868108 (-0.03%)
Cycle count: 15262958696 -> 15262665082 (-0.00%); split: -0.00%, +0.00%
Max live registers: 32194225 -> 32193192 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5650880 -> 5650608 (-0.00%); split: +0.02%, -0.03%
Totals from 22192 (3.48% of 637223) affected shaders:
Instrs: 6419739 -> 6372975 (-0.73%)
Cycle count: 184733818 -> 184440204 (-0.16%); split: -0.36%, +0.20%
Max live registers: 1989950 -> 1988917 (-0.05%); split: -0.05%, +0.00%
Max dispatch width: 5744 -> 5472 (-4.74%); split: +23.40%, -28.13%
Skylake
Totals:
Instrs: 141027379 -> 140811741 (-0.15%)
Cycle count: 14817704293 -> 14817418611 (-0.00%); split: -0.01%, +0.01%
Max live registers: 31628796 -> 31627791 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5535176 -> 5539880 (+0.08%); split: +0.14%, -0.06%
Totals from 22218 (3.53% of 628840) affected shaders:
Instrs: 5944856 -> 5729218 (-3.63%)
Cycle count: 182845101 -> 182559419 (-0.16%); split: -0.60%, +0.44%
Max live registers: 1974576 -> 1973571 (-0.05%); split: -0.07%, +0.02%
Max dispatch width: 16912 -> 21616 (+27.81%); split: +46.93%, -19.11%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
d0f1a94e3d
brw/build: Prepare BROADCAST for scalar values
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
5ea9ed4798
brw/nir: Prepare try_rebuild_source for scalar values
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
59f66b4150
brw/emit: Allow scalar sources to HF math instructions on Xe2
...
v2: Add a comment explaining the context of the workaround. Suggested by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
4457073c32
brw/lower: Properly handle UNIFORM globals address in lower_trace_ray_logical_send
...
v2: Don't shadow previous declaration of globals_addr. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
007c92b2ac
brw/lower: Adjust source stride on DF is_scalar sources to MAD on Gfx9
...
This commit used to be "brw/emit: Allow scalar sources to 64-bit
3-source instructions". These instructions were fixed up in
brw_eu_emit. There seems to be some conflict with the <0,1,0> stride an
post-RA scheduling. The only difference between the passing code
generated by this commit and the failing code generated by the older
commit is some post-RA scheduling.
v2: Change the stride of a MAD even if the instruction isn't
lowered. MAD instructions that are already SIMD8 have to follow the same
rules. 🤦
v3: Pull the lowering out to its own pass. Update the comment in
brw_fs_validate. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
d5d7ae22ae
brw/nir: Fix up handling of sources that might be convergent vectors
...
Sources that are scalars (almost all source) and convergent generally
want <0,1,0> source stride. Sources that are vectors (e.g., texture
coordinates, SSBO write data, etc.) and convergent want no extra strides
applied. In nearly all cases LOAD_PAYLOAD lowering will do the right
thing.
v2: Use VEC in emit_pixel_interpolater_send. Suggested by Ken.
v3: With the elimination of offset_to_component(), offset() may not
convert an is_scalar source to have a zero stride. Explicitly do this
in get_nir_src and prepare_alu_destination_and_sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
9e6bd5bf97
brw/lower: Allow uniform and scalar sources to many kinds of SEND
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
1bff4f93ca
brw: Basic infrastructure to store convergent values as scalars
...
In SIMD16 and SIMD32, storing convergent values in full 16- or
32-channel registers is wasteful. It wastes register space, and in most
cases on SIMD32, it wastes instructions. Our register allocator is not
clever enough to handle scalar allocations. It's fundamental unit of
allocation is SIMD8. Start treating convergent values as SIMD8.
Add a tracking bit in brw_reg to specify that a register represents a
convergent, scalar value. This has two implications:
1. All channels of the SIMD8 register must contain the same value. In
general, this means that writes to the register must be
force_writemask_all and exec_size = 8;
2. Reads of this register can (and should) use <0,1,0> stride. SIMD8
instructions that have restrictions on source stride can us <8,8,1>.
Values that are vectors (e.g., results of load_uniform or texture
operations) will be stored as multiple SIMD8 hardware registers.
v2: brw_fs_opt_copy_propagation_defs fix from Ken. Fix for Xe2.
v3: Eliminte offset_to_scalar(). Remove mention of vec4 backend in
brw_reg.h. Both suggested by Caio. The offset_to_scalar() change
necessitates some trickery in the fs_builder offset() function, but I
think this is an improvement overall. There is also some rework in
find_value_for_offset to account for the possibility that is_scalar
sources in LOAD_PAYLOAD might be <8;8,1> or <0;1,0>.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Ian Romanick
ef3dc401da
brw: Add devinfo parameter to fs_inst::regs_read
...
This isn't used now, but future commits will add uses. Doing this as a
separate commit removes a lot of "just typing" churn from commits that
have real changes to review.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:58 -08:00
Marek Olšák
af899c3752
radeonsi,radv: fix incorrect min_esverts for NGG subgroup calculation
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
c52025a1ce
radeonsi: disable luminance alpha formats on gfx6
...
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
9b7ea720c9
radeonsi: use nir->info instead of sel->info.base
...
sel->info is out of date after shader variant optimizations. We need to
stop using it.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
04a0800068
radeonsi: call si_init_shader_args in si_get_nir_shader
...
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
409a6fd69c
radeonsi: make si_init_shader_args static
...
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
7ddb28f447
radeonsi: remove some uses of enum pipe_shader_type
...
it's identical to gl_shader_stage
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
110b308841
radeonsi: make nir->info and si_shader_info::base identical
...
so that we can use nir->info instead of the latter.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
6a1bdf2f78
radeonsi/gfx12: tune streamout performance
...
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
10b951d752
radeonsi/gfx12: fix DrawTransformFeedback(stream != 0)
...
We only set buf_filled_size for the first target, but draws from non-zero
streams use buf_filled_size from other targets, so share the same
buf_filled_size buffer among all streamout targets because it contains
all 4 offsets.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
8440184dfd
radeonsi: make NGG streamout output primitive type known at compile time
...
This compiles an optimized shader variant for NGG streamout where the output
primitive is known at compile time. This allows putting stores for all
vertices into the same VMEM clause.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
5003465c42
radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs
...
Compile a monolithic optimized shader to do that, and clean up the comments.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
58132d6fc8
radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass
...
This uses si_shader_info to store whether gl_FragDepth can be removed,
and it uses the kill_z epilog flag to do the removal without recompilation.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
b56f47611a
radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3
...
It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1
to indicate that alpha for alpha-to-coverage should be read from mrtz.a.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
08abddd235
radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together
...
alpha-to-coverage must be applied before alpha-to-one. The only way to do
that is to export alpha for alpha-to-coverage via mrtz, and export 1 via
mrt0.a.
ACO and monolithic shader support is already in place thanks to RADV,
so we only need to change the LLVM PS epilog and the shader key.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
de996ac481
radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled
...
This adds kill_z and kill_stencil flags to the shader PS epilog key, which
removes those outputs if depth or stencil are disabled.
It must be implemented in:
* ACO PS epilog
* LLVM PS epilog
* ac_nir_lower_ps for monolithic shaders
Some of the samplemask code wasn't completely correct, but probably harmless.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
0beeb16e41
radeonsi: fix a gfx10.3 regression due to a gfx12 change
...
This fixes:
Assertion `!"BITSET_TEST_RANGE: bit range crosses word boundary"' failed.
Fixes: e3cef02c24 - radeonsi/gfx12: set DB_RENDER_OVERRIDE based on stencil state
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
4ee1b2ee24
radeonsi/ci: update failures and flakes
...
If deqp-runner detects a flake, it's not reported without -v.
Here I gathered all the flakes.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Marek Olšák
31358df708
radeonsi/ci: don't copy skips.csv to the results directory
...
It's not needed anymore. This fixes the script for llvmpipe.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:20 +00:00
Pierre-Eric Pelloux-Prayer
c0ef2aa7f8
DEPENDENCY: ac/llvm: fix sparse code handling
...
The existing code produced a incorrectly sized result from visit_tex.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713 >
2024-12-24 12:02:19 +00:00
Marek Olšák
3a7737ffb5
virgl/ci: disable virgl-traces because it doesn't upload results
...
Not being able to review results makes it impossible to update the hashes.
Suggested by Daniel Stone.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Marek Olšák
73d675451b
ci: update fail lists and trace checksums
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Marek Olšák
4932b63f36
v3d: enable uniform expression propagation from outputs to the next shader
...
This will take effect after nir_opt_varyings is enabled by another MR, and
will fix existing shader compiler crashes thanks to better optimizations.
For example, one GLSL program that failed to compile and had 226 VS
instructions and 356 FS instructions in NIR will be reduced to 2 or 3
instructions per shader.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Jesse Natalie
01e9449be2
microsoft/compiler: Update clip/cull split pass to handle clip/cull getting merged
...
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Jesse Natalie
8dd44c7e72
microsoft/compiler: Skip POS for io compaction
...
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Marek Olšák
dae57e184a
glsl,st/mesa: always lower IO for GLSL, unlower IO for drivers
...
This enables nir_opt_varyings for all gallium drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Marek Olšák
1dc85a34f3
st/mesa: add a pass that unlowers IO intrinsics to variables
...
We are going to switch all gallium drivers to nir_opt_varyings and then
use this to get IO variables in the end.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942 >
2024-12-24 05:54:07 -05:00
Qiang Yu
dff14d102d
aco: fix voffset missing when buffer store base >=4096
...
Regression on test:
dEQP-GLES31.functional.geometry_shading.basic.output_256
voffset is missing if buffer store base >=4096, we need to
re-calculate offen after resolve_excess_vmem_const_offset().
Fixes: cdaf269924 ("aco: inline store_vmem_mubuf/emit_single_mubuf_store")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32767 >
2024-12-24 01:42:45 +00:00
Rohan Garg
5bddf6ceb0
iris: assert that we're not exporting a TILE64 surface
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771 >
2024-12-23 19:33:36 +00:00
Rohan Garg
308c2b9828
anv: refactor choose_isl_tiling_flags to pass fewer arguments
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771 >
2024-12-23 19:33:36 +00:00
Rohan Garg
f96b2c002d
isl: disable aux when creating uncompressed TileY/Tile64 surfaces from compressed ones
...
Fixes: 8e96b51 ('intel/isl: Assert alignments of surface addresses')
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771 >
2024-12-23 19:33:36 +00:00
Rohan Garg
abd137d079
iris: use CALLOC_STRUCT instead of calloc for readability
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771 >
2024-12-23 19:33:36 +00:00
Thomas H.P. Andersen
e38150f2fa
drirc/nvk: force_vk_vendor=-1 for Artifact Classic
...
Without this the game crashes during the loading screen.
The game uses vkUpdateDescriptorSetWithTemplate and, in certain cases,
passes VkDescriptorBufferInfo structures where the offset + range
exceeds the size of the buffer. This triggers an assertion when
vk_buffer_range() is called, causing the game to crash.
When the nvidia vendor id is used the range is consistently set to 65536.
Without it the range varies and is much smaller - never exceeding 1000.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12349
Cc: stable
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32764 >
2024-12-23 16:12:35 +00:00
Mary Guillemard
711b3351ef
asahi: Remove unneeded dependencies for asahi_clc
...
There is no requirement on LLVM or SPIR-V tools since the introduction
of mesa-clc.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32719 >
2024-12-23 15:09:41 +00:00