Samuel Pitoiset
230affd52b
radv/meta: use BDA for query resolves
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475 >
2025-02-11 15:12:33 +00:00
Rhys Perry
ecd122ddb8
radv/rt: correctly preserve metadata in move_rt_instructions
...
This should invalidate nir_metadata_live_defs.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33354 >
2025-02-10 15:01:37 +00:00
Patrick Nicolas
9ef01a0f98
radv/video: Add low latency encoding
...
When VkVideoEncodeUsageInfoKHR has a tuningMode of
VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR or
VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR, request low latency
mode for the encoder.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11958
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32862 >
2025-02-09 21:57:33 +00:00
Georg Lehmann
fd77cc7c32
ac/nir/lower_ps: move exports after packing alu
...
If ACO's wqm section ends just before the first export, this mixing alu and
exports means the alu in question can't be reordered as much by the ILP
scheduler.
Foz-DB Navi31:
Totals from 8959 (11.31% of 79188) affected shaders:
Instrs: 5977212 -> 5978494 (+0.02%); split: -0.02%, +0.04%
CodeSize: 32982732 -> 32987876 (+0.02%); split: -0.01%, +0.03%
Latency: 35218073 -> 35216277 (-0.01%); split: -0.02%, +0.02%
InvThroughput: 5149751 -> 5149696 (-0.00%); split: -0.00%, +0.00%
SClause: 220552 -> 220551 (-0.00%); split: -0.01%, +0.01%
PreVGPRs: 313203 -> 313069 (-0.04%); split: -0.06%, +0.01%
Foz-DB Navi21:
Totals from 8895 (11.21% of 79377) affected shaders:
MaxWaves: 219280 -> 219272 (-0.00%); split: +0.00%, -0.01%
Instrs: 5393330 -> 5393366 (+0.00%); split: -0.00%, +0.00%
CodeSize: 29921900 -> 29922024 (+0.00%); split: -0.00%, +0.00%
VGPRs: 406664 -> 406688 (+0.01%); split: -0.00%, +0.01%
Latency: 35653975 -> 35652220 (-0.00%); split: -0.02%, +0.02%
InvThroughput: 7992134 -> 7992032 (-0.00%); split: -0.00%, +0.00%
SClause: 223784 -> 223786 (+0.00%)
Copies: 370984 -> 370983 (-0.00%)
PreVGPRs: 314323 -> 314330 (+0.00%); split: -0.01%, +0.01%
VALU: 3800023 -> 3800022 (-0.00%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33417 >
2025-02-08 17:31:18 +00:00
Martin Roukala (né Peres)
8aa22e834a
radv/ci: document more Tahiti VKCTS flakes
...
Now that we have a more powerful host, we started getting new flakes.
Let's document them!
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446 >
2025-02-08 13:22:13 +02:00
Rhys Perry
c3d27906d8
radv: vectorize lowered shader IO
...
fossil-db (navi31):
Totals from 2329 (2.93% of 79377) affected shaders:
MaxWaves: 72152 -> 72102 (-0.07%)
Instrs: 1048791 -> 1041920 (-0.66%); split: -0.72%, +0.07%
CodeSize: 5331832 -> 5285572 (-0.87%); split: -0.90%, +0.03%
VGPRs: 113844 -> 113820 (-0.02%); split: -0.14%, +0.12%
Latency: 4349524 -> 4346374 (-0.07%); split: -0.35%, +0.28%
InvThroughput: 609449 -> 609235 (-0.04%); split: -0.27%, +0.24%
VClause: 22613 -> 22451 (-0.72%); split: -1.03%, +0.31%
SClause: 21197 -> 21177 (-0.09%); split: -0.45%, +0.35%
Copies: 81900 -> 82446 (+0.67%); split: -1.51%, +2.18%
PreSGPRs: 94697 -> 93596 (-1.16%); split: -1.23%, +0.07%
PreVGPRs: 69962 -> 70080 (+0.17%); split: -0.01%, +0.18%
VALU: 625247 -> 625390 (+0.02%); split: -0.23%, +0.25%
SALU: 101692 -> 101555 (-0.13%); split: -0.24%, +0.11%
VMEM: 46459 -> 44845 (-3.47%)
fossil-db (navi21):
Totals from 17522 (22.07% of 79377) affected shaders:
MaxWaves: 425698 -> 425460 (-0.06%); split: +0.00%, -0.06%
Instrs: 11444215 -> 11428321 (-0.14%); split: -0.14%, +0.00%
CodeSize: 59227492 -> 59019376 (-0.35%); split: -0.35%, +0.00%
VGPRs: 780920 -> 781208 (+0.04%); split: -0.00%, +0.04%
Latency: 44965072 -> 44926529 (-0.09%); split: -0.12%, +0.03%
InvThroughput: 9718148 -> 9728793 (+0.11%); split: -0.01%, +0.12%
VClause: 225732 -> 225605 (-0.06%); split: -0.10%, +0.04%
SClause: 217196 -> 217160 (-0.02%); split: -0.03%, +0.01%
Copies: 1050351 -> 1065263 (+1.42%); split: -0.03%, +1.45%
PreSGPRs: 747538 -> 747223 (-0.04%); split: -0.05%, +0.01%
PreVGPRs: 626702 -> 626748 (+0.01%); split: -0.00%, +0.01%
VALU: 6629403 -> 6643822 (+0.22%); split: -0.01%, +0.23%
SALU: 1898492 -> 1898452 (-0.00%); split: -0.00%, +0.00%
VMEM: 529942 -> 528361 (-0.30%)
fossil-db (vega10):
Totals from 1791 (2.84% of 62962) affected shaders:
MaxWaves: 12270 -> 12253 (-0.14%); split: +0.01%, -0.15%
Instrs: 602026 -> 597473 (-0.76%); split: -0.83%, +0.08%
CodeSize: 3109872 -> 3071664 (-1.23%); split: -1.26%, +0.03%
SGPRs: 137826 -> 137938 (+0.08%); split: -0.10%, +0.19%
VGPRs: 70364 -> 70520 (+0.22%); split: -0.03%, +0.26%
Latency: 4757850 -> 4781905 (+0.51%); split: -0.35%, +0.86%
InvThroughput: 2296941 -> 2310685 (+0.60%); split: -0.14%, +0.74%
VClause: 14161 -> 14050 (-0.78%); split: -1.23%, +0.44%
SClause: 14058 -> 14077 (+0.14%); split: -0.57%, +0.70%
Copies: 40954 -> 42191 (+3.02%); split: -1.69%, +4.71%
PreSGPRs: 64314 -> 63214 (-1.71%); split: -1.81%, +0.10%
PreVGPRs: 53558 -> 53894 (+0.63%); split: -0.01%, +0.64%
VALU: 449920 -> 450830 (+0.20%); split: -0.19%, +0.39%
SALU: 32973 -> 32839 (-0.41%); split: -0.76%, +0.35%
VMEM: 28796 -> 25151 (-12.66%)
fossil-db (polaris10):
Totals from 1769 (2.86% of 61794) affected shaders:
MaxWaves: 12024 -> 12021 (-0.02%)
Instrs: 474761 -> 470760 (-0.84%); split: -0.94%, +0.10%
CodeSize: 2447964 -> 2420712 (-1.11%); split: -1.15%, +0.04%
SGPRs: 129664 -> 129728 (+0.05%); split: -0.14%, +0.19%
VGPRs: 65216 -> 65560 (+0.53%); split: -0.05%, +0.58%
Latency: 4304734 -> 4318319 (+0.32%); split: -0.41%, +0.72%
InvThroughput: 2114950 -> 2122580 (+0.36%); split: -0.18%, +0.54%
VClause: 10933 -> 10808 (-1.14%); split: -1.42%, +0.27%
SClause: 11430 -> 11446 (+0.14%); split: -0.70%, +0.84%
Copies: 32290 -> 31891 (-1.24%); split: -2.80%, +1.56%
PreSGPRs: 58184 -> 57096 (-1.87%); split: -1.98%, +0.11%
PreVGPRs: 48757 -> 48874 (+0.24%); split: -0.02%, +0.26%
VALU: 359097 -> 358582 (-0.14%); split: -0.25%, +0.11%
SALU: 26279 -> 25934 (-1.31%); split: -1.75%, +0.43%
VMEM: 18825 -> 17247 (-8.38%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
953faac23e
radv: vectorize descriptor loads
...
fossil-db (navi31):
Totals from 49237 (62.03% of 79377) affected shaders:
MaxWaves: 1497901 -> 1497851 (-0.00%); split: +0.00%, -0.00%
Instrs: 25766029 -> 25595468 (-0.66%); split: -0.68%, +0.02%
CodeSize: 133811412 -> 132616356 (-0.89%); split: -0.90%, +0.01%
VGPRs: 2318068 -> 2318200 (+0.01%); split: -0.00%, +0.01%
SpillSGPRs: 4512 -> 4507 (-0.11%); split: -0.64%, +0.53%
Latency: 164086813 -> 163869930 (-0.13%); split: -0.22%, +0.09%
InvThroughput: 24811220 -> 24802709 (-0.03%); split: -0.05%, +0.02%
VClause: 553717 -> 557194 (+0.63%); split: -0.30%, +0.93%
SClause: 723038 -> 710431 (-1.74%); split: -2.69%, +0.95%
Copies: 1709226 -> 1711030 (+0.11%); split: -0.48%, +0.59%
Branches: 465169 -> 465164 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1775360 -> 1961282 (+10.47%); split: -0.01%, +10.48%
VALU: 15418039 -> 15417896 (-0.00%); split: -0.02%, +0.02%
SALU: 2424519 -> 2416263 (-0.34%); split: -0.61%, +0.26%
SMEM: 1245273 -> 1121006 (-9.98%)
VOPD: 3882 -> 3885 (+0.08%); split: +0.18%, -0.10%
fossil-db (navi21):
Totals from 48539 (61.15% of 79377) affected shaders:
MaxWaves: 1262958 -> 1262912 (-0.00%); split: +0.00%, -0.01%
Instrs: 30334013 -> 30154279 (-0.59%); split: -0.60%, +0.01%
CodeSize: 161298192 -> 160027616 (-0.79%); split: -0.80%, +0.01%
VGPRs: 1979248 -> 1979192 (-0.00%); split: -0.01%, +0.01%
SpillSGPRs: 3751 -> 3776 (+0.67%); split: -0.75%, +1.41%
Latency: 185665578 -> 185429672 (-0.13%); split: -0.23%, +0.10%
InvThroughput: 41413438 -> 41406558 (-0.02%); split: -0.03%, +0.02%
VClause: 624116 -> 626703 (+0.41%); split: -0.30%, +0.71%
SClause: 775094 -> 764569 (-1.36%); split: -2.73%, +1.38%
Copies: 2437041 -> 2441758 (+0.19%); split: -0.23%, +0.42%
Branches: 770540 -> 770552 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1919117 -> 2021093 (+5.31%); split: -0.01%, +5.32%
VALU: 18926346 -> 18926269 (-0.00%); split: -0.01%, +0.01%
SALU: 4316722 -> 4310066 (-0.15%); split: -0.33%, +0.17%
SMEM: 1350230 -> 1216865 (-9.88%)
fossil-db (vega10):
Totals from 41793 (66.38% of 62962) affected shaders:
MaxWaves: 306797 -> 306685 (-0.04%); split: +0.02%, -0.06%
Instrs: 16251398 -> 16140153 (-0.68%); split: -0.71%, +0.02%
CodeSize: 83407848 -> 82543596 (-1.04%); split: -1.05%, +0.01%
SGPRs: 2787936 -> 2854864 (+2.40%); split: -0.73%, +3.13%
VGPRs: 1585644 -> 1586156 (+0.03%); split: -0.01%, +0.05%
SpillSGPRs: 3856 -> 3843 (-0.34%); split: -1.50%, +1.17%
SpillVGPRs: 560 -> 562 (+0.36%)
Latency: 167478607 -> 166829429 (-0.39%); split: -0.50%, +0.12%
InvThroughput: 76378642 -> 76353650 (-0.03%); split: -0.06%, +0.03%
VClause: 361639 -> 362694 (+0.29%); split: -0.31%, +0.60%
SClause: 546919 -> 535879 (-2.02%); split: -2.98%, +0.96%
Copies: 1388817 -> 1396020 (+0.52%); split: -0.37%, +0.89%
Branches: 227697 -> 227705 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1384316 -> 1532654 (+10.72%); split: -0.01%, +10.73%
VALU: 11896315 -> 11896547 (+0.00%); split: -0.01%, +0.01%
SALU: 1371452 -> 1370143 (-0.10%); split: -0.83%, +0.73%
VMEM: 628506 -> 628510 (+0.00%)
SMEM: 984495 -> 882129 (-10.40%)
fossil-db (polaris10):
Totals from 41057 (66.44% of 61794) affected shaders:
MaxWaves: 270307 -> 270311 (+0.00%); split: +0.02%, -0.01%
Instrs: 16082187 -> 15972163 (-0.68%); split: -0.71%, +0.02%
CodeSize: 82199592 -> 81341176 (-1.04%); split: -1.05%, +0.01%
SGPRs: 2894960 -> 2970720 (+2.62%); split: -0.67%, +3.29%
VGPRs: 1620132 -> 1620352 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 3885 -> 3868 (-0.44%); split: -1.47%, +1.03%
SpillVGPRs: 617 -> 619 (+0.32%)
Latency: 166722696 -> 166066137 (-0.39%); split: -0.52%, +0.13%
InvThroughput: 76887856 -> 76862349 (-0.03%); split: -0.08%, +0.04%
VClause: 353499 -> 354709 (+0.34%); split: -0.28%, +0.62%
SClause: 544073 -> 533053 (-2.03%); split: -2.97%, +0.95%
Copies: 1398025 -> 1405848 (+0.56%); split: -0.30%, +0.86%
Branches: 224038 -> 224041 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1362781 -> 1509495 (+10.77%); split: -0.01%, +10.77%
VALU: 11771997 -> 11772271 (+0.00%); split: -0.01%, +0.01%
SALU: 1416410 -> 1415708 (-0.05%); split: -0.72%, +0.68%
VMEM: 616867 -> 616871 (+0.00%)
SMEM: 970539 -> 869729 (-10.39%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
5fe0012670
radv: DCE before nir_opt_shrink_vectors
...
fossil-db (navi31):
Totals from 941 (1.19% of 79377) affected shaders:
MaxWaves: 28422 -> 28502 (+0.28%)
Instrs: 645374 -> 642031 (-0.52%); split: -0.55%, +0.03%
CodeSize: 3264460 -> 3244488 (-0.61%); split: -0.63%, +0.02%
VGPRs: 48392 -> 48044 (-0.72%)
SpillVGPRs: 7 -> 0 (-inf%)
Scratch: 1792 -> 0 (-inf%)
Latency: 2596896 -> 2536952 (-2.31%); split: -2.33%, +0.02%
InvThroughput: 528726 -> 500139 (-5.41%); split: -5.42%, +0.01%
VClause: 14566 -> 14539 (-0.19%)
Copies: 53022 -> 51296 (-3.26%); split: -3.37%, +0.12%
PreSGPRs: 29369 -> 29367 (-0.01%)
PreVGPRs: 29710 -> 29694 (-0.05%)
VALU: 366134 -> 364245 (-0.52%); split: -0.53%, +0.02%
SALU: 74017 -> 73891 (-0.17%)
VMEM: 25240 -> 25208 (-0.13%)
fossil-db (navi21):
Totals from 941 (1.19% of 79377) affected shaders:
MaxWaves: 22018 -> 22058 (+0.18%); split: +0.28%, -0.10%
Instrs: 521053 -> 518898 (-0.41%); split: -0.43%, +0.02%
CodeSize: 2750628 -> 2734996 (-0.57%); split: -0.58%, +0.01%
VGPRs: 41152 -> 41024 (-0.31%); split: -0.41%, +0.10%
SpillVGPRs: 5 -> 0 (-inf%)
Scratch: 2048 -> 0 (-inf%)
Latency: 2655941 -> 2607022 (-1.84%); split: -1.86%, +0.02%
InvThroughput: 711733 -> 690032 (-3.05%); split: -3.07%, +0.02%
VClause: 16388 -> 16363 (-0.15%)
Copies: 35152 -> 33485 (-4.74%); split: -4.98%, +0.24%
PreSGPRs: 28486 -> 28484 (-0.01%)
PreVGPRs: 30317 -> 30301 (-0.05%)
VALU: 348423 -> 346614 (-0.52%); split: -0.54%, +0.02%
SALU: 44020 -> 43869 (-0.34%)
VMEM: 25216 -> 25195 (-0.08%)
fossil-db (vega10):
Totals from 416 (0.66% of 62962) affected shaders:
MaxWaves: 2687 -> 2696 (+0.33%); split: +0.37%, -0.04%
Instrs: 245634 -> 243501 (-0.87%); split: -0.88%, +0.01%
CodeSize: 1312836 -> 1297248 (-1.19%); split: -1.19%, +0.01%
VGPRs: 17684 -> 17692 (+0.05%); split: -0.43%, +0.48%
SpillVGPRs: 5 -> 0 (-inf%)
Scratch: 2048 -> 0 (-inf%)
Latency: 1928393 -> 1881346 (-2.44%); split: -2.44%, +0.00%
InvThroughput: 1163915 -> 1117096 (-4.02%); split: -4.03%, +0.00%
VClause: 7070 -> 7053 (-0.24%)
Copies: 22577 -> 20834 (-7.72%); split: -7.78%, +0.06%
Branches: 4328 -> 4320 (-0.18%)
PreSGPRs: 13993 -> 13991 (-0.01%)
PreVGPRs: 13452 -> 13436 (-0.12%)
VALU: 165253 -> 163366 (-1.14%); split: -1.15%, +0.01%
SALU: 26258 -> 26111 (-0.56%)
VMEM: 11736 -> 11715 (-0.18%)
fossil-db (polaris10):
Totals from 355 (0.57% of 61794) affected shaders:
Instrs: 108639 -> 108682 (+0.04%); split: -0.03%, +0.07%
CodeSize: 583804 -> 583936 (+0.02%); split: -0.03%, +0.06%
SGPRs: 17712 -> 17728 (+0.09%)
Latency: 735332 -> 734777 (-0.08%); split: -0.08%, +0.01%
InvThroughput: 443975 -> 444045 (+0.02%); split: -0.03%, +0.04%
VClause: 2552 -> 2558 (+0.24%)
SClause: 2394 -> 2393 (-0.04%)
Copies: 11433 -> 11464 (+0.27%); split: -0.15%, +0.42%
PreVGPRs: 7365 -> 7364 (-0.01%)
VALU: 69385 -> 69416 (+0.04%); split: -0.02%, +0.07%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
d04e1ea02d
radv: move nir_opt_shrink_vectors later
...
This seems to be helpful with shaders which use NGG culling.
fossil-db (navi21):
Totals from 3529 (4.45% of 79377) affected shaders:
MaxWaves: 81490 -> 82066 (+0.71%)
Instrs: 2868872 -> 2863476 (-0.19%); split: -0.22%, +0.04%
CodeSize: 14949540 -> 14927580 (-0.15%); split: -0.18%, +0.03%
VGPRs: 165440 -> 164144 (-0.78%)
SpillSGPRs: 578 -> 405 (-29.93%)
Latency: 15388119 -> 15151882 (-1.54%); split: -1.74%, +0.20%
InvThroughput: 2935873 -> 2929736 (-0.21%); split: -0.25%, +0.04%
VClause: 70192 -> 68904 (-1.83%); split: -2.17%, +0.33%
SClause: 67678 -> 67679 (+0.00%); split: -0.10%, +0.10%
Copies: 265824 -> 261458 (-1.64%); split: -1.96%, +0.32%
Branches: 75084 -> 75088 (+0.01%); split: -0.02%, +0.02%
PreSGPRs: 165962 -> 165716 (-0.15%)
PreVGPRs: 135122 -> 134724 (-0.29%)
VALU: 1681747 -> 1677134 (-0.27%); split: -0.32%, +0.05%
SALU: 436003 -> 435915 (-0.02%); split: -0.03%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
f034aa9cd3
radv: don't use bit_sizes_int to skip nir_lower_bit_size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
19394f44df
ac/nir: set memory_modes for lowered TES input loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
1fca72ddc8
ac/nir/ngg: update bit_sizes_int
...
This is used for RADV's bit size lowering.
fossil-db (navi21):
Totals from 4520 (5.69% of 79377) affected shaders:
(no stat changes)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
539f9b4ba6
nir,aco,radv: add align_mul/offset to buffer_amd intrinsics
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
David Rosca
62b0f84981
ac/vcn_dec: Fix AV1 film grain on VCN5
...
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33376 >
2025-02-07 13:13:45 +00:00
Samuel Pitoiset
76dcac9d47
radv: advertise VK_KHR_cooperative_matrix on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378 >
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
b05a112d92
radv/nir: add cooperative matrix lowering for GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378 >
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
ad611adeb7
radv/nir: add a struct for parameters to cooperative matrix lowering
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378 >
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
dbb7e3cf88
radv: do not keep track of the streamout binding buffer
...
More like BDA style. For future work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404 >
2025-02-07 10:53:37 +01:00
Samuel Pitoiset
03cacc1406
radv: rework passing draw info via radv_draw_info
...
More like BDA style. For future work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404 >
2025-02-07 10:53:37 +01:00
Samuel Pitoiset
6f34be88d9
radv: rework passing dispatch info via radv_dispatch_info
...
More like BDA style. For future work.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404 >
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
b5740d5819
radv: use radv_indirect_dispatch() more
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404 >
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
ef7e28e7a8
radv: remove redundant drawCount == 0 for indirect mesh/task draws
...
This is already handled in radv_before_taskmesh_draw().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404 >
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
8625decbcc
radv: fix fetching draw vertex data from counter buffers with transform feedback
...
counterOffset was just ignored and nobody noticed (missing VKCTS
coverage).
VGT_STRMOUT_DRAW_OPAQUE_OFFSET will do the computation in hw for us.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33407 >
2025-02-07 07:59:39 +00:00
Hans-Kristian Arntzen
1fcb494054
radv: Repurpose radv_legacy_sparse_binding drirc
...
Rename the drirc and call it radv_disable_dedicated_sparse_queue instead,
since normal queues support sparse now anyway.
Keep the workaround for existing known games, since they might not
expect a separate SPARSE queue to pop up.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33166 >
2025-02-06 14:07:20 +00:00
Hans-Kristian Arntzen
f58630f07c
radv: Always allow sparse on normal GFX/COMPUTE/DMA queues.
...
Forcing a dedicated sparse queue is problematic in real-world scenarios.
In the current implicit sync world for sparse updates, we can rely on
submission order.
For use cases where an application can take advantage of the separate
sparse queue to do "async" updates, the existing implementation works
well, but problems arise when trying to implement D3D-style submission
ordering. E.g., when a game does sparse on a graphics or compute queue,
we need to guarantee that previous submissions, sparse update and future
submissions are properly ordered.
The Vulkan way of implementing this is to:
- Signal graphics queue to timeline N (i.e. last submission made)
- Wait on timeline N on the sparse queue
- Do sparse updates
- Signal timeline N + 1 on sparse queue
- Wait for timeline N + 1 on graphics queue (can be deferred until next
graphics submit)
This causes an unavoidable bubble in GPU execution, since the
existing sparse queue ends up doing:
- Wait pending signal. The implication here is that all previous GPU
work must have been submitted.
- Do VM operations on CPU timeline
- Wait for semaphores to signal (this is required for signal ordering)
- ... GPU is meanwhile stalling in a bubble due to GPU -> CPU -> GPU roundtrip.
- Signal semaphore on CPU (unblocks GPU work)
Letting the GPU go idle here is not great, and we can be screwed over by bad thread scheduling.
Another knock-on effect is that the graphics queue is now forced into
using a thread for submissions. This is because when the graphics queue
wants to wait for timeline N + 1, the sparse queue may not have
signalled the timeline yet on CPU, so effectively, we have created a
wait-before-signal situation internally in RADV. Throwing another thread
under the bus is not great either.
Just letting the queue in question support sparse binding solves all
these issues and I don't see a path forward where the D3D use case can
be solved in a separate queue world.
It is also friendlier to the ecosystem at large. RADV is the only driver
I know of that insists on separate sparse queues and multiple games
assume that graphics queue can support sparse.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33166 >
2025-02-06 14:07:20 +00:00
Valentine Burley
89994ec65a
amd/ci: Fix fraction for radv-stoney-angle-full
...
The radv-stoney-angle-full was unintentionally inheriting the fraction
from the pre-merge job.
Also use the correct manual rules definition while we're here, and use
consistent naming for the restricted rules.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Eric Engestrom <None>
Reviewed-by: Antonio Ospite <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33377 >
2025-02-06 11:58:33 +00:00
Samuel Pitoiset
9b827556f5
radv: fix adding the BO to cmdbuf list when starting conditional rendering
...
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33403 >
2025-02-06 07:13:29 +00:00
Martin Roukala (né Peres)
562bc5697f
radv/ci: update expectations
...
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33401 >
2025-02-06 03:31:02 +00:00
Martin Roukala (né Peres)
b432f03c8a
radv/ci: bump tahiti's cpu cores
...
You may thank @Venemo for his generous donation to our CI :)
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33401 >
2025-02-06 03:31:02 +00:00
Mike Blumenkrantz
30b616244c
radv: print stringname for VkExternalMemoryHandleTypeFlagBits error
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323 >
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
20013a1774
radv: stop blocking non-2D import/export ops
...
these work fine
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323 >
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
ca8a740e3b
radv: fix error reporting for VkExternalMemoryTypeFlagBitsKHR
...
wrong type name is confusing
cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323 >
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
602f19bad8
ac/surface: always allow LINEAR modifier for color formats
...
this is always supported
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323 >
2025-02-06 01:48:25 +00:00
Samuel Pitoiset
4fc856af98
radv: fix caching on-demand meta shaders
...
This switches to disk_cache instead of our own mechanism which only
stored meta shaders when the logical was destroyed.
Meta shaders are still stored separately from the application shaders
because they are common to all applications on a given GPU/Mesa version.
The default cache is 32MiB which should be large enough.
This fixes massive stuttering in FF7 Rebirth but all apps are
technically affected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33370 >
2025-02-05 16:30:27 +00:00
Georg Lehmann
ff225dee67
radv: inline radv_nir_lower_poly_line_smooth
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33340 >
2025-02-05 11:23:35 +00:00
Georg Lehmann
b588b56078
radv: remove radv_should_lower_poly_line_smooth
...
I think this was broken as there might be a store_output with
less than 4 components to a location that shouldn't be smoothed
anyway (i.e. not the first one).
nir_lower_poly_line_smooth now handles the case where the first location
doesn't have 4 components.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33340 >
2025-02-05 11:23:35 +00:00
Daniel Schürmann
1a8a643bbd
aco/isel: track control flow divergence in loops more accurately
...
We introduce two new variables, cf_context::in_divergent_cf and
cf_context::parent_loop.has_divergent_break, in order to determine
whether there is any other invocations on a different CF path.
Totals from 1305 (1.64% of 79395) affected shaders: (Navi31)
Instrs: 659211 -> 657815 (-0.21%); split: -0.22%, +0.01%
CodeSize: 3483228 -> 3477960 (-0.15%); split: -0.16%, +0.01%
VGPRs: 68820 -> 48048 (-30.18%)
Latency: 14197750 -> 14170767 (-0.19%); split: -0.26%, +0.07%
InvThroughput: 1619103 -> 1619826 (+0.04%); split: -0.02%, +0.07%
VClause: 12384 -> 12350 (-0.27%)
SClause: 26693 -> 26844 (+0.57%); split: -0.01%, +0.57%
Copies: 44994 -> 43535 (-3.24%); split: -3.26%, +0.02%
PreSGPRs: 49007 -> 48907 (-0.20%)
PreVGPRs: 32171 -> 32121 (-0.16%)
VALU: 349984 -> 349857 (-0.04%); split: -0.04%, +0.00%
SALU: 84252 -> 83988 (-0.31%); split: -0.32%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
583c3586fe
aco/isel: remove loop nest information from exec_info
...
Since we never enter loops with an empty exec mask, and the
control flow is structured, we don't need to consider the
loop nest depth.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
a77258346c
aco/isel: fix assumptions about potential empty exec mask in nested control flow
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
44216e035f
aco/isel: add and use exec_info::empty() helper
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
8e8398832c
aco/isel: use cf_context in loop_context to restore cf information
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
8b9c9fb904
aco/isel: use cf_context in if_context to restore cf information
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
c2bfc05d71
aco/isel: rename cf_context::has_divergent_branch
...
Make it more consistent with cf_context::has_branch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
0c5a91b9f2
aco/isel: move cf_info into separate struct cf_context
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Daniel Schürmann
61fa007e48
aco/isel: fix empty exec tracking for uniform branches
...
Totals from 5 (0.01% of 79395) affected shaders: (Navi31)
Instrs: 54730 -> 54715 (-0.03%)
CodeSize: 276928 -> 276852 (-0.03%)
Latency: 215212 -> 214874 (-0.16%)
InvThroughput: 40154 -> 40150 (-0.01%)
Copies: 6824 -> 6821 (-0.04%); split: -0.06%, +0.01%
Branches: 1625 -> 1615 (-0.62%)
SALU: 5682 -> 5678 (-0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206 >
2025-02-05 10:54:21 +00:00
Qiang Yu
da023a5a19
ac/surface: fix radv import dmabuf from radeonsi
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33391 >
2025-02-05 09:26:22 +00:00
Samuel Pitoiset
f095aaf819
radv/meta: stop using string keys also for DGC and query objects
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33379 >
2025-02-05 08:25:00 +00:00
Dave Airlie
44b88c1034
radv/video: add h264 b frame encoding support.
...
This is supported on VCN 3 and newer.
Acked-by: David Rosca <nowrep@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31104 >
2025-02-05 04:11:38 +00:00
Dave Airlie
717c85d08a
radv/video: calculate colloc buffer size for h264 B frames.
...
This adds the overheads for the colloc buffer needed when
B frames are enabled.
Acked-by: David Rosca <nowrep@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31104 >
2025-02-05 04:11:38 +00:00
Dave Airlie
19b27c77bd
radv/video: move encoder to using a buffer instead of an image
...
For the encoder DPB just allocate a buffer of storage, this should
align memory usage more with what radeonsi does.
Acked-by: David Rosca <nowrep@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31104 >
2025-02-05 04:11:38 +00:00