Commit graph

16882 commits

Author SHA1 Message Date
Samuel Pitoiset
8df1ffaa78 radv: use radv_buffer::addr more
And remove radv_buffer:offset.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
d92153e998 radv: compute radv_buffer::addr at bind time
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
e7e43f1437 radv: rename radv_buffer::bo_va to addr
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
f70af40c5d radv: pass addr to radv_copy_buffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
228903aeaf radv/rmv: pass addr to log_resource_bind_locked()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
1d58343b43 radv/video: pass addr to send_cmd()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:34 +00:00
Samuel Pitoiset
4987926e61 radv: remove unused device memory init/finish helpers
Also zero-allocate the vulkan object.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:33 +00:00
Samuel Pitoiset
06ac711b06 radv/meta: simplify creating buffers for R32G32B32 operations
Not necessary to allocate things.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:33 +00:00
Samuel Pitoiset
1130478e5d radv/meta: compute the destination addr earlier for query resolves
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:33 +00:00
Samuel Pitoiset
230affd52b radv/meta: use BDA for query resolves
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33475>
2025-02-11 15:12:33 +00:00
Rhys Perry
ecd122ddb8 radv/rt: correctly preserve metadata in move_rt_instructions
This should invalidate nir_metadata_live_defs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33354>
2025-02-10 15:01:37 +00:00
Patrick Nicolas
9ef01a0f98 radv/video: Add low latency encoding
When VkVideoEncodeUsageInfoKHR has a tuningMode of
VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR or
VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR, request low latency
mode for the encoder.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11958
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32862>
2025-02-09 21:57:33 +00:00
Georg Lehmann
fd77cc7c32 ac/nir/lower_ps: move exports after packing alu
If ACO's wqm section ends just before the first export, this mixing alu and
exports means the alu in question can't be reordered as much by the ILP
scheduler.

Foz-DB Navi31:
Totals from 8959 (11.31% of 79188) affected shaders:
Instrs: 5977212 -> 5978494 (+0.02%); split: -0.02%, +0.04%
CodeSize: 32982732 -> 32987876 (+0.02%); split: -0.01%, +0.03%
Latency: 35218073 -> 35216277 (-0.01%); split: -0.02%, +0.02%
InvThroughput: 5149751 -> 5149696 (-0.00%); split: -0.00%, +0.00%
SClause: 220552 -> 220551 (-0.00%); split: -0.01%, +0.01%
PreVGPRs: 313203 -> 313069 (-0.04%); split: -0.06%, +0.01%

Foz-DB Navi21:
Totals from 8895 (11.21% of 79377) affected shaders:
MaxWaves: 219280 -> 219272 (-0.00%); split: +0.00%, -0.01%
Instrs: 5393330 -> 5393366 (+0.00%); split: -0.00%, +0.00%
CodeSize: 29921900 -> 29922024 (+0.00%); split: -0.00%, +0.00%
VGPRs: 406664 -> 406688 (+0.01%); split: -0.00%, +0.01%
Latency: 35653975 -> 35652220 (-0.00%); split: -0.02%, +0.02%
InvThroughput: 7992134 -> 7992032 (-0.00%); split: -0.00%, +0.00%
SClause: 223784 -> 223786 (+0.00%)
Copies: 370984 -> 370983 (-0.00%)
PreVGPRs: 314323 -> 314330 (+0.00%); split: -0.01%, +0.01%
VALU: 3800023 -> 3800022 (-0.00%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33417>
2025-02-08 17:31:18 +00:00
Martin Roukala (né Peres)
8aa22e834a radv/ci: document more Tahiti VKCTS flakes
Now that we have a more powerful host, we started getting new flakes.
Let's document them!

Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446>
2025-02-08 13:22:13 +02:00
Rhys Perry
c3d27906d8 radv: vectorize lowered shader IO
fossil-db (navi31):
Totals from 2329 (2.93% of 79377) affected shaders:
MaxWaves: 72152 -> 72102 (-0.07%)
Instrs: 1048791 -> 1041920 (-0.66%); split: -0.72%, +0.07%
CodeSize: 5331832 -> 5285572 (-0.87%); split: -0.90%, +0.03%
VGPRs: 113844 -> 113820 (-0.02%); split: -0.14%, +0.12%
Latency: 4349524 -> 4346374 (-0.07%); split: -0.35%, +0.28%
InvThroughput: 609449 -> 609235 (-0.04%); split: -0.27%, +0.24%
VClause: 22613 -> 22451 (-0.72%); split: -1.03%, +0.31%
SClause: 21197 -> 21177 (-0.09%); split: -0.45%, +0.35%
Copies: 81900 -> 82446 (+0.67%); split: -1.51%, +2.18%
PreSGPRs: 94697 -> 93596 (-1.16%); split: -1.23%, +0.07%
PreVGPRs: 69962 -> 70080 (+0.17%); split: -0.01%, +0.18%
VALU: 625247 -> 625390 (+0.02%); split: -0.23%, +0.25%
SALU: 101692 -> 101555 (-0.13%); split: -0.24%, +0.11%
VMEM: 46459 -> 44845 (-3.47%)

fossil-db (navi21):
Totals from 17522 (22.07% of 79377) affected shaders:
MaxWaves: 425698 -> 425460 (-0.06%); split: +0.00%, -0.06%
Instrs: 11444215 -> 11428321 (-0.14%); split: -0.14%, +0.00%
CodeSize: 59227492 -> 59019376 (-0.35%); split: -0.35%, +0.00%
VGPRs: 780920 -> 781208 (+0.04%); split: -0.00%, +0.04%
Latency: 44965072 -> 44926529 (-0.09%); split: -0.12%, +0.03%
InvThroughput: 9718148 -> 9728793 (+0.11%); split: -0.01%, +0.12%
VClause: 225732 -> 225605 (-0.06%); split: -0.10%, +0.04%
SClause: 217196 -> 217160 (-0.02%); split: -0.03%, +0.01%
Copies: 1050351 -> 1065263 (+1.42%); split: -0.03%, +1.45%
PreSGPRs: 747538 -> 747223 (-0.04%); split: -0.05%, +0.01%
PreVGPRs: 626702 -> 626748 (+0.01%); split: -0.00%, +0.01%
VALU: 6629403 -> 6643822 (+0.22%); split: -0.01%, +0.23%
SALU: 1898492 -> 1898452 (-0.00%); split: -0.00%, +0.00%
VMEM: 529942 -> 528361 (-0.30%)

fossil-db (vega10):
Totals from 1791 (2.84% of 62962) affected shaders:
MaxWaves: 12270 -> 12253 (-0.14%); split: +0.01%, -0.15%
Instrs: 602026 -> 597473 (-0.76%); split: -0.83%, +0.08%
CodeSize: 3109872 -> 3071664 (-1.23%); split: -1.26%, +0.03%
SGPRs: 137826 -> 137938 (+0.08%); split: -0.10%, +0.19%
VGPRs: 70364 -> 70520 (+0.22%); split: -0.03%, +0.26%
Latency: 4757850 -> 4781905 (+0.51%); split: -0.35%, +0.86%
InvThroughput: 2296941 -> 2310685 (+0.60%); split: -0.14%, +0.74%
VClause: 14161 -> 14050 (-0.78%); split: -1.23%, +0.44%
SClause: 14058 -> 14077 (+0.14%); split: -0.57%, +0.70%
Copies: 40954 -> 42191 (+3.02%); split: -1.69%, +4.71%
PreSGPRs: 64314 -> 63214 (-1.71%); split: -1.81%, +0.10%
PreVGPRs: 53558 -> 53894 (+0.63%); split: -0.01%, +0.64%
VALU: 449920 -> 450830 (+0.20%); split: -0.19%, +0.39%
SALU: 32973 -> 32839 (-0.41%); split: -0.76%, +0.35%
VMEM: 28796 -> 25151 (-12.66%)

fossil-db (polaris10):
Totals from 1769 (2.86% of 61794) affected shaders:
MaxWaves: 12024 -> 12021 (-0.02%)
Instrs: 474761 -> 470760 (-0.84%); split: -0.94%, +0.10%
CodeSize: 2447964 -> 2420712 (-1.11%); split: -1.15%, +0.04%
SGPRs: 129664 -> 129728 (+0.05%); split: -0.14%, +0.19%
VGPRs: 65216 -> 65560 (+0.53%); split: -0.05%, +0.58%
Latency: 4304734 -> 4318319 (+0.32%); split: -0.41%, +0.72%
InvThroughput: 2114950 -> 2122580 (+0.36%); split: -0.18%, +0.54%
VClause: 10933 -> 10808 (-1.14%); split: -1.42%, +0.27%
SClause: 11430 -> 11446 (+0.14%); split: -0.70%, +0.84%
Copies: 32290 -> 31891 (-1.24%); split: -2.80%, +1.56%
PreSGPRs: 58184 -> 57096 (-1.87%); split: -1.98%, +0.11%
PreVGPRs: 48757 -> 48874 (+0.24%); split: -0.02%, +0.26%
VALU: 359097 -> 358582 (-0.14%); split: -0.25%, +0.11%
SALU: 26279 -> 25934 (-1.31%); split: -1.75%, +0.43%
VMEM: 18825 -> 17247 (-8.38%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
953faac23e radv: vectorize descriptor loads
fossil-db (navi31):
Totals from 49237 (62.03% of 79377) affected shaders:
MaxWaves: 1497901 -> 1497851 (-0.00%); split: +0.00%, -0.00%
Instrs: 25766029 -> 25595468 (-0.66%); split: -0.68%, +0.02%
CodeSize: 133811412 -> 132616356 (-0.89%); split: -0.90%, +0.01%
VGPRs: 2318068 -> 2318200 (+0.01%); split: -0.00%, +0.01%
SpillSGPRs: 4512 -> 4507 (-0.11%); split: -0.64%, +0.53%
Latency: 164086813 -> 163869930 (-0.13%); split: -0.22%, +0.09%
InvThroughput: 24811220 -> 24802709 (-0.03%); split: -0.05%, +0.02%
VClause: 553717 -> 557194 (+0.63%); split: -0.30%, +0.93%
SClause: 723038 -> 710431 (-1.74%); split: -2.69%, +0.95%
Copies: 1709226 -> 1711030 (+0.11%); split: -0.48%, +0.59%
Branches: 465169 -> 465164 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1775360 -> 1961282 (+10.47%); split: -0.01%, +10.48%
VALU: 15418039 -> 15417896 (-0.00%); split: -0.02%, +0.02%
SALU: 2424519 -> 2416263 (-0.34%); split: -0.61%, +0.26%
SMEM: 1245273 -> 1121006 (-9.98%)
VOPD: 3882 -> 3885 (+0.08%); split: +0.18%, -0.10%

fossil-db (navi21):
Totals from 48539 (61.15% of 79377) affected shaders:
MaxWaves: 1262958 -> 1262912 (-0.00%); split: +0.00%, -0.01%
Instrs: 30334013 -> 30154279 (-0.59%); split: -0.60%, +0.01%
CodeSize: 161298192 -> 160027616 (-0.79%); split: -0.80%, +0.01%
VGPRs: 1979248 -> 1979192 (-0.00%); split: -0.01%, +0.01%
SpillSGPRs: 3751 -> 3776 (+0.67%); split: -0.75%, +1.41%
Latency: 185665578 -> 185429672 (-0.13%); split: -0.23%, +0.10%
InvThroughput: 41413438 -> 41406558 (-0.02%); split: -0.03%, +0.02%
VClause: 624116 -> 626703 (+0.41%); split: -0.30%, +0.71%
SClause: 775094 -> 764569 (-1.36%); split: -2.73%, +1.38%
Copies: 2437041 -> 2441758 (+0.19%); split: -0.23%, +0.42%
Branches: 770540 -> 770552 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1919117 -> 2021093 (+5.31%); split: -0.01%, +5.32%
VALU: 18926346 -> 18926269 (-0.00%); split: -0.01%, +0.01%
SALU: 4316722 -> 4310066 (-0.15%); split: -0.33%, +0.17%
SMEM: 1350230 -> 1216865 (-9.88%)

fossil-db (vega10):
Totals from 41793 (66.38% of 62962) affected shaders:
MaxWaves: 306797 -> 306685 (-0.04%); split: +0.02%, -0.06%
Instrs: 16251398 -> 16140153 (-0.68%); split: -0.71%, +0.02%
CodeSize: 83407848 -> 82543596 (-1.04%); split: -1.05%, +0.01%
SGPRs: 2787936 -> 2854864 (+2.40%); split: -0.73%, +3.13%
VGPRs: 1585644 -> 1586156 (+0.03%); split: -0.01%, +0.05%
SpillSGPRs: 3856 -> 3843 (-0.34%); split: -1.50%, +1.17%
SpillVGPRs: 560 -> 562 (+0.36%)
Latency: 167478607 -> 166829429 (-0.39%); split: -0.50%, +0.12%
InvThroughput: 76378642 -> 76353650 (-0.03%); split: -0.06%, +0.03%
VClause: 361639 -> 362694 (+0.29%); split: -0.31%, +0.60%
SClause: 546919 -> 535879 (-2.02%); split: -2.98%, +0.96%
Copies: 1388817 -> 1396020 (+0.52%); split: -0.37%, +0.89%
Branches: 227697 -> 227705 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1384316 -> 1532654 (+10.72%); split: -0.01%, +10.73%
VALU: 11896315 -> 11896547 (+0.00%); split: -0.01%, +0.01%
SALU: 1371452 -> 1370143 (-0.10%); split: -0.83%, +0.73%
VMEM: 628506 -> 628510 (+0.00%)
SMEM: 984495 -> 882129 (-10.40%)

fossil-db (polaris10):
Totals from 41057 (66.44% of 61794) affected shaders:
MaxWaves: 270307 -> 270311 (+0.00%); split: +0.02%, -0.01%
Instrs: 16082187 -> 15972163 (-0.68%); split: -0.71%, +0.02%
CodeSize: 82199592 -> 81341176 (-1.04%); split: -1.05%, +0.01%
SGPRs: 2894960 -> 2970720 (+2.62%); split: -0.67%, +3.29%
VGPRs: 1620132 -> 1620352 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 3885 -> 3868 (-0.44%); split: -1.47%, +1.03%
SpillVGPRs: 617 -> 619 (+0.32%)
Latency: 166722696 -> 166066137 (-0.39%); split: -0.52%, +0.13%
InvThroughput: 76887856 -> 76862349 (-0.03%); split: -0.08%, +0.04%
VClause: 353499 -> 354709 (+0.34%); split: -0.28%, +0.62%
SClause: 544073 -> 533053 (-2.03%); split: -2.97%, +0.95%
Copies: 1398025 -> 1405848 (+0.56%); split: -0.30%, +0.86%
Branches: 224038 -> 224041 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1362781 -> 1509495 (+10.77%); split: -0.01%, +10.77%
VALU: 11771997 -> 11772271 (+0.00%); split: -0.01%, +0.01%
SALU: 1416410 -> 1415708 (-0.05%); split: -0.72%, +0.68%
VMEM: 616867 -> 616871 (+0.00%)
SMEM: 970539 -> 869729 (-10.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
5fe0012670 radv: DCE before nir_opt_shrink_vectors
fossil-db (navi31):
Totals from 941 (1.19% of 79377) affected shaders:
MaxWaves: 28422 -> 28502 (+0.28%)
Instrs: 645374 -> 642031 (-0.52%); split: -0.55%, +0.03%
CodeSize: 3264460 -> 3244488 (-0.61%); split: -0.63%, +0.02%
VGPRs: 48392 -> 48044 (-0.72%)
SpillVGPRs: 7 -> 0 (-inf%)
Scratch: 1792 -> 0 (-inf%)
Latency: 2596896 -> 2536952 (-2.31%); split: -2.33%, +0.02%
InvThroughput: 528726 -> 500139 (-5.41%); split: -5.42%, +0.01%
VClause: 14566 -> 14539 (-0.19%)
Copies: 53022 -> 51296 (-3.26%); split: -3.37%, +0.12%
PreSGPRs: 29369 -> 29367 (-0.01%)
PreVGPRs: 29710 -> 29694 (-0.05%)
VALU: 366134 -> 364245 (-0.52%); split: -0.53%, +0.02%
SALU: 74017 -> 73891 (-0.17%)
VMEM: 25240 -> 25208 (-0.13%)

fossil-db (navi21):
Totals from 941 (1.19% of 79377) affected shaders:
MaxWaves: 22018 -> 22058 (+0.18%); split: +0.28%, -0.10%
Instrs: 521053 -> 518898 (-0.41%); split: -0.43%, +0.02%
CodeSize: 2750628 -> 2734996 (-0.57%); split: -0.58%, +0.01%
VGPRs: 41152 -> 41024 (-0.31%); split: -0.41%, +0.10%
SpillVGPRs: 5 -> 0 (-inf%)
Scratch: 2048 -> 0 (-inf%)
Latency: 2655941 -> 2607022 (-1.84%); split: -1.86%, +0.02%
InvThroughput: 711733 -> 690032 (-3.05%); split: -3.07%, +0.02%
VClause: 16388 -> 16363 (-0.15%)
Copies: 35152 -> 33485 (-4.74%); split: -4.98%, +0.24%
PreSGPRs: 28486 -> 28484 (-0.01%)
PreVGPRs: 30317 -> 30301 (-0.05%)
VALU: 348423 -> 346614 (-0.52%); split: -0.54%, +0.02%
SALU: 44020 -> 43869 (-0.34%)
VMEM: 25216 -> 25195 (-0.08%)

fossil-db (vega10):
Totals from 416 (0.66% of 62962) affected shaders:
MaxWaves: 2687 -> 2696 (+0.33%); split: +0.37%, -0.04%
Instrs: 245634 -> 243501 (-0.87%); split: -0.88%, +0.01%
CodeSize: 1312836 -> 1297248 (-1.19%); split: -1.19%, +0.01%
VGPRs: 17684 -> 17692 (+0.05%); split: -0.43%, +0.48%
SpillVGPRs: 5 -> 0 (-inf%)
Scratch: 2048 -> 0 (-inf%)
Latency: 1928393 -> 1881346 (-2.44%); split: -2.44%, +0.00%
InvThroughput: 1163915 -> 1117096 (-4.02%); split: -4.03%, +0.00%
VClause: 7070 -> 7053 (-0.24%)
Copies: 22577 -> 20834 (-7.72%); split: -7.78%, +0.06%
Branches: 4328 -> 4320 (-0.18%)
PreSGPRs: 13993 -> 13991 (-0.01%)
PreVGPRs: 13452 -> 13436 (-0.12%)
VALU: 165253 -> 163366 (-1.14%); split: -1.15%, +0.01%
SALU: 26258 -> 26111 (-0.56%)
VMEM: 11736 -> 11715 (-0.18%)

fossil-db (polaris10):
Totals from 355 (0.57% of 61794) affected shaders:
Instrs: 108639 -> 108682 (+0.04%); split: -0.03%, +0.07%
CodeSize: 583804 -> 583936 (+0.02%); split: -0.03%, +0.06%
SGPRs: 17712 -> 17728 (+0.09%)
Latency: 735332 -> 734777 (-0.08%); split: -0.08%, +0.01%
InvThroughput: 443975 -> 444045 (+0.02%); split: -0.03%, +0.04%
VClause: 2552 -> 2558 (+0.24%)
SClause: 2394 -> 2393 (-0.04%)
Copies: 11433 -> 11464 (+0.27%); split: -0.15%, +0.42%
PreVGPRs: 7365 -> 7364 (-0.01%)
VALU: 69385 -> 69416 (+0.04%); split: -0.02%, +0.07%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
d04e1ea02d radv: move nir_opt_shrink_vectors later
This seems to be helpful with shaders which use NGG culling.

fossil-db (navi21):
Totals from 3529 (4.45% of 79377) affected shaders:
MaxWaves: 81490 -> 82066 (+0.71%)
Instrs: 2868872 -> 2863476 (-0.19%); split: -0.22%, +0.04%
CodeSize: 14949540 -> 14927580 (-0.15%); split: -0.18%, +0.03%
VGPRs: 165440 -> 164144 (-0.78%)
SpillSGPRs: 578 -> 405 (-29.93%)
Latency: 15388119 -> 15151882 (-1.54%); split: -1.74%, +0.20%
InvThroughput: 2935873 -> 2929736 (-0.21%); split: -0.25%, +0.04%
VClause: 70192 -> 68904 (-1.83%); split: -2.17%, +0.33%
SClause: 67678 -> 67679 (+0.00%); split: -0.10%, +0.10%
Copies: 265824 -> 261458 (-1.64%); split: -1.96%, +0.32%
Branches: 75084 -> 75088 (+0.01%); split: -0.02%, +0.02%
PreSGPRs: 165962 -> 165716 (-0.15%)
PreVGPRs: 135122 -> 134724 (-0.29%)
VALU: 1681747 -> 1677134 (-0.27%); split: -0.32%, +0.05%
SALU: 436003 -> 435915 (-0.02%); split: -0.03%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
f034aa9cd3 radv: don't use bit_sizes_int to skip nir_lower_bit_size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
19394f44df ac/nir: set memory_modes for lowered TES input loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
1fca72ddc8 ac/nir/ngg: update bit_sizes_int
This is used for RADV's bit size lowering.

fossil-db (navi21):
Totals from 4520 (5.69% of 79377) affected shaders:
(no stat changes)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
Rhys Perry
539f9b4ba6 nir,aco,radv: add align_mul/offset to buffer_amd intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242>
2025-02-07 13:52:57 +00:00
David Rosca
62b0f84981 ac/vcn_dec: Fix AV1 film grain on VCN5
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33376>
2025-02-07 13:13:45 +00:00
Samuel Pitoiset
76dcac9d47 radv: advertise VK_KHR_cooperative_matrix on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378>
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
b05a112d92 radv/nir: add cooperative matrix lowering for GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378>
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
ad611adeb7 radv/nir: add a struct for parameters to cooperative matrix lowering
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33378>
2025-02-07 12:06:10 +00:00
Samuel Pitoiset
dbb7e3cf88 radv: do not keep track of the streamout binding buffer
More like BDA style. For future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404>
2025-02-07 10:53:37 +01:00
Samuel Pitoiset
03cacc1406 radv: rework passing draw info via radv_draw_info
More like BDA style. For future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404>
2025-02-07 10:53:37 +01:00
Samuel Pitoiset
6f34be88d9 radv: rework passing dispatch info via radv_dispatch_info
More like BDA style. For future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404>
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
b5740d5819 radv: use radv_indirect_dispatch() more
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404>
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
ef7e28e7a8 radv: remove redundant drawCount == 0 for indirect mesh/task draws
This is already handled in radv_before_taskmesh_draw().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33404>
2025-02-07 09:30:22 +01:00
Samuel Pitoiset
8625decbcc radv: fix fetching draw vertex data from counter buffers with transform feedback
counterOffset was just ignored and nobody noticed (missing VKCTS
coverage).

VGT_STRMOUT_DRAW_OPAQUE_OFFSET will do the computation in hw for us.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33407>
2025-02-07 07:59:39 +00:00
Hans-Kristian Arntzen
1fcb494054 radv: Repurpose radv_legacy_sparse_binding drirc
Rename the drirc and call it radv_disable_dedicated_sparse_queue instead,
since normal queues support sparse now anyway.
Keep the workaround for existing known games, since they might not
expect a separate SPARSE queue to pop up.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33166>
2025-02-06 14:07:20 +00:00
Hans-Kristian Arntzen
f58630f07c radv: Always allow sparse on normal GFX/COMPUTE/DMA queues.
Forcing a dedicated sparse queue is problematic in real-world scenarios.

In the current implicit sync world for sparse updates, we can rely on
submission order.

For use cases where an application can take advantage of the separate
sparse queue to do "async" updates, the existing implementation works
well, but problems arise when trying to implement D3D-style submission
ordering. E.g., when a game does sparse on a graphics or compute queue,
we need to guarantee that previous submissions, sparse update and future
submissions are properly ordered.
The Vulkan way of implementing this is to:

- Signal graphics queue to timeline N (i.e. last submission made)
- Wait on timeline N on the sparse queue
- Do sparse updates
- Signal timeline N + 1 on sparse queue
- Wait for timeline N + 1 on graphics queue (can be deferred until next
  graphics submit)

This causes an unavoidable bubble in GPU execution, since the
existing sparse queue ends up doing:

- Wait pending signal. The implication here is that all previous GPU
  work must have been submitted.
- Do VM operations on CPU timeline
- Wait for semaphores to signal (this is required for signal ordering)
- ... GPU is meanwhile stalling in a bubble due to GPU -> CPU -> GPU roundtrip.
- Signal semaphore on CPU (unblocks GPU work)

Letting the GPU go idle here is not great, and we can be screwed over by bad thread scheduling.

Another knock-on effect is that the graphics queue is now forced into
using a thread for submissions. This is because when the graphics queue
wants to wait for timeline N + 1, the sparse queue may not have
signalled the timeline yet on CPU, so effectively, we have created a
wait-before-signal situation internally in RADV. Throwing another thread
under the bus is not great either.

Just letting the queue in question support sparse binding solves all
these issues and I don't see a path forward where the D3D use case can
be solved in a separate queue world.

It is also friendlier to the ecosystem at large. RADV is the only driver
I know of that insists on separate sparse queues and multiple games
assume that graphics queue can support sparse.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33166>
2025-02-06 14:07:20 +00:00
Valentine Burley
89994ec65a amd/ci: Fix fraction for radv-stoney-angle-full
The radv-stoney-angle-full was unintentionally inheriting the fraction
from the pre-merge job.
Also use the correct manual rules definition while we're here, and use
consistent naming for the restricted rules.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Eric Engestrom <None>
Reviewed-by: Antonio Ospite <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33377>
2025-02-06 11:58:33 +00:00
Samuel Pitoiset
9b827556f5 radv: fix adding the BO to cmdbuf list when starting conditional rendering
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33403>
2025-02-06 07:13:29 +00:00
Martin Roukala (né Peres)
562bc5697f radv/ci: update expectations
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33401>
2025-02-06 03:31:02 +00:00
Martin Roukala (né Peres)
b432f03c8a radv/ci: bump tahiti's cpu cores
You may thank @Venemo for his generous donation to our CI :)

Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33401>
2025-02-06 03:31:02 +00:00
Mike Blumenkrantz
30b616244c radv: print stringname for VkExternalMemoryHandleTypeFlagBits error
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323>
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
20013a1774 radv: stop blocking non-2D import/export ops
these work fine

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323>
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
ca8a740e3b radv: fix error reporting for VkExternalMemoryTypeFlagBitsKHR
wrong type name is confusing

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323>
2025-02-06 01:48:25 +00:00
Mike Blumenkrantz
602f19bad8 ac/surface: always allow LINEAR modifier for color formats
this is always supported

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323>
2025-02-06 01:48:25 +00:00
Samuel Pitoiset
4fc856af98 radv: fix caching on-demand meta shaders
This switches to disk_cache instead of our own mechanism which only
stored meta shaders when the logical was destroyed.

Meta shaders are still stored separately from the application shaders
because they are common to all applications on a given GPU/Mesa version.
The default cache is 32MiB which should be large enough.

This fixes massive stuttering in FF7 Rebirth but all apps are
technically affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33370>
2025-02-05 16:30:27 +00:00
Georg Lehmann
ff225dee67 radv: inline radv_nir_lower_poly_line_smooth
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33340>
2025-02-05 11:23:35 +00:00
Georg Lehmann
b588b56078 radv: remove radv_should_lower_poly_line_smooth
I think this was broken as there might be a store_output with
less than 4 components to a location that shouldn't be smoothed
anyway (i.e. not the first one).

nir_lower_poly_line_smooth now handles the case where the first location
doesn't have 4 components.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33340>
2025-02-05 11:23:35 +00:00
Daniel Schürmann
1a8a643bbd aco/isel: track control flow divergence in loops more accurately
We introduce two new variables, cf_context::in_divergent_cf and
cf_context::parent_loop.has_divergent_break, in order to determine
whether there is any other invocations on a different CF path.

Totals from 1305 (1.64% of 79395) affected shaders: (Navi31)

Instrs: 659211 -> 657815 (-0.21%); split: -0.22%, +0.01%
CodeSize: 3483228 -> 3477960 (-0.15%); split: -0.16%, +0.01%
VGPRs: 68820 -> 48048 (-30.18%)
Latency: 14197750 -> 14170767 (-0.19%); split: -0.26%, +0.07%
InvThroughput: 1619103 -> 1619826 (+0.04%); split: -0.02%, +0.07%
VClause: 12384 -> 12350 (-0.27%)
SClause: 26693 -> 26844 (+0.57%); split: -0.01%, +0.57%
Copies: 44994 -> 43535 (-3.24%); split: -3.26%, +0.02%
PreSGPRs: 49007 -> 48907 (-0.20%)
PreVGPRs: 32171 -> 32121 (-0.16%)
VALU: 349984 -> 349857 (-0.04%); split: -0.04%, +0.00%
SALU: 84252 -> 83988 (-0.31%); split: -0.32%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206>
2025-02-05 10:54:21 +00:00
Daniel Schürmann
583c3586fe aco/isel: remove loop nest information from exec_info
Since we never enter loops with an empty exec mask, and the
control flow is structured, we don't need to consider the
loop nest depth.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206>
2025-02-05 10:54:21 +00:00
Daniel Schürmann
a77258346c aco/isel: fix assumptions about potential empty exec mask in nested control flow
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206>
2025-02-05 10:54:21 +00:00
Daniel Schürmann
44216e035f aco/isel: add and use exec_info::empty() helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206>
2025-02-05 10:54:21 +00:00
Daniel Schürmann
8e8398832c aco/isel: use cf_context in loop_context to restore cf information
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33206>
2025-02-05 10:54:21 +00:00