Rhys Perry
2135c88d9c
aco: fix signedness of DS_instruction::offset0/1
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Daniel Schürmann
d703a0e808
aco: remove register hints entirely
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
2fe005a3fe
aco: remove occurences of VCC hint
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
b10c4d7dee
aco: make program->needs_vcc independent of VCC hints
...
Totals from 5 (0.00% of 135048) affected shaders: (GFX9)
SGPRs: 208 -> 160 (-23.08%)
CodeSize: 2700 -> 2692 (-0.30%)
Instrs: 533 -> 531 (-0.38%)
Latency: 41688 -> 41680 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
415a3820fc
aco/ra: omit VCC affinity on VOPC_SDWA for GFX9+
...
VOPC_SDWA can also use arbitrary SGPR pairs on GFX9+.
Totals from 5607 (4.16% of 134913) affected shaders: (GFX10.3)
CodeSize: 42470760 -> 42452988 (-0.04%)
Instrs: 7943174 -> 7942883 (-0.00%)
Latency: 102887029 -> 102886305 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 20454456 -> 20454338 (-0.00%); split: -0.00%, +0.00%
Copies: 376818 -> 376865 (+0.01%); split: -0.00%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
6ebc61d71b
aco/ra: create VCC-affinities during RA
...
instead of using register hints.
Totals from 88367 (65.50% of 134913) affected shaders: (GFX10.3)
CodeSize: 322492184 -> 322252912 (-0.07%); split: -0.08%, +0.01%
Instrs: 60615809 -> 60541260 (-0.12%); split: -0.12%, +0.00%
Latency: 557067980 -> 557009210 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 109676757 -> 109674804 (-0.00%); split: -0.00%, +0.00%
SClause: 1939703 -> 1939924 (+0.01%); split: -0.01%, +0.02%
Copies: 4557567 -> 4487530 (-1.54%); split: -1.54%, +0.00%
Branches: 1941123 -> 1937453 (-0.19%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
44fb9ba84a
aco/ra: only use VCC if program->needs_vcc == true
...
A future commit will make VCC register assignment independent
from register hints. Up to GFX9, VCC can alternatively be used
as regular SGPR, so prevent overlap.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Rhys Perry
7478b00c7c
aco: remove old global access intrinsics
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
9d1bab3615
aco: increase global_load_params.max_const_offset_plus_one
...
The callback now supports this. This shouldn't have any effect yet except
on GFX6 with 12 byte loads.
fossil-db (Pitcairn):
Totals from 246 (0.18% of 135668) affected shaders:
VGPRs: 14684 -> 14768 (+0.57%); split: -0.44%, +1.01%
CodeSize: 1765792 -> 1738040 (-1.57%)
Instrs: 344605 -> 340055 (-1.32%)
Latency: 4892904 -> 4861942 (-0.63%)
InvThroughput: 2479599 -> 2446070 (-1.35%)
VClause: 8782 -> 8735 (-0.54%)
SClause: 9854 -> 9853 (-0.01%)
Copies: 47327 -> 45401 (-4.07%); split: -4.08%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
3e9517c757
aco: implement _amd global access intrinsics
...
fossil-db (Sienna Cichlid):
Totals from 7 (0.01% of 134621) affected shaders:
VGPRs: 760 -> 776 (+2.11%)
CodeSize: 222000 -> 222044 (+0.02%); split: -0.01%, +0.03%
Instrs: 40959 -> 40987 (+0.07%); split: -0.01%, +0.08%
Latency: 874811 -> 886609 (+1.35%); split: -0.00%, +1.35%
InvThroughput: 437405 -> 443303 (+1.35%); split: -0.00%, +1.35%
VClause: 1242 -> 1240 (-0.16%)
SClause: 1050 -> 1049 (-0.10%); split: -0.19%, +0.10%
Copies: 4953 -> 4973 (+0.40%); split: -0.04%, +0.44%
Branches: 1947 -> 1957 (+0.51%); split: -0.05%, +0.56%
PreVGPRs: 741 -> 747 (+0.81%)
fossil-db changes seem to be noise.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
391bf3ea30
aco: don't expand smem/mubuf global loads
...
For example, dwordx3->dwordx4 or ubyte3->dwordx2.
Global loads don't have the bounds checking that buffer loads have that
makes this safe.
The alignment checks are added to global_load_callback() in case
byte_align_loads=false, align=1 and bytes_needed=3. Without them, the
callback will create a dword load.
fossil-db (Sienna Cichlid):
Totals from 267 (0.20% of 134621) affected shaders:
CodeSize: 1603352 -> 1606568 (+0.20%)
Instrs: 294946 -> 295482 (+0.18%); split: -0.00%, +0.18%
Latency: 2997003 -> 2997052 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 526645 -> 526659 (+0.00%)
SClause: 9179 -> 9185 (+0.07%); split: -0.02%, +0.09%
Copies: 25363 -> 25375 (+0.05%); split: -0.08%, +0.13%
Branches: 8298 -> 8299 (+0.01%)
fossil-db (Polaris10):
Totals from 267 (0.20% of 135668) affected shaders:
CodeSize: 1636672 -> 1638756 (+0.13%); split: -0.00%, +0.13%
Instrs: 308484 -> 308733 (+0.08%); split: -0.01%, +0.09%
Latency: 3446045 -> 3446904 (+0.02%); split: -0.00%, +0.03%
InvThroughput: 1206722 -> 1206828 (+0.01%); split: -0.00%, +0.01%
SClause: 9308 -> 9311 (+0.03%); split: -0.08%, +0.11%
Copies: 36933 -> 36921 (-0.03%); split: -0.08%, +0.05%
fossil-db (Pitcairn):
Totals from 275 (0.20% of 135668) affected shaders:
SGPRs: 17616 -> 17520 (-0.54%); split: -0.64%, +0.09%
VGPRs: 15428 -> 15540 (+0.73%); split: -0.23%, +0.96%
CodeSize: 1885792 -> 1929120 (+2.30%); split: -0.00%, +2.30%
MaxWaves: 1284 -> 1285 (+0.08%)
Instrs: 368963 -> 376095 (+1.93%); split: -0.00%, +1.94%
Latency: 5122922 -> 5168398 (+0.89%); split: -0.01%, +0.90%
InvThroughput: 2562866 -> 2604279 (+1.62%)
VClause: 9268 -> 9296 (+0.30%); split: -0.13%, +0.43%
SClause: 10702 -> 10705 (+0.03%); split: -0.05%, +0.07%
Copies: 48620 -> 50629 (+4.13%); split: -0.08%, +4.21%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
6baad09711
aco: use saddr for global access with sgpr address
...
fossil-db (Sienna Cichlid):
Totals from 38 (0.03% of 134621) affected shaders:
CodeSize: 237196 -> 237060 (-0.06%); split: -0.09%, +0.03%
Instrs: 43895 -> 43894 (-0.00%); split: -0.02%, +0.01%
Latency: 914633 -> 916263 (+0.18%); split: -0.01%, +0.19%
InvThroughput: 468215 -> 468971 (+0.16%); split: -0.02%, +0.18%
SClause: 1239 -> 1242 (+0.24%)
PreSGPRs: 997 -> 1003 (+0.60%)
PreVGPRs: 936 -> 923 (-1.39%); split: -1.50%, +0.11%
Regression seems to be RA noise, creating a waitcnt.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
d957730b9b
aco: use vcc for 64-bit vgpr addition
...
fossil-db (Sienna Cichlid):
Totals from 229 (0.17% of 134621) affected shaders:
CodeSize: 1520192 -> 1517644 (-0.17%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Timur Kristóf
b874a2ed61
aco: Fix VOP2 instruction format in visit_tex.
...
There was a v_or_b32 that accidentally used SOP2.
It should use VOP2.
Issue found by looking at a gfxreconstruct trace posted by a user
in this bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5838
Cc: mesa-stable
Fixes: 93c8ebfa78 "aco: Initial commit of independent AMD compiler"
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15923 >
2022-04-13 13:06:49 +00:00
Rhys Perry
773c7cbcbc
radv,aco: implement 64-bit inline push constants
...
fossil-db (Sienna Cichlid):
Totals from 21 (0.02% of 134621) affected shaders:
CodeSize: 1932 -> 1560 (-19.25%)
Instrs: 357 -> 303 (-15.13%)
Latency: 6576 -> 5883 (-10.54%)
InvThroughput: 26304 -> 23532 (-10.54%)
SClause: 42 -> 24 (-42.86%)
Copies: 90 -> 105 (+16.67%); split: -10.00%, +26.67%
PreSGPRs: 144 -> 201 (+39.58%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145 >
2022-04-12 11:44:30 +00:00
Rhys Perry
7f6262bb85
radv: allow holes in inline push constants
...
Use a dword mask instead of a range to track which push constants to
inline.
fossil-db (Sienna Cichlid):
Totals from 5724 (4.25% of 134621) affected shaders:
CodeSize: 20894044 -> 20815748 (-0.37%); split: -0.39%, +0.02%
Instrs: 4002568 -> 3988385 (-0.35%); split: -0.38%, +0.02%
Latency: 29285060 -> 29224414 (-0.21%); split: -0.22%, +0.01%
InvThroughput: 5529700 -> 5526893 (-0.05%); split: -0.05%, +0.00%
VClause: 78093 -> 78240 (+0.19%); split: -0.23%, +0.41%
SClause: 135495 -> 131027 (-3.30%); split: -3.30%, +0.00%
Copies: 330856 -> 324552 (-1.91%); split: -2.37%, +0.46%
PreSGPRs: 226031 -> 224778 (-0.55%); split: -0.61%, +0.05%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145 >
2022-04-12 11:44:30 +00:00
Erik Faye-Lund
28dbabec8e
aco: do not use designated initializers
...
Designated initializers are a C++20 feature, but we don't use C++20. In
fact, enabling C++20 for ACO triggers new compiler errors due to some
equality semantics details.
So let's instead stop using designated initializers here.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15706 >
2022-04-05 16:58:56 +00:00
Rhys Perry
5b4e41e4db
aco: don't use v_mad_mix on GFX9 if 16-bit denormals must be preserved
...
This probably effectively disables the v_mad_mix optimization on GFX9.
fossil-db (Vega):
Totals from 11545 (7.15% of 161366) affected shaders:
MaxWaves: 43025 -> 42780 (-0.57%); split: +0.06%, -0.63%
Instrs: 18571635 -> 18734201 (+0.88%); split: -0.00%, +0.88%
CodeSize: 96483568 -> 96611012 (+0.13%); split: -0.11%, +0.24%
SGPRs: 1079056 -> 1077616 (-0.13%); split: -0.14%, +0.01%
VGPRs: 819248 -> 821868 (+0.32%); split: -0.04%, +0.36%
SpillSGPRs: 13313 -> 12464 (-6.38%)
Latency: 293804093 -> 295046122 (+0.42%); split: -0.09%, +0.51%
InvThroughput: 110002239 -> 110994978 (+0.90%); split: -0.03%, +0.93%
VClause: 342458 -> 342596 (+0.04%); split: -0.12%, +0.16%
SClause: 648566 -> 648046 (-0.08%); split: -0.12%, +0.04%
Copies: 1728225 -> 1726679 (-0.09%); split: -0.66%, +0.57%
Branches: 552973 -> 552963 (-0.00%); split: -0.02%, +0.02%
PreSGPRs: 862360 -> 856820 (-0.64%); split: -0.69%, +0.05%
PreVGPRs: 773689 -> 776818 (+0.40%); split: -0.02%, +0.42%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15718 >
2022-04-04 19:27:12 +00:00
Daniel Schürmann
6d383159d4
aco/optimizer: check recursively if we can eliminate s_and exec
...
Totals from 2860 (2.12% of 134913) affected shaders: (GFX10.3)
CodeSize: 5990728 -> 5979164 (-0.19%); split: -0.20%, +0.01%
Instrs: 1094562 -> 1091653 (-0.27%); split: -0.28%, +0.01%
Latency: 8689841 -> 8684523 (-0.06%); split: -0.07%, +0.00%
InvThroughput: 1840533 -> 1840527 (-0.00%); split: -0.00%, +0.00%
SClause: 51437 -> 51439 (+0.00%)
Copies: 82461 -> 82472 (+0.01%)
PreSGPRs: 83136 -> 83172 (+0.04%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15675 >
2022-04-01 15:35:26 +02:00
Daniel Schürmann
db8c401f71
aco/ra: fix stride check on subdword parallelcopies for create_vector
...
On GFX6/7, info.rc is in full dwords.
Fixes: 9476986e6f ('aco/ra: special-case get_reg_for_create_vector_copy()')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15393 >
2022-03-31 08:55:07 +00:00
Samuel Pitoiset
d32656bc65
radv: lower has_multiview_view_index in NIR
...
This lowering is done in a new NIR pass where the layer is written
before emit_vertex_with_counter for geometry shaders and after the
position for other vertex stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15456 >
2022-03-31 07:34:21 +00:00
Georg Lehmann
141ca78634
radv, aco: Packed iadd_sat/uadd_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421 >
2022-03-28 20:02:52 +00:00
Georg Lehmann
50f585254c
aco: Implement scalar iadd_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421 >
2022-03-28 20:02:52 +00:00
Georg Lehmann
989f2b1785
aco: Implement 64bit uadd_sat.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421 >
2022-03-28 20:02:52 +00:00
Georg Lehmann
01fad6aa72
aco: Remove 0 data components from image stores.
...
Image stores always write a full texel.
The hardware writes zero for components not included in dmask.
Totals from 387 (0.29% of 134913) affected shaders: (GFX10.3)
VGPRs: 17216 -> 17136 (-0.46%)
CodeSize: 1987652 -> 1981504 (-0.31%)
MaxWaves: 9054 -> 9058 (+0.04%)
Instrs: 361883 -> 361115 (-0.21%); split: -0.21%, +0.00%
Latency: 5383187 -> 5381227 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 1373830 -> 1372097 (-0.13%)
VClause: 9031 -> 9038 (+0.08%); split: -0.04%, +0.12%
Copies: 25923 -> 25153 (-2.97%); split: -2.98%, +0.01%
PreVGPRs: 14643 -> 14555 (-0.60%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14986 >
2022-03-28 19:43:18 +00:00
Samuel Pitoiset
4cfb5332d6
radv: lower adjusting gl_FragCoord.z for VRS in NIR
...
fossils-db (Sienna Cichlid):
Totals from 4432 (3.29% of 134913) affected shaders:
VGPRs: 231232 -> 231880 (+0.28%)
CodeSize: 24738224 -> 24718008 (-0.08%); split: -0.08%, +0.00%
MaxWaves: 93120 -> 93000 (-0.13%)
Instrs: 4540970 -> 4541062 (+0.00%); split: -0.01%, +0.01%
Latency: 49658353 -> 49641444 (-0.03%); split: -0.05%, +0.01%
InvThroughput: 9604328 -> 9603041 (-0.01%); split: -0.02%, +0.01%
VClause: 66497 -> 66498 (+0.00%)
SClause: 209530 -> 209532 (+0.00%); split: -0.01%, +0.01%
Copies: 276135 -> 276249 (+0.04%); split: -0.14%, +0.18%
PreSGPRs: 189409 -> 189415 (+0.00%)
PreVGPRs: 207368 -> 207458 (+0.04%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15450 >
2022-03-28 16:17:37 +02:00
Samuel Pitoiset
a42b6a4d39
radv: lower load_sample_mask_in in NIR
...
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15450 >
2022-03-28 16:17:34 +02:00
Samuel Pitoiset
8c51874af4
radv,aco: lower color exports in NIR
...
fossils-db (Sienna Cichlid):
Totals from 27108 (20.09% of 134913) affected shaders:
VGPRs: 1260608 -> 1261424 (+0.06%); split: -0.00%, +0.07%
CodeSize: 112795868 -> 112785892 (-0.01%); split: -0.05%, +0.04%
MaxWaves: 628608 -> 628448 (-0.03%); split: +0.00%, -0.03%
Instrs: 20750003 -> 20749314 (-0.00%); split: -0.01%, +0.00%
Latency: 288088081 -> 288015865 (-0.03%); split: -0.06%, +0.04%
InvThroughput: 53944847 -> 53961693 (+0.03%); split: -0.01%, +0.04%
VClause: 396463 -> 396467 (+0.00%); split: -0.02%, +0.02%
SClause: 842088 -> 842150 (+0.01%); split: -0.03%, +0.04%
Copies: 1244982 -> 1259026 (+1.13%); split: -0.01%, +1.14%
PreSGPRs: 1251949 -> 1251909 (-0.00%)
PreVGPRs: 1099647 -> 1100879 (+0.11%); split: -0.03%, +0.14%
fossils-db (Polaris10):
Totals from 23928 (17.60% of 135960) affected shaders:
SGPRs: 1751792 -> 1751024 (-0.04%); split: -0.05%, +0.01%
VGPRs: 1098964 -> 1098556 (-0.04%); split: -0.13%, +0.09%
CodeSize: 99893472 -> 99837940 (-0.06%); split: -0.06%, +0.00%
MaxWaves: 138322 -> 138306 (-0.01%); split: +0.03%, -0.04%
Instrs: 19213995 -> 19211980 (-0.01%); split: -0.02%, +0.01%
Latency: 273026926 -> 273109402 (+0.03%); split: -0.01%, +0.04%
InvThroughput: 111160907 -> 111195187 (+0.03%); split: -0.04%, +0.07%
VClause: 343058 -> 343097 (+0.01%); split: -0.02%, +0.03%
SClause: 802756 -> 802884 (+0.02%); split: -0.04%, +0.06%
Copies: 1729387 -> 1739208 (+0.57%); split: -0.04%, +0.61%
PreSGPRs: 1090264 -> 1090303 (+0.00%); split: -0.00%, +0.01%
PreVGPRs: 959490 -> 960600 (+0.12%); split: -0.04%, +0.15%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15263 >
2022-03-28 11:47:35 +00:00
Rhys Perry
1ead285d92
aco: fix RA validation of 16-bit fma_mix operands
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15562 >
2022-03-28 11:05:25 +01:00
Daniel Schürmann
007cb02db9
aco: use branch definition as scratch register for SSA lowering
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15505 >
2022-03-28 07:36:46 +00:00
Daniel Schürmann
2d1e6b756e
aco: remove 'high' parameter from can_use_opsel()
...
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551 >
2022-03-25 22:02:50 +00:00
Daniel Schürmann
b98a9dcc36
aco/optimizer: fix call to can_use_opsel() in apply_insert()
...
The definition index is -1.
Fixes: 54292e99c7 ('aco: optimize 32-bit extracts and inserts using SDWA ')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551 >
2022-03-25 22:02:50 +00:00
Boris Brezillon
de039537da
aco: Fix an MSVC warning
...
'warning C4804: '<<': unsafe use of type 'bool' in operation'
Fixes: 9934c86761 ("aco: use v_fma_mix to combine mul/add/fma input conversions")
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15518 >
2022-03-24 09:11:13 +00:00
Rhys Perry
543a5487cf
radv,aco: lower image descriptor loads in NIR
...
fossil-db (Sienna Cichlid):
Totals from 2926 (1.80% of 162293) affected shaders:
Instrs: 2315110 -> 2306644 (-0.37%); split: -0.37%, +0.00%
CodeSize: 12581592 -> 12546588 (-0.28%); split: -0.28%, +0.00%
VGPRs: 130216 -> 130208 (-0.01%)
SpillSGPRs: 477 -> 474 (-0.63%); split: -5.03%, +4.40%
Latency: 29686188 -> 29678804 (-0.02%); split: -0.05%, +0.02%
InvThroughput: 6926545 -> 6926286 (-0.00%); split: -0.02%, +0.02%
SClause: 73761 -> 72996 (-1.04%); split: -1.16%, +0.12%
Copies: 144068 -> 137279 (-4.71%); split: -4.78%, +0.07%
Branches: 47466 -> 47483 (+0.04%); split: -0.01%, +0.04%
PreSGPRs: 118042 -> 117377 (-0.56%); split: -1.34%, +0.77%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
15640e58d9
radv,aco: lower texture descriptor loads in NIR
...
fossil-db (Sienna Cichlid):
Totals from 39445 (24.30% of 162293) affected shaders:
MaxWaves: 875988 -> 875972 (-0.00%)
Instrs: 35372561 -> 35234909 (-0.39%); split: -0.41%, +0.03%
CodeSize: 190237480 -> 189379240 (-0.45%); split: -0.47%, +0.02%
VGPRs: 1889856 -> 1889928 (+0.00%); split: -0.00%, +0.01%
SpillSGPRs: 10764 -> 10857 (+0.86%); split: -2.04%, +2.91%
SpillVGPRs: 1891 -> 1907 (+0.85%); split: -0.32%, +1.16%
Scratch: 260096 -> 261120 (+0.39%)
Latency: 477701150 -> 477578466 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 87819847 -> 87830346 (+0.01%); split: -0.03%, +0.04%
VClause: 673353 -> 673829 (+0.07%); split: -0.04%, +0.11%
SClause: 1385396 -> 1366478 (-1.37%); split: -1.65%, +0.29%
Copies: 2327965 -> 2229134 (-4.25%); split: -4.58%, +0.34%
Branches: 906707 -> 906434 (-0.03%); split: -0.13%, +0.10%
PreSGPRs: 1874153 -> 1862698 (-0.61%); split: -1.34%, +0.73%
PreVGPRs: 1691382 -> 1691383 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
52f850238a
radv,aco: lower buffer descriptor loads in NIR
...
fossil-db (Sienna Cichlid):
Totals from 75420 (46.47% of 162293) affected shaders:
MaxWaves: 1878200 -> 1879228 (+0.05%); split: +0.06%, -0.00%
Instrs: 54021103 -> 54141370 (+0.22%); split: -0.04%, +0.26%
CodeSize: 287813520 -> 288293352 (+0.17%); split: -0.04%, +0.21%
VGPRs: 3267576 -> 3266296 (-0.04%); split: -0.04%, +0.00%
SpillSGPRs: 10445 -> 10904 (+4.39%); split: -0.31%, +4.70%
SpillVGPRs: 1818 -> 1811 (-0.39%); split: -1.05%, +0.66%
Scratch: 955392 -> 954368 (-0.11%)
Latency: 563477854 -> 562131282 (-0.24%); split: -0.31%, +0.08%
InvThroughput: 111860104 -> 111553968 (-0.27%); split: -0.30%, +0.02%
VClause: 958432 -> 961415 (+0.31%); split: -0.34%, +0.65%
SClause: 1917415 -> 1926952 (+0.50%); split: -0.69%, +1.19%
Copies: 3812945 -> 3916758 (+2.72%); split: -0.27%, +2.99%
Branches: 1611235 -> 1612022 (+0.05%); split: -0.04%, +0.08%
PreSGPRs: 3095505 -> 3126580 (+1.00%); split: -0.06%, +1.07%
PreVGPRs: 2773011 -> 2773013 (+0.00%)
Most regressions seem to be because ACO's convert_pointer_to_64_bit()
can't be CSE'd with radv_nir_apply_pipeline_layout()'s
convert_pointer_to_64_bit(). This should be improved by later commits.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
22050fcf9d
radv,aco: lower vulkan_resource_index in NIR
...
fossil-db (Sienna Cichlid):
Totals from 31338 (19.31% of 162293) affected shaders:
MaxWaves: 758634 -> 758616 (-0.00%)
Instrs: 26398289 -> 26378282 (-0.08%); split: -0.09%, +0.01%
CodeSize: 141048208 -> 140971060 (-0.05%); split: -0.07%, +0.01%
VGPRs: 1373656 -> 1373736 (+0.01%)
SpillSGPRs: 9944 -> 9924 (-0.20%); split: -0.24%, +0.04%
SpillVGPRs: 1892 -> 1898 (+0.32%); split: -0.95%, +1.27%
Latency: 308570144 -> 308528462 (-0.01%); split: -0.03%, +0.02%
InvThroughput: 57698072 -> 57684901 (-0.02%); split: -0.07%, +0.04%
VClause: 440357 -> 440602 (+0.06%); split: -0.02%, +0.08%
SClause: 974724 -> 973315 (-0.14%); split: -0.18%, +0.04%
Copies: 1944925 -> 1945103 (+0.01%); split: -0.11%, +0.12%
Branches: 799444 -> 799461 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1619860 -> 1619233 (-0.04%); split: -0.05%, +0.02%
PreVGPRs: 1252813 -> 1252863 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
e6672f6fd1
radv: move radv_declare_shader_args() out of shader_variant_compile()
...
Declaring them earlier will allow us to access them in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
53996cf979
aco: implement load_{scalar,vector}_arg_amd and load_smem_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Rhys Perry
177b54ebe9
aco/tests: add v_fma_mix tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
1092f37805
aco: use v_fma_mix to combine mul/add/fma output conversions
...
fossil-db (Sienna Cichlid):
Totals from 42 (0.03% of 134913) affected shaders:
CodeSize: 596904 -> 596332 (-0.10%); split: -0.10%, +0.00%
Instrs: 110194 -> 109902 (-0.26%)
Latency: 1205239 -> 1204915 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 189697 -> 189375 (-0.17%)
VClause: 1365 -> 1366 (+0.07%)
Copies: 5429 -> 5414 (-0.28%); split: -0.33%, +0.06%
Branches: 4034 -> 4026 (-0.20%)
fossil-db (Navi):
Totals from 42 (0.03% of 134913) affected shaders:
CodeSize: 596044 -> 595488 (-0.09%); split: -0.10%, +0.00%
Instrs: 110845 -> 110540 (-0.28%)
Latency: 1206131 -> 1205747 (-0.03%)
InvThroughput: 190178 -> 189809 (-0.19%)
VClause: 1372 -> 1370 (-0.15%); split: -0.29%, +0.15%
Copies: 5671 -> 5641 (-0.53%); split: -0.56%, +0.04%
Branches: 4033 -> 4025 (-0.20%)
fossil-db (Vega):
Totals from 42 (0.03% of 135048) affected shaders:
CodeSize: 605824 -> 605352 (-0.08%); split: -0.08%, +0.00%
Instrs: 115975 -> 115706 (-0.23%)
Latency: 1399845 -> 1398912 (-0.07%)
InvThroughput: 489901 -> 489442 (-0.09%)
VClause: 1314 -> 1311 (-0.23%); split: -0.38%, +0.15%
Copies: 9673 -> 9666 (-0.07%); split: -0.12%, +0.05%
Branches: 4025 -> 4024 (-0.02%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
21304b772c
aco: apply clamp to v_fma_mix
...
fossil-db (Sienna Cichlid):
Totals from 2536 (1.88% of 134913) affected shaders:
CodeSize: 17314568 -> 17282960 (-0.18%)
Instrs: 3191438 -> 3187487 (-0.12%)
Latency: 59465090 -> 59407885 (-0.10%)
InvThroughput: 10271466 -> 10260512 (-0.11%)
fossil-db (Navi):
Totals from 2512 (1.86% of 134913) affected shaders:
CodeSize: 17194700 -> 17173396 (-0.12%)
Instrs: 3215093 -> 3212430 (-0.08%)
Latency: 60174315 -> 60142593 (-0.05%)
InvThroughput: 9491103 -> 9483979 (-0.08%)
fossil-db (Vega):
Totals from 2512 (1.86% of 135048) affected shaders:
CodeSize: 17186776 -> 17165472 (-0.12%)
Instrs: 3311166 -> 3308503 (-0.08%)
Latency: 65737409 -> 65716096 (-0.03%)
InvThroughput: 21735857 -> 21719792 (-0.07%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
35196b6d89
aco: combine add/mul as v_fma_mix into fma
...
fossil-db (Sienna Cichlid):
Totals from 7345 (5.44% of 134913) affected shaders:
CodeSize: 73840060 -> 73768936 (-0.10%); split: -0.10%, +0.00%
Instrs: 13701603 -> 13684183 (-0.13%); split: -0.13%, +0.00%
Latency: 185389373 -> 185306538 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 33785020 -> 33757593 (-0.08%); split: -0.08%, +0.00%
VClause: 237337 -> 237338 (+0.00%)
SClause: 485728 -> 485720 (-0.00%)
Copies: 935900 -> 935279 (-0.07%); split: -0.07%, +0.00%
Branches: 480721 -> 480722 (+0.00%)
fossil-db (Navi):
Totals from 10649 (7.89% of 134913) affected shaders:
VGPRs: 756624 -> 756516 (-0.01%); split: -0.02%, +0.01%
CodeSize: 92156580 -> 91707900 (-0.49%); split: -0.49%, +0.00%
MaxWaves: 159402 -> 159476 (+0.05%); split: +0.07%, -0.02%
Instrs: 17155827 -> 17070449 (-0.50%); split: -0.50%, +0.00%
Latency: 246296456 -> 245487120 (-0.33%); split: -0.33%, +0.00%
InvThroughput: 41438159 -> 41117424 (-0.77%); split: -0.77%, +0.00%
VClause: 323790 -> 323867 (+0.02%); split: -0.00%, +0.03%
SClause: 612077 -> 612034 (-0.01%); split: -0.01%, +0.00%
Copies: 1103012 -> 1102775 (-0.02%); split: -0.03%, +0.01%
Branches: 555893 -> 555896 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 824372 -> 824378 (+0.00%)
PreVGPRs: 740390 -> 740363 (-0.00%); split: -0.01%, +0.01%
fossil-db (Vega):
Totals from 10950 (8.11% of 135048) affected shaders:
SGPRs: 1034528 -> 1034560 (+0.00%)
VGPRs: 794092 -> 794104 (+0.00%); split: -0.01%, +0.01%
CodeSize: 94409768 -> 93955568 (-0.48%); split: -0.48%, +0.00%
MaxWaves: 38950 -> 38939 (-0.03%); split: +0.00%, -0.03%
Instrs: 18162637 -> 18070934 (-0.50%); split: -0.51%, +0.00%
Latency: 291718455 -> 290772451 (-0.32%); split: -0.32%, +0.00%
InvThroughput: 109114674 -> 108489767 (-0.57%); split: -0.57%, +0.00%
VClause: 334498 -> 334579 (+0.02%); split: -0.01%, +0.03%
SClause: 628871 -> 628825 (-0.01%); split: -0.01%, +0.00%
Copies: 1674477 -> 1674850 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 834800 -> 834802 (+0.00%)
PreVGPRs: 750460 -> 750415 (-0.01%); split: -0.01%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
9934c86761
aco: use v_fma_mix to combine mul/add/fma input conversions
...
fossil-db (Sienna Cichlid):
Totals from 11558 (8.57% of 134913) affected shaders:
VGPRs: 829392 -> 825200 (-0.51%); split: -0.52%, +0.02%
SpillSGPRs: 7845 -> 8399 (+7.06%)
CodeSize: 101822704 -> 101677172 (-0.14%); split: -0.25%, +0.11%
MaxWaves: 172216 -> 173182 (+0.56%); split: +0.59%, -0.03%
Instrs: 19061343 -> 18883450 (-0.93%); split: -0.93%, +0.00%
Latency: 256011590 -> 255177378 (-0.33%); split: -0.39%, +0.06%
InvThroughput: 46104438 -> 45604059 (-1.09%); split: -1.12%, +0.04%
VClause: 352211 -> 351948 (-0.07%); split: -0.21%, +0.13%
SClause: 676506 -> 676961 (+0.07%); split: -0.04%, +0.11%
Copies: 1246571 -> 1237745 (-0.71%); split: -0.97%, +0.26%
Branches: 626229 -> 626241 (+0.00%); split: -0.02%, +0.03%
PreSGPRs: 882176 -> 888853 (+0.76%); split: -0.00%, +0.76%
PreVGPRs: 796705 -> 792304 (-0.55%); split: -0.56%, +0.00%
fossil-db (Navi):
Totals from 11558 (8.57% of 134913) affected shaders:
VGPRs: 803900 -> 798660 (-0.65%); split: -0.73%, +0.08%
SpillSGPRs: 7894 -> 8492 (+7.58%); split: -0.10%, +7.68%
CodeSize: 96892596 -> 97134716 (+0.25%); split: -0.05%, +0.29%
MaxWaves: 181454 -> 183014 (+0.86%); split: +0.94%, -0.08%
Instrs: 18186813 -> 18093994 (-0.51%); split: -0.56%, +0.05%
Latency: 253385909 -> 253325528 (-0.02%); split: -0.15%, +0.12%
InvThroughput: 43315355 -> 42805541 (-1.18%); split: -1.33%, +0.15%
VClause: 338755 -> 338535 (-0.06%); split: -0.16%, +0.10%
SClause: 656561 -> 656829 (+0.04%); split: -0.07%, +0.11%
Copies: 1162235 -> 1153558 (-0.75%); split: -1.07%, +0.32%
Branches: 588536 -> 588542 (+0.00%); split: -0.03%, +0.03%
PreSGPRs: 854849 -> 861640 (+0.79%); split: -0.00%, +0.80%
PreVGPRs: 783401 -> 779031 (-0.56%); split: -0.56%, +0.00%
fossil-db (Vega):
Totals from 11516 (8.53% of 135048) affected shaders:
SGPRs: 1072128 -> 1076288 (+0.39%); split: -0.01%, +0.40%
VGPRs: 821312 -> 818124 (-0.39%); split: -0.43%, +0.04%
SpillSGPRs: 11952 -> 12677 (+6.07%)
CodeSize: 96378496 -> 96707596 (+0.34%); split: -0.04%, +0.38%
MaxWaves: 42614 -> 42883 (+0.63%); split: +0.68%, -0.04%
Instrs: 18672844 -> 18600274 (-0.39%); split: -0.44%, +0.05%
Latency: 296658786 -> 296338296 (-0.11%); split: -0.21%, +0.10%
InvThroughput: 111665547 -> 111283559 (-0.34%); split: -0.40%, +0.06%
VClause: 343001 -> 342826 (-0.05%); split: -0.14%, +0.09%
SClause: 646684 -> 646657 (-0.00%); split: -0.05%, +0.04%
Copies: 1715316 -> 1712895 (-0.14%); split: -0.53%, +0.39%
PreSGPRs: 850737 -> 856543 (+0.68%); split: -0.04%, +0.72%
PreVGPRs: 775293 -> 772215 (-0.40%); split: -0.41%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
eeef1bbe65
aco: refactor selection of mad/fma
...
In the future, whether we need to use fma will depend on which
multiplication is chosen.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
e12bee3cb7
aco: improve support for v_fma_mix
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
79c8740c6e
aco: fix fp16 opcode definitions
...
The v_fma_mix optimizations assume v_cvt_f16_f32 and v_mul_f16 use a v2b
definition.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Rhys Perry
c4cf92cad7
radv,aco,ac/llvm: fix indirect dispatches on the compute queue on GFX7-10
...
Since neither PKT3_LOAD_SH_REG_INDEX nor PKT3_COPY_DATA work with compute
queues on GFX7-10, we have to load the dispatch size from memory in the
shader.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15064 >
2022-03-14 19:54:36 +00:00
Rhys Perry
973967c49d
aco: split and recombine unaligned sgpr inputs
...
An example is the num_work_groups argument. Fixes invalid assembly with
func.compute.num-workgroups.basic.q0
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15064 >
2022-03-14 19:54:36 +00:00
Daniel Schürmann
70aea6b41a
aco/ra: refactor collect_vars() to return a sorted vector
...
The vector of IDs is sorted with decreasing sizes,
and by increasing assigned registers.
This decouples register assingment from ssa IDs.
Totals from 12694 (9.41% of 134913) affected shaders: (GFX10.3)
VGPRs: 757864 -> 757848 (-0.00%); split: -0.00%, +0.00%
CodeSize: 72350540 -> 72348688 (-0.00%); split: -0.02%, +0.02%
MaxWaves: 237018 -> 237020 (+0.00%); split: +0.00%, -0.00%
Instrs: 13545494 -> 13544699 (-0.01%); split: -0.03%, +0.02%
Latency: 148539203 -> 148533292 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 30319086 -> 30320382 (+0.00%); split: -0.01%, +0.01%
VClause: 326875 -> 327028 (+0.05%); split: -0.05%, +0.09%
SClause: 479833 -> 479837 (+0.00%); split: -0.00%, +0.00%
Copies: 862152 -> 860914 (-0.14%); split: -0.43%, +0.28%
Branches: 317775 -> 317777 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11526 >
2022-03-14 08:32:10 +00:00