Rhys Perry
ea92aea9f2
aco: turn v_mov_b32 into addition to create VOPD instructions
...
fossil-db (navi31, wave32):
Totals from 15655 (19.76% of 79242) affected shaders:
Instrs: 10699119 -> 10688239 (-0.10%); split: -0.11%, +0.00%
CodeSize: 61290308 -> 61288596 (-0.00%); split: -0.01%, +0.00%
Latency: 89159743 -> 89150355 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16966295 -> 16955427 (-0.06%); split: -0.07%, +0.00%
VALU: 5484626 -> 5473993 (-0.19%); split: -0.20%, +0.00%
VOPD: 1446725 -> 1457358 (+0.73%); split: +0.74%, -0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
65dfb27f8f
aco: swap operands to create VOPD instructions
...
fossil-db (navi31, wave32):
Totals from 61565 (77.69% of 79242) affected shaders:
Instrs: 34874995 -> 34439344 (-1.25%); split: -1.27%, +0.02%
CodeSize: 200611536 -> 200564028 (-0.02%); split: -0.12%, +0.10%
Latency: 242520024 -> 242120510 (-0.16%); split: -0.28%, +0.11%
InvThroughput: 50236383 -> 49588742 (-1.29%); split: -1.31%, +0.02%
VClause: 713308 -> 712902 (-0.06%); split: -0.07%, +0.01%
SClause: 1184865 -> 1184620 (-0.02%); split: -0.03%, +0.01%
VALU: 18235068 -> 17803847 (-2.36%); split: -2.37%, +0.00%
VOPD: 3930904 -> 4362125 (+10.97%); split: +10.99%, -0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Rhys Perry
96d8b7c59c
aco: refactor create_vopd_instruction
...
Prepare for operand swapping.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485 >
2024-02-08 12:15:23 +00:00
Marek Olšák
72948d9ff9
radeonsi,aco: remove the VS prolog
...
The upside is that this removes 600 lines of code. The downside is
that if instance divisors are used, we will compile the VS on demand.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27120 >
2024-02-07 09:50:53 +00:00
Rhys Perry
4f6aac1589
aco/tests: fix to_hw_instr.swap_linear_vgpr
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27436 >
2024-02-06 15:40:58 +00:00
Rhys Perry
174e37afb9
aco: fix >8 byte linear vgpr copies
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27436 >
2024-02-06 15:40:58 +00:00
Rhys Perry
48461c0d9e
aco: enable VOPD scheduler
...
fossil-db (navi31, wave32):
Totals from 75237 (94.95% of 79242) affected shaders:
Instrs: 40383337 -> 36522907 (-9.56%); split: -9.56%, +0.00%
CodeSize: 209282312 -> 209883000 (+0.29%); split: -0.18%, +0.46%
Latency: 256898944 -> 257834204 (+0.36%); split: -0.23%, +0.60%
InvThroughput: 59105835 -> 52725948 (-10.79%); split: -10.90%, +0.11%
VClause: 759487 -> 758151 (-0.18%); split: -0.18%, +0.01%
SClause: 1264895 -> 1263230 (-0.13%); split: -0.14%, +0.01%
VALU: 22886416 -> 18837085 (-17.69%)
VOPD: 0 -> 4049331 (+inf%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:15 +00:00
Rhys Perry
75a76ec3fd
aco: implement VOPD scheduler
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:15 +00:00
Rhys Perry
1fb79b4aa2
aco: refactor schedule_ilp main loop
...
Besides switching to "ctx.active_mask" as the condition, this is basically
the same.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:14 +00:00
Rhys Perry
c66d42b9ed
aco: add VOPD statistic
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:14 +00:00
Rhys Perry
6547e17e60
aco: add VOPD format
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:14 +00:00
Daniel Schürmann
4fa27845e5
aco/insert_exec_mask: Reduce latency when switching to WQM.
...
Change pattern:
s_mov_b64 s[0:1], exec s_mov_b64 s[0:1], exec
s_wqm_b64 exec, s[0:1] -> s_wqm_b64 exec, exec
Totals from 16667 (21.03% of 79242) affected shaders: (GFX11)
Instrs: 11317502 -> 11307484 (-0.09%); split: -0.09%, +0.00%
CodeSize: 60194272 -> 60155088 (-0.07%); split: -0.07%, +0.00%
Latency: 94345873 -> 94338374 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 13568872 -> 13568683 (-0.00%); split: -0.00%, +0.00%
Copies: 808334 -> 808332 (-0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112 >
2024-02-02 18:55:15 +00:00
Daniel Schürmann
e89977ff71
aco: always terminate quads if they have been demoted entirely
...
Previously, quads got only terminated in top-level control flow.
This patch makes the behavior consistent.
Totals from 7811 (9.86% of 79242) affected shaders: (GFX11)
Instrs: 7859667 -> 7850757 (-0.11%); split: -0.18%, +0.07%
CodeSize: 41642280 -> 41611836 (-0.07%); split: -0.13%, +0.06%
Latency: 73692815 -> 73707588 (+0.02%); split: -0.02%, +0.04%
InvThroughput: 10672160 -> 10672323 (+0.00%); split: -0.01%, +0.01%
VClause: 137478 -> 137469 (-0.01%); split: -0.02%, +0.02%
SClause: 314905 -> 314924 (+0.01%); split: -0.19%, +0.20%
Copies: 587014 -> 576039 (-1.87%); split: -2.10%, +0.23%
Branches: 213101 -> 213123 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 313588 -> 313355 (-0.07%); split: -0.09%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112 >
2024-02-02 18:55:15 +00:00
Daniel Schürmann
a42b83e3fb
aco/insert_exec_mask: tiny refactor
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112 >
2024-02-02 18:55:15 +00:00
Daniel Schürmann
c309d20172
aco/insert_exec_mask: Fix unconditional demote at top-level control flow.
...
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27362 >
2024-01-31 13:50:46 +00:00
Rhys Perry
6dc182b6b2
aco: fix labelling of s_not with constant
...
Fixes RADV compilation of a Cyberpunk 2077 RT pipeline with
PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: dfaa3c0af6 ("aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27194 >
2024-01-24 17:25:15 +00:00
Georg Lehmann
4c74077b62
aco: implement rotate
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:40 +00:00
Georg Lehmann
b90ec971d7
aco/gfx11: resolve VcmpxPermlaneHazard for v_permlane64
...
The GFX11 ISA docs description of this hazard says it's about v_permlane in
general, not just v_permlane(x)16.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:40 +00:00
Georg Lehmann
19876386e2
aco/gfx11: use v_nop to resolve VcmpxPermlaneHazard
...
The GFX11 ISA doc explicitly recommends using v_nop in
7.2.8. PERMLANE Specific Rules.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:40 +00:00
Georg Lehmann
a626f765b5
aco: support v_permlane64_b32
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:40 +00:00
Georg Lehmann
c67d4a75ba
aco: validate v_permlane opsel correctly
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:40 +00:00
Georg Lehmann
bc57f14c2d
aco: fix printing dpp8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118 >
2024-01-24 16:38:39 +00:00
Georg Lehmann
0a03cf5b3c
aco: remove boolean shuffle isel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27116 >
2024-01-19 20:13:34 +00:00
Georg Lehmann
6b031daf16
aco: implement as_uniform and ballot_relaxed
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27116 >
2024-01-19 20:13:34 +00:00
Georg Lehmann
74fc2e287f
aco: stop scheduling at p_logical_end
...
No Foz-DB changes, but this fixes some issues when the spiller inserts
scratch loads after p_logical_end for p_return.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27119 >
2024-01-19 17:04:28 +00:00
Daniel Schürmann
e3098bb232
aco: give spiller more room to assign spilled SGPRs to VGPRs
...
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.
Use twice as many spill slots as upper bound.
Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011 >
2024-01-19 14:15:27 +00:00
Samuel Pitoiset
7a0b343495
aco: silent checking if clrxdisasm is available
...
Otherwise, this is reported a ton of times and the CI output is
unusable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27136 >
2024-01-19 07:33:56 +00:00
Georg Lehmann
e36235e6d5
aco: reassign split vector to SOPC
...
Foz-DB Navi21:
Totals from 2669 (3.42% of 78112) affected shaders:
Instrs: 3570360 -> 3562026 (-0.23%)
CodeSize: 19049784 -> 19017092 (-0.17%)
Latency: 25343555 -> 25337767 (-0.02%); split: -0.03%, +0.00%
InvThroughput: 6191344 -> 6191079 (-0.00%); split: -0.01%, +0.00%
VClause: 90803 -> 90802 (-0.00%)
SClause: 114858 -> 114842 (-0.01%); split: -0.03%, +0.01%
Copies: 269287 -> 260999 (-3.08%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27046 >
2024-01-13 11:03:09 +00:00
Daniel Schürmann
09413ff745
aco/insert_exec_mask: only create loop phis for exec mask if necessary
...
Totals from 195 (0.25% of 79242) affected shaders: (GFX11)
Instrs: 476457 -> 476031 (-0.09%); split: -0.23%, +0.14%
CodeSize: 2453964 -> 2452108 (-0.08%); split: -0.23%, +0.16%
SpillSGPRs: 944 -> 913 (-3.28%); split: -3.39%, +0.11%
SpillVGPRs: 838 -> 835 (-0.36%); split: -0.95%, +0.60%
Latency: 10811026 -> 10810125 (-0.01%); split: -0.08%, +0.07%
InvThroughput: 2276677 -> 2276698 (+0.00%); split: -0.12%, +0.12%
VClause: 9223 -> 9233 (+0.11%); split: -0.10%, +0.21%
SClause: 9025 -> 9005 (-0.22%); split: -0.38%, +0.16%
Copies: 67419 -> 67382 (-0.05%); split: -0.97%, +0.92%
PreSGPRs: 10830 -> 10668 (-1.50%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937 >
2024-01-12 09:05:15 +00:00
Daniel Schürmann
e83d8e1366
aco/insert_exec_mask: replace phi for loop restore mask with explicit copies
...
Totals from 1785 (2.25% of 79242) affected shaders: (GFX11)
Instrs: 6787574 -> 6787041 (-0.01%); split: -0.01%, +0.00%
CodeSize: 34906500 -> 34904704 (-0.01%); split: -0.01%, +0.01%
SpillSGPRs: 5848 -> 5816 (-0.55%)
Latency: 88616877 -> 88617209 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 16644948 -> 16644717 (-0.00%); split: -0.00%, +0.00%
VClause: 141122 -> 141121 (-0.00%)
SClause: 178929 -> 178906 (-0.01%); split: -0.03%, +0.02%
Copies: 569444 -> 569081 (-0.06%); split: -0.09%, +0.03%
Branches: 186980 -> 186961 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 133648 -> 133369 (-0.21%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937 >
2024-01-12 09:05:15 +00:00
Daniel Schürmann
d375d297cf
aco/insert_exec_mask: unify exec restore code after divergent control flow
...
No fossil-db changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937 >
2024-01-12 09:05:15 +00:00
Georg Lehmann
fddd866b27
aco: apply fneg/fabs to VOP3P
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919 >
2024-01-08 13:26:19 +00:00
Georg Lehmann
72ac6a5251
aco: clean up fneg/fabs combining
...
This technically fixes some bugs with fneg(fneg(a)) and fabs(fneg(a)), but
those shouldn't be present in the input NIR.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919 >
2024-01-08 13:26:19 +00:00
Georg Lehmann
a90d154f62
aco: fix applying input modifiers to DPP8
...
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919 >
2024-01-08 13:26:19 +00:00
Georg Lehmann
1d61770dd5
aco: apply packed fneg commutatively
...
If only one component is negated, isel does not ensure that the constant
operand is in src1 because then the negate was a fmul, not a fneg.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919 >
2024-01-08 13:26:19 +00:00
Daniel Schürmann
dce695b24f
aco: refactor and speed-up dead code analysis
...
Assuming that no loop header phis are dead code,
we can perform the dead code analysis in a single iteration.
Totals from 25 (0.03% of 79330) affected shaders: (GFX11)
MaxWaves: 664 -> 662 (-0.30%)
Instrs: 487618 -> 488822 (+0.25%)
CodeSize: 2451548 -> 2459756 (+0.33%)
VGPRs: 1296 -> 1332 (+2.78%)
Latency: 2337256 -> 2338098 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 560682 -> 576158 (+2.76%)
VClause: 15782 -> 15790 (+0.05%)
Copies: 37905 -> 38731 (+2.18%)
PreVGPRs: 1124 -> 1156 (+2.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26901 >
2024-01-08 09:43:53 +00:00
Daniel Schürmann
023e78b4d7
aco: add new post-RA scheduler for ILP
...
Totals from 77247 (97.37% of 79330) affected shaders: (GFX11)
Instrs: 44371374 -> 43215723 (-2.60%); split: -2.64%, +0.03%
CodeSize: 227819532 -> 223188224 (-2.03%); split: -2.06%, +0.03%
Latency: 301016823 -> 290147626 (-3.61%); split: -3.70%, +0.09%
InvThroughput: 48551749 -> 47646212 (-1.87%); split: -1.88%, +0.01%
VClause: 870581 -> 834655 (-4.13%); split: -4.13%, +0.00%
SClause: 1487061 -> 1340851 (-9.83%); split: -9.83%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676 >
2024-01-06 11:30:42 +00:00
Daniel Schürmann
72a5c659d4
aco: form clauses for LDS instructions
...
No fossil-db changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676 >
2024-01-06 11:30:42 +00:00
Daniel Schürmann
8f16745821
aco: fix should_form_clause() for memory instructions without operands
...
In particular, this applies to s_memtime and s_memrealtime.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676 >
2024-01-06 11:30:41 +00:00
Rhys Perry
ae54cbeb3f
nir: remove sad_u8x4
...
All uses of this can be replaced with msad_4x8.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907 >
2024-01-05 18:55:22 +00:00
Rhys Perry
1410735a62
aco: implement msad_4x8
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907 >
2024-01-05 18:55:22 +00:00
Rhys Perry
3009dcd102
aco: correctly set min/max_subgroup_size for wave32-as-wave64
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26894 >
2024-01-05 17:35:48 +00:00
Friedrich Vock
1e3541728b
radv,aco: Convert 1D ray launches to 2D
...
Because we use unaligned dispatches, 1D launches only use 8 threads per
wave. Converting to 2D and fixing up launch IDs in the prolog
significantly increases occupancy.
Gives ~30% uplift in Ghostwire Tokyo.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26105 >
2024-01-05 17:08:05 +00:00
Georg Lehmann
71edf4de5e
aco/gfx12: implement broadcast dmask shrink behavior
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26897 >
2024-01-05 12:03:54 +00:00
Georg Lehmann
4a6ee2c483
aco: shrink buffer stores with undef/zero components
...
Buffer stores store 0 like image stores for unspecified components.
Foz-DB Navi21:
Totals from 91 (0.11% of 79330) affected shaders:
Instrs: 63327 -> 63121 (-0.33%)
CodeSize: 315312 -> 314440 (-0.28%); split: -0.28%, +0.00%
VGPRs: 3144 -> 3120 (-0.76%)
Latency: 441424 -> 441300 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 65501 -> 65130 (-0.57%)
Copies: 6197 -> 5999 (-3.20%)
PreVGPRs: 2197 -> 2182 (-0.68%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26897 >
2024-01-05 12:03:54 +00:00
Rhys Perry
cad2c0915d
aco/tests: use more raw strings
...
Python 3.12 started giving a SyntaxWarning for unrecognized escapes such
as "\w". This might become a SyntaxError in a future python version.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26850 >
2024-01-03 13:33:52 +00:00
Georg Lehmann
9ecfd7919b
aco: optimize 32bit fsign by using fmulz with Inf
...
2 instruction fsign with the power of cursed DX9 floating point rules.
Foz-DB Navi31:
Totals from 3803 (4.86% of 78196) affected shaders:
Instrs: 8436366 -> 8412549 (-0.28%); split: -0.29%, +0.00%
CodeSize: 43174284 -> 43114676 (-0.14%); split: -0.14%, +0.01%
SpillSGPRs: 3241 -> 3247 (+0.19%)
Latency: 66333841 -> 66287361 (-0.07%); split: -0.08%, +0.01%
InvThroughput: 10331902 -> 10316916 (-0.15%); split: -0.15%, +0.01%
VClause: 165455 -> 165472 (+0.01%); split: -0.01%, +0.02%
SClause: 242352 -> 242335 (-0.01%); split: -0.02%, +0.01%
Copies: 604086 -> 605781 (+0.28%); split: -0.04%, +0.32%
Branches: 214017 -> 214013 (-0.00%)
PreSGPRs: 209413 -> 209726 (+0.15%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26765 >
2024-01-02 13:07:30 +01:00
Eric Engestrom
7e8db6aedf
meson: always define {,DRAW_}LLVM_AVAILABLE one way or the other
...
With the usual benefits of `#if` instead of `#ifdef` (mostly the fact
that typos can be build failures instead of silently being interpreted
as if 0).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3863 >
2023-12-24 10:01:39 +00:00
Daniel Schürmann
42e9ba1c70
aco: remove VCCZ and EXECZ register handling
...
We don't use these registers and since RDNA3 removed the explicit usage,
it is unlikely that we will properly support them in the future.
Removing the registers from the ACO IR prevents accidentally using them
without proper support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26664 >
2023-12-14 20:08:28 +00:00
Daniel Schürmann
dd7b6898e6
radv: fix number of physical SGPRs on GFX10+
...
This change has no effect.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26521 >
2023-12-11 10:39:51 +00:00