Commit graph

2856 commits

Author SHA1 Message Date
Daniel Schürmann
4fa27845e5 aco/insert_exec_mask: Reduce latency when switching to WQM.
Change pattern:
s_mov_b64 s[0:1], exec         s_mov_b64 s[0:1], exec
s_wqm_b64 exec, s[0:1]   ->    s_wqm_b64 exec, exec

Totals from 16667 (21.03% of 79242) affected shaders: (GFX11)

Instrs: 11317502 -> 11307484 (-0.09%); split: -0.09%, +0.00%
CodeSize: 60194272 -> 60155088 (-0.07%); split: -0.07%, +0.00%
Latency: 94345873 -> 94338374 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 13568872 -> 13568683 (-0.00%); split: -0.00%, +0.00%
Copies: 808334 -> 808332 (-0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112>
2024-02-02 18:55:15 +00:00
Daniel Schürmann
e89977ff71 aco: always terminate quads if they have been demoted entirely
Previously, quads got only terminated in top-level control flow.
This patch makes the behavior consistent.

Totals from 7811 (9.86% of 79242) affected shaders: (GFX11)

Instrs: 7859667 -> 7850757 (-0.11%); split: -0.18%, +0.07%
CodeSize: 41642280 -> 41611836 (-0.07%); split: -0.13%, +0.06%
Latency: 73692815 -> 73707588 (+0.02%); split: -0.02%, +0.04%
InvThroughput: 10672160 -> 10672323 (+0.00%); split: -0.01%, +0.01%
VClause: 137478 -> 137469 (-0.01%); split: -0.02%, +0.02%
SClause: 314905 -> 314924 (+0.01%); split: -0.19%, +0.20%
Copies: 587014 -> 576039 (-1.87%); split: -2.10%, +0.23%
Branches: 213101 -> 213123 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 313588 -> 313355 (-0.07%); split: -0.09%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112>
2024-02-02 18:55:15 +00:00
Daniel Schürmann
a42b83e3fb aco/insert_exec_mask: tiny refactor
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27112>
2024-02-02 18:55:15 +00:00
Daniel Schürmann
c309d20172 aco/insert_exec_mask: Fix unconditional demote at top-level control flow.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27362>
2024-01-31 13:50:46 +00:00
Rhys Perry
6dc182b6b2 aco: fix labelling of s_not with constant
Fixes RADV compilation of a Cyberpunk 2077 RT pipeline with
PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: dfaa3c0af6 ("aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27194>
2024-01-24 17:25:15 +00:00
Georg Lehmann
4c74077b62 aco: implement rotate
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
b90ec971d7 aco/gfx11: resolve VcmpxPermlaneHazard for v_permlane64
The GFX11 ISA docs description of this hazard says it's about v_permlane in
general, not just v_permlane(x)16.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
19876386e2 aco/gfx11: use v_nop to resolve VcmpxPermlaneHazard
The GFX11 ISA doc explicitly recommends using v_nop in
7.2.8. PERMLANE Specific Rules.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
a626f765b5 aco: support v_permlane64_b32
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
c67d4a75ba aco: validate v_permlane opsel correctly
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
bc57f14c2d aco: fix printing dpp8
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:39 +00:00
Georg Lehmann
0a03cf5b3c aco: remove boolean shuffle isel
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27116>
2024-01-19 20:13:34 +00:00
Georg Lehmann
6b031daf16 aco: implement as_uniform and ballot_relaxed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27116>
2024-01-19 20:13:34 +00:00
Georg Lehmann
74fc2e287f aco: stop scheduling at p_logical_end
No Foz-DB changes, but this fixes some issues when the spiller inserts
scratch loads after p_logical_end for p_return.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27119>
2024-01-19 17:04:28 +00:00
Daniel Schürmann
e3098bb232 aco: give spiller more room to assign spilled SGPRs to VGPRs
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.

Use twice as many spill slots as upper bound.

Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011>
2024-01-19 14:15:27 +00:00
Samuel Pitoiset
7a0b343495 aco: silent checking if clrxdisasm is available
Otherwise, this is reported a ton of times and the CI output is
unusable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27136>
2024-01-19 07:33:56 +00:00
Georg Lehmann
e36235e6d5 aco: reassign split vector to SOPC
Foz-DB Navi21:
Totals from 2669 (3.42% of 78112) affected shaders:
Instrs: 3570360 -> 3562026 (-0.23%)
CodeSize: 19049784 -> 19017092 (-0.17%)
Latency: 25343555 -> 25337767 (-0.02%); split: -0.03%, +0.00%
InvThroughput: 6191344 -> 6191079 (-0.00%); split: -0.01%, +0.00%
VClause: 90803 -> 90802 (-0.00%)
SClause: 114858 -> 114842 (-0.01%); split: -0.03%, +0.01%
Copies: 269287 -> 260999 (-3.08%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27046>
2024-01-13 11:03:09 +00:00
Daniel Schürmann
09413ff745 aco/insert_exec_mask: only create loop phis for exec mask if necessary
Totals from 195 (0.25% of 79242) affected shaders: (GFX11)

Instrs: 476457 -> 476031 (-0.09%); split: -0.23%, +0.14%
CodeSize: 2453964 -> 2452108 (-0.08%); split: -0.23%, +0.16%
SpillSGPRs: 944 -> 913 (-3.28%); split: -3.39%, +0.11%
SpillVGPRs: 838 -> 835 (-0.36%); split: -0.95%, +0.60%
Latency: 10811026 -> 10810125 (-0.01%); split: -0.08%, +0.07%
InvThroughput: 2276677 -> 2276698 (+0.00%); split: -0.12%, +0.12%
VClause: 9223 -> 9233 (+0.11%); split: -0.10%, +0.21%
SClause: 9025 -> 9005 (-0.22%); split: -0.38%, +0.16%
Copies: 67419 -> 67382 (-0.05%); split: -0.97%, +0.92%
PreSGPRs: 10830 -> 10668 (-1.50%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937>
2024-01-12 09:05:15 +00:00
Daniel Schürmann
e83d8e1366 aco/insert_exec_mask: replace phi for loop restore mask with explicit copies
Totals from 1785 (2.25% of 79242) affected shaders: (GFX11)

Instrs: 6787574 -> 6787041 (-0.01%); split: -0.01%, +0.00%
CodeSize: 34906500 -> 34904704 (-0.01%); split: -0.01%, +0.01%
SpillSGPRs: 5848 -> 5816 (-0.55%)
Latency: 88616877 -> 88617209 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 16644948 -> 16644717 (-0.00%); split: -0.00%, +0.00%
VClause: 141122 -> 141121 (-0.00%)
SClause: 178929 -> 178906 (-0.01%); split: -0.03%, +0.02%
Copies: 569444 -> 569081 (-0.06%); split: -0.09%, +0.03%
Branches: 186980 -> 186961 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 133648 -> 133369 (-0.21%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937>
2024-01-12 09:05:15 +00:00
Daniel Schürmann
d375d297cf aco/insert_exec_mask: unify exec restore code after divergent control flow
No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26937>
2024-01-12 09:05:15 +00:00
Georg Lehmann
fddd866b27 aco: apply fneg/fabs to VOP3P
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
72ac6a5251 aco: clean up fneg/fabs combining
This technically fixes some bugs with fneg(fneg(a)) and fabs(fneg(a)), but
those shouldn't be present in the input NIR.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
a90d154f62 aco: fix applying input modifiers to DPP8
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Georg Lehmann
1d61770dd5 aco: apply packed fneg commutatively
If only one component is negated, isel does not ensure that the constant
operand is in src1 because then the negate was a fmul, not a fneg.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26919>
2024-01-08 13:26:19 +00:00
Daniel Schürmann
dce695b24f aco: refactor and speed-up dead code analysis
Assuming that no loop header phis are dead code,
we can perform the dead code analysis in a single iteration.

Totals from 25 (0.03% of 79330) affected shaders: (GFX11)

MaxWaves: 664 -> 662 (-0.30%)
Instrs: 487618 -> 488822 (+0.25%)
CodeSize: 2451548 -> 2459756 (+0.33%)
VGPRs: 1296 -> 1332 (+2.78%)
Latency: 2337256 -> 2338098 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 560682 -> 576158 (+2.76%)
VClause: 15782 -> 15790 (+0.05%)
Copies: 37905 -> 38731 (+2.18%)
PreVGPRs: 1124 -> 1156 (+2.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26901>
2024-01-08 09:43:53 +00:00
Daniel Schürmann
023e78b4d7 aco: add new post-RA scheduler for ILP
Totals from 77247 (97.37% of 79330) affected shaders: (GFX11)

Instrs: 44371374 -> 43215723 (-2.60%); split: -2.64%, +0.03%
CodeSize: 227819532 -> 223188224 (-2.03%); split: -2.06%, +0.03%
Latency: 301016823 -> 290147626 (-3.61%); split: -3.70%, +0.09%
InvThroughput: 48551749 -> 47646212 (-1.87%); split: -1.88%, +0.01%
VClause: 870581 -> 834655 (-4.13%); split: -4.13%, +0.00%
SClause: 1487061 -> 1340851 (-9.83%); split: -9.83%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:42 +00:00
Daniel Schürmann
72a5c659d4 aco: form clauses for LDS instructions
No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:42 +00:00
Daniel Schürmann
8f16745821 aco: fix should_form_clause() for memory instructions without operands
In particular, this applies to s_memtime and s_memrealtime.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676>
2024-01-06 11:30:41 +00:00
Rhys Perry
ae54cbeb3f nir: remove sad_u8x4
All uses of this can be replaced with msad_4x8.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Rhys Perry
1410735a62 aco: implement msad_4x8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>
2024-01-05 18:55:22 +00:00
Rhys Perry
3009dcd102 aco: correctly set min/max_subgroup_size for wave32-as-wave64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26894>
2024-01-05 17:35:48 +00:00
Friedrich Vock
1e3541728b radv,aco: Convert 1D ray launches to 2D
Because we use unaligned dispatches, 1D launches only use 8 threads per
wave. Converting to 2D and fixing up launch IDs in the prolog
significantly increases occupancy.

Gives ~30% uplift in Ghostwire Tokyo.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26105>
2024-01-05 17:08:05 +00:00
Georg Lehmann
71edf4de5e aco/gfx12: implement broadcast dmask shrink behavior
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26897>
2024-01-05 12:03:54 +00:00
Georg Lehmann
4a6ee2c483 aco: shrink buffer stores with undef/zero components
Buffer stores store 0 like image stores for unspecified components.

Foz-DB Navi21:
Totals from 91 (0.11% of 79330) affected shaders:
Instrs: 63327 -> 63121 (-0.33%)
CodeSize: 315312 -> 314440 (-0.28%); split: -0.28%, +0.00%
VGPRs: 3144 -> 3120 (-0.76%)
Latency: 441424 -> 441300 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 65501 -> 65130 (-0.57%)
Copies: 6197 -> 5999 (-3.20%)
PreVGPRs: 2197 -> 2182 (-0.68%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26897>
2024-01-05 12:03:54 +00:00
Rhys Perry
cad2c0915d aco/tests: use more raw strings
Python 3.12 started giving a SyntaxWarning for unrecognized escapes such
as "\w". This might become a SyntaxError in a future python version.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26850>
2024-01-03 13:33:52 +00:00
Georg Lehmann
9ecfd7919b aco: optimize 32bit fsign by using fmulz with Inf
2 instruction fsign with the power of cursed DX9 floating point rules.

Foz-DB Navi31:
Totals from 3803 (4.86% of 78196) affected shaders:
Instrs: 8436366 -> 8412549 (-0.28%); split: -0.29%, +0.00%
CodeSize: 43174284 -> 43114676 (-0.14%); split: -0.14%, +0.01%
SpillSGPRs: 3241 -> 3247 (+0.19%)
Latency: 66333841 -> 66287361 (-0.07%); split: -0.08%, +0.01%
InvThroughput: 10331902 -> 10316916 (-0.15%); split: -0.15%, +0.01%
VClause: 165455 -> 165472 (+0.01%); split: -0.01%, +0.02%
SClause: 242352 -> 242335 (-0.01%); split: -0.02%, +0.01%
Copies: 604086 -> 605781 (+0.28%); split: -0.04%, +0.32%
Branches: 214017 -> 214013 (-0.00%)
PreSGPRs: 209413 -> 209726 (+0.15%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26765>
2024-01-02 13:07:30 +01:00
Eric Engestrom
7e8db6aedf meson: always define {,DRAW_}LLVM_AVAILABLE one way or the other
With the usual benefits of `#if` instead of `#ifdef` (mostly the fact
that typos can be build failures instead of silently being interpreted
as if 0).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3863>
2023-12-24 10:01:39 +00:00
Daniel Schürmann
42e9ba1c70 aco: remove VCCZ and EXECZ register handling
We don't use these registers and since RDNA3 removed the explicit usage,
it is unlikely that we will properly support them in the future.
Removing the registers from the ACO IR prevents accidentally using them
without proper support.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26664>
2023-12-14 20:08:28 +00:00
Daniel Schürmann
dd7b6898e6 radv: fix number of physical SGPRs on GFX10+
This change has no effect.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26521>
2023-12-11 10:39:51 +00:00
Daniel Schürmann
5ebba87772 aco: rename max_wave64_per_simd -> max_waves_per_simd
and update usage. Changes are because the scheduler targets
a different number of waves.

Totals from 195 (0.25% of 79330) affected shaders: (GFX11)

MaxWaves: 3120 -> 3108 (-0.38%)
Instrs: 71202 -> 71070 (-0.19%); split: -0.27%, +0.09%
CodeSize: 383272 -> 382828 (-0.12%); split: -0.21%, +0.10%
VGPRs: 7392 -> 7752 (+4.87%)
Latency: 2280141 -> 2262487 (-0.77%); split: -0.79%, +0.02%
InvThroughput: 4759022 -> 5725442 (+20.31%); split: -0.01%, +20.32%
VClause: 1737 -> 1741 (+0.23%); split: -3.11%, +3.34%
SClause: 2385 -> 2376 (-0.38%); split: -0.80%, +0.42%
Copies: 5257 -> 5274 (+0.32%); split: -0.25%, +0.57%
Branches: 1213 -> 1212 (-0.08%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26521>
2023-12-11 10:39:50 +00:00
Samuel Pitoiset
8b87c985b0 radv: prepare the PS epilog key for exporting MRTZ on RDNA3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
2023-12-06 11:49:31 +00:00
Samuel Pitoiset
81eeb157f8 aco: export depth/stencil/samplemask in create_fs_jump_to_epilog()
This currently has no effects because the store_output instructions
are removed earlier (in ac_nir_lower_ps). Though, this will be needed
for exporting MRTZ from PS epilogs for alpha to coverage on RDNA3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
2023-12-06 11:49:31 +00:00
Qiang Yu
7656251294 aco: fix set_wqm segfault when ps prolog
ps prolog does not have nir shader.

Fixes: 3b10547e67 ("aco: enable helper lanes if shader->info.fs.require_full_quads")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26512>
2023-12-06 05:34:30 +00:00
Rhys Perry
e110eac171 aco: insert p_end_wqm before p_jump_to_epilog
Otherwise, we can transition to exact before p_jump_to_epilog, then
transition to WQM again and then back to exact:
p_jump_to_epilog //transitions to exact
p_logical_end //transitions to wqm
p_end_wqm //transitions to exact

We rely on ssa elimination to clean most of this up.

fossil-db (navi21):
Totals from 1 (0.00% of 79330) affected shaders:
Instrs: 111 -> 110 (-0.90%)
CodeSize: 572 -> 568 (-0.70%)
Copies: 16 -> 15 (-6.25%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25440>
2023-12-05 21:02:04 +00:00
Rhys Perry
7a37a39fe0 aco: simplify v_mul_* labelling slightly
This was from before VALU_instruction existed.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26445>
2023-12-05 16:56:58 +00:00
Rhys Perry
468ee8b80c aco: implement 16-bit fsat on GFX8
GFX8 doesn't have v_med3_f16.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26445>
2023-12-05 16:56:58 +00:00
Rhys Perry
de51a21e26 aco: implement 16-bit derivatives
These are used by radeonsi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26445>
2023-12-05 16:56:58 +00:00
Rhys Perry
997a0884a5 aco: implement 16-bit fsign on GFX8
GFX8 doesn't have v_med3_i16.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26445>
2023-12-05 16:56:58 +00:00
Rhys Perry
b7725b072b aco: flush denormals for 16-bit fmin/fmax on GFX8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26445>
2023-12-05 16:56:57 +00:00
Georg Lehmann
4b9618ceec aco: add test for post-ra DPP clobbered in linear cfg
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26373>
2023-11-28 12:48:56 +00:00