Commit graph

3142 commits

Author SHA1 Message Date
Rhys Perry
71afacff39 aco/insert_exec_mask: ensure top mask is not a temporary at loop exits
This is problematic when the successor of the loop exit is an invert
block. It assumes that the top mask is Operand(bld.lm) and doesn't change
it when entering the else branch.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11348
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29767>
2024-06-20 12:47:05 +00:00
Rhys Perry
bdc229231d aco: remove push constants
These are lowered in NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675>
2024-06-20 12:09:29 +00:00
Rhys Perry
9fe3af1e2a aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
Forgot about this one.

fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770>
2024-06-18 20:17:38 +00:00
Georg Lehmann
c3c398d56d aco: make local functions static in files without anonymous namespace
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Georg Lehmann
046414e061 aco: add more anonymous namespaces
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Daniel Schürmann
9b1a748b5e nir: remove nir_intrinsic_discard
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.

This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
2024-06-17 19:37:16 +00:00
Daniel Schürmann
d5821bdf7d radv: emit discard as demote by default
Also removes radv_lower_discard_to_demote debug option.

Totals from 1506 (1.90% of 79439) affected shaders: (GFX11)
MaxWaves: 46432 -> 46448 (+0.03%)
Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67%
CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51%
VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12%
Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59%
InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02%
VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53%
SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97%
Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95%
Branches: 8406 -> 8378 (-0.33%)
PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93%
PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14%
VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02%
SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14%
VMEM: 15052 -> 15051 (-0.01%)
SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
2024-06-17 19:37:15 +00:00
Daniel Schürmann
b7982152ff aco: use aco::monotonic_allocator for IDSet
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
97fd5d3f33 aco: make aco::monotonic_buffer_resource declaration visible for aco::IDSet
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
95967c2ca0 aco/reindex_ssa: replace live_var parameter with boolean
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
a497d105e3 aco: move live var information into struct Program
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
2322ab427e aco/scheduler: remove unused register_demand parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
677c9d9e93 aco/assembler: fix GFX67 MTBUF opcode encoding
Fixes: 56ac6f26e0 ('aco/assembler: slightly refactor MTBUF assembly for more readability')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29708>
2024-06-13 09:18:05 +00:00
Daniel Schürmann
56ac6f26e0 aco/assembler: slightly refactor MTBUF assembly for more readability
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Daniel Schürmann
14f4906e53 aco/assembler: fix MTBUF opcode encoding on GFX11
We have accidentally set the tfe bit for some opcodes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Rhys Perry
7a4f121c5d aco: remove some missing label resets
In the case of:
   c = xor(a, b)
   d = not(c)
   xor(d, e)
it will be optimized to:
   d = xnor(a, b)
   xor(d, e)
because "d" would still had a label with "instr=not(c)", it would then be
further optimized to:
   d = xnor(a, b)
   xnor(c, e)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11309
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29650>
2024-06-11 09:30:16 +00:00
Friedrich Vock
15f2c9c553 aco: Limit rt stages to 128 vgprs
Totals from 35472 (7.40% of 479373) affected shaders:

MaxWaves: 206239 -> 283776 (+37.60%)
Instrs: 193922210 -> 202721106 (+4.54%)
CodeSize: 1056819972 -> 1110833680 (+5.11%); split: -0.00%, +5.11%
VGPRs: 6026704 -> 4540416 (-24.66%)
SpillSGPRs: 23742 -> 25754 (+8.47%)
SpillVGPRs: 118897 -> 2295118 (+1830.34%)
Scratch: 7201792 -> 152752128 (+2021.03%)
Latency: 2713432565 -> 3194796286 (+17.74%); split: -0.20%, +17.94%
InvThroughput: 1052131232 -> 935049835 (-11.13%); split: -16.59%, +5.46%
VClause: 6972784 -> 8716721 (+25.01%); split: -0.02%, +25.03%
SClause: 4879313 -> 4852452 (-0.55%); split: -0.88%, +0.33%
Copies: 32782141 -> 35223995 (+7.45%)
Branches: 11075847 -> 11094087 (+0.16%); split: -0.00%, +0.17%
VALU: 118525960 -> 120929058 (+2.03%)
SALU: 33924572 -> 33973293 (+0.14%); split: -0.03%, +0.17%
VMEM: 12419116 -> 17104582 (+37.73%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Friedrich Vock
ec8512ce85 aco/spill: Don't spill phis with all-undef operands
Fixes some crashes when limiting RT stages to 128 VGPRs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Rhys Perry
5297896856 aco: use ac_get_hw_cache_flags()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:43 +00:00
Rhys Perry
00eccf524f aco: use GFX12 scope/temporal-hint
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1 aco: use ac_hw_cache_flags
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
cdaf269924 aco: inline store_vmem_mubuf/emit_single_mubuf_store
Both of these are only used once.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
185fa04baa aco/gfx6: set glc for buffer_store_byte/short
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Daniel Schürmann
c452a4d1cc aco/ra: use round robin register allocation
Totals from 74681 (94.06% of 79395) affected shaders: (GFX11)

MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10%
Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05%
CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05%
VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22%
Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23%
InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21%
VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27%
SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33%
Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76%
Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02%
VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19%
SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14%
VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
197943ae27 aco/ra: change heuristic to first fit
Totals from 73175 (92.17% of 79395) affected shaders: (GFX11)

MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01%
Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15%
CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12%
VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24%
Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30%
InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25%
VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33%
SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26%
Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20%
Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01%
VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12%
SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11%
VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
d76fc005b6 aco/ra: re-use registers from killed operands
Totals from 77283 (97.34% of 79395) affected shaders: (GFX11)

MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02%
Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11%
CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11%
VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65%
Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52%
InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34%
VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53%
SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14%
Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64%
Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03%
VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19%
SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07%
VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
b054cfe704 aco/ra: move can_write_m0() check into get_reg_specified()
This way, affinities are also covered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
8e817cf52b aco/ra: refactor get_reg_simple() with increased stride.
This should avoid some redundant calls.

Totals from 153 (0.19% of 79395) affected shaders: (GFX11)

Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05%
CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05%
VGPRs: 10068 -> 10348 (+2.78%)
Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11%
InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02%
VClause: 3868 -> 3870 (+0.05%)
Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34%
Branches: 6479 -> 6480 (+0.02%)
VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
1b0edf3f33 aco/ra: Fix array access when finding register for subdword variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
5326e033ff aco/ra: fix handling of killed operands in compact_relocate_vars()
Found by inspection.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:14 +00:00
Rhys Perry
4cfb7a0c17 aco: remove support for sub-dword push constants
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>
2024-06-06 17:52:05 +00:00
Georg Lehmann
3fb1a64918 aco: move s_add_u32 -> s_addk_i32 optimization fully to ra
Having this in one place is better.
When I wrote the old I wasn't aware that checking the kill flag on definitions
is the same as checking zero uses.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Georg Lehmann
60f3f0fdbb aco/ra: use a switch to check vop2acc instruction support
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Georg Lehmann
fdc2fb6835 aco: move literal unswizzle opt to RA
Much simpler.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Georg Lehmann
c63c750380 aco/gfx11+: fix inline constants for v_pk_fmac_f16
On newer hardware, the hi operation reads the lo half of the inline constant.
On older hardware, it reads the hi half (zero).
I tested this on Navi31 for gfx11 and Raphael for gfx10.

Foz-DB Navi31:
Totals from 4 (0.01% of 79395) affected shaders:
CodeSize: 36832 -> 36448 (-1.04%)
Latency: 20362 -> 20334 (-0.14%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Georg Lehmann
39380d475a aco: add affinities for possible sopk optimizations
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Georg Lehmann
fac475bc25 aco: rework how affinities for acc operands are determined
Improve accuracy by adding a helper that's also used by
the optimization function.

Foz-DB Navi31:
Totals from 50 (0.06% of 79206) affected shaders:
CodeSize: 126148 -> 126128 (-0.02%); split: -0.05%, +0.04%
Latency: 334049 -> 334060 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 59203 -> 59205 (+0.00%)
Copies: 2011 -> 1998 (-0.65%); split: -0.75%, +0.10%
VALU: 14221 -> 14208 (-0.09%); split: -0.11%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>
2024-06-06 16:28:23 +00:00
Rhys Perry
8e475bba61 aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry
1ad05d4ca8 aco: implement nir_atomic_op_ordered_add_gfx12_amd
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry
0dee5fdd3c aco: don't combine vgpr into writelane src0
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry
2a4424425a aco/gfx12: fix s_wait_event immediate
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry
c651eed1d8 aco/gfx12: implement load_subgroup_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Georg Lehmann
818ff03865 aco: optimize branching sequence with p_create_vector exec producer
This happens with inverse_ballot and wave64.

Foz-DB Navi21:
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 2689 -> 2683 (-0.22%)
CodeSize: 14988 -> 14972 (-0.11%)
Latency: 20207 -> 20204 (-0.01%)
Copies: 144 -> 141 (-2.08%)
Branches: 76 -> 73 (-3.95%)
SALU: 241 -> 238 (-1.24%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502>
2024-06-04 15:40:57 +00:00
Georg Lehmann
dcab408a6c nir: remove unpack_half_flush_to_zero
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>
2024-05-31 09:46:35 +00:00
Mike Blumenkrantz
2aaa6ebba1 build/amd: add amd-use-llvm build option
this allows amd drivers to disable llvm support while still allowing
llvmpipe/lavapipe to be built

by disabling llvm support in amd drivers, the load times for these drivers
decreases by 5-10ms

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969>
2024-05-30 19:05:00 +00:00
Samuel Pitoiset
ce6557cc04 aco: adjust loading local invocation ID for GS on GFX12
It uses gs_vtx_offset[0] instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29417>
2024-05-30 11:05:04 +00:00
Rhys Perry
ac47ee1be7 meson: remove --depfile for aco_tests
This isn't needed right now and probably doesn't work. glsl_scraper.py
writes to the same depfile several times.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29348>
2024-05-30 09:44:52 +00:00
Rhys Perry
1829d74ad3 aco: fix fddx/y with uniform inf/nan input
inf or nan subtracted by itself is not zero.

I don't think Vulkan requires this, but this better matches NIR's constant
folding and the divergent implementation.

fossil-db (navi31):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 537 -> 588 (+9.50%)
CodeSize: 3132 -> 3380 (+7.92%)
Latency: 2806 -> 2819 (+0.46%)
InvThroughput: 286 -> 316 (+10.49%)
Copies: 24 -> 39 (+62.50%)
VALU: 262 -> 289 (+10.31%)
SALU: 33 -> 51 (+54.55%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29418>
2024-05-29 15:18:52 +00:00
Georg Lehmann
b04d99d093 aco/optimizer: use p_create_vector to create mask when a copy can't be used
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Georg Lehmann
2b56a97374 aco/lower_to_hw: optimize split 64bit constant copies
Foz-DB Navi21:
Totals from 3209 (4.04% of 79395) affected shaders:
Instrs: 6502065 -> 6496612 (-0.08%)
CodeSize: 35578300 -> 35556596 (-0.06%)
Latency: 66092924 -> 66092668 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 16968953 -> 16968900 (-0.00%); split: -0.00%, +0.00%
SClause: 198651 -> 198647 (-0.00%)
Copies: 597323 -> 591872 (-0.91%)
SALU: 930918 -> 925467 (-0.59%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00