Rhys Perry
71afacff39
aco/insert_exec_mask: ensure top mask is not a temporary at loop exits
...
This is problematic when the successor of the loop exit is an invert
block. It assumes that the top mask is Operand(bld.lm) and doesn't change
it when entering the else branch.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11348
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29767 >
2024-06-20 12:47:05 +00:00
Rhys Perry
bdc229231d
aco: remove push constants
...
These are lowered in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:29 +00:00
Rhys Perry
9fe3af1e2a
aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
...
Forgot about this one.
fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770 >
2024-06-18 20:17:38 +00:00
Georg Lehmann
c3c398d56d
aco: make local functions static in files without anonymous namespace
...
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740 >
2024-06-18 17:53:07 +00:00
Georg Lehmann
046414e061
aco: add more anonymous namespaces
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740 >
2024-06-18 17:53:07 +00:00
Daniel Schürmann
9b1a748b5e
nir: remove nir_intrinsic_discard
...
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.
This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:16 +00:00
Daniel Schürmann
d5821bdf7d
radv: emit discard as demote by default
...
Also removes radv_lower_discard_to_demote debug option.
Totals from 1506 (1.90% of 79439) affected shaders: (GFX11)
MaxWaves: 46432 -> 46448 (+0.03%)
Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67%
CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51%
VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12%
Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59%
InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02%
VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53%
SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97%
Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95%
Branches: 8406 -> 8378 (-0.33%)
PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93%
PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14%
VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02%
SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14%
VMEM: 15052 -> 15051 (-0.01%)
SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:15 +00:00
Daniel Schürmann
b7982152ff
aco: use aco::monotonic_allocator for IDSet
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
97fd5d3f33
aco: make aco::monotonic_buffer_resource declaration visible for aco::IDSet
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
95967c2ca0
aco/reindex_ssa: replace live_var parameter with boolean
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
a497d105e3
aco: move live var information into struct Program
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
2322ab427e
aco/scheduler: remove unused register_demand parameter
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
677c9d9e93
aco/assembler: fix GFX67 MTBUF opcode encoding
...
Fixes: 56ac6f26e0 ('aco/assembler: slightly refactor MTBUF assembly for more readability')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29708 >
2024-06-13 09:18:05 +00:00
Daniel Schürmann
56ac6f26e0
aco/assembler: slightly refactor MTBUF assembly for more readability
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692 >
2024-06-12 11:41:58 +00:00
Daniel Schürmann
14f4906e53
aco/assembler: fix MTBUF opcode encoding on GFX11
...
We have accidentally set the tfe bit for some opcodes.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692 >
2024-06-12 11:41:58 +00:00
Rhys Perry
7a4f121c5d
aco: remove some missing label resets
...
In the case of:
c = xor(a, b)
d = not(c)
xor(d, e)
it will be optimized to:
d = xnor(a, b)
xor(d, e)
because "d" would still had a label with "instr=not(c)", it would then be
further optimized to:
d = xnor(a, b)
xnor(c, e)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11309
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29650 >
2024-06-11 09:30:16 +00:00
Friedrich Vock
15f2c9c553
aco: Limit rt stages to 128 vgprs
...
Totals from 35472 (7.40% of 479373) affected shaders:
MaxWaves: 206239 -> 283776 (+37.60%)
Instrs: 193922210 -> 202721106 (+4.54%)
CodeSize: 1056819972 -> 1110833680 (+5.11%); split: -0.00%, +5.11%
VGPRs: 6026704 -> 4540416 (-24.66%)
SpillSGPRs: 23742 -> 25754 (+8.47%)
SpillVGPRs: 118897 -> 2295118 (+1830.34%)
Scratch: 7201792 -> 152752128 (+2021.03%)
Latency: 2713432565 -> 3194796286 (+17.74%); split: -0.20%, +17.94%
InvThroughput: 1052131232 -> 935049835 (-11.13%); split: -16.59%, +5.46%
VClause: 6972784 -> 8716721 (+25.01%); split: -0.02%, +25.03%
SClause: 4879313 -> 4852452 (-0.55%); split: -0.88%, +0.33%
Copies: 32782141 -> 35223995 (+7.45%)
Branches: 11075847 -> 11094087 (+0.16%); split: -0.00%, +0.17%
VALU: 118525960 -> 120929058 (+2.03%)
SALU: 33924572 -> 33973293 (+0.14%); split: -0.03%, +0.17%
VMEM: 12419116 -> 17104582 (+37.73%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593 >
2024-06-10 19:39:52 +00:00
Friedrich Vock
ec8512ce85
aco/spill: Don't spill phis with all-undef operands
...
Fixes some crashes when limiting RT stages to 128 VGPRs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593 >
2024-06-10 19:39:52 +00:00
Rhys Perry
5297896856
aco: use ac_get_hw_cache_flags()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:43 +00:00
Rhys Perry
00eccf524f
aco: use GFX12 scope/temporal-hint
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1
aco: use ac_hw_cache_flags
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
cdaf269924
aco: inline store_vmem_mubuf/emit_single_mubuf_store
...
Both of these are only used once.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
185fa04baa
aco/gfx6: set glc for buffer_store_byte/short
...
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Daniel Schürmann
c452a4d1cc
aco/ra: use round robin register allocation
...
Totals from 74681 (94.06% of 79395) affected shaders: (GFX11)
MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10%
Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05%
CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05%
VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22%
Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23%
InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21%
VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27%
SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33%
Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76%
Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02%
VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19%
SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14%
VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
197943ae27
aco/ra: change heuristic to first fit
...
Totals from 73175 (92.17% of 79395) affected shaders: (GFX11)
MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01%
Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15%
CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12%
VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24%
Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30%
InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25%
VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33%
SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26%
Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20%
Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01%
VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12%
SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11%
VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
d76fc005b6
aco/ra: re-use registers from killed operands
...
Totals from 77283 (97.34% of 79395) affected shaders: (GFX11)
MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02%
Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11%
CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11%
VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65%
Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52%
InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34%
VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53%
SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14%
Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64%
Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03%
VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19%
SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07%
VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
b054cfe704
aco/ra: move can_write_m0() check into get_reg_specified()
...
This way, affinities are also covered.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
8e817cf52b
aco/ra: refactor get_reg_simple() with increased stride.
...
This should avoid some redundant calls.
Totals from 153 (0.19% of 79395) affected shaders: (GFX11)
Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05%
CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05%
VGPRs: 10068 -> 10348 (+2.78%)
Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11%
InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02%
VClause: 3868 -> 3870 (+0.05%)
Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34%
Branches: 6479 -> 6480 (+0.02%)
VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
1b0edf3f33
aco/ra: Fix array access when finding register for subdword variables
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:15 +00:00
Daniel Schürmann
5326e033ff
aco/ra: fix handling of killed operands in compact_relocate_vars()
...
Found by inspection.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235 >
2024-06-06 21:02:14 +00:00
Rhys Perry
4cfb7a0c17
aco: remove support for sub-dword push constants
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480 >
2024-06-06 17:52:05 +00:00
Georg Lehmann
3fb1a64918
aco: move s_add_u32 -> s_addk_i32 optimization fully to ra
...
Having this in one place is better.
When I wrote the old I wasn't aware that checking the kill flag on definitions
is the same as checking zero uses.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
60f3f0fdbb
aco/ra: use a switch to check vop2acc instruction support
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
fdc2fb6835
aco: move literal unswizzle opt to RA
...
Much simpler.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
c63c750380
aco/gfx11+: fix inline constants for v_pk_fmac_f16
...
On newer hardware, the hi operation reads the lo half of the inline constant.
On older hardware, it reads the hi half (zero).
I tested this on Navi31 for gfx11 and Raphael for gfx10.
Foz-DB Navi31:
Totals from 4 (0.01% of 79395) affected shaders:
CodeSize: 36832 -> 36448 (-1.04%)
Latency: 20362 -> 20334 (-0.14%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
39380d475a
aco: add affinities for possible sopk optimizations
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Georg Lehmann
fac475bc25
aco: rework how affinities for acc operands are determined
...
Improve accuracy by adding a helper that's also used by
the optimization function.
Foz-DB Navi31:
Totals from 50 (0.06% of 79206) affected shaders:
CodeSize: 126148 -> 126128 (-0.02%); split: -0.05%, +0.04%
Latency: 334049 -> 334060 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 59203 -> 59205 (+0.00%)
Copies: 2011 -> 1998 (-0.65%); split: -0.75%, +0.10%
VALU: 14221 -> 14208 (-0.09%); split: -0.11%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512 >
2024-06-06 16:28:23 +00:00
Rhys Perry
8e475bba61
aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
1ad05d4ca8
aco: implement nir_atomic_op_ordered_add_gfx12_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
0dee5fdd3c
aco: don't combine vgpr into writelane src0
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
2a4424425a
aco/gfx12: fix s_wait_event immediate
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
c651eed1d8
aco/gfx12: implement load_subgroup_id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Georg Lehmann
818ff03865
aco: optimize branching sequence with p_create_vector exec producer
...
This happens with inverse_ballot and wave64.
Foz-DB Navi21:
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 2689 -> 2683 (-0.22%)
CodeSize: 14988 -> 14972 (-0.11%)
Latency: 20207 -> 20204 (-0.01%)
Copies: 144 -> 141 (-2.08%)
Branches: 76 -> 73 (-3.95%)
SALU: 241 -> 238 (-1.24%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502 >
2024-06-04 15:40:57 +00:00
Georg Lehmann
dcab408a6c
nir: remove unpack_half_flush_to_zero
...
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433 >
2024-05-31 09:46:35 +00:00
Mike Blumenkrantz
2aaa6ebba1
build/amd: add amd-use-llvm build option
...
this allows amd drivers to disable llvm support while still allowing
llvmpipe/lavapipe to be built
by disabling llvm support in amd drivers, the load times for these drivers
decreases by 5-10ms
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969 >
2024-05-30 19:05:00 +00:00
Samuel Pitoiset
ce6557cc04
aco: adjust loading local invocation ID for GS on GFX12
...
It uses gs_vtx_offset[0] instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29417 >
2024-05-30 11:05:04 +00:00
Rhys Perry
ac47ee1be7
meson: remove --depfile for aco_tests
...
This isn't needed right now and probably doesn't work. glsl_scraper.py
writes to the same depfile several times.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29348 >
2024-05-30 09:44:52 +00:00
Rhys Perry
1829d74ad3
aco: fix fddx/y with uniform inf/nan input
...
inf or nan subtracted by itself is not zero.
I don't think Vulkan requires this, but this better matches NIR's constant
folding and the divergent implementation.
fossil-db (navi31):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 537 -> 588 (+9.50%)
CodeSize: 3132 -> 3380 (+7.92%)
Latency: 2806 -> 2819 (+0.46%)
InvThroughput: 286 -> 316 (+10.49%)
Copies: 24 -> 39 (+62.50%)
VALU: 262 -> 289 (+10.31%)
SALU: 33 -> 51 (+54.55%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29418 >
2024-05-29 15:18:52 +00:00
Georg Lehmann
b04d99d093
aco/optimizer: use p_create_vector to create mask when a copy can't be used
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00
Georg Lehmann
2b56a97374
aco/lower_to_hw: optimize split 64bit constant copies
...
Foz-DB Navi21:
Totals from 3209 (4.04% of 79395) affected shaders:
Instrs: 6502065 -> 6496612 (-0.08%)
CodeSize: 35578300 -> 35556596 (-0.06%)
Latency: 66092924 -> 66092668 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 16968953 -> 16968900 (-0.00%); split: -0.00%, +0.00%
SClause: 198651 -> 198647 (-0.00%)
Copies: 597323 -> 591872 (-0.91%)
SALU: 930918 -> 925467 (-0.59%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422 >
2024-05-29 11:59:22 +00:00