Alyssa Rosenzweig
daa97bb41a
amd: switch to derivative intrinsics
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565 >
2024-08-08 15:26:07 +00:00
Georg Lehmann
f36fccabf5
aco: optimize 64bit find_lsb/find_msb
...
No Foz-DB changes, but this should be better, especially for gfx6-7 where
uadd_sat is 2 instructions.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30549 >
2024-08-08 11:15:55 +00:00
Georg Lehmann
bee487df48
aco/gfx11.5+: use vinterp for fddx/fddy
...
Since GFX11.5 VINTERP can be dual issued, DPP cannot.
Foz-DB GFX11.5:
Totals from 8401 (10.58% of 79395) affected shaders:
MaxWaves: 247880 -> 247848 (-0.01%)
Instrs: 6802675 -> 6815061 (+0.18%); split: -0.08%, +0.26%
CodeSize: 36539444 -> 36500948 (-0.11%); split: -0.22%, +0.11%
VGPRs: 444324 -> 445932 (+0.36%); split: -0.01%, +0.37%
SpillSGPRs: 1350 -> 1346 (-0.30%)
Latency: 63628380 -> 63523687 (-0.16%); split: -0.20%, +0.04%
InvThroughput: 10566750 -> 10486009 (-0.76%); split: -0.77%, +0.01%
VClause: 100171 -> 100248 (+0.08%); split: -0.08%, +0.16%
SClause: 175467 -> 176208 (+0.42%); split: -0.05%, +0.47%
Copies: 356817 -> 356935 (+0.03%); split: -0.17%, +0.20%
PreVGPRs: 283403 -> 283898 (+0.17%); split: -0.02%, +0.20%
VALU: 4217969 -> 4229831 (+0.28%); split: -0.03%, +0.31%
SALU: 479367 -> 479428 (+0.01%); split: -0.00%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30350 >
2024-07-30 15:25:19 +00:00
Georg Lehmann
8c6e299141
aco: reorder dpp for ddx/ddy
...
Having the mov last allows us to fuse it with the use instruction.
Foz-DB Navi31:
Totals from 9400 (11.84% of 79395) affected shaders:
MaxWaves: 273998 -> 274030 (+0.01%)
Instrs: 8303778 -> 8282997 (-0.25%); split: -0.29%, +0.04%
CodeSize: 44428088 -> 44464860 (+0.08%); split: -0.09%, +0.18%
VGPRs: 506616 -> 504492 (-0.42%)
SpillSGPRs: 1389 -> 1393 (+0.29%)
Latency: 76923466 -> 76983332 (+0.08%); split: -0.06%, +0.14%
InvThroughput: 12386888 -> 12391262 (+0.04%); split: -0.04%, +0.07%
VClause: 125136 -> 125059 (-0.06%); split: -0.13%, +0.07%
SClause: 227361 -> 226615 (-0.33%); split: -0.43%, +0.10%
Copies: 440787 -> 440749 (-0.01%); split: -0.17%, +0.16%
PreVGPRs: 339783 -> 333343 (-1.90%); split: -1.92%, +0.02%
VALU: 5088362 -> 5069737 (-0.37%); split: -0.37%, +0.01%
SALU: 606596 -> 606609 (+0.00%); split: -0.01%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30350 >
2024-07-30 15:25:19 +00:00
Rhys Perry
a8a15dc5b5
aco: add struct and helpers for exec potentially empty
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30320 >
2024-07-26 11:08:27 +00:00
Rhys Perry
39264a90c3
aco: consider exec empty after divergent continue then divergent break
...
For:
loop {
if (divergent)
continue
if (divergent)
break
//exec is potentially empty here
loop {
if (divergent)
break
}
}
If a subset of invocations take the continue and then the rest take the
break, then exec will be empty after the break.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30320 >
2024-07-26 11:08:27 +00:00
Marek Olšák
b2d32ae246
nir: add nir_intrinsic_load_per_primitive_input, split from io_semantics flag
...
Instead of having 1 bit in nir_io_semantics indicating a per-primitive
FS input, add a dedicated intrinsic for it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895 >
2024-07-23 16:13:16 +00:00
Rhys Perry
cccfbe6141
aco: move s_setprio to before NGG exec initialization
...
fossil-db (gfx1150):
Totals from 32 (0.04% of 79395) affected shaders:
Instrs: 17397 -> 17365 (-0.18%)
CodeSize: 83700 -> 83580 (-0.14%)
Latency: 59006 -> 58974 (-0.05%)
fossil-db (navi21):
Totals from 4 (0.01% of 79395) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30241 >
2024-07-23 13:14:52 +00:00
Rhys Perry
0919ce1ac4
aco/gfx11.5: workaround export priority issue
...
https://github.com/llvm/llvm-project/pull/99273
fossil-db (gfx1150):
Totals from 73996 (93.20% of 79395) affected shaders:
Instrs: 36015357 -> 36807177 (+2.20%)
CodeSize: 189072544 -> 192238748 (+1.67%)
Latency: 245845181 -> 246790550 (+0.38%); split: -0.00%, +0.38%
InvThroughput: 45068018 -> 45116177 (+0.11%); split: -0.00%, +0.11%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30241 >
2024-07-23 13:14:51 +00:00
Georg Lehmann
efb9258814
aco: handle clustered uniform reductions correctly
...
Alternatively we could just trust divergence analysis to do the right thing.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30235 >
2024-07-19 07:24:34 +00:00
Georg Lehmann
5e8bb93ea3
aco: micro optimize VALU fquantize2f16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
5b4fcfd638
aco/gfx11.5: select SALU fquantize2f16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
2549bc2f9e
aco/gfx11.5: select SALU fneg/fabs
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
284b9965e8
aco/gfx11.5+: allow sgpr dst for trans ops and use pseudo scalar ops on gfx12
...
Also optimize the denorm scaling path by only emitting the expensive trans op once
and allowing fma for the final muliplication.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
314053a3e3
aco/gfx11.5: select SALU fsign
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
b1b5a0c6ad
aco/gfx11.5: select SALU fsat
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
ee0e183700
aco/gfx11.5: select SOPC float instructions
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:15 +00:00
Georg Lehmann
4bd229ac50
aco/gfx11.5: select SOP2 float instructions
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:14 +00:00
Georg Lehmann
a90d4d340c
aco/gfx11.5: select SALU float conversions
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:14 +00:00
Georg Lehmann
4399c7bac3
aco: add aco_opcode::p_s_cvt_f16_f32_rtne
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:14 +00:00
Georg Lehmann
1efb7754fc
aco/gfx11.5: select s_(ceil|floor|trunc|rndne)
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:14 +00:00
Georg Lehmann
33a719b3e2
aco/gfx11.5: select s_cvt_[ui]32_f32
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245 >
2024-07-18 08:36:14 +00:00
Georg Lehmann
2d3f536174
aco,nir: add dpp16_shift_amd intrinsic
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24650 >
2024-07-17 15:04:38 +00:00
Rhys Perry
a6eb5c9caa
aco: use alignment information in visit_load_constant()
...
The intrinsic has this now.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912 >
2024-07-01 17:34:22 +00:00
Rhys Perry
98cb50297b
aco: use s_pack_ll_b32_b16 for pack_32_2x16_split
...
fossil-db (navi21):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 45963 -> 45941 (-0.05%)
CodeSize: 241908 -> 241768 (-0.06%)
Latency: 176508 -> 176501 (-0.00%)
SALU: 6123 -> 6101 (-0.36%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912 >
2024-07-01 17:34:21 +00:00
Georg Lehmann
7fc8ad2ddd
aco/ir: remove unused vopc helpers
...
And rename get_swapped and get_inverse to show that they should only be used for VOPC.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467 >
2024-06-27 08:12:30 +00:00
Georg Lehmann
c5ba17cd25
aco: implement ford, funord, fneo, fequ, fltu, fgeu
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467 >
2024-06-27 08:12:30 +00:00
Rhys Perry
e3ffc244f5
aco: skip continue_or_break LCSSA phis when not needed
...
Fixes:
//exec is empty here
loop {
%1:s[16-17] = ...
if () {
break
}
%2:s[16-17] = ...
continue_or_break
}
%3 = phi %1, undef
//because of the undef, %2 can use s[16-17] and overwrite the address
load(%3:s[16-17])
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bbe4652430 ("aco: create lcssa phis for continue_or_break loops when necessary")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11333
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29838 >
2024-06-26 09:10:54 +00:00
Rhys Perry
bdc229231d
aco: remove push constants
...
These are lowered in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675 >
2024-06-20 12:09:29 +00:00
Daniel Schürmann
9b1a748b5e
nir: remove nir_intrinsic_discard
...
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.
This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:16 +00:00
Daniel Schürmann
d5821bdf7d
radv: emit discard as demote by default
...
Also removes radv_lower_discard_to_demote debug option.
Totals from 1506 (1.90% of 79439) affected shaders: (GFX11)
MaxWaves: 46432 -> 46448 (+0.03%)
Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67%
CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51%
VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12%
Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59%
InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02%
VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53%
SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97%
Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95%
Branches: 8406 -> 8378 (-0.33%)
PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93%
PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14%
VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02%
SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14%
VMEM: 15052 -> 15051 (-0.01%)
SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617 >
2024-06-17 19:37:15 +00:00
Rhys Perry
5297896856
aco: use ac_get_hw_cache_flags()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:43 +00:00
Rhys Perry
00eccf524f
aco: use GFX12 scope/temporal-hint
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1
aco: use ac_hw_cache_flags
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
cdaf269924
aco: inline store_vmem_mubuf/emit_single_mubuf_store
...
Both of these are only used once.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
185fa04baa
aco/gfx6: set glc for buffer_store_byte/short
...
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243 >
2024-06-07 13:22:42 +00:00
Rhys Perry
4cfb7a0c17
aco: remove support for sub-dword push constants
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480 >
2024-06-06 17:52:05 +00:00
Rhys Perry
8e475bba61
aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
1ad05d4ca8
aco: implement nir_atomic_op_ordered_add_gfx12_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
2a4424425a
aco/gfx12: fix s_wait_event immediate
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
c651eed1d8
aco/gfx12: implement load_subgroup_id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Georg Lehmann
dcab408a6c
nir: remove unpack_half_flush_to_zero
...
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433 >
2024-05-31 09:46:35 +00:00
Samuel Pitoiset
ce6557cc04
aco: adjust loading local invocation ID for GS on GFX12
...
It uses gs_vtx_offset[0] instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29417 >
2024-05-30 11:05:04 +00:00
Rhys Perry
1829d74ad3
aco: fix fddx/y with uniform inf/nan input
...
inf or nan subtracted by itself is not zero.
I don't think Vulkan requires this, but this better matches NIR's constant
folding and the divergent implementation.
fossil-db (navi31):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 537 -> 588 (+9.50%)
CodeSize: 3132 -> 3380 (+7.92%)
Latency: 2806 -> 2819 (+0.46%)
InvThroughput: 286 -> 316 (+10.49%)
Copies: 24 -> 39 (+62.50%)
VALU: 262 -> 289 (+10.31%)
SALU: 33 -> 51 (+54.55%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29418 >
2024-05-29 15:18:52 +00:00
Konstantin Seurer
a93f95c69c
radv/rt: Remove load_rt_dynamic_callable_stack_base_amd
...
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Konstantin Seurer
432f3eb9ca
radv/rt: Track ray_launch_size reads
...
Totals from 33 (8.71% of 379) affected shaders:
Instrs: 1434025 -> 1433988 (-0.00%); split: -0.01%, +0.00%
CodeSize: 7578824 -> 7578472 (-0.00%); split: -0.01%, +0.00%
Latency: 9241632 -> 9241639 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3407014 -> 3407049 (+0.00%); split: -0.00%, +0.00%
VClause: 40399 -> 40391 (-0.02%)
SClause: 37755 -> 37760 (+0.01%); split: -0.04%, +0.05%
Copies: 169588 -> 169567 (-0.01%); split: -0.04%, +0.02%
PreSGPRs: 4323 -> 4319 (-0.09%)
VALU: 940500 -> 940484 (-0.00%); split: -0.00%, +0.00%
SALU: 220508 -> 220509 (+0.00%); split: -0.03%, +0.03%
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Konstantin Seurer
7ba8fccad3
radv/rt: Track ray_launch_id reads
...
We can expect the z-component to be unused most of the times. Avoid
preserving it in those cases.
Totals from 94 (24.80% of 379) affected shaders:
MaxWaves: 916 -> 935 (+2.07%)
Instrs: 3316697 -> 3318357 (+0.05%); split: -0.06%, +0.11%
CodeSize: 17618704 -> 17616680 (-0.01%); split: -0.09%, +0.08%
VGPRs: 11632 -> 11520 (-0.96%)
SpillSGPRs: 1139 -> 1205 (+5.79%); split: -0.35%, +6.15%
Latency: 22595907 -> 22598225 (+0.01%); split: -0.15%, +0.16%
InvThroughput: 7036479 -> 6923740 (-1.60%); split: -1.74%, +0.14%
VClause: 104325 -> 104361 (+0.03%); split: -0.16%, +0.19%
SClause: 83920 -> 83925 (+0.01%); split: -0.08%, +0.08%
Copies: 328140 -> 330687 (+0.78%); split: -0.27%, +1.05%
Branches: 134521 -> 134541 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 8753 -> 8806 (+0.61%)
PreVGPRs: 10984 -> 10937 (-0.43%)
VALU: 2149880 -> 2151318 (+0.07%); split: -0.08%, +0.15%
SALU: 499107 -> 499128 (+0.00%); split: -0.08%, +0.09%
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619 >
2024-05-28 12:23:45 +00:00
Rhys Perry
12b4bdc134
aco/gfx12: decrease max_nsa_vgprs for VSAMPLE
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
ef74407577
aco/gfx12: use ttmp9/ttmp7 for workgroup id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
fae2a85d57
aco/gfx12: implement subgroup shader clock
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00