v_add can be dual issued on gfx11, v_med3 cannot.
Don't use v_add directly to still optimize omod(fsat(x)).
Foz-DB GFX1100:
Totals from 32702 (24.24% of 134913) affected shaders:
Latency: 475008203 -> 474928037 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 59226198 -> 59140787 (-0.14%); split: -0.14%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
This is to avoid missing array flag when <=GFX8 and 3D image
in which case is treated as 2D array image.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23094>
Unlike \S, \w only matches characters which are valid in identifiers. This
seems to be much faster, especially for longer identifier names.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636>
... and expect it to behave like ACCESS_NON_TEMPORAL.
Handling ACCESS_NON_TEMPORAL is sufficient.
This was copied from ac_nir_to_llvm.c.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22770>
We haven't measured any noteworthy perf improvement
from these, and they are difficult to port to NIR,
so remove them before the NIR based VS input lowering
in order to make it easier to bisect and analyze stats.
Fossil DB stats on Rembrandt (GFX10.3):
Totals from 21750 (16.12% of 134913) affected shaders:
VGPRs: 868512 -> 868664 (+0.02%); split: -0.00%, +0.02%
CodeSize: 64406804 -> 64397572 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 567904 -> 567888 (-0.00%); split: +0.00%, -0.00%
Instrs: 12327212 -> 12324851 (-0.02%); split: -0.10%, +0.08%
Latency: 61367324 -> 61371204 (+0.01%); split: -0.04%, +0.05%
InvThroughput: 9687734 -> 9686000 (-0.02%); split: -0.03%, +0.01%
VClause: 248207 -> 303449 (+22.26%); split: -0.02%, +22.28%
SClause: 314942 -> 315564 (+0.20%); split: -0.09%, +0.29%
Copies: 921581 -> 921820 (+0.03%); split: -0.16%, +0.19%
Branches: 341964 -> 341967 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16805>
The meson.build_root() method has been deprecated, so let's switch to
meson.project_build_root(), which usually means the same thing. The case
where it doesn't do the same thing is if Mesa is a subproject to some
other project, but in that case I believe we want the build root of Mesa,
not of the parent project anyway.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907>
It looks like the LLVM assembler promotes true 16-bit instructions to VOP3
in this case.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251>
We can skip the v_or_b32 or use an instruction smaller than
v_alignbyte_b32.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19933>
fossil-db (gfx1100):
Totals from 52753 (39.07% of 135032) affected shaders:
CodeSize: 153603860 -> 153163384 (-0.29%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19933>
Foz-DB Navi21:
Totals from 2403 (1.78% of 134913) affected shaders:
CodeSize: 25329156 -> 25311244 (-0.07%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19413>
The LLVM backend does this when lowering ordered_xfb_counter_add_amd. I
guess there is some missing dependency checking or something.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345>
This doesn't make sense, opsel preserves the not selected half of the register,
p_insert zeros it.
No Foz-DB changes.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 54292e99c7 ("aco: optimize 32-bit extracts and inserts using SDWA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19253>
I misread the ISA doc and got the order wrong.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: dae1629778 ("aco: disable sdwa on gfx11")
Fixes: e68e6c75ca ("aco: use v_perm_b32 to copy 0xff00/0x00ff/0xff/0x00")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19223>
fossil-db (gfx1100):
Totals from 57858 (42.85% of 135032) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18273>