Commit graph

12 commits

Author SHA1 Message Date
Marek Olšák
d9a77f9ca3 ac/llvm: add better code for fsign
There are 2 improvements:
- better code for 16, 32, and 64 bits
- vector support for 16 and 32 bits

Totals:
SGPRS: 2639738 -> 2625882 (-0.52 %)
VGPRS: 1534120 -> 1533916 (-0.01 %)
Spilled SGPRs: 3541 -> 3557 (0.45 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 292 -> 292 (0.00 %) dwords per thread
Code Size: 55640332 -> 55384892 (-0.46 %) bytes
Max Waves: 964785 -> 964857 (0.01 %)

Totals from affected shaders:
SGPRS: 377352 -> 363496 (-3.67 %)
VGPRS: 209800 -> 209596 (-0.10 %)
Spilled SGPRs: 1979 -> 1995 (0.81 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 12549300 -> 12293860 (-2.04 %) bytes
Max Waves: 105762 -> 105834 (0.07 %)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>
2020-09-06 14:36:21 +00:00
Marek Olšák
f85294207f Revert "ac: generate FMA for inexact instructions for radeonsi"
This reverts commit 4b9370cb0f.

Fixes: 4b9370cb0f
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3429

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>
2020-09-06 14:36:20 +00:00
James Park
24b80f8bb9 amd/llvm: Reorder LLVM headers
LLVM uses __declspec(restrict) which breaks because Mesa define restrict
as __restrict. Move the LLVM headerse up to dodge the macro.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6180>
2020-08-05 17:15:18 +00:00
Bas Nieuwenhuizen
40e00c800c amd/llvm: Mark pointer function arguments as 32-byte aligned.
Otherwise LLVM does not see the pointers as allowing speculative
loads.

The pipeline-db results are pretty wild, but mostly what is to be
expected from allowing more code movement in LLVM:

Totals from affected shaders:
SGPRS: 157728 -> 168336 (6.73 %)
VGPRS: 158628 -> 158664 (0.02 %)
Spilled SGPRs: 10845 -> 24753 (128.24 %)
Spilled VGPRs: 13 -> 13 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 8 -> 8 (0.00 %) dwords per thread
Code Size: 17189180 -> 17313712 (0.72 %) bytes
LDS: 204 -> 204 (0.00 %) blocks
Max Waves: 5700 -> 5687 (-0.23 %)
Wait states: 0 -> 0 (0.00 %)

This gives some boosts for shaders we can move a descriptor load
outside a loop.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3159>
2020-07-08 23:47:06 +00:00
Marek Olšák
b97cc41aa2 Revert "ac: reassociate FP expressions for inexact instructions for radeonsi"
This reverts commit cf2f3c2753.

It breaks shadows in Unigine Superposition.

Fixes: cf2f3c2753

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4837>
2020-05-04 11:51:37 -04:00
Marek Olšák
cf2f3c2753 ac: reassociate FP expressions for inexact instructions for radeonsi
Totals:
SGPRS: 2591784 -> 2590696 (-0.04 %)
VGPRS: 1666888 -> 1666736 (-0.01 %)
Spilled SGPRs: 4131 -> 4107 (-0.58 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52715468 -> 52693584 (-0.04 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479897 -> 479892 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
4b9370cb0f ac: generate FMA for inexact instructions for radeonsi
NIR mostly does this already.

Totals:
SGPRS: 2588520 -> 2591784 (0.13 %)
VGPRS: 1666984 -> 1666888 (-0.01 %)
Spilled SGPRs: 4074 -> 4131 (1.40 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52726872 -> 52715468 (-0.02 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479872 -> 479897 (0.01 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
f2c2a28073 ac: update and document fast math flags used by radeonsi
This should have no effect, because we never use FP division, but
it's safer for the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Samuel Pitoiset
519d9b30de radv: remove useless RADV_DEBUG=unsafemath debug option
This option is useless and shouldn't be used at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-15 09:07:34 +01:00
Samuel Pitoiset
ee9811a0bb ac: fix build with recent LLVM
Build is broken since "Move CodeGenFileType enum to Support/CodeGen.h".

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-14 14:41:55 +00:00
Samuel Pitoiset
7dfb15fff1 ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO
Because some instructions will be optimized by the backend compiler,
the driver has to manually flush to zero to keep the result exact.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:51 +02:00
Timur Kristóf
3a08110d43 amd: Move all amd/common code that depends on LLVM to amd/llvm.
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-08 00:44:08 +00:00
Renamed from src/amd/common/ac_llvm_helper.cpp (Browse further)