Commit graph

31 commits

Author SHA1 Message Date
Marek Olšák
04e4fe594b radeonsi: use ctx->types instead of bld->types etc.
even vec_type is f32.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 16:55:19 +02:00
Marek Olšák
7a5e6dcba5 radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 16:55:16 +02:00
Marek Olšák
29adaa19ac radeonsi: remove most uses of lp_build_const*
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
474468fbf9 radeonsi/gfx9: disable features that don't work
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Samuel Pitoiset
7751ed39e4 radeonsi: disable sinking common instructions down to the end block
Initially this was a workaround for a bug introduced in LLVM 4.0
in the SimplifyCFG pass that caused image instrinsics to disappear
(because they were badly sunk). Finally, this is a win because it
decreases SGPR spilling and increases the number of waves a bit.

Although, shader-db results are good I think we might want to
remove it in the future once the issue is fixed. For now, enable
it for LLVM >= 4.0.

This also fixes a rendering issue with the speedometer in Dirt Rally.

More information can be found here https://reviews.llvm.org/D26348.

Thanks to Dave Airlie for the patch.

v2: - add a FIXME comment
    - use if (HAVE_LLVM >= 0x0400) instead

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-15 14:24:40 +01:00
Marek Olšák
7e1faa79d3 radeonsi: drop support for LLVM 3.6 & 3.7
They are too old.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
7f1446a8a1 ac: normalize build helper names
s/emit/build/

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Timothy Arceri
d90bf4ef3e radeon: remove unused radeon_elf_util.{c,h}
We now use the shared code in AMD common instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
dc4c551a34 radeon/ac: switch from radeon_elf_read() to ac_elf_read()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
69a687189e radeon/ac: switch from radeon_shader_binary to ac_shader_binary
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Marek Olšák
52581606c2 radeonsi: set no-signed-zeros-fp-math
Recommended by Matt Arsenault.

46757 shaders in 28742 tests
Totals:
SGPRS: 2068851 -> 2066907 (-0.09 %)
VGPRS: 1604056 -> 1602676 (-0.09 %)
Spilled SGPRs: 1402 -> 1382 (-1.43 %)
Spilled VGPRs: 113 -> 113 (0.00 %)
Private memory VGPRs: 1332 -> 1332 (0.00 %)
Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread
Code Size: 58815520 -> 58716788 (-0.17 %) bytes
LDS: 1162 -> 1162 (0.00 %) blocks
Max Waves: 354616 -> 354905 (0.08 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 786452 -> 784508 (-0.25 %)
VGPRS: 530000 -> 528620 (-0.26 %)
Spilled SGPRs: 958 -> 938 (-2.09 %)
Spilled VGPRs: 85 -> 85 (0.00 %)
Private memory VGPRs: 636 -> 636 (0.00 %)
Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread
Code Size: 26349936 -> 26251204 (-0.37 %) bytes
LDS: 304 -> 304 (0.00 %) blocks
Max Waves: 108962 -> 109251 (0.27 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
fd3e73f54e gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)
v2: define lp_float_mode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
660b55e6d9 radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
dbd38f2a92 radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Tom Stellard
226a2c6d6e radeonsi: Fix build on LLVM < 3.9 v2
This was broken by: e0cc0a614c

v2:
  - Use preprocessor macro

Tested-by: Mark Janes <mark.a.janes@intel.com>
2017-02-01 02:10:00 +00:00
Tom Stellard
e0cc0a614c radeonsi: Set datalayout on the llvm module
This prevents LLVM from using sext instructions for local memory offsets
and allows the backend to fold immediate offsets into the instruction.

This also prevents some incorrect code generation for ptrtoint and
inttoptr instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 20:39:30 +00:00
Marek Olšák
59c5da40ed radeonsi: preload PS inputs only if KILL is used
so that most shaders can get lower VGPR usage thanks to lazy input loading.
I think this is a more accurate constraint that prevents the black transitions
in Witcher 2.

Affected shaders (7758):
Max Waves: 57437 -> 58231 (1.38 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Samuel Pitoiset
e1ea70d9f3 radeonsi: replace si_shader_context::soa by bld_base
We no longer need to use lp_build_tgsi_soa_context.

No regressions founds with full piglit run.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-13 10:41:08 +01:00
Samuel Pitoiset
ecf04b84e5 radeonsi: replace ctx->soa.outputs by ctx->outputs
The plan is to replace si_shader_context::soa with its parent
structure (ie. bld_base).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-13 10:41:06 +01:00
Samuel Pitoiset
f04088a7ba radeonsi: move si_shader_context::soa::addr to si_shader_context
The plan is to replace si_shader_context::soa with its parent
structure (ie. bld_base).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-13 10:41:02 +01:00
Samuel Pitoiset
6f0d955b6d radeonsi: allocate the array of immediates dynamically
Currently, we can store up to 256 immediates in a static array,
but this is not always enough. Instead, allocate a dynamic array
like what we currently do for temps.

This fixes a segfault with
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23

No regressions found with full piglit run.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-13 10:40:57 +01:00
Nicolai Hähnle
a0ce09b4b2 amd/common: unify cube map coordinate handling between radeonsi and radv
Code is taken from a combination of radv (for the more basic functions,
to avoid gallivm dependencies) and radeonsi (for the new and improved
derivative calculations).

v2: add 0.5 offset to tex coords only after derivative calculation

v3:
- really only touch the first three coordinates
- rebase on the removal of the 1.5 --> 0.5 offset change

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-13 00:39:10 +01:00
Marek Olšák
cac74a9bcc radeonsi: fix the Witcher 2 black transitions
v2: do it properly

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98238

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-09 12:01:30 +01:00
Marek Olšák
5b85a6b3f7 radeonsi: set si_shader_context::input_decls for ranged decls correctly
This has no effect because no code uses those members with ranged decls.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-09 12:01:30 +01:00
Marek Olšák
358079da2d radeonsi: set unsafe fpmath on FP instructions when allowed by R600_DEBUG
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-15 19:17:56 +01:00
Marek Olšák
171e349782 radeonsi: fold some shader context initialization to si_llvm_context_init
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-15 19:17:56 +01:00
Nicolai Hähnle
0b9bba7f6c radeonsi: pass the function name to si_llvm_create_func
We will use multiple functions in one module, so they should have
different names.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:06:54 +01:00
Nicolai Hähnle
23dfb688ba radeonsi: add always-inline pass to si_llvm_finalize_module
Change the pass manager as well, since this is a module-level pass. No
noticeable run-time difference on shader-db.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:06:42 +01:00
Marek Olšák
21af69e753 radeonsi: rename prefixes from radeon to si
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-18 18:41:08 +02:00
Marek Olšák
6e475fefa1 radeonsi: merge radeon_llvm_context and si_shader_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-18 18:41:06 +02:00
Marek Olšák
5ab25bb4ba radeonsi: import all TGSI->LLVM code from gallium/radeon
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-18 18:41:04 +02:00
Renamed from src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c (Browse further)