Initially this was a workaround for a bug introduced in LLVM 4.0
in the SimplifyCFG pass that caused image instrinsics to disappear
(because they were badly sunk). Finally, this is a win because it
decreases SGPR spilling and increases the number of waves a bit.
Although, shader-db results are good I think we might want to
remove it in the future once the issue is fixed. For now, enable
it for LLVM >= 4.0.
This also fixes a rendering issue with the speedometer in Dirt Rally.
More information can be found here https://reviews.llvm.org/D26348.
Thanks to Dave Airlie for the patch.
v2: - add a FIXME comment
- use if (HAVE_LLVM >= 0x0400) instead
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This prevents LLVM from using sext instructions for local memory offsets
and allows the backend to fold immediate offsets into the instruction.
This also prevents some incorrect code generation for ptrtoint and
inttoptr instructions.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
so that most shaders can get lower VGPR usage thanks to lazy input loading.
I think this is a more accurate constraint that prevents the black transitions
in Witcher 2.
Affected shaders (7758):
Max Waves: 57437 -> 58231 (1.38 %)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
We no longer need to use lp_build_tgsi_soa_context.
No regressions founds with full piglit run.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The plan is to replace si_shader_context::soa with its parent
structure (ie. bld_base).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The plan is to replace si_shader_context::soa with its parent
structure (ie. bld_base).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Currently, we can store up to 256 immediates in a static array,
but this is not always enough. Instead, allocate a dynamic array
like what we currently do for temps.
This fixes a segfault with
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
No regressions found with full piglit run.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Code is taken from a combination of radv (for the more basic functions,
to avoid gallivm dependencies) and radeonsi (for the new and improved
derivative calculations).
v2: add 0.5 offset to tex coords only after derivative calculation
v3:
- really only touch the first three coordinates
- rebase on the removal of the 1.5 --> 0.5 offset change
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This has no effect because no code uses those members with ranged decls.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Change the pass manager as well, since this is a module-level pass. No
noticeable run-time difference on shader-db.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>