Commit graph

30034 commits

Author SHA1 Message Date
Tim Rowley
c1aa444a3e swr: [rasterizer jitter] Pass LLVM-IR size into jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:13 -06:00
Tim Rowley
e0a829d320 swr: [rasterizer core] Frontend SIMD16 WIP
Removed temporary scafolding in PA, widended the PA_STATE interface
for SIMD16, and implemented PA_STATE_CUT and PA_TESS for SIMD16.

PA_STATE_CUT and PA_TESS now work in SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:06 -06:00
Tim Rowley
79174e52b5 swr: [rasterizer jitter] Disable unsafe FP optimizations in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:00 -06:00
Tim Rowley
db599e316a swr: [rasterizer core] Frontend SIMD16 WIP
Widen simdvertex to SIMD16/simd16vertex in frontend for passing VS
attributes from VS to PA.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:52 -06:00
Tim Rowley
09c54cfd2d swr: [rasterizer jitter] Add DEBUGTRAP jit builder function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:47 -06:00
Tim Rowley
b01f26e005 swr: [rasterizer jitter] Multisample blend jit fix
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:41 -06:00
Tim Rowley
8780706c62 swr: [rasterizer jitter] Change SimdVector representation to array
Make all SimdVectors in LLVM represented as simdscalar[4] rather
than a struct.

Fixes issues with promotion of values from i32 to i64 to match
register width.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:33 -06:00
Tim Rowley
d159b0bf34 swr: [rasterizer jitter] Fix issues with stream-out on llvm>=3.8
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:27 -06:00
Tim Rowley
8423ad437b swr: [rasterizer jitter] Adjust jitter header includes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:20 -06:00
Tim Rowley
feecd7dcf5 swr: [rasterizer core] Frontend SIMD16 WIP
SIMD16 Primitive Assembly (PA) only supports TriList and RectList.

CUT_AWARE_PA, TESS, GS, and SO disabled in the SIMD16 front end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:10 -06:00
Bartosz Tomczyk
94262e5f5d r600/sb: Fix memory leak
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-08 17:36:05 +01:00
Li Qiang
83fb63d31d gallium/tgsi: fix oob access in parse instruction
When parsing texture instruction, it doesn't stop if the
'cur' is ',', the loop variable 'i' will also be increased
and be used to index the 'inst.TexOffsets' array. This can lead
an oob access issue. This patch avoid this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Li Qiang <liq3ea@gmail.com>
2017-02-07 14:00:04 +10:00
Bruce Cherniak
11d6f836d0 swr: [rasterizer core] Removed unused clip code.
Removed unused Clip() and FRUSTUM_CLIP_MASK define.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-02-06 16:30:50 -06:00
Bruce Cherniak
bf29495dcd swr: [rasterizer core] Remove dead code Clipper::ClipScalar()
Clipper::ClipScalar() is dead code and should be removed.  It is causing
an error with gcc-7 because it references a now defunct member.

v2: includes bugzilla reference, same code change

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633
CC: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-02-06 16:27:53 -06:00
Eric Anholt
72e6d1f00a gallium: Remove vc4 simulator hack from loader infrastructure.
Now that there's MESA_LOADER_DRIVER_OVERRIDE for choosing the driver name
we load, we don't need this any more.

v2: Get the junk out of pipe_loader_drm.c, too.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
2017-02-06 12:44:06 -08:00
Eric Anholt
61bb1a9795 targets: Use a macro to reduce cut and paste in driver setup.
All the replicated prototypes/function bodies obfuscated the interesting
logic of the file: the mapping from driver enable macros to entrypoints we
expose, and the way that the swrast entrypoints are special compared to
the DRM entrypoints.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-06 12:44:06 -08:00
Dave Airlie
13a28ff236 radeon/ac: move common llvm build functions to a separate file.
Suggested by Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-07 05:46:35 +10:00
Samuel Pitoiset
af303abcdb winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()
cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-03 12:07:14 +01:00
Edward O'Callaghan
3879425917 ilo: EOL unmaintained older gallium intel driver
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:46 +11:00
Edward O'Callaghan
d77fa310ed ilo: EOL drop unmaintained gallium drv from buildsys
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:36 +11:00
Edward O'Callaghan
01b625ef1a ilo: EOL unplumb unmaintained gallium drv from winsys
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:32 +11:00
Dave Airlie
c9a2fc3679 radeonsi/ac: move most of emit_ddxy to shared code.
We can reuse this in radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
c5f0a56aeb radeonsi/ac: move get thread id to shared code.
radv will use this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
b3c28942c7 radeonsi/ac: move tbuffer store and buffer load to shared code.
These are all reuseable by radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
a9773311f6 radeonsi/ac: move a bunch of load/store related things to common code.
These are all shareable with radv, so start migrating them to the
common code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Marek Olšák
dfe111368d Revert "radeonsi: decrease the number of texture slots to 24"
This reverts commit bdd860e307.

Requested by a game developer.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-03 00:39:48 +01:00
Nicolai Hähnle
a020cb3a72 gallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capability
Make the cap consistent with PIPE_CAP_INT64.

Aside from the hypothetical case of using draw for vertex shaders (and
actually caring about doubles...), every implementation supports doubles
either nowhere or everywhere.

Also, st/mesa didn't even check the cap correctly in all supported
shader stages.

While at it, add a missing LLVM version check for 64-bit integers in
radeonsi. This is conservative: judging by the log, LLVM 3.8 might be
sufficient, but there are probably bugs that have been fixed since then.

v2: fix clover (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-02 16:53:42 +01:00
Mauro Rossi
9c45bb731c android: fix llvm, elf dependencies for M, N releases
These changes set the correct llvm version and elf include path
which differ for Marshmallow and Nougat

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-02-01 23:01:35 +00:00
Dave Airlie
dc68b920df radeonsi/ac: move frag interp emission code to shared llvm code.
This code should be used in radv, so move it to a shared location
in advance of doing that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:24:53 +10:00
Boyuan Zhang
f90ccf48bc st/va: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
d596bd29ec st/vdpau: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
c29191eea8 radeon/uvd: add h264 constrained baseline support
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
22841ec84a vl: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Michel Dänzer
31136eae3a winsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49
The kernel driver reports correct values now.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-01 16:38:14 +09:00
Tom Stellard
226a2c6d6e radeonsi: Fix build on LLVM < 3.9 v2
This was broken by: e0cc0a614c

v2:
  - Use preprocessor macro

Tested-by: Mark Janes <mark.a.janes@intel.com>
2017-02-01 02:10:00 +00:00
Rob Herring
01e18b21d1 vc4: Enable Neon on arm android builds
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:21 -08:00
Rob Herring
83107acb7b vc4: fix arm64 build with Neon
The addition of Neon assembly breaks on arm64 builds because the assembly
syntax is different. For now, restrict Neon to ARMv7 builds.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:19 -08:00
Rob Herring
6d92f32852 vc4: Make Neon inline assembly clang compatible
clang throws an error on "%r2" and similar. I couldn't find any
documentation on what "%r?" is supposed to mean and I've never seen any
use like that as far as I remember. The parameter is supposed to be
cpu_stride and just %2/%3 should be sufficient.

There's no need for trailing ";" either, so remove those, too.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:09 -08:00
Tom Stellard
e0cc0a614c radeonsi: Set datalayout on the llvm module
This prevents LLVM from using sext instructions for local memory offsets
and allows the backend to fold immediate offsets into the instruction.

This also prevents some incorrect code generation for ptrtoint and
inttoptr instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 20:39:30 +00:00
Wladimir J. van der Laan
56314f5baf etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers
This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.

- It looks like SE_CLIP registers were not set at all.
  I'm amazed that rendering worked without them. Emit them to
  avoid issues on gc3000.

- Define constants
  ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
  ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
  ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
  ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)

  These demarcate the margin (fixp16) between the computed sizes and the
  value sent to the chip. I have set these to the numbers used by the
  Vivante driver for gc2000. I am not sure whether any old hardware was
  relying on the old numbers, or whether those were just a guess. But if
  so, these need to be moved to the _specs structure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 19:29:23 +01:00
Wladimir J. van der Laan
fe3bb8cdb5 etnaviv: Generate new sin/cos instructions on GC3000
Shaders using sin/cos instructions were not working on GC3000.

The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):

- Need their input scaled by 1/pi instead of 2/pi.

- Output an x and y component, which need to be multiplied to
  get the result.

- tex_amode needs to be set to 1.

Add a new bit to the compiler specs and generate these instructions
as necessary.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 19:29:16 +01:00
Wladimir J. van der Laan
658568941d etnaviv: Cannot render to rb-swapped formats
Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.

(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 09:28:28 +01:00
Christian Gmeiner
82fe240a99 etnaviv: Avoid infinite loop in find_frame()
Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.

Reported by clang:

etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
   for (unsigned sp = c->frame_sp; sp >= 0; sp--)
                                   ~~ ^  ~

v2: Simply use the same datatype as c->frame_sp is using.

CC: <mesa-stable@lists.freedesktop.org>
Reported-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2017-01-31 09:19:25 +01:00
Samuel Pitoiset
af7fef12f7 r600: fix a compilation warning in r600_screen_create()
Should be r600_common_screen instead of r600_screen.

Fixes: 80157a2c20 ("gallium/radeon: clean up r600_query_init_backend_mask")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 18:13:18 +01:00
Marek Olšák
f8bc628b2c gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter
to simplify things in draw_vbo a little

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
75c425e511 winsys/radeon: clamp vram_vis_size to 256MB
the value from the kernel is wrong

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
eba9e9dd1d radeonsi: handle count_from_stream_output in a few IA_MULTI_VGT_PARAM cases
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
a0740d59aa radeonsi: don't invoke DCC decompression in update_all_texture_descriptors
This fixes a bug uncovered by the 17-part patch series, specifically:
  "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"

If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
f8dd2f5bac radeonsi: fold info->indirect conditionals into the last one in draw_vbo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00
Marek Olšák
408f9a1584 radeonsi: atomize the scratch buffer state
The update frequency is very low.

Difference: Only account for the size when allocating a new one and when
            starting a new IB, and check for NULL. (v3)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00