Commit graph

30994 commits

Author SHA1 Message Date
Marek Olšák
8ac4923a67 radeonsi: prevent race conditions when doing scratch patching
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
9dfc030b48 radeonsi: separate scratch state patching code into its own function
Picked from a different branch. When we stop using the scratch patching,
this function will not be called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
1b01014cbf radeonsi/gfx9: also apply scratch relocations to the 1st shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
e107c5a426 radeonsi/gfx9: set correct LLVM calling conventions for merged shaders
for scratch support

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
a47289f8fc radeonsi: remove unused parameters from si_shader_apply_scratch_relocs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
2d662c0cba radeonsi: inline si_llvm_shader_type into si_llvm_create_func
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
4c0e68dfe5 radeonsi: don't use util_memcpy_cpu_to_le32 for shader uploads
at least I think this is correct.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
7660c9ee4e radeonsi: make si_compile_llvm static
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
f8f8242e8b radeonsi: fold surrounding code into si_llvm_finalize_module
and rename to si_llvm_optimize_module.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
5dad0c3477 radeonsi: don't call eliminate_const_vs_outputs in shaders without VS exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
12beef0374 radeonsi: drop support for LLVM 3.8
LLVM 3.8:
- had broken indirect resource indexing
- didn't have scratch coalescing
- was the last user of problematic v16i8
- only supported OpenGL 4.1

This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
4d32b4ac99 radeonsi: stop using v16i8
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
283a1d1e27 radeonsi/gfx9: make some PA & DB registers match the closed Vulkan driver
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Rob Clark
7b55a05159 freedreno/a5xx: compute shader support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
10c17f23b7 freedreno: core compute state support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
2ce449fa7d freedreno/ir3: compute shader support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
39c5a46a7a freedreno/a5xx: SSBO support
To simplify things for now, since all the gfx shader stages share a
single SSBO state block, only advertise SSBO support for fragment shader
(and compute when we have that).  We could possibly use a fixed-
partitioning of the SSBO index space to support SSBOs on other stages
without having to resort to shader variants.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
edde00f5f1 freedreno/ir3: SSBO/atomic support
TODO cwabbott pointed out a write-after-read hazzard, which effects both
this and arrays.  A write needs to depend on *all* reads since the last
write, not just the last read.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
4d841fbaae freedreno: core SSBO support
The generation-independent support for binding shader buffer objects.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
fd6ed7b562 freedreno/ir3: resync instr-a3xx.h/disasm-a3xx.c
Sync to the same files from freedreno.git to correct decoding of ldgb/
stgb instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Emil Velikov
4752ae876a targets/libgl-xlib: remove unneeded GLX_SHARED_GLAPI define
There's no users in-tree that use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:56 +01:00
Emil Velikov
4ea5f4b74c gallium/dri: remove unneeded HAVE_SHARED_GLAPI guard
Always true, since the dri modules required shared glapi.

With earlier commit (da410e6afa "configure: explicitly require shared
glapi for enable-dri") we even made that explicit during the configure
stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:40 +01:00
Emil Velikov
79a26b663a gallium/dri: always link against shared glapi
In the early days of Xorg and Mesa we had multiple providers of the
GLAPI. All of those were the ones responsible for dlopening the DRI
module. Hence it was perfectly fine, and actually expected, for the DRI
modules to have unresolved symbols.

Since then we've moved the API to a separate shared library and no other
libraries provide the symbols.

Here comes the picky part:
It's possible that one uses old Xorg (where libglx.so provides the
GLAPI) and new Mesa (with DRI modules linking against libglapi.so).

That should still work, since the the libglx.so symbols will take
precedence over the libglapi.so ones.

I've verified this while running 1.14 series Xorg alongside this (and
next) patch.

It may seem a bit fragile, but that's of reasonably OK since all of the
affected Xorg versions have been EOL for years.

The final one being the 1.14 series, which saw its final bug fix release
1.14.7 in June 2014.

To ensure that the binaries do not have unresolved symbols add
-no-undefined and $(LD_NO_UNDEFINED), just like we do everywhere else
throughout mesa.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98428
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:29 +01:00
Dave Airlie
7088b655e8 radeonsi: constify a bunch of the perfcounter structs.
This moves the structs from the data segment to the rodata segment,
which seems like the more correct place for them.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-04 11:52:47 +10:00
Marek Olšák
f466683cb0 radeonsi/gfx9: fix gl_ViewportIndex
v2: remove unnecessary LLVMBuildAnd calls

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Marek Olšák
ec34632859 radeonsi/gfx9: set VGT_REUSE_OFF = 0
same as Vulkan

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Christian Gmeiner
a8007ed687 etnaviv: add L8A8_UNORM texture format
No piglit regressions.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-05-03 22:43:10 +02:00
Marek Olšák
7647e90b15 ac: rename ac_eliminate_const_vs_outputs -> ac_optimize_vs_outputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 20:55:00 +02:00
Eric Anholt
ece06defe7 vc4: Use runtime CPU detection for whether NEON is available.
This will allow Raspbian's ARMv6 builds to take advantage of the new NEON
code, and could prevent problems if vc4 ends up getting used on a v7 CPU
without NEON.

v2: Drop dead NEON_SUFFIX (noted by Erik Faye-Lund)
2017-05-02 13:35:23 -07:00
Eric Anholt
a373f77662 vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.
Android.mk was setting the flag across the entire driver, so we didn't
have non-NEON versions getting built.  This was going to be a problem with
the next commit, when I start auto-detecting NEON support and use the
non-NEON version when appropriate.

Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-02 13:35:23 -07:00
Eric Anholt
463b7d0332 gallium: Enable ARM NEON CPU detection.
I wrote this code with reference to pixman, though I've only decided to
cover Linux (what I'm testing) and Android (seems obvious enough).  Linux
has getauxval() as a cleaner interface to the /proc entry, but it's more
glibc-specific and I didn't want to add detection for that.

This will be used to enable NEON at runtime on ARMv6 builds of vc4.

v2: Actually initialize the temp vars in the Android path (noticed by
    daniels)
v3: Actually pull in the cpufeatures library (change by robher).
    Use O_CLOEXEC.  Break out of the loop when we find our feature.
v4: Drop VFP code, which was confused about what it was detecting and not
    actually used yet.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-05-02 13:35:23 -07:00
Philipp Zabel
b539335e50 renderonly: use drmIoctl
To restart interrupted system calls, use drmIoctl.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:22:53 +02:00
Philipp Zabel
cd8ee259c8 renderonly: drop resources on destroy
The renderonly_scanout holds a reference on its prime pipe resource,
which should be released when it is destroyed. If it was created by
renderonly_create_kms_dumb_buffer_for_resource, the dumb BO also has
to be destroyed.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:19:23 +02:00
Philipp Zabel
ab51cd2f26 renderonly: close transfer prime_fd
prime_fd is only used to transfer the scanout buffer to the GPU inside
renderonly_create_kms_dumb_buffer_for_resource. It should be closed
immediately to avoid leaking the DMA-BUF file handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:19:19 +02:00
Eric Anholt
d884d1a654 vc4: Only build the NEON code on arm32.
NEON is sufficiently different on arm64 that we can't just reuse this
code.  Disable it on arm64 for now.

v2: Use PIPE_ARCH_ARM instead, as __ARM_ARCH may be 8 for a 32-bit build
    for a v8 CPU.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>
2017-05-01 13:27:39 -07:00
Samuel Pitoiset
dec5b27b1b gm107/ir: add a missing assertion in emitISCADD()
For consistency, similar to the other emitters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-01 11:56:49 +02:00
Ilia Mirkin
6af14778a3 gallium/targets: fix bool setting on BE architectures
val_bool and val_int are in a union. val_bool gets the first byte, which
happens to work on LE when setting via the int, but breaks on BE. By
setting the value properly, we are able to use DRI3 on BE architectures.
Tested by running glxgears with a NV34 in a G5 PPC.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: squash the vmwgfx hunk]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-29 14:32:20 +01:00
Brian Paul
52d69c2e8d st/wgl: whitespace, formatting fixes in stw_pixelformat.c
Trivial.
2017-04-28 22:01:34 -06:00
Charmaine Lee
ba8e2ea19a st/wgl: allow WGL_BIND_TO_TEXTURE_RGB_ARB for RGBA visuals
We do not need to restrict WGL_BIND_TO_TEXTURE_RGB_ARB to
RGB visuals only. It can be supported with RGBA visuals as well.

This fixes the early exit of cinebench-r15-test trace.

Tested with cinebench-r15, piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-28 22:01:24 -06:00
Brian Paul
d06045dfdd st/wgl: use ARRAY_SIZE() macro in wglChoosePixelFormatARB()
Trivial.
2017-04-28 21:37:07 -06:00
Brian Paul
394f8dacbc st/wgl: whitespace/formatting fixes in stw_ext_pixelformat.c
Trivial.
2017-04-28 21:37:06 -06:00
Neha Bhende
197907c926 svga: implement sRGB rendering for imported surfaces
If texture is imported and templ format is sRGB, use compatible sRGB format
to the imported texture format while creating surface view.

tested with MTT piglit, glretrace, viewperf and conform

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Neha Bhende
1b415a5b28 svga: add function svga_linear_to_srgb()
This function will return compatible svga srgb format for corresponding
linear format

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Thomas Hellstrom
ca59fd1706 svga: Add a more elaborate format compatibility determination v2
dri3 is a bit sloppy about its format compatibility requirements, so add
a possibility to import xrgb surfaces as argb textures and vice versa.

At the same time, make the svga_texture_from_handle() function a bit more
readable and fix the error path where we leaked a winsys surface.

v2: Addressed review comments by Brian.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Tim Rowley
18d5c452d0 swr/rast: add memory api to SwrGetInterface()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:09 -05:00
Tim Rowley
a46539af11 swr/rast: use gather instruction for odd format fetch
Small fetch performance optimization - use gather instruction
for odd format fetch instead of slow emulated code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:02 -05:00
Tim Rowley
eff909de7d swr/rast: enable SIMD16 8x2 tile backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:56 -05:00
Tim Rowley
5fde2ae533 swr/rast: add SwrInit() to init backend/memory tables
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:50 -05:00
Tim Rowley
e8d58049f6 swr/rast: increment depth/stencil tile pointer in SIMD16 BE
Misplaced #endif preventing depth and stencil hot tile pointers
from incrementing in SIMD16 8x2 configuration of BackendPixelRate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:42 -05:00
Tim Rowley
d4c1486737 swr/rast: add SwrGetInterface() function to return api
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:34 -05:00