Commit graph

27966 commits

Author SHA1 Message Date
Francisco Jerez
86100e13ab clover/llvm: Trivial assorted cleanups for invocation.cpp.
Drop a few include and using directives which are no longer necessary.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:34 -07:00
Francisco Jerez
520cc26859 clover/llvm: Split native codegen into separate file.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:34 -07:00
Francisco Jerez
8195637363 clover/llvm: Split bitcode codegen into separate file.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
71ac9820d6 clover/llvm: Split shared codegen support code into separate file.
This is the common part of the code used to generate a clover::module
from LLVM bitcode, shared between the native and LLVM paths.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
26fa9bfd0d clover/llvm: Define function for bitcode print-out.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
f0721020ad clover/llvm: Split native codegen and assembly print-out into separate functions.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
1d042adc0a clover/llvm: Clean up bitcode codegen.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
952d1e6fd6 clover/llvm: Use metadata introspection utils for kernel enumeration.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
d37d5842c1 clover/llvm: Use metadata introspection utils for kernel argument set-up.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:33 -07:00
Francisco Jerez
3ed31bbf05 clover/llvm: Add simplified utility functions for metadata introspection.
v2: Fix for latest LLVM from SVN.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net> (v1)
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:34:30 -07:00
Francisco Jerez
7da2c1ff0f clover/llvm: Clean up codestyle of get_kernel_args().
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:59 -07:00
Francisco Jerez
0601fe7438 clover/llvm: Fold compile_native() call into build_module_native().
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:56 -07:00
Francisco Jerez
f98422eafd clover/llvm: Factor out duplicated construction of clover::module.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:53 -07:00
Francisco Jerez
3ce6ab068c clover/llvm: Clean up compile_native().
This switches compile_native() to the C++ API (which the rest of this
file makes use of anyway so there is little benefit from using the C
API), what should get rid of an amount of boilerplate and fix a leak
of the TargetMachine object in the error path.

v2: Additional fixes for LLVM 3.6.
v3: Update for the latest LLVM SVN changes.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:50 -07:00
Francisco Jerez
7bcefa5903 clover/llvm: Clean up ELF parsing.
This function was doing three separate things:
 - Initializing and releasing the ELF parsing state (the latter can be
   better done using RAII).
 - Searching for the symbol table in the ELF file.
 - Extraction of kernel symbol offsets from the symbol table.

Split each one into a separate function for clarity and clean up the
result slightly.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:48 -07:00
Francisco Jerez
574477e599 clover/llvm: Move a bunch of utility functions into separate file.
Some of these will be useful from a different compilation unit in the
same subtree so put them in a publicly accessible header file.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:43 -07:00
Francisco Jerez
92247cef3f clover/llvm: Tidy debug handling.
Most significant change is debugging flags are now a scoped enum and
all debugging helpers live in the debug namespace.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:40 -07:00
Francisco Jerez
4614397ac2 clover/llvm: Use helper function to abort compilation with error message.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:37 -07:00
Francisco Jerez
423eecb76a clover/llvm: Simplify diagnostic_handler().
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:29 -07:00
Francisco Jerez
5884dfbc2a clover/llvm: Trivial codestyle clean-up for optimize().
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:22:21 -07:00
Francisco Jerez
bdc27f13d5 clover/llvm: Clean up compilation into LLVM IR.
Some assorted and mostly trivial clean-ups for the source to bitcode
compilation path.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:21:50 -07:00
Francisco Jerez
714b167f57 clover/llvm: Factor out LLVM context init.
So it can be shared between the compilation and linking codepaths.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:21:30 -07:00
Francisco Jerez
fa94055d53 clover/llvm: Declare compiler instance at top level and pass down as argument.
This allows simplifying the interface of compile_llvm() because it no
longer needs to read out and return the optimization level and address
space map from the compiler instance.  Instead declare the compiler
instance at the top level so that both properties are available
directly.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:21:13 -07:00
Francisco Jerez
a27d4ec3b9 clover/llvm: Refactor compiler instance initialization.
This will be shared between the compiler and linker codepaths.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:21:08 -07:00
Francisco Jerez
c2a167ad73 clover/llvm: Factor out compiler option tokenization.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:20:47 -07:00
Francisco Jerez
c513cfa747 clover/llvm: Factor out target string parsing.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:20:41 -07:00
Francisco Jerez
251054220e clover/llvm: Collect #ifdef mess into a separate file.
This gets rid of most ifdef's from the invocation.cpp code -- Only a
couple of them are left which will be removed differently in the
following commits.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:20:12 -07:00
Francisco Jerez
11afde89b8 clover/llvm: Drop dead code.
This ifdef'ed out code was meant to handle compilation into TGSI, but
it doesn't seem likely that it will ever be useful even if the TGSI
back-end is resurrected because the TGSI bitcode can just be plumbed
through in ELF format and dealt with as a regular "native" back-end.

Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:20:05 -07:00
Francisco Jerez
600ac51448 clover/llvm: Drop support for LLVM < 3.6.
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-07-11 20:19:49 -07:00
Ben Skeggs
0d911a720d nvc0: initial support for GP100 GPUs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-07-12 10:56:35 +10:00
Samuel Pitoiset
9bc083284f nvc0: use a define for the driver constant buffer size
This might avoid mistakes if the size is bumped in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-11 22:30:41 +02:00
Samuel Pitoiset
31a615677b nvc0: fix the driver cb size when draw parameters are used
The size of the driver constant buffer for each stage should be 2048
and not 512 because it has been increased recently for buffers/images.
While we are at it, do the same change for indirect draws.

This fixes all ARB_shader_draw_parameters tests on GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-11 22:11:27 +02:00
Samuel Pitoiset
19d0450b27 nvc0/ir: fix images indirect access on Fermi
This fixes the following piglits:

arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index
arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-11 21:01:21 +02:00
Marek Olšák
d7b6f90684 gallivm: set LLVMNoUnwindAttribute on all intrinsics
RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement:

 Application            Files    SGPRs     VGPRs   SpillSGPR SpillVGPR Code Size    LDS    Max Waves   Waits
 unigine_valley           278    0.00 %   -0.29 %    0.00 %    0.00 %    0.01 %    0.00 %    0.17 %    0.00 %

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-07-11 19:06:05 +02:00
Nicolai Hähnle
374aa2bb27 gallium/u_queue: assert that users must wait on fences before destroying them
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:04:44 +02:00
Nicolai Hähnle
a0a616720a gallium/u_queue: guard fence->signalled checks with fence->mutex
I have seen a hang during application shutdown that could be explained by the
following race condition which this patch fixes:

1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true.
2. Main thread calls util_queue_job_wait, which returns immediately.
3. Main thread deletes the job and fence structures, leaving garbage behind.
4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because
   it is accessing garbage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:03:59 +02:00
Nicolai Hähnle
b479c47a9c radeonsi: fix bad assertion in si_emit_sample_mask
The blitter sets mask == 1, which is fine since it doesn't use smoothing.
Fixes a regression introduced in commit 5bcfbf91.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-09 19:46:54 +02:00
Christian König
64ac4aef27 radeon/uvd: simplify sending context buffer message
Just send it whenever it is allocated.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-07-08 21:03:32 +02:00
Christian König
6b474e06a2 radeon/uvd: fix contex buffer destruction in the error path
Destroying a not allocated buffer is harmless.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-07-08 21:03:32 +02:00
Christian König
36df04dac4 radeon/uvd: move polaris fw check into radeon_video.c v2
It's actually not very clever to claim to support H.264
and then fail to create a decoder.

v2: prefix FW macro with UVD_.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-07-08 21:03:31 +02:00
Christian König
5290bf43c8 radeon/video: fix coding style in radeon_video.c v2
v2: fix other tabs as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-07-08 21:03:31 +02:00
Brian Paul
74163475b0 svga: simplify/fix 1D/2D array resource copies
Fixes the one of the piglit arb_copy_image-targets tests for 1D arrays.
Previously, we were applying the 1D array z/face adjustment twice.

Also simplify the copy_region_vgpu10() function.  It never has to copy
multiple array layers/slices.  The Mesa code for glCopyImageSubData does
the loop over slices/faces.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-07-08 12:53:21 -06:00
Brian Paul
fb26317604 svga: remove unused variable
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-07-08 12:53:21 -06:00
Brian Paul
689293ad52 svga: add dumping for more device commands
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-07-08 12:53:21 -06:00
Brian Paul
599c333d07 svga: silence a couple unused variable warnings
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-07-08 12:53:20 -06:00
Charmaine Lee
c3c7ff014b svga: rebind using render target surfaces in hw draw state
Currently when we rebind framebuffer resources at the beginning of
the command buffer, we use the color buffer surfaces saved in the context
hw clear state. But the surfaces could be different from the actual
emitted render target surfaces if any of the color buffer surfaces
is also used for shader resource, in that case, we create
a backed surface for the collided render target surface. So to rebind
the framebuffer resources correctly, use the render target surfaces saved
in the context hw draw state.

Tested with Heaven, Lightsmark2008, MTT piglit, glretrace, conform.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-07-08 12:53:20 -06:00
Charmaine Lee
da98cee067 svga: invalidate gb surface before it is reused
With this patch, a guest-backed surface will be invalidated
using the SVGA_3D_CMD_INVALIDATE_GB_SURFACE command before
the surface is reused. This fixes the updating dirty image error
from the device when a surface is reused.

v2: Instead of invalidating the surface when it is reused,
    send the invalidate command before the surface is put into
    the recycle pool.

v3: (1) surface invalidate is a noop operation in Linux winsys, since
        surface invalidation is not needed for DMA path.
    (2) Instead of invalidating the surface content in
        svga_screen_surface_destroy() when a surface is to be destroyed,
        it is done in svga_screen_cache_flush() when the surface is
        no longer referenced in a command buffer and is ready to
        be moved to the unused list. At this point, the surface will
        be moved to the invalidate list. When the surface invalidation
        is submitted, the surface will be moved to the unused list.

Tested with piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-07-08 12:53:20 -06:00
Brian Paul
ca531aeeb1 svga: fix use of provoking vertex control
If the SVGA3D_DEVCAP_DX_PROVOKING_VERTEX query returns false, never
define rasterizer state objects with provokingVertexLast set.  Despite
what the device reports, it may interpret the provokingVertexLast flag
anyway.  This fixes an issue when using capability clamping.

Tested with piglit provoking-vertex and glsl-fs-flat-color tests.

VMware bug 1550143.

Reviewed-by: <charmainel@vmware.com>
2016-07-08 12:53:20 -06:00
Nayan Deshmukh
af18a04755 vl: add half pixel to v_tex before adding offsets
Since pixel center lies at 0.5, add half_pixel to vtex
before adding offsets to it.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-08 20:51:12 +02:00
Samuel Pitoiset
a0bf1768c7 nvc0/ir: remove unused resource info loading helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-08 19:12:23 +02:00