Commit graph

62988 commits

Author SHA1 Message Date
Rob Clark
670418740f freedreno/a3xx: fix write to bogus register
The loops for updating the multiple packed fields in SP_VS_OUT[] and
SP_VS_VPC_DST[] will zero out one register beyond the last that on
required.  Which is normally not a problem (and is kinda convenient
when looking at cmdstream dumps) unless we have maximum (16) varyings.

Fix loop termination condition so that this does not happen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:26:35 -04:00
Rob Clark
c37889b5ac freedreno/a3xx: account for special inputs/outputs
We need to size input/output tables big enough for special inputs/
outputs (gl_Position, gl_FrontFacing, etc) which, while they don't
count towards the hw limit of 16 attributes or 16 varyings, we do
still need to track them all the same.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:26:35 -04:00
Rob Clark
5dcf59e142 freedreno/a3xx: fix MAX_INPUTS shader cap
Hardware only supports 16.  Which fd3_shader_variant properly reflected,
but the pipe cap did not, leading to array overflow (and shaders that
could not possibly work).

Also a bunch of asserts to make problems like this easier to see.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:25:53 -04:00
Rob Clark
e1896948da freedreno/a3xx: add debug flag to expose glsl130
We are starting to add integer support to the compiler, which does not
get exercised with glsl feature level 120 and without advertising
integer support.  But doing so breaks too many things right now.  So
for now use a debug flag to conditionally expose the functionality
while it is in development.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:20:29 -04:00
Ryan Houdek
ac2a8e3c9d freedreno/a3xx/compiler: add KILL_IF
The KILL_IF opcode could potentially be merged in to the regular KILL
opcode function.  It was a pain to do so, so I've left is separated
for cleanliness.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:19:43 -04:00
Ryan Houdek
a889049400 freedreno/a3xx/compiler: start adding integer support
Adds a large sum of TGSI opcodes to the a3xx compiler.

For integer opcodes we have 28 opcodes added.
Adds 4 floating point compare opcodes

If GLSL 1.30 is enabled, this allows the GLSL 1.30 piglits to have a
completion amount of 432/641.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 21:19:21 -04:00
Roland Scheidegger
8620730f8a draw: better llvm names for shaders for debugging.
All shaders had the same name.
We could probably use some identifier per shader too, but for now only use
the variant number.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-15 02:35:35 +02:00
Roland Scheidegger
65ad90bd1b llvmpipe: improve setup shader names (for debugging)
The setup shaders were composed of both a fs shader number and a variant
number. But since they aren't tied to a particular fragment shader, the
former was a fixed zero while the latter was also always zero because
it was never assigned. So, similar to what the fs code does, use a ever
increasing number to give it a more catchy name (unlike fragment shaders
though where this number is for each explicitly created shader, we just use
it for the implicitly created variants).
And while here, fix whitespace a bit.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-15 02:35:29 +02:00
Roland Scheidegger
1d28650b55 llvmpipe: kill off llvmpipe_variant_count
Unused except it was increased for both fs and setup shader variants created.
Probably some leftover from ages ago.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-15 02:35:26 +02:00
Roland Scheidegger
3e817e7e56 mesa/st: fix number of ubos being declared in a shader
Previously the code used the total number of ubos being declared in the
linked program (so the ubos of all shaders combined), use the number
from the particular shader instead.
This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks
seen in llvmpipe since 8a9f5ecdb1 as it now emits
code for each declared buffer, not just the ones actually used.

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-05-15 02:35:25 +02:00
Ben Skeggs
9c64cb80d2 nvc0: enable support for maxwell boards
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:54 +10:00
Ben Skeggs
d548d47edf nvc0: add maxwell (sm50) compiler backend
The big missing part here is proper sched data calculations, but
hopefully the chosen placeholder will be sufficient for now.

Passes piglit as well as GK107 does.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:49 +10:00
Ben Skeggs
7b9475fa65 nvc0: maxwell isa has no per-instruction join modifier
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:46 +10:00
Ben Skeggs
07d3972b49 nvc0: replace immd 0 with $rLASTGPR for emit/restart opcodes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:42 +10:00
Ben Skeggs
3723ff5223 nvc0: move nvc0 lowering pass class definitions into header
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:39 +10:00
Ben Skeggs
bede1bdb48 nvc0: bump sched data member to 32-bits
SM50 backend requires 21 bits per instruction, not 8.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:34 +10:00
Ben Skeggs
c42d7556d3 nvc0: use vertex arrays for eng3d blit
Maxwell doesn't have immediate-mode.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:29 +10:00
Ben Skeggs
edb1020ea5 nvc0: restrict "constant vbo" logic to fermi/kepler classes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:25 +10:00
Ben Skeggs
322460fdbc nvc0: replace some vb->stride checks with constant_vbo instead
Maxwell no longer has the methods to set constant attributes, and we'll
want to be treating stride 0 vtxbufs the same as for stride > 0.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:21 +10:00
Ben Skeggs
9306c3470f nvc0: add maxwell class
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:16 +10:00
Ben Skeggs
0079a375a5 nvc0: allow for easier modification of compiler library routines
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:54:12 +10:00
Ben Skeggs
737477dac3 nvc0: properly distribute macros in source form
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-15 09:53:56 +10:00
Emil Velikov
e48054d036 docs: Add a note about llvm-shared-libs and libxatracker
Both changes landed in 10.2, and for people not following the
development cycle these will come as a surprise. Note that the
pipe_* interface is not stable.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 23:44:08 +01:00
Brad King
6aac2637a6 automake: Honor GL_LIB for gallium libgl-xlib
Use "@GL_LIB@" in src/gallium/targets/libgl-xlib/Makefile.am to produce
the library name specified by the configure --with-gl-lib-name option.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-14 23:44:08 +01:00
Emil Velikov
f57d092199 configure: correctly set LD_NO_UNDEFINED
Commit 11623be934 was meant to have this hunk, which
I accidently dropped during git rebase.

Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
2014-05-14 23:44:08 +01:00
Roland Scheidegger
8a9f5ecdb1 gallivm: only fetch pointers to constant buffers once
In 1d35f77228 support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-14 16:23:33 +02:00
Roland Scheidegger
18c6454ad1 gallivm: fix output stream flushing in error case for disassembly.
When there's an error, also need to flush the stream, otherwise an assertion
is hit (meaning you don't actually see the error neither).
2014-05-14 16:23:33 +02:00
Michel Dänzer
c5828b0599 radeonsi: Fix anisotropic filtering state setup
Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-05-14 22:53:30 +09:00
Ilia Mirkin
12d97fb7c1 tgsi: support parsing texture offsets from text tgsi shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 09:40:37 -04:00
Ilia Mirkin
04b7e65814 mesa/st: provide native integers implementation of ir_unop_any
Previously, ir_unop_any was implemented via a dot-product call, which
uses floating point multiplication and addition. The multiplication was
completely pointless, and the addition can just as well be done with an
or. Since we know that the inputs are booleans, they must already be in
canonical 0/~0 format, and the final SNE can also be avoided.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 09:40:37 -04:00
Rob Clark
209522070e gallium/docs: clarify when query results are reset
It wasn't completely clear from the docs, so I had to figure out by
looking at piglit results.  Hopefully this saves the next driver writer
implementing queries some time.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-14 07:54:02 -04:00
José Fonseca
b18b7781b2 gallivm: Remove lp_func_delete_body.
Not necessary, now that we will free the whole module (hence all
function bodies) immediately after compiling.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
a6f5cc66db gallivm: Remove gallivm_free_function.
Unused.  Deprecated by gallivm_free_ir().

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
0b239d9ed9 llvmpipe: Delete unneeded LLVM stuff earlier.
Same as Frank's change to draw module but for llvmpipe module.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
Frank Henigman
ef14f0d59f draw: Delete unneeded LLVM stuff earlier.
Free up unneeded LLVM stuff immediately after generating vertex shader
code.  Saves about 500K per shader.

v2: Don't bother calling gallivm_free_function (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
Frank Henigman
865d0312c0 gallivm: Separate freeing LLVM intermediate data from freeing final code.
Split free_gallivm_state() into two steps.  First step is
gallivm_free_ir() which cleans up the LLVM scaffolding used to generate
code while preserving the code itself.  Second step is
gallivm_free_code() to free the memory occupied by the code.

v2: s/gallivm_teardown/gallivm_free_ir/ (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
Frank Henigman
2c73102dc3 gallivm: One code memory pool with deferred free.
Provide a JITMemoryManager derivative which puts all generated code into
one memory pool instead of creating a new one each time code is generated.
This saves significant memory per shader as the pool size is 512K and
a small shader occupies just several K.

This memory manager also defers freeing generated code until you tell
it to do so, making it possible to destroy the LLVM engine while keeping
the code, thus enabling future memory savings.

v2: Fix compilation errors with LLVM 3.4 (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
2ea923cf57 gallivm: Run passes per module, not per function.
This is how it is meant to be done nowadays.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
920933e09e gallivm: Use LLVM global context.
I saw that LLVM internally uses its global context for some things, even
when we use our own.  Given ours is also global, might as well use
LLVM's.

However, sepearate contexts can still be enabled with a simple source
code modification, for when the need/benefit arises.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
69f0835ff1 gallivm: Stop using module providers.
Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so
its use just causes unnecessary confusion.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:05:00 +01:00
José Fonseca
9cf67e51b0 gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.
Older versions haven't been tested probably don't work anyway.  But more
importantly, code supporting it is hindering further work.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:04:59 +01:00
José Fonseca
ecef2da0b2 configure: Require LLVM 3.1.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:04:59 +01:00
José Fonseca
c0ef9a67d3 scons: Require LLVM 3.1
Support for prior versions will be removed in the following change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-14 11:04:59 +01:00
Matt Turner
2012599abb i965: Reformat brw_set_src1 so it can be easily found with grep. 2014-05-13 22:40:01 -07:00
Samuel Iglesias Gonsalvez
e0dc018fd5 i965: fix size assert for gen7 in brw_init_compaction_tables()
It should compare with it's own size.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
2014-05-13 22:35:42 -07:00
Iago Toral Quiroga
520dfa4b5c i965: Relax accumulator dependency scheduling on Gen < 6
Many instructions implicitly update the accumulator on Gen < 6. The instruction
scheduling code just calls add_barrier_deps() for each accumulator access on
these platforms, but a large class of operations don't actually update the
accumulator -- mostly move and logical instructions. Teaching the scheduling
code about this would allow more flexibility to schedule instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-05-13 22:33:59 -07:00
Jonathan Gray
0c0bbe77d0 glsl: simplify the M_PI*f macros, fixes build on OpenBSD
The M_PI*f macros used a preprocessor paste to append 'f'
to M_PI defines, which works if the values are only numbers
but breaks on OpenBSD where M_PI definitions have casts
and brackets to meet requirements of a future version of POSIX,

http://austingroupbugs.net/view.php?id=801
http://austingroupbugs.net/view.php?id=828

Simplify the M_PI*f macros by using casts directly in the defines
as suggested by Kenneth Graunke.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2014-05-13 22:30:22 -07:00
Carl Worth
a5769ad373 docs: Really add the 10.1.3 release nots this time
Commit a96c3bccf6 intended to add these, but I
forgot to add the file.
2014-05-13 17:30:17 -07:00
Rob Clark
f999c13176 freedreno/a3xx: occlusion query support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13 18:33:19 -04:00
Rob Clark
b8f78e1890 freedreno: add support for hw queries
Real GPU queries need some infrastructure to track samples per tile and
accumulate the results.  But fortunately this can be shared across GPU
generation.

See:
https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13 18:33:19 -04:00