Commit graph

22131 commits

Author SHA1 Message Date
Marek Olšák
02134cfaae radeonsi: use tgsi_shader_info to get a list of GS outputs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
101905d3f7 radeonsi: use tgsi_shader_info in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
6f04cf7fac radeonsi: simplify dereferences in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
639f6b41d2 radeonsi: use tgsi_shader_info in si_shader_vs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
fa933438a2 radeonsi: use tgsi_shader_info in si_shader_ps
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
e23fec1445 radeonsi: use tgsi_shader_info in fetch_input_gs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:51 +02:00
Marek Olšák
7a645c5366 radeonsi: don't rely on shader->output in si_llvm_emit_fs_epilogue
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:16 +02:00
Marek Olšák
216cf86ec4 radeonsi: use tgsi_shader_info in si_llvm_emit_es_epilogue
tgsi_shader_info contains everything we need.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:13 +02:00
Marek Olšák
34e8200599 radeonsi: don't recompile shaders when changing nr_cbufs from 0 to 1
Both cases are equivalent.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:07 +02:00
Marek Olšák
5e0fbe1b63 radeonsi: remove vs.ucps_enabled from the shader key
Written CLIPDIST outputs are simply disabled in PA_CL_VS_OUT_CNTL.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:02 +02:00
Marek Olšák
a9592cd3ac radeonsi: assume ClipDistance usage mask is always 0xf
No code in Mesa sets the usage mask to any other value.
The final mask is AND'ed with enable bits from the rasterizer state anyway.

If somebody implements setting usage masks in st/mesa, we can use
tgsi_shader_info to get it more easily.

This is a prerequisite for the following commit.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:51:44 +02:00
Francisco Jerez
2286edce16 clover: Fix unintended fall-through in kernel::argument::bind. 2014-10-12 11:44:05 +03:00
Jan Vesely
5bffc5e262 clover: Append implicit arguments to the kernel argument list.
[ Francisco Jerez: Split off from a larger patch, and take a slightly
  different approach for passing the implicit arguments around. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-12 01:50:13 +03:00
Francisco Jerez
bf89a97748 clover: Pass execution dimensions and offset to the kernel as implicit arguments.
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-12 01:44:19 +03:00
Francisco Jerez
06139c56fa clover: Add semantic information to module::argument for implicit parameter passing.
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-12 01:39:21 +03:00
Francisco Jerez
27c51b5f58 clover: Use unreachable() from util/macros.h instead of assert(0).
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-11 12:44:09 +03:00
Vinson Lee
5480d6b13f gallium: Add tokens for DragonFly BSD.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Brian Paul <brianp@vmware.com>
2014-10-10 21:32:35 -07:00
Chia-I Wu
566d1889ea ilo: disassemble compacted instructions
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-11 11:55:50 +08:00
Eric Anholt
070b2c2efc vc4: Use the fnv1 hash function instead of gallium util's crc32.
Improves simulated norast performance on a little benchmark by 13.4012%
+/- 2.08459% (n=13).
2014-10-10 15:49:34 +02:00
Eric Anholt
d09509da2a vc4: Don't look up the compiled shaders unless state has changed.
Improves simulated norast performance on a little benchmark by 38.0965%
+/- 3.27534% (n=11).
2014-10-10 15:49:22 +02:00
Eric Anholt
c6f50c4086 vc4: Actually clear the context's dirty flags.
I was trying to skip state updates when !dirty, and suspiciously
everything was always dirty.
2014-10-10 15:03:13 +02:00
Eric Anholt
7c474f9f2e vc4: Optimize the other case of SEL_X_Y wih a 0 -> SEL_X_0(a).
Cleans up some output to be more obvious in a piglit test I'm looking at.
2014-10-10 15:03:12 +02:00
Eric Anholt
7e67ea994c vc4: Optimize out adds of 0. 2014-10-09 21:47:06 +02:00
Eric Anholt
0401f55fff vc4: Optimize fmul(x, 0) and fmul(x, 1).
This was being generated frequently by matrix multiplies of 2 and
3-channel vertex attributes (which have the 0 or 1 loaded in the shader).
2014-10-09 21:47:06 +02:00
Eric Anholt
1cd8c1aab0 vc4: Factor out the turn-it-into-a-mov in opt_algebraic.
This will be used more in the next commits.
2014-10-09 21:47:06 +02:00
Eric Anholt
40748cf8d9 vc4: Eliminate unused texture instructions. 2014-10-09 21:47:06 +02:00
Eric Anholt
b73cab6826 vc4: Dead code eliminate unused SF instructions. 2014-10-09 21:47:06 +02:00
Eric Anholt
93cac2637b vc4: Prevent copy propagating out the MOVs from r4.
Copy propagating these might result in reading the r4 after some other
instruction has written r4.  Just prevent all copy propagation of this for
now.

Fixes bad rendering with upcoming indirect register access support, where
the copy propagation was consistently happening across another read.
2014-10-09 21:47:06 +02:00
Eric Anholt
c4b0dd5356 vc4: Split the coordinate shader to its own vc4_compiled_shader.
Merging VS and CS into the same struct wasn't winning us anything except
for not allocating a separate BO (but if we want to pack programs into
BOs, we should pack not just those 2 programs together).  What it was
getting us was a bunch of code duplication about hash table lookups and
propagating vc4_compile contents into a vc4_compiled_shader.

I was about to make the situation worse with indirect uniform buffer
access.
2014-10-09 21:47:06 +02:00
Eric Anholt
5c72d7706c vc4: Add #defines for the texture uniform fields.
I wanted to make another set of texture uploads for handling reladdr
constants, and duplicating all the bitshifting looked like a terrible
idea.  In the process, this fixes a swap of the s/t texture wrap modes.
2014-10-09 21:47:06 +02:00
Eric Anholt
5cfab07639 vc4: Initialize undefined temporaries to 0.
Under the simulator, reading registers before writing them triggers an
assertion failure.  c->undef gets treated as r0, which will usually be
written, but not if it's used in the first instruction.  We should
definitely not be aborting in this case, and return some sort of undefined
value instead.

Fixes glsl-user-varying-ff.
2014-10-09 21:47:06 +02:00
Michel Dänzer
7b4276d7ac r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers
Putting those in VRAM can cause long pauses due to buffers being moved
into / out of VRAM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-09 18:11:44 +09:00
Eric Anholt
5a13522898 vc4: Optimize SF(ITOF(x)) -> SF(x).
This is a common production of st_glsl_to_tgsi, because CMP takes a float
argument.
2014-10-09 11:01:18 +02:00
Eric Anholt
00a9aebfe0 vc4: Add some optimization of FADD(FSUB(0, x)).
This is a common production of st_glsl_to_tgsi, which uses negate flags on
source arguments to handle subtraction.
2014-10-09 11:01:18 +02:00
Eric Anholt
67aea92964 vc4: Mostly fix offset calculation for NPOT mipmap levels.
The non-base NPOT levels are stored as POT-aligned images.  We get that
POT alignment by minifying the POT-aligned base level.

This means that level strides are also POT aligned, so we have to tell the
rendering mode config that our resource is larger than the actual
requested area.

Fixes the fbo-generatemipmap-formats NPOT cases.  Regresses
depthstencil-render-miplevels 273 * -- the texture presentation now works
(where it was completely broken before), it looks like there's some
overflow of image bounds happening at the lower miplevels.
2014-10-09 11:01:09 +02:00
Eric Anholt
0b96a086cb vc4: Move the mirrored kernel code to a kernel/ directory.
Now this whole setup matches the kernel's file layout much more closely.
2014-10-09 09:46:39 +02:00
Eric Anholt
ef9914aa74 vc4: Enable LIT lowering in TGSI instead of our own code.
This brings us the -128/128 clamping on the w component.
2014-10-08 22:47:39 +02:00
Eric Anholt
9773d45908 vc4: Fix scalar math opcodes to replicate their result from the X channel.
Thanks to robclark for pointing out that I was probably failing to do this
when I reported a "bug" in his lowering code.
2014-10-08 22:47:39 +02:00
Chia-I Wu
4e50a32be6 ilo: fix rectlist on GEN7+
It was broken by 343b014b57.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-09 03:37:04 +08:00
Eric Anholt
581418585e vc4: Add support for two-sided color.
It's fairly easy, thanks to Rob Clark's lowering code.  Fixes
two-sided-lighting and 4 vertex-program-two-side testcases, while
regressing 8 testcases that involve enabling two-sided color while only
initializing one of the two colors in the VS.  If you're enabling two
sided color, it's of course expected that you really do set up both
colors, so this is still an improvement (and when we set up a linker for
TGSI, we'll hopefully fix those 8 fails).
2014-10-08 17:45:16 +02:00
Eric Anholt
4dccdbf5cb vc4: Enable POW lowering in TGSI instead of our own code. 2014-10-08 17:42:59 +02:00
Eric Anholt
1aef5a337f vc4: Enable DP lowering in TGSI instead of our own code. 2014-10-08 17:42:59 +02:00
Eric Anholt
4f6e4c7370 vc4: Start using tgsi_lowering for opcodes we haven't supported before. 2014-10-08 17:42:59 +02:00
Eric Anholt
f9854e169f gallium: Rename freedreno parts of tgsi_lowering.[ch].
Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
19df602b39 gallium: Reformat tgsi_lowering.c for the normal style.
Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
3141dc8e87 gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.
Lots of drivers need to transform the weird instructions in TGSI into
reasonable scalar ops, and this code can make those translations
canonical.

Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
84caf5a861 vc4: Set unused raddr fields to QPU_R_NOP.
The simulator assertion fails if you have a write to a reg and then a read
(for example, in the NOP side of an instruction), even if the read isn't
used for anything.  By setting unused raddrs to NOP, we avoid the problem
(since only the phsyical registers are tracked).
2014-10-08 17:42:59 +02:00
Eric Anholt
48af7426f2 vc4: Abstract out the field-merging logic for instructions.
I'm going to be doing the same logic for some more fields next.
2014-10-08 17:42:59 +02:00
Niels Ole Salscheider
acdcef6788 r600: Use DMA transfers in r600_copy_global_buffer
v2: Do not demote items that are already in the pool

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2014-10-07 15:59:43 -04:00
Michel Dänzer
be0a994fb8 radeonsi: Use dummy pixel shader if compilation of the real shader failed
Instead of crashing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-07 12:07:13 +09:00