Commit graph

22191 commits

Author SHA1 Message Date
EdB
01d94193ac clover: Don't return CL_INVALID_VALUE if there is no header.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:35:10 +03:00
EdB
aa93af809f clover: Add allow_empty_tag.
To allow empty objs() list checks.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:35:10 +03:00
EdB
611d66fe45 clover: Add initial implementation of clCompileProgram for CL 1.2.
[ Francisco Jerez: General clean-up. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:34:51 +03:00
EdB
fead2b0463 clover: Add a simple compat::pair.
std::pair is not c++98/c++11 safe.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:33:02 +03:00
Francisco Jerez
5583459655 clover/util: Allow using key_equals with pair-like objects other than std::pair. 2014-10-20 10:33:02 +03:00
Francisco Jerez
e987fd5dc6 clover/util: Define equality operators for a couple of compat classes. 2014-10-20 10:33:01 +03:00
Francisco Jerez
1441a3c1bb clover/util: Fix construction of compat::vector with a general container as argument. 2014-10-20 10:33:01 +03:00
Eric Anholt
6212d2402d vc4: Translate 4-byte index buffers to 2 bytes.
Fixes assertion failures in 14 piglit tests (half of which now pass).
2014-10-19 08:44:56 +01:00
Eric Anholt
572fba95e4 vc4: Add support for rebasing texture levels so firstlevel == 0.
GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't.  Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.
2014-10-19 08:42:33 +01:00
Eric Anholt
15eb4c59f6 vc4: Apply a Newton-Raphson step to improve RSQ
Fixes all the piglit built-in-functions/*sqrt tests, among others.
2014-10-18 10:08:59 +01:00
Eric Anholt
1fc124b80f vc4: Apply a Newton-Raphson step to improve RCP.
Fixes all the piglit floating-point *-op-div tests, among others.
2014-10-18 10:08:59 +01:00
Eric Anholt
0fdc5111b4 vc4: Add a little bit more packet parsing to make dump reading easier.
Probably should have done this *before* staring at all those render lists
today.
2014-10-18 10:08:59 +01:00
Eric Anholt
9ebfb3014e vc4: Make some assertions about how many flushes/EOFs the simulator sees.
This caught the previous commit's bug in the kernel validator.
2014-10-17 13:13:43 +01:00
Eric Anholt
1f7048419e vc4: Fix accidental dropping of the low bits of the store tilebuffer packet.
Notably this included the EOF flag (the other bits are the full buffer
dump selection, but we don't do full dumps), which caused the kernel
checking for frame completion to trigger.
2014-10-17 13:09:29 +01:00
Eric Anholt
afc3aa373d vc4: Set the primitive list format at the start of rendering.
The other driver does this manually before calling into each tile, but we
can just let it get binned into the tiles (saving repeated kernel
validation on the packet).

Fixes simulator assertion failures on polygon-mode and non-auto texwrap.
2014-10-17 13:09:28 +01:00
Eric Anholt
895c904103 vc4: Replace the FLUSH_ALL with FLUSH.
We don't need to emit all of our current state at the end of each bin
list.  We're going to be smashing it all at the start of the next tile's
bin list, anyway.
2014-10-17 13:09:28 +01:00
Eric Anholt
000976ed99 vc4: Add some comments about state management. 2014-10-17 13:09:28 +01:00
Eric Anholt
135287db17 vc4: Make sure there's exactly 1 tile store per tile coords packet.
It's not documented that I can see, but the other driver does it (check
vg_hw_4.c), and one of the HW guys confirmed that you really do need to do
it.
2014-10-17 13:09:25 +01:00
Michel Dänzer
c4db733fac winsys/radeon: Use a single buffer cache manager again
The trick is to generate a unique buffer usage value for each possible
combination of domains and flags, with only one bit set each for the
domains and flags. This ensures pb_check_usage() only returns TRUE when
the domains and flags the cached buffer was created for exactly match
the requested ones.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-17 17:09:49 +09:00
Tom Stellard
e1d363b3ff clover: Add environment variables for dumping kernel code v2
There are two debug variables:

CLOVER_DEBUG which you can set to any combination of llvm,clc,asm
(separated by commas) to dump llvm IR, OpenCL C, and native assembly.

CLOVER_DEBUG_FILE which you can set to a file name for dumping output
instead of stderr.  If you set this variable, the output will be split
into three separate files with different suffixes: .cl for OpenCL C,
.ll for LLVM IR, and .asm for native assembly.  Note that when data
is written, it is always appended to the files.

v2:
  - Code cleanups
  - Add CLOVER_DEBUG_FILE environment variable for dumping to a file.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:52 -04:00
Tom Stellard
76136c29bb clover: Register an llvm diagnostic handler v3
This will allow us to handle internal compiler errors.

v2:
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:41 -04:00
Tom Stellard
8e7df519bd clover: Add support for compiling to native object code v3
v2:
  - Split build_module_native() into three separate functions.
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:30 -04:00
Tom Stellard
8b7cc90cef gallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir
Drivers can return this value for PIPE_COMPUTE_CAP_IR_TARGET
if they want clover to give them native object code.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:22 -04:00
Tom Stellard
dc39b32c9b clover: Factor kernel argument parsing into its own function v2
v2:
  - Code cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:14 -04:00
Emil Velikov
79d09a4b12 vc4: correctly include the source files
The kernel files are built into a separate static library and
all the functions that require it are already wrapped in ifdef
USE_VC4_SIMULATOR. Don't forget the header file :)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-10-16 10:00:14 +01:00
Rob Clark
652b8fbbbb freedreno/ir3: large const support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:49 -04:00
Rob Clark
e71a3f80fb freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
dd332fe641 freedreno: fix layer_stride
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
8233b36a17 freedreno: inline fd_draw_emit()
Manual LTO

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
368466b7b7 freedreno/ir3: optimize shader key comparision
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
d595987ea3 freedreno/a3xx: refactor/optimize emit
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing.  Refactor
to group these into fd3_emit.  This simplifies fxn signatures, avoids
passing around shader key on the stack, etc.  It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
d5d80b3739 freedreno/a3xx: refactor vertex state emit
Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws.  Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.

Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Eric Anholt
57de9bbb63 vc4: Fix the uniform debug output.
I dropped the shader index when moving to the compiled shader struct, but
didn't update the format string here.
2014-10-15 18:12:03 +01:00
Eric Anholt
201d4c0b2a vc4: Add support for user clip plane and gl_ClipVertex.
Fixes about 15 piglit tests about interpolation and clipping.
2014-10-15 18:11:46 +01:00
Eric Anholt
6a0bf67048 vc4: Move the output semantics setup to a helper.
I want to reuse it elsewhere to set up outputs that aren't in the TGSI.
2014-10-15 18:11:46 +01:00
Michel Dänzer
159f93cf39 r600g,radeonsi: Only set use_staging_texture = TRUE once
No need to check for setting the flag after we set it already.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:30 +09:00
Michel Dänzer
87da286755 r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled
We set the NO_CPU_ACCESS flag for BO allocation in that case, so direct CPU
access may not work.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:14 +09:00
Michel Dänzer
3ede67a4c6 winsys/radeon: Use separate caching buffer manager for each set of flags
Otherwise the caching buffer manager may return a buffer which was created
with a different set of flags, which can cause trouble.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:11:40 +09:00
Tom Stellard
8cf6482c3d clover: Fix regression in module serialization
We need to serialize semantic information for arguments, which was added
in 06139c56fa.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-14 17:58:06 -04:00
Ilia Mirkin
742158b51e st/gbm: fix order of arguments passed to is_format_supported
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-10-14 12:33:38 -04:00
Rob Clark
abe3b3d1e0 freedreno: use tgsi_lowering
Now that the freedreno_lowering code is moved to tgsi_lowering, remove
our private copy and switch over to using the common version.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-14 12:30:08 -04:00
David Heidelberger
d2c1d9693f r300/compiler: remove useless check
This code is already in if (!variable->C->is_r500) so no need check
twice.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
2014-10-14 12:18:32 -04:00
Nick Sarnie
e5bf8d38db ilo: Build pipe-loader for ilo
Trivial patch to create the pipe loader for ilo. All the code was already there.

Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Emil Velikov
af897df508 automake: explicitly set TARGET_RADEON_{WINSYS,COMMON}
Originally the variables were set only once via the ?= operator but
that causes issues when doing incremental builds. They appear to be
undefined and missing from the dependency list despite their addition
to LIBADD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84807
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Eric Anholt
a2d8b6dbd5 vc4: Fix render target NPOT alignment at small miplevels.
The texturing hardware takes the POT level 0 width/height and minifies
those.  This is different from what we were doing, for example, for
273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16.

Fixes piglit-depthstencil-render-miplevels 273.
2014-10-14 14:57:50 +01:00
Eric Anholt
b5fc9d5664 vc4: Add support for having 0 vertex elements used.
You have to load at least 1, according to the simulator.  Fixes 4 piglit
tests and even more ES2 conformance tests.
2014-10-14 11:29:48 +01:00
Vinson Lee
a2fd55cfb6 auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.
This patch fixes this build error on DragonFly BSD.

  CC       os/os_misc.lo
os/os_misc.c: In function 'os_get_total_physical_memory':
os/os_misc.c:132:2: error: #error Unsupported *BSD

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-13 23:40:46 -07:00
Chia-I Wu
4e2cf84b1f ilo: clear writer pointer after unmapping
It does not look like an issue now but it is good to be future proof.  Spotted
by Courtney Goeltzenleuchter.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-14 08:54:25 +08:00
Eric Anholt
615bbf0ca6 vc4: Write the VPM read setup multiple times to queue all the inputs.
There's a 4-element fifo, and the size (number of dwords per vertex) field
is just 4 bits.

Fixes glsl-routing on sim.
2014-10-13 17:16:05 +01:00
Eric Anholt
e1d1c39626 vc4: Add support for the TXL opcode.
There's a bit at the bottom of cube map stride (which has some formatting
bugs in the docs) which flips the bias coordinate to being an absolute
LOD.
2014-10-13 17:15:47 +01:00