Commit graph

11465 commits

Author SHA1 Message Date
Aaron Watry
01f3622c74 r600/llvm: Free binary.code/binary.config in r600_llvm_compile
radeon_llvm_compile allocates memory for binary.code, binary.config,
or neither depending on what's being done.

We need to make sure to free that memory after it's no longer needed.

v2: Don't bother checking for null before FREE()

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:53:31 -08:00
Aaron Watry
dd73b99420 r600/llvm: initialize radeon_llvm_binary
use memset to initialize to 0's... otherwise code_size and config_size
could be uninitialized when read later in this method.

It's also hard to do NULL checks on uninitialized pointers.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

v2: Fix indentation

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:53:31 -08:00
Brian Paul
2bc1680665 svga: remove unused vars in svga_hwtnl_simple_draw_range_elements()
And simplify the code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-15 10:27:01 -07:00
Brian Paul
1a36dfb21e svga: print warning for unsupported indirect dest reg indexing
For DX9-level shaders, there's only limited support for indirect
indexing of registers (with the loop counter register, not the
general address register.)

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-15 10:23:49 -07:00
Brian Paul
3969330b47 svga: mark dest image as defined in svga_surface_copy()
After we blit/copy to a dest texture image we need to mark it as
being defined.  This fixes broken mipmap generation for quite a
few texture formats.  Mipgen involves making texture views and
svga_texture_view_surface() skips texture images that are undefined.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-15 10:23:48 -07:00
Brian Paul
79984b9928 svga: do primitive trimming in translate_indices()
The index translation code expects the number of indexes to be
consistent with the primitive type (ex: a multiple of 3 for
PIPE_PRIM_TRIANGLES).  If it's not, we can write out of bounds
in the destination buffer.

Fixes failed assertions in the pipebuffer debug code found with
Piglit primitive-restart-draw-mode test.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-15 10:23:48 -07:00
Aaron Watry
4c6ac9e614 radeonsi/compute: Dispose of LLVM module after compiling kernels
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:16:49 -08:00
Aaron Watry
35dad4a1e2 radeonsi/compute: Free program and program.kernels on shutdown
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:16:49 -08:00
Aaron Watry
d41b10f811 radeon/llvm: Free created llvm memory buffer
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:16:49 -08:00
Aaron Watry
a2b93da84b radeon/llvm: Free libelf resources
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:16:49 -08:00
Aaron Watry
df482fe02f radeon/llvm: fix spelling error
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-15 09:16:49 -08:00
José Fonseca
c5a05a6aef trace: Dump user_buffer members. 2013-11-15 15:32:33 +00:00
Alex Deucher
f5778f152b radeonsi: add support for Hawaii asics (v2)
Update additional register fields.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-11-15 08:51:09 -05:00
Roland Scheidegger
473cb3fe4a llvmpipe: (trivial) fix more fallout from the setup cleanup.
Oops... Should have done some more testing.
2013-11-14 15:49:42 +00:00
Roland Scheidegger
5190c16a04 llvmpipe: (trivial) fix misplaced bld context assignment.
Should fix polygon offset crashes...
2013-11-14 14:44:15 +00:00
Roland Scheidegger
673d5391a2 softpipe: (trivial) fix debug code
The debug printfs wouldn't actually compile when enabled, so kill them off
and insert some new one in another place, and make sure it keeps compiling
by enclosing it in a if-0 clause.
2013-11-14 12:24:55 +00:00
Roland Scheidegger
2dd693412a llvmpipe: clean up state setup code a bit
In particular get rid of home-grown vector helpers which didn't add much.
And while here fix formatting a bit. No functional change.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-14 12:24:55 +00:00
Roland Scheidegger
754319490f gallivm,llvmpipe: fix float->srgb conversion to handle NaNs
d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-14 12:24:55 +00:00
Ben Skeggs
c944bde5be nvc0: release 3d bufctx after drawing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-11-13 08:09:29 +10:00
Roland Scheidegger
50f19e3a66 draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offset
Since we explicitly require a integer input we should avoid using exp2 math
(even if we were using optimized versions), which turns the exp2 into a int
sub (plus some casts).

v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-12 19:08:58 +00:00
Ilia Mirkin
08122e151a nouveau/video: mark bitstream-level acceleration as unsupported
Adding a vl_mpeg-based helper didn't seem to work, as it produced data
that the card couldn't handle. (And I didn't investigate further.) This
makes the decoding functionality only accessible via XvMC and avoids
crashes when attempting to use VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-12 10:11:41 +01:00
Ilia Mirkin
e8d5d3409c nouveau/video: don't try on nv3x
It doesn't work, I don't know why, but no point in hanging people's
displays until it gets figured out.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
2013-11-12 10:10:54 +01:00
Tom Stellard
a859131003 radeonsi/compute: Add Sea Islands support 2013-11-11 17:21:34 -05:00
Vincent Lejeune
88c8f19729 r600/llvm: Store inputs in function arguments 2013-11-11 23:14:42 +01:00
Brian Paul
34ce1a8502 svga: improve loops over color buffers
Only loop over the actual number of color buffers supported, not
PIPE_MAX_COLOR_BUFS.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-11 08:12:18 -07:00
Brian Paul
2182d2db28 svga: document magic number of 8 render targets per batch
Grab the comments from commit message b84b7f19df to explain
what the code is doing.
2013-11-11 08:12:18 -07:00
Fredrik Höglund
e420fb887f r600g: Add support for PIPE_FORMAT_R11G11B10_FLOAT vertex elements
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-11-07 23:51:44 +01:00
Brian Paul
6592a6d065 svga: always return 4 for PIPE_MAX_COLOR_BUFS
Even if the query returns 8, only 4 really work.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-07 15:21:40 -07:00
Brian Paul
055dbd5c3e svga: return true for the PIPE_CAP_SM3 query
This just tells the state tracker to turn on the GL_ARB_shader_texture_lod
extension.  This simply allows the GLSL compiler to emit TXL and TXD
instructions for both vertex and fragment shaders.  We already support
these opcodes in the svga driver.  Though, the shadow2DGrad() Piglit
tests are failing.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-07 15:21:40 -07:00
Matthew McClure
f9e2c24326 draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_float
With this patch, the llvmpipe and draw modules will calculate the depth bias
according to floating point depth buffer semantics described in the
arb_depth_buffer_float specification, when the driver has a z buffer bound
with a format type of UTIL_FORMAT_TYPE_FLOAT.

By default, the driver will use the existing UNORM calculation for depth bias.

A new function, draw_set_zs_format, was added to calculate the Minimum
Resolvable Depth value and floating point depth sense for the draw module.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-07 18:32:54 +00:00
Fabio Pedretti
da7daade92 r600/compute: silence unused var warning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-11-06 22:07:58 +01:00
Vincent Lejeune
08556073d1 r600/llvm: Fix isampleBuffer on preEG 2013-11-06 17:36:22 +01:00
Vincent Lejeune
1184f8fd34 r600/llvm: Fix texbuf for pre EG gen 2013-11-06 17:36:22 +01:00
Marek Olšák
6463b94973 r600g: properly unbind a DSA state being deleted in r600_delete_dsa_state
Tested-by: Christian König <christian.koenig@amd.com>
2013-11-04 19:07:57 +01:00
Marek Olšák
a767f57a7d radeonsi: implement ARB_vertex_type_2_10_10_10_rev 2013-11-04 19:07:57 +01:00
Marek Olšák
6a250877ea r600g,radeonsi: properly expose texture buffer formats
This exposes GL_ARB_texture_buffer_object_rgb32.
2013-11-04 19:07:57 +01:00
Marek Olšák
dbeedbb7ab radeonsi: implement texture buffer objects
GLSL 1.40 is done.
2013-11-04 19:07:57 +01:00
Marek Olšák
164de0d2a5 radeonsi: report our border color behavior 2013-11-04 19:07:57 +01:00
Marek Olšák
4569bf9199 radeonsi: bind a dummy constant buffer in place of NULL buffers 2013-11-04 19:07:57 +01:00
Marek Olšák
2fd4200123 radeonsi: implement uniform buffer objects 2013-11-04 19:07:57 +01:00
Marek Olšák
e5f0080d91 radeonsi: try to fix IA_MULTI_VGT_PARAM programming
This doesn't make any difference on Bonaire, but it might help on Hawaii.
2013-11-04 19:07:57 +01:00
Rob Clark
f407ea1f1c freedreno/a3xx/texture: min/max lod
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:22:40 -04:00
Rob Clark
2d10e22f8b freedreno/a3xx: update envytools headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:22:28 -04:00
Rob Clark
f16b084bb9 freedreno/a3xx: fix VS out / FS in linking
Actually link VS out / FS in based on semantic info, keeping in mind
that position/pointsize can also be an input to the FS.  This fixes a
few fragment shaders which were using gl_Position.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:20:47 -04:00
Rob Clark
83318d6511 freedreno/a3xx: allow num_samplers != num_textures
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:20:29 -04:00
Rob Clark
a53fe2221c freedreno/a3xx/compiler: highp frag shader
Fixes use of full-precision in fragment shader (ie. don't clobber r0.x
since that can be used by future bary instructions for varying fetch).
And makes use of full-precision the default in fragment shader (but can
be overriden via FD_MESA_DEBUG=fraghalf).

Seems like half precision is often not enough for texture coordinates.
The blob compiler is clever enough to keep texture coords in full
precision registers while using half precision for everything else.  But
we aren't quite that clever yet, so better to default to full precision.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:19:42 -04:00
Rob Clark
310fd5839c freedreno/a3xx/compiler: relative addressing fixes.
Handle some relative addressing constraints: cannot handle const or
relative in cat5 and src2 of cat3.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:18:44 -04:00
Rob Clark
4ddd4e83c7 freedreno: we do actually support sqrt
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:17:56 -04:00
Kai Wasserbäch
bbb77fc2f1 radeonsi: Allow longer intrinsic names
Fixes a boat load of Piglit tests for me, which crashed like fdo#70913
before.

Thanks to Michel Dänzer for the tip.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70913
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-30 16:40:06 -07:00
Tom Stellard
6f3465f340 radeon/llvm: Specify the DataLayout when running optimizations
Without DataLayout, a lot of optimization passes aren't run and the ones
that are don't work as well.
2013-10-30 16:40:06 -07:00