Commit graph

11465 commits

Author SHA1 Message Date
Christian König
d3e07bed90 tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
Nobody seems to be using it, and only nv50 had a partial implementation.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Andreas Boll
36320bfa54 radeon/llvm: Link against libgallium.la to fix an undefined symbol
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1

Fixes a regression introduced with
f70c385351

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-03-19 12:07:51 +01:00
Alex Deucher
2da8ee16a8 r600g: properly set non_disp tiling mode for DMA (v2)
Needs to be set for depth, stencil, and fmask just
like other blocks.

v2: drop additional cayman bits for now

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Alex Deucher
4409758a04 r600g: Use blitter rather than DMA for 128bpp on cayman (v3)
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces.  When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.

v2: cayman/TN only

v3: fix comments

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Maarten Lankhorst
f70c385351 gallium/build: Fix visibility CFLAGS in automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Fix formatting - use one CFLAG per line

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-16 12:45:22 +01:00
Brian Paul
f4a2c29d93 softpipe: fix up NUM_ENTRIES confusion
There were two different NUM_ENTRIES #defines for the framebuffer
tile cache and the texture tile cache.  Rename the later to fix
the warnings:

In file included from sp_flush.c:40:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
In file included from sp_context.c:50:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition

Also, replace occurances of NUM_ENTRIES with Element() macro to
be safer.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-14 18:17:18 -06:00
José Fonseca
6a3d77e13d softpipe: Shrink context size.
- each softpipe_tex_tile_cache 50*64*64*4*4 = 3,276,800 bytes
- each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe
  context is 314,572,800 bytes, i.e, 300MB

That is, in a 32bits process (around 3GB virtual memory max), we can
only fit 10 contexts.

This change is a short-term hack to shrink the context size.  Longer
term we'll need to change how the texture cache works.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-14 11:59:53 +00:00
Roland Scheidegger
9e93d7c4fd llvmpipe: don't assert when trying to render to surfaces with multiple layers
instead just warn when creating the surface, rendering will simply happen
to first layer.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:22:30 +01:00
Roland Scheidegger
81e728982d softpipe: don't assert when creating surfaces with multiple layers
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:21:56 +01:00
José Fonseca
4889315619 llvmpipe: Fix geometry shader token leak.
Trivial. Matches softpipe's code.
2013-03-13 21:46:50 +00:00
Tom Stellard
c95177ea88 radeon/llvm: Add missing license headers
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-13 16:01:31 +00:00
Tom Stellard
1c4f283151 radeon/llvm: Make radeon_llvm_util.cpp a C file
All the functions in this file are now implemented in C.
2013-03-13 16:01:31 +00:00
Tom Stellard
3958c104c6 radeon/llvm: Optimize radeon_llvm_strip_unused_kernels()
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.

Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
2013-03-13 16:01:31 +00:00
Tom Stellard
2ace79dce5 radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
b34b8576ec radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
7e9abbea15 radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API
Also make the function static since it is not used anywhere else.
2013-03-13 16:01:30 +00:00
Tom Stellard
97bfcddde0 r600g/llvm: Move llvm wrapper functions into the radeon directory 2013-03-13 16:01:30 +00:00
José Fonseca
7bff1cc3f6 autotools: Add missing top-level include dir.
Fixes autotools build failure.  Not sure if there are more, as I have
difficulties in building the full tree.
2013-03-13 00:25:09 +00:00
Matt Turner
e59fc3faa5 mesa: Use PACKAGE_BUGREPORT macro.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:33 -07:00
Michel Dänzer
4dca602521 radeonsi: Fix off-by-one for maximum vertex element index in some cases
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-12 18:25:54 +01:00
Christoph Bumiller
8aa8b0539e nvc0: avoid crash on updating RASTERIZE_ENABLE state
When doing a blit with the 3D engine, the rasterizer or zsa cso may
be NULL.
2013-03-12 12:55:37 +01:00
Christoph Bumiller
e2dded78ea nvc0: add MP trap handler for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
ae59a7d35d nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
e066f2f62f nvc0: implement compute support for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
75f1f852b0 nvc0/ir: try to fix CAS (CompareAndSwap) 2013-03-12 12:55:37 +01:00
Christoph Bumiller
18fdfbdc32 nv50/ir: add CCTL (cache control) op 2013-03-12 12:55:37 +01:00
Christoph Bumiller
9db7e09cb4 nvc0/ir/emit: fix emission of large address offsets 2013-03-12 12:55:36 +01:00
Christoph Bumiller
175c185941 nvc0: add SHADER/COMPUTE_RESOURCE bind flags to format table 2013-03-12 12:55:36 +01:00
Christoph Bumiller
19ea0bd521 nouveau: align PIPE_BIND_SHADER,COMPUTE_RESOURCEs to 256 bytes 2013-03-12 12:55:36 +01:00
Christoph Bumiller
47f2179844 nv50,nvc0: copy writable flag on surface creation 2013-03-12 12:55:36 +01:00
Christoph Bumiller
7a91d3a2a4 nv50/ir: add support for different sampler and resource index on nve4
And remove non-working code for indirect sampler/resource selection.
Will be added back later.

Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
2013-03-12 12:55:36 +01:00
Christoph Bumiller
99e4eba669 nv50/ir: implement splitting of 64 bit ops after RA 2013-03-12 12:55:36 +01:00
Christoph Bumiller
ac9f19e485 nvc0/ir: skip back edges when determining latest sched value 2013-03-12 12:55:36 +01:00
Christoph Bumiller
f07c46a4f4 nvc0/ir: use large issue delay after RET, too 2013-03-12 12:55:36 +01:00
Christoph Bumiller
b23ec3f8ba nv50/ir: fix size adjustment for sched info for multiple functions 2013-03-12 12:55:36 +01:00
Christoph Bumiller
d39169cb6d nv50/ir: print function inputs and outputs 2013-03-12 12:55:36 +01:00
Christoph Bumiller
1b4faa2b17 nv50/ir/ssa: add a few comments regarding RenamePass 2013-03-12 12:55:36 +01:00
Francisco Jerez
1535b754fb nv50/ir/tgsi: Exclude local declarations from function prototypes. 2013-03-12 12:55:36 +01:00
Christoph Bumiller
9b563ef3f7 nv50/ir/opt: try to make use of SUCLAMP addend 2013-03-12 12:55:36 +01:00
Christoph Bumiller
a788be19e5 nv50/ir: don't assert on type in Modifier.applyTo if it is 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c3a5bc0bdf nv50/ir: add support for barriers
nv50 part by Francisco Jerez.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
a0a25191f2 nv50/ir/tgsi: add support for atomics 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c2dfcd7f0e nv50/ir/tgsi: handle TGSI_OPCODE_LOAD,STORE
Squashed and (heavily) modified original patches by Francisco Jerez:
nv50/ir/tgsi: Implement resource LOAD/STORE (wip).
nv50/ir/tgsi: Emit SUST/SULD for surface access, and add CB LOAD/STORE support
nv50/ir/tgsi: Fix/clean up the LOAD/STORE handling code.

Left out for now:
nv50/ir/tgsi: Resource indirect indexing

Treating raw, read-only surfaces as constant buffers (CBs) was removed
because CBs are limited to a size of 64 KiB which isn't desireable, and
because this decision should probably be made by the state tracker.
If we used a number of CB slots for surfaces, it might find that we
cannot accomodate the advertised limit.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
d105b3df14 nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH 2013-03-12 12:55:35 +01:00
Christoph Bumiller
4506ed28de nvc0/ir: implement lowering of surface ops for nve4 2013-03-12 12:55:35 +01:00
Christoph Bumiller
8ac68b071d nvc0/ir: add formatted surface load lib code, move to extra header
OpenGL is nice and makes the user specify a format with an image unit.
OpenCL is evil and doesn't, and what's better than adding a huge load
of functions that we call indirectly to handle the conversion ?
2013-03-12 12:55:35 +01:00
Christoph Bumiller
ce1951daed nv50/ir: extend moveSources for delta < 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c0fc3463e9 nvc0/ir: lower atomics in s[] 2013-03-12 12:55:35 +01:00
Christoph Bumiller
9c196779bc nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c8f0c43f7a nv50/ir/emit: handle OP_ATOM 2013-03-12 12:55:35 +01:00