Commit graph

22739 commits

Author SHA1 Message Date
Marek Olšák
7c9ec6ca7e radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer
This is easier to read and will work better with shader image stores.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
a1bbccf521 radeonsi: change TC cache flushing strategy for textures
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
ca9c5b2be5 radeonsi: improve and fix streamout flushing
- we don't usually need to flush TC L2
- we should flush KCACHE
  (not really an issue now since we always flush KCACHE when updating
   descriptors, but it could be a problem if we used CE, which doesn't
   require flushing KCACHE)
- add an explicit VS_PARTIAL_FLUSH flag

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
18a30c9778 radeonsi: use TC L2 for CP DMA operations with shader resources on CIK
So that TC L2 doesn't need to be flushed.

The only problem is with index buffers, which don't use TC.
A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
11b76369f5 radeonsi: use TC L2 for updating descriptors on CIK
This allows not flushing TC L2 on CIK later.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
02ba7334d3 radeonsi: don't use TC L2 for updating descriptors on SI
It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA
when updating the same memory.

The solution for SI is to use uncached access here, because CP DMA doesn't
support cached access.

CIK will be handled in the next patch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
edf18da85d radeonsi: only flush the right set of caches for CP DMA operations
That's either framebuffer caches or caches for shader resources.
The motivation is that framebuffer caches need to be flushed very rarely
here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
73c2b0d18c radeonsi: implement separate ICACHE and KCACHE flush for SI
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
0aecf9e2d1 radeonsi: add a combined flag for flushing a framebuffer
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
2bfe9d4538 radeonsi: rename flush flags, split the TC flag into L1 and L2
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d217819e78 r600g,radeonsi: separate cache flush flags
I will rename them for radeonsi.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d14f2ab4ad r600g: move r6xx-specific streamout flush flagging into r600g
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
0543630d0b radeonsi: only set BC_OPTIMIZE_DISABLE when necessary
SPI_PS_IN_CONTROL is moved into the SPI mapping state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
5d8e838dae radeonsi: do not define FACE as an ordinary PS input
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
15a7fff69a radeonsi: remove flatshade from the shader key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
13de9475fc radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen
It doesn't do anything useful. And colors are floating-point, so we can use
fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE
state only (in the next patch).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
e3d4bdd6a8 radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values
Only done for completeness. Not used by anything yet.

Tested by advertising PIPE_CAP_VERTEXID_NOBASE.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d7c6f397f4 radeonsi: fix VertexID for OpenGL
This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
368b0a7340 radeonsi: clarify a hw bug in shader exports
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d1d2af2398 radeonsi: use ordered compares for SSG and face selection
Ordered compares are what you have in C. Unordered compares are the result
of negating ordered compares (they return true if either argument is NaN).

That special NaN behavior is completely useless here, and unordered
compares produce horrible code with all stable LLVM versions.
(I think that has been fixed in LLVM git)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
a38e8de643 radeonsi: remove unused and not useful variables
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
638fa8016a radeonsi: remove init config from states
It really doesn't do anything there.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
9141d88555 radeonsi: reduce the size of si_pm4_state
- the relocs array is unused, remove it
- ndw is at most 115 (init), set 140 as the maximum
- compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
1b82eb677d tgsi: add uses_centroid into tgsi_shader_info 2015-01-07 12:06:43 +01:00
Eric Anholt
426fd535d9 vc4: Fix scaling W projection of the Z coordinate when there's a Z offset.
Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform
gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation
tests.
2015-01-06 17:22:13 -08:00
Eric Anholt
49b5c901e8 vc4: Fix deletion from the program cache.
They key is, oddly enough, in the key field, not in the data field (which
is the vc4_compiled_shader *).  Fixes regular failures in fp-long-alu.
2015-01-06 15:41:36 -08:00
Eric Anholt
b295403971 vc4: Skip storing the Z/S contents when it's invalidated.
Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409%
(n=67).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:41 -08:00
Eric Anholt
239db93888 gallium: Plumb the swap INVALIDATE_ANCILLARY flag through more layers.
v2: Instead of telling the driver that the window system ancillaries have
    been invalidated (when the driver doesn't know which of its buffers
    are the window system's!), introduce a method for invalidating
    specific surfaces.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:41 -08:00
Tom Stellard
a8ef880a1b radeon/llvm: Use amdgcn triple for SI+ on LLVM >= 3.6 2015-01-06 12:53:21 -08:00
Tom Stellard
761e36b4ca radeonsi: Cache LLVMTargetMachine object in si_screen
Rather than building a new one every compile.  This should reduce some
of the overhead of compiling shaders.

One consequence of this change is that we lose the MachineInstrs dumps
when dumping the shaders via R600_DEBUG.  The LLVM IR and assembly is
still dumped, and if you still want to see the MachineInstr dump, you
can run the dumped LLVM IR through llc.
2015-01-06 12:53:21 -08:00
Brian Paul
d294365d06 draw: silence uninitialized variable warning
v2: move initialization of llvm_gs to declaration.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-05 13:50:54 -07:00
Brian Paul
04e35cc4aa gallivm: silence a couple compiler warnings
Silence warnings about possibly uninitialized variables when making a
release build.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-01-05 13:50:54 -07:00
Leonid Shatz
5fea39ace3 gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:39 +01:00
Roland Scheidegger
b59c7ed0ab gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:38 +01:00
Ilia Mirkin
21a280f87c nvc0: add name to magic number
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
7228302009 nvc0: regenerate rnndb headers
The headers hadn't been regenerated in a long time and had seen a number
of manual modifications. A few changes:
 - remove nvc0_2d entirely, use the nv50 header which has the nvc0
   values too
 - remove 3ddefs, it's identical to the nv50 file
 - move macros out into a separate file

Also the upstream rnndb changed the overall chip naming convention; this
was fixed up manually in the generated files until a better solution is
determined.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
7ed02b111a nv50: regenerate rnndb headers
The headers hadn't been regenerated in a long time, and there were a few
minor divergences. Among other things, rnndb has changed naming to
G80/etc, for now I've not tackled switching that over and manually
replaced the nvidia codenames back to the chip ids. However no other
modifications of the headergen'd headers was done.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Tobias Klausmann
1f8c0be27e nv50: enable texture compression
Compression seems to be supported for only some formats. Enable it for
those. Previously this was disabled for everything despite the code
looking like it was actually enabled.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
e452cfb149 nv50/ir: enable sat modifier for OP_SUB
SUB is handled the same as ADD, so no reason not to allow a saturate
modifier on it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Roy Spliet
44673512a8 nv50/ir: Add sat modifier for mul
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
ec3e1e6194 nv50,nvc0: avoid doing work inside of an assert
assert is compiled out in release builds - don't put logic into it. Note
that this particular instance is only used for vp debugging and is
normally compiled out.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
fb1afd1ea5 nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 00:34:33 -05:00
Marek Olšák
3793a1b421 r300g: handle vertex format PIPE_FORMAT_NONE 2015-01-04 23:54:47 +01:00
Roy Spliet
c3260f8d98 nv50/ir: Fold sat into mad
The mad instruction emitter already supported the saturate modifier,
but the ModifierFolding pass never tried folding cvt sat operations
in for NV50.

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-01 21:40:35 -05:00
Ilia Mirkin
9e94b87b60 nv50/ir: fold MAD when one of the multiplicands is const
Fold MAD dst, src0, immed, src2 (or src0/immed swapped) when
 - immed = 0 -> MOV dst, src2
 - immed = +/- 1 -> ADD dst, src0, src2

These types of MAD patterns were observed in some st/nine shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-01 21:40:35 -05:00
Alexander von Gluck IV
290553b6d6 gallium/state_tracker: Rewrite Haiku's state tracker
* More gallium-like
* Leverage stamps properly and don't call mesa functions
2015-01-01 21:33:36 -05:00
Marek Olšák
b77eaafcdc radeonsi: fix warnings 2015-01-01 14:42:32 +01:00
Eric Anholt
a6f6d6188c u_primconvert: Fix leak of the upload BO on context destroy.
v2: Conditionalize it on having done any uploads (Turns out
    u_upload_destroy() isn't safe with a NULL arg).

Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2014-12-31 13:50:17 -08:00
Eric Anholt
37478c638a vc4: Fix memory leak as of 0404e7fe0a.
Can't reset the CL before looking at how much we had pupt in it.
2014-12-31 11:34:28 -08:00
Ilia Mirkin
be0311c962 nv50,nvc0: set vertex id base to index_bias
Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
  arb_draw_indirect-vertexid elements
  gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
2014-12-30 23:30:23 -05:00