Commit graph

27608 commits

Author SHA1 Message Date
Ilia Mirkin
0f673db6f0 nvc0: reduce overhead from always marking buffers dirty
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Ilia Mirkin
e8ee161b16 nvc0: fix memory barrier flag handling
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Ilia Mirkin
29abbeecd8 nvc0: mark bound buffer range valid
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Marek Olšák
d5491a81ff gallium/radeon: don't use the DMA ring for pipelined buffer uploads
Submitting a DMA IB flushes the GFX IB and all GPU caches.

Vedran Miletić said:
  "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps
   (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling
   from 1200p)."

Some anonymous dude said:
   R9 390 results:
      Tomb Raider (normal settings): 80 -> 88 FPS
      Talos Principle (custom settings): 23 -> 56 FPS
      Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Vedran Miletić <vedran@miletic.net>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
9c35ec2042 r600g: don't flush caches when binding shader resources
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
eff94af794 r600g: only do necessary cache flushes in cp_dma_copy_buffer
The main impact is that {upload, draw, upload, draw, ..} doesn't flush
framebuffer caches before every upload.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
9e62012c30 r600g: only do necessary cache flushes in cp_dma_clear_buffer
The main impact is that fast color clear doesn't flush TC, CONST, DB.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
c92a3ae7e9 r600g: remove a CP DMA workaround that's not needed anymore
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
5ea5ed6050 r600g: fix CP DMA hazard with index buffer fetches (v3)
v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel,
    otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ade16e1f5d r600g: properly sync CP with CP DMA on R6xx
This will allow removing useless cache & IB flushes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
7746903d3a r600g: write WAIT_UNTIL in the correct place
This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ee0c96c11e gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ada3d8f31e gallium/u_suballoc: allow different alignment for each allocation
Just move the alignment parameter from u_suballocator_create
to u_suballocator_alloc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Rob Clark
1535519e51 freedreno/ir3: do idiv lowering after main opt loop
Give algebraic-opt pass a chance to catch udiv by const power-of-two,
before running lower-idiv pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-03 16:05:03 -04:00
Nicolai Hähnle
a64c7cd2ba radeonsi: mark buffer texture range valid for shader images
When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-03 14:11:05 +02:00
Samuel Pitoiset
28590eb949 nvc0: mark buffer texture range valid for shader images
Loosely based on radeonsi (Thanks to Nicolai).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-06-03 00:12:23 +02:00
Charmaine Lee
0cf0d7c02e svga: allow copy box in svga_transfer_dma_band()
Instead of just allow copy of a rectangle in svga_transfer_dma_band(),
this patch allows it to copy a box, hence allows copy a 3d texture
in one transfer.

Fixes black screen in running Heaven after commit fb9fe35. (Bug 1663282)

Tested with Heaven, glretrace, piglit.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-02 15:03:41 -06:00
Rob Clark
94d8fbd217 freedreno: fix bad bitshift warnings
Coverity doesn't realize idx will never be negative.  Throw in some
assert()s to help it out.

(Hopefully assert() isn't getting compiled out for coverity build.. but
there seems to be just one way to find out.  We might have to change
these to assume())

Fixes CID 1362442, 1362443

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 16:29:32 -04:00
Rob Clark
676c77a923 freedreno: assume builtin shaders do compile
Maybe we should switch to ureg to build the builtin shaders.  But at any
rate, if they fail to compile it is because someone messed them up (or
changed TGSI syntax?).

CID 1362444

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 16:29:32 -04:00
Rob Clark
80c2886033 freedreno/a4xx: silence coverity warning
CID 1362451

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
9b854ce53c freedreno/a3xx+a4xx: fix potential null ptr deref
Coverity spotted the a3xx case (not sure why not the a4xx).

CID 1362452

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
27a97097e1 freedreno/ir3: fix coverity warning
CID 1362453

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
374ad2e2bd freedreno/ir3: use nir_shader_get_entrypoint() helper
Should also fix coverity warning: CID 1362454

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
df64cd6814 freedreno/a4xx: fix incorrect enum type
a4xx has it's own enum, different from a2xx/a3xx.

Spotted by coverity: CID 1362458, 1362459

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
1632b0eac0 freedreno: fix coverity negative array index warning
Never can happen, since query would not have been created in the first
place if pidx(query_type) return negative.  Lets let coverity realize
this.

CID 1362460

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
ba452d43e0 freedreno: fix dereference before null check
ptr can actually never be null so just drop the check.

CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking ptr suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
228b2b36f4 gallium/util: remove u_staging
Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2016-06-02 15:44:07 -04:00
Rob Clark
18fb922faa freedreno/a3xx: only update/emit bordercolor state when needed
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
11f0652404 freedreno/a4xx: only update/emit bordercolor state when needed
I noticed in stk that it was contributing to a lot of overhead.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Eric Engestrom
17f4c723eb st/osmesa: remove double-write (overwriting)
These two lines have been here since the file was created.
I'm guessing the second one was just for testing during dev, so it's the
one that's going away.

CoverityID: 1296205

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-02 07:05:05 -06:00
Nayan Deshmukh
6c9a352d79 st/vdpau: check for null pointer in get/put bits.
Check for null pointer before accessing arrays in get/put bits
native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-02 09:28:48 +02:00
Christian König
b3e75c3997 radeon/uvd: fix the H264 level for Tonga v2
We support 5.2 for a while now.

v2: we even support 5.2 for H264, 5.1 is for HEVC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
2016-06-02 09:27:57 +02:00
Nicolai Hähnle
c7877b9dab winsys/amdgpu: decay max_ib_size over time
So that memory use will eventually decrease again after a temporary peak.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
6aff6377b1 winsys/amdgpu: implement IB chaining on the gfx ring
As a consequence, CE IB size never triggers a flush anymore.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
45be461f55 winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalize
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
89ba076de4 radeon/winsys: introduce radeon_winsys_cs_chunk
We will chain multiple chunks together and will keep pointers to the older
chunks to support IB dumping.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
a7c26bfc0c radeonsi/sid: add packet definitions for IB chaining
While we're at it, add packet printing in si_debug.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
83a01cb498 winsys/amdgpu: start with smaller IBs, growing as necessary
This avoids allocating giant IBs from the outset, especially for CE and DMA.

Since we now limit max_dw only by the size that the buffer happens to be
(which, due to the buffer cache, can be even larger than the rounded-up size
we request), the new function amdgpu_ib_max_submit_dwords controls when we
submit an IB.

With this change, we effectively never flush prematurely due to the CE IB,
after an initial warm-up phase.

v2:
- clean up buffer_size calculation

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
f80c6abb9e winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functions
The latter function allows getting the containing amdgpu_cs from any IB
(including non-main ones).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
9e5ed559ba winsys/amdgpu: extract IB big buffer allocation for re-use
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
9db851b5ee winsys/amdgpu: add IB buffer in amdgpu_get_new_ib
Adding the buffer when we start using it for the IB makes the logic for
chaining a bit simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
d6211a61b0 gallium/radeon: use cs_check_space throughout
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
46ad3561be radeon/winsys: add cs_check_space
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
92d5d97b10 winsys/amdgpu: simplify interface of amdgpu_get_new_ib
We'll want to have an amdgpu_cs pointer for future changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
8396ab4241 winsys/amdgpu: add amdgpu_cs_has_user_fence
v2: style change

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
d9893feb2c gallium/cso: allow saving the first fragment shader image slot
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:15 +02:00
Nicolai Hähnle
fc0352ff9c gallium/u_inlines: allow NULL src in util_copy_image_view
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:12 +02:00
Nicolai Hähnle
57f576f1fb gallium: add PIPE_BARRIER_ALL define
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:36:48 +02:00
Marek Olšák
12740efd29 radeonsi: set correct stencil tile mode for texturing
Sadly, this doesn't affect SI and VI in any way.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-01 17:35:30 +02:00
Marek Olšák
ea68215c54 winsys/amdgpu: set flags correctly when allocating depth-stencil buffers
This mimics Vulkan. It also documents how to fix stencil texturing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-01 17:35:30 +02:00