Commit graph

31608 commits

Author SHA1 Message Date
Ilia Mirkin
5fdcddbeb4 a5xx: update headers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Marek Olšák
156832ee2b gallium/radeon: attempt to fix a compiler failure in radeon_winsys.h
trivial.
2017-07-04 22:40:35 +02:00
Marek Olšák
0591df025b winsys/amdgpu: use 128KB BOs for suballocations of up to 64KB BOs
This decreases the number of BOs, but might also increase memory usage.
It's better for small textures.

The gameplay is on the far right:
https://people.freedesktop.org/~mareko/suballoc.svg

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
c784015643 gallium/radeon: allow suballocating textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
23446eedd1 gallium/radeon: generalize the function for in-place texture reallocation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
91f72975ac gallium/radeon: add radeon_winsys::buffer_is_suballocated
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
0f13451da3 gallium/radeon: clean up pb_cache bucket/usage determination
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
d4fac1e1d7 gallium/radeon: enable suballocations for VRAM with no CPU access
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
64e5577cac gallium/radeon: clean up (domain, flags) <-> (slab heap) translations
This is cleaner, and we are down to 4 slabs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
b09a22ad21 gallium/radeon: remove RADEON_FLAG_CPU_ACCESS
https://lists.freedesktop.org/archives/amd-gfx/2017-June/010591.html

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
03c5ef195d gallium/radeon: disallow exports of sparse and suballocated BOs
I think it's unsafe, because the slabs can reuse exported storage.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
047c34f0ac gallium/radeon: clean up r600_texture_get_handle
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
7525c3e123 gallium/radeon: rename RADEON_FLAG_HANDLE -> RADEON_FLAG_NO_SUBALLOC
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
e6dbe975ef gallium/radeon: fix a possible crash for buffer exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
fee2883bd7 gallium/radeon: ignore PIPE_BIND_SHARED for buffers
BO exports can't be predicted this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
5b373629fc radeonsi: add a HUD query for getting an average GFX BO list size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Juan A. Suarez Romero
2c240a7205 vc4: automake: include vc4_cl_dump.h in
Ensure vc4_cl_dump.h and $(BROADCOM_FILES) are distributed in the
dist-file.

This fixes `make distcheck`

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-04 09:37:19 +02:00
Brian Paul
6158c0b5d8 svga: don't call svga_texture_device_format_has_alpha() for PIPE_BUFFER
svga_texture_device_format_has_alpha() is only intended to work for
texture resources, not buffer resources.  This fixes a failed assertion
in the svga_texture() cast function when running texture buffer tests.

Also, add an assertion in svga_texture_device_format_has_alpha() to
catch the issue sooner.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:14 -06:00
Brian Paul
e6d1cc31fa svga: fix texture buffer object regression
With change 8aba778fa2 we stopped binding
sampler objects for texture buffers.  That broke our texture sample /
sampler view setup code.

Now, we loop over the max(num samplers, num sampler views) and handle
the sampler and view information separately.  For texture buffers,
the sampler will be NULL but the sampler view non-null.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:13 -06:00
Brian Paul
6b4bf7e8be svga: move assertion in draw_vgpu10()
The buffer binding flags aren't ensured until after the
svga_buffer_handle() call, so move the assertion after it.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:13 -06:00
Brian Paul
9bd047aa26 svga: fix buffer binding flags initialization
If a buffer is created/initialized with glNamedBufferData we will
have no target (GL_ARRAY_BUFFER, GL_UNIFORM_BUFFER, etc) so the
svga_buffer::bind_flags will be zero until we try to get the buffer
handle.

This patch initializes the svga_buffer::bind_flags field when it's
zero.

This fixes the Piglit arb_uniform_buffer_object-rendering-dsa test.

Note that there's still issues in this area that'll have to be
addressed in the future.  For example, creating a buffer object
as GL_UNIFORM_BUFFER and later using it as a vertex buffer will
fail.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:11 -06:00
Nicolai Hähnle
b0b4b5e8f7 winsys/radeon: only call pb_slabs_reclaim when slabs are actually used
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100242
Fixes: fb827c055c ("winsys/radeon: enable buffer allocation from slabs")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-03 12:39:41 +02:00
Bruce Cherniak
32c1a54bd0 swr: Limit memory held by defer deleted resources.
This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold.  This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".

This resolves  bug 101467 filed against swr for the piglit
streaming-texture-leak test.  For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.

Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations.  But, this addresses those as well.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-02 17:38:57 -05:00
Brian Paul
f215f42f1b svga: add texture size/levels sanity check code in svga_texture_create()
The state tracker should never ask us to create a texture with invalid
dimensions / mipmap levels.  Do some assertions to check that.

No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
e54fe78e0e gallium/docs: document that TXF is used with PIPE_BUFFER resources
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
f4091e1638 gallium/docs: clarify that samplers are not used with PIPE_BUFFER resources
Commit 8aba778fa2 "st/mesa: don't set
sampler states for TBOs" changed how texture buffer objects are handled.
Document the new convention.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Eric Anholt
d623040dd5 vc4: Start using XML unpack functions in CL dump.
For now this is a no-op on the output, but it makes it clear that we've
had weird things going on with things like
V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
2017-06-30 12:25:45 -07:00
Eric Anholt
56541d356d vc4: Replace a couple of magic numbers with #define usage. 2017-06-30 12:25:45 -07:00
Eric Anholt
f6c5c6b9be vc4: Move rasterizer state packing to CSO creation time.
This gets our vc4_emit.c size back down a bit:

before:
   1020       0       0    1020     3fc src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
    968	      0	      0	    968	    3c8	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
bd1925562a vc4: Convert the driver to emitting the shader record using pack macros. 2017-06-30 12:25:45 -07:00
Eric Anholt
8d36bd3d08 vc4: Simplify pack header usage
Take the CL pointer in, which will be useful for enabling relocs.
However, our code expands a bit more:

before:
   4449       0       0    4449    1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
    988       0       0     988     3dc src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
   4481	      0	      0	   4481	   1181	src/gallium/drivers/vc4/.libs/vc4_draw.o
   1020	      0	      0	   1020	    3fc	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
4cef255872 vc4: Start using the pack header.
This slightly inflates the size of the generated code, in exchange for
getting us some convenient tools.

before:
   4389	      0	      0	   4389	   1125	src/gallium/drivers/vc4/.libs/vc4_draw.o
    808	      0	      0	    808	    328	src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
   4449	      0	      0	   4449	   1161	src/gallium/drivers/vc4/.libs/vc4_draw.o
    988	      0	      0	    988	    3dc	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
7f80a9ff13 vc4: Introduce XML-based packet header generation like Intel's.
I really liked this idea, as it should help with management of packet
parsing tools like the CL dump.  The python script is forked off of theirs
because our packets are byte-based instead of dwords, and the changes to
do so while avoiding performance regressions due to unaligned accesses
were quite invasive.

v2: Fix Android.mk paths, drop shebang for python script, fix overlap
    detection.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Rob Herring <robh@kernel.org>
2017-06-30 12:25:45 -07:00
Bruce Cherniak
6646f6ba0d swr: Minor cleanup of variable usage, no functional change.
In swr_update_derived, for consistency, index buffer validation should
be using the p_draw_info copy "info" rather than referencing
p_draw_info.

No functional change.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
b9b53e2695 swr: use swr_query_result type instead of void
Tag pStat field in swr_draw_context structure so gen_llvm_types.py
can deal with the actual structure type instead of using void.

Code cleanup, no functional change.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
80bd5cd9d0 swr/rast: increase number of possible draws in flight
Increases performance of some large workloads on KNL by ~30%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
ab564c7ab4 swr/rast: move default split size from driver to rasterizer
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
64af92c977 swr/rast: Fix missing setup of psContext.pColorBuffer
Fixes render target read access from pixel shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
fc4f6c44c4 swr/rast: Switch intrinsic usage to SIMDLib
Switch from a macro-based simd intrinsics layer to a more C++
implementation, which also adds AVX512 optimizations to 128-bit
and 256-bit SIMD.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
614de92f10 swr/rast: Fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
0cc7c46cf4 swr/rast: Split rasterizer.cpp to improve compile time
Hardcode split to four files currently.  Decreases swr build
time on KNL by over 50%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
5eecaca911 swr/rast: gen_backends.py remove extraneous semicolon
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
f87ff64850 swr/rast: Support dynamically sized vertex layout
Each shader stage state (VS, TS, GS, SO, BE/CLIP) now has a
vertexAttribOffset to specify the offset to the start of the
general attribute section of the incoming verts for that stage.
It is up to the driver to set this up correctly based on the
active stages. All the shader stages use this value instead of
VERTEX_ATTRIB_START_SLOT to offset to the incoming attributes.

Only the vertex shader stage supports dynamic layout output
currently. The other stages continue to expect the output to be
the fixed layout slots as before. Will be enabling GS next.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
cae53b24d7 swr/rast: Split backend.cpp to improve compile time
Hardcode split to four files currently.  Decreases swr build
time on a quad-core by ~10%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
b89bd3694c swr/rast: gen_backends.py removal of commented debug prints
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
248663f91d swr/rast: gen_backends.py quote cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
ba64ddedc2 swr/rast: generators will create target directories
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Mauro Rossi
84690d06c1 Android: fix typo in symlink for driver loading and 32 bit builds
There is typo in the mkdir command path,
the correct one is $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)

The other issue is in 32bit builds, because lib64 does not exist there,
we can use TARGET_IS_64_BIT to refine the post install command.

Fixes: a3d98ca62f ("Android: use symlinks for driver loading")

Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-30 11:23:51 -05:00
Brian Paul
0782350b80 svga: update a few surface format names
To sync with in-house changes.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-30 08:24:27 -06:00
Brian Paul
d3cbe8c5f3 svga: whitespace fixes in svga_resource_buffer_upload.c
Trivial.
2017-06-30 08:24:27 -06:00