Commit graph

37593 commits

Author SHA1 Message Date
Kenneth Graunke
2208d5a683 iris: Fix DrawTransformFeedback math when there's a buffer offset
We need to subtract the starting offset from the final offset before
dividing by the stride.  See src/intel/vulkan/genX_cmd_buffer.c:3142.

Not known to fix anything.
2019-04-23 15:57:07 -07:00
Kenneth Graunke
38db20245b iris: Make some offset math helpers take a const isl_surf pointer 2019-04-23 15:47:10 -07:00
Chia-I Wu
cc53815ae1 virgl: skip empty cmdbufs
Several empty cmdbufs are submitted by app/xserver per frame, from
glamor_block_handler for example.  Let's skip them.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-04-23 19:07:48 +00:00
Eric Anholt
ec686a66db gallium: Remove the malloc pipebuffer manager.
This has been unused since r600 stopped using it in 2010.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>
2019-04-23 10:36:07 -07:00
Eric Anholt
6345dfc8f3 gallium: Remove the "alt" pipebuffer manager interface.
This one would allocate from two underlying pools, but has never been
used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>
2019-04-23 10:36:07 -07:00
Eric Anholt
8e31a4f27f gallium: Remove the ondemand pipebuffer manager.
I couldn't find any uses in the tree since its introduction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>
2019-04-23 10:36:07 -07:00
Eric Anholt
f5c08d9818 gallium: Remove the pool pipebuffer manager.
Noticed while trying to decide if pipebuffer was of any use to me, and
found that nothing has used it in the last 10 years at least.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>
2019-04-23 10:36:07 -07:00
Jonathan Marek
d133f55a99 freedreno: a2xx: same gmem2mem sequence for all tiles
Set REG_A2XX_RB_COPY_DEST_OFFSET in the tile init as it won't get touched
by the draw batch. Then gmem2mem is the same for all tiles.

Similar to what is done in a6xx, but only for gmem2mem.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-23 17:13:32 +00:00
Jonathan Marek
4107e0678a freedreno: a2xx: enable batch reordering
Batch reordering on a2xx is now tested and functional.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-23 17:13:32 +00:00
Jonathan Marek
7f670ca5fd freedreno: a2xx: use nir_lower_io for TGSI shaders
Allows removing the load_deref/store_deref code in the compiler.

tgsi_to_nir now uses screen instead of options so we can simplify that too.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-23 17:13:32 +00:00
Jonathan Marek
bce4f11dbc freedreno: a2xx: disable PIPE_CAP_PACKED_UNIFORMS
a2xx driver is currently broken when PIPE_CAP_PACKED_UNIFORMS is enabled,
disable it for now.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-23 17:13:32 +00:00
Jonathan Marek
418c3d9a4f freedreno: a2xx: fix builtin blit program compilation
tgsi_to_nir now requires a screen pointer and is used by fd2_prog_init.
fd2_prog_init is used before fd_context_init so set the pointer manually.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-23 17:13:32 +00:00
Jonathan Marek
33cafb41a2 svga: add new ATC formats to the format conversion table
Fixes the static assertion error.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-23 17:11:56 +00:00
Jonathan Marek
0e696416f9 freedreno: a2xx: add GL_AMD_compressed_ATC_texture support
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-23 17:11:56 +00:00
Jonathan Marek
734409096b freedreno: a3xx: add GL_AMD_compressed_ATC_texture support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-23 17:11:56 +00:00
Jonathan Marek
bfa72e4d52 llvmpipe, softpipe: no support for ATC textures
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-23 17:11:56 +00:00
Jonathan Marek
ea254fcd3c gallium: add ATC format support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-23 17:11:56 +00:00
Marek Olšák
951d60f8cd radeonsi: delay adding BOs at the beginning of IBs until the first draw
so that bound compute shader resources won't be added when they are not
needed and same for graphics.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:36:36 -04:00
Marek Olšák
09bb8c8557 radeonsi: add helper si_get_minimum_num_gfx_cs_dwords
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:36:34 -04:00
Marek Olšák
c59d238bb0 radeonsi: add si_cp_copy_data
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:36:33 -04:00
Marek Olšák
694e320643 winsys/amdgpu: clean up and remove nonsensical assertion
The assertion considers max_dw from the current IB in the chain, but
big_ib_buffer is a buffer for the next IB, which can be smaller.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:36:31 -04:00
Marek Olšák
1807f6cfe9 winsys/amdgpu: enable chaining for compute IBs
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:36:06 -04:00
Marek Olšák
b99bed6246 winsys/amdgpu: reorder chunks, make BO_HANDLES first, IB and FENCE last
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák
437d032b7d winsys/amdgpu: make IBs writable and expose their address
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák
64d6cc982d ac: add radeon_info::marketing_name, replacing the winsys callback
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák
9b33465481 tgsi/scan: add uses_drawid
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Kenneth Graunke
77449d7c41 iris: Track valid data range and infer unsynchronized mappings.
Applications frequently call glBufferSubData() to consecutive regions
of a VBO to append new vertex data.  If no data exists there yet, we
can promote these to unsynchronized writes, even if the buffer is busy,
since the GPU can't be doing anything useful with undefined content.
This can avoid a bunch of unnecessary blitting on the GPU.

u_threaded_context would do this for us, and in fact prohibits us from
doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED).  But we haven't
hooked that up yet, and it may be useful to disable u_threaded_context
when debugging...at which point we'd still want this optimization.  At
the very least, it would let us measure the benefit of threading
independently from this optimization.  And it's not a lot of code.

Removes most stall avoidance blits in "Total War: WARHAMMER."

On my Skylake GT4e at 1920x1080, this appears to improve performance
in games by the following (but I did not do many runs for proper
statistics gathering):

   ----------------------------------------------
   | DiRT Rally        | +2% (avg) | + 2% (max) |
   | Bioshock Infinite | +3% (avg) | + 9% (max) |
   | Shadow of Mordor  | +7% (avg) | +20% (max) |
   ----------------------------------------------
2019-04-23 00:24:08 -07:00
Kenneth Graunke
768b17a7ad iris: Make a resource_is_busy() helper
This checks both "is it busy" and "do we have work queued up for it"?
2019-04-23 00:24:08 -07:00
Kenneth Graunke
5ad0c88dbe iris: Replace buffer backing storage and rebind to update addresses.
This implements PIPE_CAP_INVALIDATE_BUFFER and invalidate_resource(),
as well as the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag.  When either
of these happen, we swap out the backing storage of the buffer for a
new idle BO, allowing us to write to it immediately without stalling
or queueing a blit.

On my Skylake GT4e at 1920x1080, this improves performance in games:

   -----------------------------------------------
   | DiRT Rally        | +25% (avg) | +17% (max) |
   | Bioshock Infinite | +22% (avg) | +11% (max) |
   | Shadow of Mordor  | +27% (avg) | +83% (max) |
   -----------------------------------------------
2019-04-23 00:24:08 -07:00
Kenneth Graunke
0a082b6560 iris: Make memzone_for_address non-static
I want to use this in iris_resource.c.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
72277044e2 iris: Make a gl_shader_stage -> pipe_shader_stage helper function
This is probably not the best place for it, but I don't feel like moving
the one out of the TGSI translator today, and we already have the other
direction here, so...*shrug*
2019-04-23 00:24:08 -07:00
Kenneth Graunke
b45dff1da8 iris: Rework image views to store pipe_image_view.
This will be useful when rebinding images.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
2f60850a3f iris: Rework UBOs and SSBOs to use pipe_shader_buffer
This unifies a bunch of the UBO and SSBO code to use common structures.
Beyond iris_state_ref, pipe_shader_buffer also gives us a buffer size,
which can be useful when filling out the surface state.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
00d4019676 iris: Track bound constant buffers
This helps avoid having to iterate over [0, PIPE_MAX_CONSTANT_BUFFERS)
looking to see if any resources are bound.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
4d12236072 iris: Mark constants dirty on transfer unmap even if no flushes occur
I have various conditions in place to try and avoid unnecessary
PIPE_CONTROL flushes, especially to batches which may have never
used the buffer being mapped.  But if we do a CPU map to a bound
constant buffer, we still need to mark push constants dirty, even
if there's nothing happening in batches that would warrant a flush.

Fixes obvious misrendering in the "XCOM 2: War of the Chosen" menus
(lots of rainbow colored triangles).  Fixes lots of blinking elements
in "Shadow of Mordor".  Fixes missing crowd rendering in "DiRT Rally".
2019-04-23 00:24:08 -07:00
Marek Olšák
b58e5fb6f3 radeonsi: use CP DMA for the null const buffer clear on CIK
This is a workaround for a thread deadlock that I have no idea
why it occurs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108879
Fixes: 9b331e462e

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-22 16:05:52 -04:00
Kenneth Graunke
1566054459 iris: Track bound and writable SSBOs
Marek recently extended pipe->set_shader_buffers() to take an extra
writable_bitmask parameter, indicating which SSBOs are writable (some
may be bound read-only).  We can use this to decide whether to set
EXEC_OBJECT_WRITE when pinning.  Avoiding the write flag can save us
some cross-batch flushing if the SSBO is used for reading in both the
render and compute engines.
2019-04-22 11:31:14 -07:00
Chia-I Wu
e9c5e13344 virgl: clear vertex_array_dirty
Clear vertex_array_dirty after the state is emitted.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-04-22 10:19:47 -07:00
Lubomir Rintel
e983a975c6 gallivm: disable NEON instructions if they are not supported
The LLVM project made some questionable decisions about defaults for
armv7 (e.g. they enable NEON that is not there on NVIDIA and Marvell
platforms).

On top of that, getHostCPUFeatures() doesn't disable missing machine
attributes. Finally, -neon alone is not sufficient to disable emmision
of NEON instructions.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 09:47:49 -07:00
Lubomir Rintel
bc6bfc861f gallivm: guess CPU features also on ARM
getHostCPUFeatures() is also available on ARM, for even longer time than
for x86. Use it -- it potentially enables instructions that may speed
things up.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/518
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 09:47:39 -07:00
Kenneth Graunke
36478b9f77 iris: Enable the dual_color_blend_by_location driconf option.
This fixes rendering in Unigine Valley 1.0 and Heaven 4.0.
2019-04-22 09:36:36 -07:00
Kenneth Graunke
faa52e328e iris: Add mechanism for iris-specific driconf options
Based on Nicolai's 0f8c5de869.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-04-22 09:35:36 -07:00
Icenowy Zheng
3e91c7d544 lima: add Android build
Currently only meson build supported is added for lima driver.

Add Android build support for lima.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Acked-by: Qiang Yu <yuq825@gmail.com>
2019-04-21 01:05:19 +00:00
Andre Heider
8b13aac966 st/nine: skip position checks in SetCursorPosition()
For HW cursors, "cursor.pos" doesn't hold the current position of the
pointer, just the position of the last call to SetCursorPosition().

Skip the check against stale values and bump the d3dadapter9 drm version
to expose this change of behaviour.

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2019-04-20 13:06:29 +02:00
Alyssa Rosenzweig
648cda258b panfrost/mdg: Use shared fsign lowering
Fixes failures in shaders.operator.common_functions.sign.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-19 23:15:57 +00:00
Alyssa Rosenzweig
31d9caa239 panfrost: Fixup vertex offsets to prevent shadow copy
Mali attribute buffers have to be 64-byte aligned. However, Gallium
enforces no such requirement; for unaligned buffers, we were previously
forced to create a shadow copy (slow!). To prevent this, we instead use
the offseted buffer's address with the lower bits masked off, and then
add those masked off bits to the src_offset. Proof of correctness
included, possibly for the opportunity to say "QED" unironically.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-19 22:50:20 +00:00
Alyssa Rosenzweig
e008d4f011 panfrost: Track BO lifetime with jobs and reference counts
This (fairly large) patch continues work surrounding the panfrost_job
abstraction to improve job lifetime management. In particular, we add
infrastructure to track which BOs are used by a particular job
(currently limited to the vertex buffer BOs), to reference count these
BOs, and to automatically manage the BOs memory based on the reference
count. This set of changes serves as a code cleanup, as a way of future
proofing for allowing flushing BOs, and immediately as a bugfix to
workaround the missing reference counting for vertex buffer BOs.
Meanwhile, there are a few cleanups to vertex buffer handling code
itself, so in the short-term, this allows us to remove the costly VBO
staging workaround, since this patch addresses the underlying causes.

v2: Use pipe_reference for BO reference counting, rather than managing
it ourselves. Don't duplicate hash-table key removal. Fix vertex buffer
counting.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-19 22:50:20 +00:00
Kristian H. Kristensen
bcb81b4d48 gallium/auxiliary/vl: Fix a couple of warnings
Remove unused functions and mark unhandled default case with
unreachable.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-04-19 16:17:37 +00:00
Kristian H. Kristensen
3ecfe20648 tgsi: Mark tgsi_strings_check() unused
It's there to hold the static asserts, don't warning about it being
unused.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-04-19 16:17:37 +00:00
Erico Nunes
2288b59ddc lima: enable nir fsign lowering in ppir
The mali utgard pp doesn't support a sign instruction.
Use the nir lowering function for fsign to implement fsign in ppir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-04-19 15:42:23 +00:00