Commit graph

37977 commits

Author SHA1 Message Date
Eric Anholt
c2e68bebb4 freedreno: Output the same shader-db format as v3d and intel.
This lets us reuse their report.py, at the expense of fd-report.py no
longer working.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:20 -07:00
Eric Anholt
6d9b45171d freedreno: Remove the ir3_tgsi_to_nir() helper function.
It was more of a hindrance, as it pretended that we could compile in the
driver with a missing screen.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:18 -07:00
Eric Anholt
a0d4d7febf freedreno: Fix assertion failures in context setup in shader-db mode.
The TTN path needs access to the screen to make the right decisions about
lowering, but we didn't have pctx->screen set up at fdN_prog_init time.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:06 -07:00
Marek Olšák
894e017c9c r600+radeonsi: use ctx_query_reset_status on radeon
This allows a nice cleanup, because the winsys always handles it.
2019-05-16 13:15:36 -04:00
Marek Olšák
4549c36788 winsys/radeon: implement ctx_query_reset_status by copying radeonsi
To make it behave like amdgpu. I'm just trying to move this out of
radeonsi. The radeonsi code will be removed in the next commit.
2019-05-16 13:15:36 -04:00
Marek Olšák
6b3343e5d8 winsys/amdgpu: report a CS rejection as a reset only if there's no GPU reset 2019-05-16 13:15:36 -04:00
Marek Olšák
78e35df52a radeonsi: update buffer descriptors in all contexts after buffer invalidation
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
2019-05-16 13:15:36 -04:00
Marek Olšák
0f1b070bad radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets
This is a prerequisite for the next commit.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
2019-05-16 13:14:55 -04:00
Marek Olšák
f3ae455eb0 radeonsi: compute culling - flush CS to remove write references to buffers
Only read-only buffers can use compute culling.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
04122532e3 radeonsi: invalidate caches at the beginning of the prim discard compute IB
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
9f505ce21d radeonsi: disable primitive restart for triangles for DiRT Rally
It may decrease performance and it prevents compute-based primitive culling.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
0252fb92b8 radeonsi: add primitive culling stats to the HUD
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
c9b7a37b8f radeonsi: cull primitives with async compute for large draw calls
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:34 -04:00
Marek Olšák
187f1c999f winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_space
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
4eb377d1c3 radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1
The prim discard compute shader bakes InstanceID into the output index buffer.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
b206f007de radeonsi: make some functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
301344008f radeonsi: allow si_shader_select_with_key to return an optimized shader or fail
If a prim discard compute shader hasn't finished compilation, we don't want
to any shader.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
ca9edd7cd0 radeonsi: use pipe_draw_info::instance_count indirectly
It will be modified by compute shader culling.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
d380fabdbb radeonsi: use pipe_draw_info::prim and primitive_restart indirectly
so that the fields can be changed by the driver.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
43aa2f4f7c radeonsi: make functions for creating LLVM functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
b19884e08e winsys/amdgpu: add a parallel compute IB coupled with a gfx IB
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:07:00 -04:00
Marek Olšák
07c83d25fd radeonsi: add a cs parameter into si_cp_copy_data
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:57 -04:00
Marek Olšák
ce264d19a0 radeonsi: add a cs parameter into si_cp_release_mem
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:56 -04:00
Marek Olšák
9624855f13 radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limits
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:54 -04:00
Marek Olšák
6e38af0631 radeonsi: move si_*_descriptors_idx functions into si_state.h
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:53 -04:00
Marek Olšák
49a016ec5d radeonsi: make si_initialize_compute reusable
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:51 -04:00
Marek Olšák
c44c6951d4 radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helper
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:49 -04:00
Marek Olšák
c7ceeea093 radeonsi: return the last part's return value from @wrapper
The primitive discard compute shader will get the position output this way.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:40 -04:00
Marek Olšák
d569b7cb31 winsys/amdgpu: always set NO_CPU_ACCESS and NO_SUBALLOC on GDS resources
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:18 -04:00
Jan Zielinski
d65b160e6a swr: clean up supported OGL4.0/4.1 extensions list
This commit adjusts the capabilities returned
by the SWR driver and the documentation to correctly
report the following extensions:

GL_ARB_texture_query_lod, GL_ARB_texture_cube_map_array,
GL_ARB_gpu_shader_fp64, GL_ARB_texture_gather,
GL_ARB_vertex_attrib_64bit.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-05-16 17:41:14 +02:00
Leo Liu
aa040d3b3c vl/dri3: set back buffer from output to NULL with front buffer case
Since the using output optimization is only for back buffer case

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-16 10:28:38 -04:00
Roland Scheidegger
4171a26193 auxiliary/draw: fix crash with zero-stride draw auto
transform feedback draws get the number of vertices from the transform
feedback object. In draw, we'll figure this out with the number of bytes
written divided by the stride. However, it is apparently possible we end
up with a stride of 0 there (not entirely sure it could happen with GL).
Probably when nothing was actually ever written (so we don't actually
have a stride set). Just avoid the division by zero by setting the count
to 0.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-05-16 14:01:33 +02:00
Kenneth Graunke
752367b766 iris: Dodge more GLSL IR lowering
This avoids some lower_instructions bits in st.
2019-05-15 19:44:21 -07:00
Alyssa Rosenzweig
74ab80b92d panfrost/midgard: Add load/store opcodes
This commit adds a bunch of new load/store opcodes, largely related to
OpenCL, as well as adjusting the name of existing opcodes to be more
uniform. The immediate effect is compute shaders are substantially
easier to interpret now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:25:25 +00:00
Alyssa Rosenzweig
f73c0b73ec panfrost/midgard: Enable integer constant inlining
Midgard ALU features two types of constants: embedded constants (128-bit
chunk, zero/one per schedule bundle) and inline constants (16-bit
splattered into the op, second source if present). Inline constants are
much more efficient from a space and scheduling freedom standpoint, so
it's desirable to inline when possible. Now that integer ops are well
understood and in use, we enable inlining of integers constants in
addition to floats (which have been inlined since forever).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:20:41 +00:00
Alyssa Rosenzweig
8214aaa3c8 panfrost/midgard: Remove imov workaround
The previous commit fixes the issue this patched around.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:20:41 +00:00
Alyssa Rosenzweig
0a13babdd8 panfrost/midgard: Set int outmod for ops writing integers
By default, the "normal" output modifier is set on ALU ops. This is the
correct default for float outputs -- for floats, it preserves the semantic
value. Unfortunately, when used with integers, it does not preserve the
bitstream encoding, causing misbehaviour. (It's an open question what
happens when `normal` is used with integers -- does it apply some other
transformation? or does it do floating point normalization/etc on the
ints as if they were floats?).

Instead, we default to the "clamp to integer" output modifier for
ops writing integers. Semantically, this makes sense (clamping an
integer to the nearest integer is the identity function). In the
hardware with an integer opcode, this is the actual "normal".

This fixes numerous sporadic and sometimes bizarre bugs relating to
integers, especially integer moves. With this in place, we no longer
care about the types involved; it's just bits on the wire again.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:20:30 +00:00
Alyssa Rosenzweig
81b1053d9b panfrost: Set custom stride for textures when necessary
From Gallium (and our) perspective, the stride of a BO is arbitrary. For
internal buffers, we can make it something nice, but for imported linear
buffers (e.g. EGL clients), we don't always have that luxury. To cope,
we calculate the expected stride of a texture, compare it to the BO's
actual reported stride, and if they differ, set the latter as a custom
stride.

Fixes rendering of windows not on tile boundaries (noticeable in Weston
with es2gears_wayland, for instance). Also, this should fix stride
issues with bufer reloading.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:16:36 +00:00
Alyssa Rosenzweig
cea9352059 panfrost/decode: Stride decoding
With a special flag, texture descriptors can include custom stride(s).
We haven't seen a case of this used for mipmaps/cubemaps, so it's not
clear how that will be encoded, but this dumps correctly for single
one-level 2D textures.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:15:37 +00:00
Alyssa Rosenzweig
d699ffbf0e panfrost/decode: Futureproof texture dumping
One field was not dumped for some reason. It's observed to be 0, but
it's still good to have it available.

Also, extra fields might be snuck in the bitmaps array (it's
variable-lengthed at the end), and we want to guard against that
possibility, so we dump a little more.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 01:15:37 +00:00
Marek Olšák
ccfcb9d818 ac: rename SI-CIK-VI to GFX6-GFX7-GFX8
Acked-by: Dave Airlie <airlied@redhat.com>

We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.

It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
2019-05-15 20:54:10 -04:00
Kenneth Graunke
d305409db5 st/dri: Minor style fixes
Trivial.
2019-05-15 14:49:14 -07:00
Chia-I Wu
659c5800e5 virgl: handle DONT_BLOCK and MAP_DIRECTLY
Handle PIPE_TRANSFER_DONT_BLOCK and PIPE_TRANSFER_MAP_DIRECTLY.
Make virgl_resource_transfer_prepare return an enum instead of a
bool for extensibility (e.g., instruct the callers to map
differently).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-05-15 20:51:28 +00:00
Chia-I Wu
e87186fc67 virgl: add virgl_resource_transfer_prepare
virgl_resource_transfer_prepare should be called before mapping to
prepare the resource.  It does flush, readback, and wait as needed.
virgl_res_needs_flush and virgl_res_needs_readback become internal
helpers to the new function.

There should be no externally visible change.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-05-15 20:51:28 +00:00
Chia-I Wu
cdcf38b98a virgl: honor DISCARD_WHOLE_RESOURCE in virgl_res_needs_readback
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-05-15 20:51:28 +00:00
Chia-I Wu
a62ab178ce virgl: clean up virgl_res_needs_readback
Add comments and follow the coding style of virgl_res_needs_flush.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-05-15 20:51:28 +00:00
Alyssa Rosenzweig
a9cef4f0e5 gallium: Add default check for PIPE_CAP_FRAGMENT_SHADER_INTERLOCK
Fixes: c704c0226 ("gallium: Add a PIPE_CAP_FRAGMENT_SHADER_INTERLOCK")

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-14 21:34:49 -07:00
Andrii Kryvytskyi
eca53f00aa iris: Check if resource has stencil before returning it
Signed-off-by: Andrii Kryvytskyi <andrii.o.kryvytskyi@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-14 21:16:11 -07:00
Kenneth Graunke
bb5db02bab iris: Enable fragment shader interlock on Gen9+.
There's some debate about whether we should support this on older
hardware as well.  Currently i965 turns it off on Gen8- though, so
we follow suit.  If this changes, we can update this as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-14 19:34:33 -07:00
Kenneth Graunke
c704c0226c gallium: Add a PIPE_CAP_FRAGMENT_SHADER_INTERLOCK.
Corresponding to GL_ARB_fragment_shader_interlock and
GL_NV_fragment_shader_interlock.  Currently, only the NIR paths
support this functionality, but someone could conceivably add it
to TGSI too.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-14 19:34:29 -07:00