Commit graph

40679 commits

Author SHA1 Message Date
Rob Clark
a3dc975ee7 freedreno/ir3: also track # of nops for shader-db
The instruction count is (mostly) a measure of what optimization passes
can do, while # of nops is more an indication of how effectively the
scheduler is balancing register pressure vs instruction count.  So track
these independently.

(There could be opportunities to rematerialize values to reduce register
pressure, swapping some nop's with other alu instructions, so nothing is
truely independent.. but it is still useful to break these stats out.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
f3980a8ef7 freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION
Set flag based on actual output reg type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
f0f9ec6882 freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION
We should really be setting this based on the actual output register
type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Marek Olšák
3ef50b023e radeonsi/nir: fix compute shader crash due to nir_binary == NULL
This partially reverts 8b30114dda.

Fixes: 8b30114dda "radeonsi/nir: call nir_serialize only once per shader"
2019-11-08 16:47:59 -05:00
Marek Olšák
8b30114dda radeonsi/nir: call nir_serialize only once per shader
We were calling it twice.

First serialize it, then use it to compute the cache key.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-08 15:30:28 -05:00
Eric Anholt
b1f38aed84 u_format: Fix swizzle of A1R5G5B5.
Found once I started using the generated unpack code from the Mesa side.

Fixes: 4bbaac3782 ("gallium: Add some more channel orderings of packed formats.")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-08 11:56:02 -08:00
David Stevens
0466239aae virgl: support emulating planar image sampling
Mesa emulates planar format sampling with per-plane samplers. Virgl now
supports this by allowing the plane index to be passed when creating a
sampler view from a planar image. With this change, mesa now passes that
information to virgl.

Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Lepton Wu <lepton@chromium.org>
2019-11-08 17:06:56 +00:00
Krzysztof Raszkowski
084431ce45 gallium/swr: Enable some ARB_gpu_shader5 extensions
Enable / add to features.txt:
- Enhanced textureGather.
- Geometry shader instancing.
- Geometry shader multiple streams.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 16:04:47 +00:00
Krzysztof Raszkowski
e5ed9a1b91 gallium/swr: Fix GS invocation issues
- Fixed proper setting gl_InvocationID.
- Fixed GS vertices output memory overflow.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 14:52:16 +00:00
Boris Brezillon
ee82f9f07e panfrost: Try to evict unused BOs from the cache
The panfrost BO cache can only grow since all newly allocated BOs are
returned to the cache (unless they've been exported).

With the MADVISE ioctl that's not a big issue because the kernel can
come and reclaim this memory, but MADVISE will only be available on 5.4
kernels. This means an app can currently allocate a lot memory without
ever releasing it, leading to some situations where the OOM-killer kicks
in and kills the app (or even worse, kills another process consuming
more memory than the GL app) to get some of this memory back.

Let's try to limit the amount of BOs we keep in the cache by evicting
entries that have not been used for more than one second (if the app
stopped allocating BOs of this size, it's likely to not allocate
similar BOs in a near future).

This solution is based on the VC4/V3D implementation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Boris Brezillon
25059cc41f panfrost: Move BO cache related fields to a sub-struct
We will soon introduce an LRU list to evict BOs that have been unused
for more than 1 second. Let's first move all BO cache fields to a
sub-struct to clarify which fields are used by the BO caching logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Kristian H. Kristensen
3699a74a43 freedreno/a6xx: Turn on tessellation shaders
Wow. Very triangle. So shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
53782571ae freedreno/a6xx: Only use merged regs and four quads for VS+FS
When other geometry stages are present, we chose two quads and no
merged regs.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
07aedc367c freedreno/blitter: Save tessellation state
We have tessellation state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
d2d0c8186d freedreno/a6xx: Only set emit.hs/ds when we're drawing patches
At least the gallium blitter helper will call us to draw with
tessellation shaders set but a non-patch primitive.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
e584790885 freedreno: Use bypass rendering for tessellation
It seems like tiling could work in the Adreno architecture, but we've
only ever seen bypass rendering with tessellation.  For now, let's do
that too.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
47e2c19511 freedreno/a6xx: Program state for tessellation stages
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
03a30e7c3d freedreno/a6xx: Emit constant parameters for tessellation stages
Assemble the information the stages need and emit the constants.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
5dd51d2da7 freedreno/a6xx: Allocate and program tessellation buffer
Tessellation needs a couple of buffers that should hold the entire
output from a full VS+TCS draw call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
f0ef3e9697 freedreno/a6xx: Build the right draw command for tessellation
We need to select the right primitive type, set a bit to turn on
tessellation and or in the TES output primitive type.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
8621fbc37b freedreno/ir3: Add tessellation field to shader key
Whether we're tessellating and which primitives the TES outputs
affects the entire pipeline so let's add a field to the key to track
that.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:56 -08:00
Kristian H. Kristensen
d6209a50bb freedreno: Don't count primitives for patches
The gallium helper doesn't like patches and we can't determine how
many primitives it gets tessellated into anyway.  On gens where we
have tessellation, we get the prim count from a HW counter so just
skip counting on the CPU.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:47 -08:00
Kristian H. Kristensen
5d67da13a3 freedreno/ir3: Emit link map as byte or dwords offsets as needed
Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages
that load with ldg (TES) need dwords offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:42 -08:00
Kristian H. Kristensen
3d16ec4a71 freedreno/a6x: Rename z/s formats
What we call eRB6_Z24_UNORM_S8_UINT now is actually
RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually
RB6_Z24_UNORM_S8_UINT.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:36 -08:00
Kristian H. Kristensen
50124afe34 freedreno/a6xx: Fix layered texture type enum
2D array textures and 3D textures are different enum values after all.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:33 -08:00
Kristian H. Kristensen
0276d0766d freedreno: Add nogmem debug option to force bypass rendering
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:31 -08:00
Kristian H. Kristensen
7fed7c2a7d freedreno/a6xx: Clear sysmem with CP_BLIT
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:28 -08:00
Kristian H. Kristensen
b0b443dcab freedreno/a6xx: Fix primitive counters again
We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO)
PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit
the geometry pipeline, whether or not xfb is on.  Then for
PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction
that writes out per-stream counts for generated and emitted, but only
when xfb is enabled.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:22 -08:00
Erico Nunes
9817bff4da lima: fix bo submit memory leak
Fix memory leak on allocation for lima submit, reported by valgrind.

128 bytes in 1 blocks are definitely lost in loss record 38 of 84
   at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91)
   by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139)
   by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113)
   by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57)
   by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802)
   by 0x586378F: lima_draw_vbo (lima_draw.c:1351)
   by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184)
   by 0x55D0A57: st_draw_vbo (st_draw.c:268)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)
   by 0x43610B: Mesh::render_vbo() (mesh.cpp:583)
   by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242)
   by 0x41131B: MainLoop::draw() (main-loop.cpp:133)
   by 0x411947: MainLoop::step() (main-loop.cpp:108)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Erico Nunes
d939f5d463 lima: fix nir shader memory leak
Fix memory leak on allocation for nir shader, reported by valgrind.

3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
   at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x5750817: ralloc_size (ralloc.c:119)
   by 0x5750977: rzalloc_size (ralloc.c:151)
   by 0x575C173: nir_shader_create (nir.c:45)
   by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
   by 0x55D5003: st_create_fp_variant (st_program.c:1242)
   by 0x55D789F: st_get_fp_variant (st_program.c:1522)
   by 0x55D789F: st_get_fp_variant (st_program.c:1507)
   by 0x56400C3: st_update_fp (st_atom_shader.c:163)
   by 0x563D333: st_validate_state (st_atom.c:261)
   by 0x55D07CB: prepare_draw (st_draw.c:132)
   by 0x55D08DF: st_draw_vbo (st_draw.c:184)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Prodea Alexandru-Liviu
1a05811936 Meson: Remove lib prefix from graw and osmesa when building with Mingw.
Also remove version sufix from osmesa swrast on Windows.

v2: Make sure we don't remove lib prefix on *nix platforms.

Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

Cc: "19.3" <mesa-stable@lists.freedesktop.org>
2019-11-07 22:04:50 +00:00
Eric Anholt
b28eb044cd gallium: Add equivalents of packed MESA_FORMAT_*UINT formats.
These are the last formats that MESA_FORMAT had and PIPE_FORMAT
didn't.  The .csv entries channel sizes and swizzles all came from the
corresponding UNORM format.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
6fab4a7b59 gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8.
This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT
didn't.  Note that it's an array format on gallium's side as well,
since it's a NPOT pixel size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
4bbaac3782 gallium: Add some more channel orderings of packed formats.
This covers everything that MESA_FORMAT had for packed unorm.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
6196259d95 gallium: Add defines for FXT1 texture compression.
This texture compression is exposed by 830 and 915, and to make
MESA_FORMAT match PIPE_FORMAT defines I need a corresponding
PIPE_FORMAT.

v2: Set is_hand_written so we don't try to generate pack/unpack code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Dylan Baker
0013af540d osmesa/tests: Extend render test to cover other working cases
Only the GL_UNSIGNED_BYTE cases actually work, the rest all fail, but we
should test the working cases to ensure that they continue to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-07 06:11:19 -08:00
Dylan Baker
7bfb56a135 gallium/osmesa: Convert osmesa test to gtest
This uses a bunch of additional C++ features for niceness and safety.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-07 06:11:19 -08:00
Tomeu Vizoso
072207bc18 panfrost: Pipe the GPU ID into compiler and disassembler
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-07 08:48:45 +00:00
Boris Brezillon
b60ed3c7b2 panfrost: Release the ctx->pipe_framebuffer ref
ctx->pipe_framebuffer contains the last bound FB state, let's release
resources pointed by this FB state when the context is destroyed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-07 08:33:08 +01:00
Boris Brezillon
8c8e4fd5c6 panfrost: Destroy the upload manager allocated in panfrost_create_context()
pipe->stream_uploader has been allocated with u_upload_create_default()
in panfrost_create_context(), let's destroy it in the context destroy
path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-07 08:33:08 +01:00
Lepton Wu
5a40e153fd gallium: dri2: Use index as plane number.
This fix wrong color when playing video under Android + virgl
configuration.

Fixes: 2decad495f ("gallium/dri2: Support images with multiple planes for modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-11-06 21:58:28 +00:00
Tomeu Vizoso
6469c1a445 panfrost: Generate polygon list manually for SFBD
On clears without draws, the SFBD GPUs need for userspace to generate
the trivial polygon list.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:19:31 +01:00
Tomeu Vizoso
8e1ae5fa14 panfrost: Decode blend shaders for SFBD
Also set MALI_HAS_BLEND_SHADER as needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
afeda06062 panfrost: Take into account texture layers in SFBD
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
9447a84f69 panfrost: Rework format encoding on SFBD
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
e40d11ccb2 panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720
Testing shows that it's needed.

Also remove ctx->is_t6xx as it was the last use of it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:17:13 +01:00
Tomeu Vizoso
23fe7cd2d6 panfrost: Add checksum fields to SFBD descriptor
During tests on T720, these fields were discovered.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:17:13 +01:00
Erik Faye-Lund
bc80900b6c zink: do advertize integer support in shaders
This is supported, so let's correct this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-06 13:43:14 +01:00
Erik Faye-Lund
8920689a58 zink/spirv: implement ball_fequal[2-4] 2019-11-06 13:43:14 +01:00
Erik Faye-Lund
ea2d9b3d38 zink/spirv: implement ball_iequal[2-4] 2019-11-06 13:43:14 +01:00