Commit graph

437 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
fb7e4e16e7 radv/amdgpu: Add some debug flags.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 00:10:23 +01:00
Bas Nieuwenhuizen
682248db45 radv: Cache command buffers in command pool.
So that we don't keep allocating BOs for the IBs and upload buffers.

We run some risk of memory increase with e.g. a bimodal size
distribution of command buffers, but I haven't noticed a significant
increase with dota2 and talos.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 00:07:51 +01:00
Dave Airlie
b19caecbd6 radeon/ac: fix intrinsic version check
Reported-by: 375gnu@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 06:05:58 +10:00
Bas Nieuwenhuizen
a247215469 radv: Merge fast clear flushes.
Don't flush multiple times if we clear multiple attachments. Also allows
doing the depth clear in parallel with the fast color clears.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-05 20:40:31 +01:00
Emil Velikov
342e5fdb64 radv: use enum_to_str util functions.
Port of e9dcb17962
vulkan/util: Add generator for enum_to_str functions

Cc: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-04 15:05:14 +00:00
Marek Olšák
7f1446a8a1 ac: normalize build helper names
s/emit/build/

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
8bde7fb3fc ac: replace SI.vs.load.input with amdgcn.buffer.load.format
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
94811dc66c radeonsi: move SI.vs.load.input building into amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
97e21cfa25 ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0
ADD_TID doesn't work. Needs more investigation.

v2: remove leftover dead code

Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2017-03-03 15:29:30 +01:00
Marek Olšák
8cfdbba6c7 ac: remove offen parameter from ac_build_buffer_store_dword
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
27439dfdae radeonsi: merge and simplify tbuffer_store functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
d4324ddb89 radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
9c09592086 radeonsi: move kill intrinsic building into amd/common
just a cleanup

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
e729dc7c46 radeonsi: set readnone on reads from read-only memory 2017-03-03 15:29:30 +01:00
Marek Olšák
653ac0b389 radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz 2017-03-03 15:29:30 +01:00
Marek Olšák
4b2e5b9389 ac: replace old image intrinsics with new ones
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
ad18d7f040 radeonsi: move image intrinsic building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
2b3ebe307c ac: replace SI.export with amdgcn.exp.*
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
369f4a8726 radeonsi: move llvm.SI.export building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
9af03318aa ac: unify build_type_name_for_intr functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
b5744310d4 gallivm, ac: add writeonly and inaccessiblememonly attributes
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Tobias Klausmann
6d600cf632 amd/common: Fix build with new ac_add_function_attr()
Fix usage of ac_add_function_attr() and make it known!

common/ac_nir_to_llvm.c: In function 'create_llvm_function':
common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function
'ac_add_function_attr' [-Werror=implicit-function-declaration]
    ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL);
    ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-01 23:53:38 +01:00
Marek Olšák
940da36a65 gallivm,ac: add function attributes at call sites instead of declarations
They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic.
We need this to force readnone or inaccessiblememonly on some amdgcn
intrinsics.

This is only used with LLVM 4.0 and later. Intrinsics only used with
LLVM <= 3.9 don't need the LEGACY flag.

gallivm and ac code is in the same patch, because splitting would be
more complicated with all the LEGACY uses all over the place.

v2: don't change the prototype of lp_add_function_attr.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
2017-03-01 18:59:36 +01:00
Marek Olšák
408f370710 gallivm,ac: remove unused FUNC_ATTR_LAST enums
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-03-01 18:59:36 +01:00
Dave Airlie
e66be3d3bb radv: fix txs for sampler buffers
I messed this up when I wrote it, this fixes:
dEQP-VK.memory.pipeline_barrier.*uniform_texel_buffer.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-01 08:02:24 +10:00
Marek Olšák
8c838730d0 amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 21:44:30 +01:00
Bas Nieuwenhuizen
6e9fb1de7f radv: Don't allocate space for unused immutable samplers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:18 +01:00
Bas Nieuwenhuizen
137b06b437 radv/ac: Use constants for immutable samplers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:14 +01:00
Bas Nieuwenhuizen
500e6e40f6 radv: Detect if all immutable samplers for a binding are equal.
We can then use constants for indexed loads.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:10 +01:00
Bas Nieuwenhuizen
dd2a0c7aef radv: Store the immutable samplers as uint32_t[4].
So we don't need to know about radv_sampler in ac_nir_to_llvm.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:46:02 +01:00
Timothy Arceri
f0aaa4b3a4 radeon/ac: make ac_shader_binary_config_start() available externally
The read config functions are different for r600 and radeonsi so
we can't just share the one in amd common. So just share this
instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
affc8314cb radeon/ac: add llvm_ir_string to ac_shader_binary struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Dave Airlie
800b82ea13 radv: fix depth format in blit2d.
For blitting we need to use the depth or stencil format, never
the combined.

This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-28 06:11:54 +10:00
Dave Airlie
1121ce4525 radv/formats: add fast clear for 8-bit signed ints.
These formats are used by some CTS tests, may as well fill them in.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-28 06:11:50 +10:00
Bas Nieuwenhuizen
43d833ae97 radv: Use correct size for availability flag.
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen
8ea34a98c0 radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen
bb878db7eb radv: Reset emitted compute pipeline when calling secondary cmd buffer.
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Dave Airlie
15f47027ad radv: add support for NV_dedicated_allocation
This adds initial support for NV_dedicated_allocation, then
uses it for the wsi image/memory allocation paths internally
in the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 00:22:51 +00:00
Andres Rodriguez
35189d3279 radv/winsys: fix freeing imported memory.
This bo->fd wasn't setting some stuff correctly that could
lead to crashes for anything using this path later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 00:22:39 +00:00
Dave Airlie
f695735ed6 vulkan/wsi/radv: add initial prime support (v1.1)
This is a complete rewrite of my previous rfc patches.

This adds the ability to present to a different GPU that rendering
using a driver side operation that can copy from the tiled to
linear shared image.

This does prime support completely in the swapchain present code,
and each queue has a precreated command buffer for each image
and for the each queue family. This means presenting should work
on graphics and compute queues and transfer in the future.

v1.1: initialise needs_linear_copy in swapchain.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 05:42:16 +10:00
Bas Nieuwenhuizen
336b05c49a radv/ac: Add integer->integer casts.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-26 19:59:27 +01:00
Marek Olšák
c7878b0167 ac: silence a warning
trivial
2017-02-25 00:16:38 +01:00
Emil Velikov
e3ad2d40db radv/entrypoints: Only generate entrypoints for supported features
This changes the way radv_entrypoints_gen.py works from generating a
table containing every single entrypoint in the XML to just the ones
that we actually need.  There's no reason for us to burn entrypoint
table space on a bunch of NV extensions we never plan to implement.

RADV implements VK_AMD_draw_indirect_count, so add that to the list.

Port of 114c281e70
"and/entrypoints: Only generate entrypoints for supported features"

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-02-24 17:36:25 +00:00
Dave Airlie
ccb70d6f53 radv: add sample mask output support
This adds support to write to sample mask from the fragment shader.

We can optimise this later like radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:53 +10:00
Dave Airlie
8282c5c771 radv/ac: refactor our fmask sample index fixup.
This refactors out the sample index fixup between
txf and image load.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:49 +10:00
Dave Airlie
5e9ead0fa2 radv: fetch sample index via fmask for image coord as well.
This follows the txf_ms code, I can't figure out why amdgpu-pro
doesn't do this in their shaders, they must know someone we don't.

This fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:44 +10:00
Dave Airlie
bdcbe7c76b radv: add sample mask input support
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:35 +10:00
Dave Airlie
58c97a0791 radv: enable location at sample when persample is forced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:30 +10:00
Dave Airlie
fc430c391b radv: fix interpolation at wrong place for offset interp
The code was interpolating at the offset from the sample,
not the offset from the center. Also fix for persample interpolation
modes we should force the pixel center to be at the sample.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:19 +10:00
Dave Airlie
b71e6538a8 radv/ac: handle gs->copy shader clip distances.
This fixes up the clip distance passing between the geometry
shader and the copy shader. It packs the clip and cull distances
into one or two consecutive slots, and avoids wasting space and
make sure the gs output and copy shader input agree on where
things are stored.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-23 15:31:41 +10:00