Commit graph

3483 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
6a220e67ce radv: Switch to using rtld.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen
5ff651c0a7 radv: Move more stuff to variant create time.
Due to them depending on the linker result.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen
726a31df70 radv: Add the concept of radv shader binaries.
This simplifies a bunch of stuff by
(1) Keeping all the things in a single allocation, making things easier
 for the cache.
(2) creating a shader_variant creation helper.

This is immediately put to use by creating rtld shader binaries. This
is the main reason for the binaries, as we need to do the linking at
upload time, i.e. post caching. We do not enable rtld yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen
43f2f01cc8 radv: Add export_prim_id to the shader variant info.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen
15046ef7c8 radv: use last nir shader to determine stage in postprocessing
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen
7469516244 radv: Merge rsrc1/rsrc2 fields with the config fields.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Samuel Pitoiset
cce2645810 radv: do not crash when generating binning state for unknown chips
These values are only useful if binning is disabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-04 12:22:46 +02:00
Samuel Pitoiset
8a425e057d radv: fix potential crash in the compute resolve path
If the destination attachment is UNUSED.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-04 12:22:43 +02:00
Marek Olšák
969e5176c2 ac: rework ac_build_waitcnt for gfx10
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
3203a74dcb radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
76898a8062 amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
4bdf44724f radeonsi/gfx10: set DLC for loads when GLC is set
This fixes L1 shader array cache coherency.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
1666ee183e radeonsi/gfx10: implement hardware MSAA resolve
MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile
optimization that we use on gfx9 and earlier does not work.

Be very explicit about how the swizzle mode of the temporary surface is
selected.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
016a465d7d radeonsi/gfx10: implement gfx10_shader_ngg
For pipelines without API GS. We will later expand this to cover NGG
geometry shaders as well.

Note that the vtx offset passed into the GS part is just the
vertex index multiplied by VGT_ESGS_RING_ITEMSIZE.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
84e7ee421f ac/surface/gfx10: allow "rotated" micro mode
Standard mode does not support DCC.

The R is retconned to "render target" on gfx10.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
a66be784c3 ac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modes
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
97ddcfff7c amd/addrlib/gfx10: forbid DCC for swizzle modes which the hardware does not support
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
9eb4a79345 amd/addrlib/gfx10: fix assertion in Addr2IsValidDisplaySwizzleMode
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
6d416ac7e1 amd/common/gfx10: print gfx10 registers in debug dumps
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
70fd27d1e3 amd/common/gfx10: CMASK is only used for FMASK
All regular color compression is done via DCC.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
b52bf8f12a amd/common/gfx10: support new tbuffer encoding
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
c067aaa580 amd/common/gfx10: pad shader buffers for instruction prefetch
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
227c29a80d amd/common/gfx10: implement scan & reduce operations
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
7ba80c1d19 amd/common/gfx10: add GS_ALLOC_REQ message define
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
4c364c89e2 amd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEM
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
74a26af913 amd/common/gfx10: add register JSON
A small number of fields now need new disambiguation.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
536782b0b7 amd/common: add GFX10 chips
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
db7e7a6cb5 radv: gfx10 is not supported
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Marek Olšák
78cdf9a99f amd/addrlib: add gfx10 support
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Bas Nieuwenhuizen
c6cb9b197d radv: Support VK_EXT_queue_family_foreign.
Basically same as external for now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Only case we might need to handle differently in the near future
is Raven's case of displayable DCC which is not renderable. But
we don't support that yet.
2019-07-03 10:56:21 +00:00
Bas Nieuwenhuizen
8a053254b8 radv: Fix interactions between variable descriptor count and inline uniform blocks.
Fixes: d7e6541cc7 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-03 10:43:35 +00:00
Samuel Pitoiset
a7b6a869a7 radv: only allocate a 32-bit value for the TC-compat range metadata
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 08:52:01 +02:00
Samuel Pitoiset
6baa453dd5 radv: remove unused code in radv_update_tc_compat_zrange_metadata()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 08:51:58 +02:00
Samuel Pitoiset
a21f23c811 radv: add radv_get_depth_pipeline() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 08:51:42 +02:00
Samuel Pitoiset
6cc213b3c1 radv: enable DCC for layers on GFX8
It's currently only enabled if dcc_slice_size is equal to
dcc_slice_fast_clear_size because the driver assumes that
portions of multiple layers are contiguous but it's not
always true.

Still not supported on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:38:02 +02:00
Samuel Pitoiset
233224c7f7 radv: do not enable DCC for mipmapped arrays because performance is worse
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:38:00 +02:00
Samuel Pitoiset
e41e575e24 radv: implement clearing DCC layers on GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:56 +02:00
Samuel Pitoiset
e47c68b7b0 radv: merge radv_dcc_clear_level() into radv_clear_dcc()
This will help for clearing DCC arrays because we need to know
the subresource range.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:51 +02:00
Samuel Pitoiset
f772fe6a11 radv: add support for decompressing DCC layers with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:49 +02:00
Samuel Pitoiset
83297baf2d ac: compute the DCC fast clear size per slice on GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:44 +02:00
Samuel Pitoiset
6517d226ac ac: compute the size of one DCC slice on GFX8
Addrlib doesn't provide this info. Because DCC is linear, at least
on GFX8, it's easy to compute the size of one slice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:41 +02:00
Bas Nieuwenhuizen
d7e6541cc7 radv: Only allocate supplied number of descriptors when variable.
Fixes: b5e04e9217 "radv: Support allocating variable size descriptor sets."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-01 20:53:33 +02:00
Sagar Ghuge
456557a837 nir: Add lower_rotate flag and set to true in all drivers
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Samuel Pitoiset
d8b079e4c7 radv: rework how the number of VGPRs is computed
Just a cleanup, it shouldn't change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:59:27 +02:00
Samuel Pitoiset
e3baa54195 radv: gather if a vertex shaders needs the instance ID
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:59:24 +02:00
Samuel Pitoiset
17cb7ea6fc radv: fix decompressing DCC levels with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:59:22 +02:00
Samuel Pitoiset
f4d2c47cf6 radv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8
Just move around the switch case. GFX9+ is handled below.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:59:19 +02:00
Samuel Pitoiset
b4477fa4d4 radv: reduce number of VGPRs for TESS_EVAL if primitive ID is not used
We only need to 2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:59:17 +02:00
Samuel Pitoiset
cc50c85e13 radv: make sure to mark the image as compressed when clearing DCC levels
Found while working on DCC for arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:58:56 +02:00
Emil Velikov
4ec32413f3 ac: change ac_query_gpu_info() signature
Currently libdrm_amdgpu provides a typedef of the various handles. While
the goal was to make those opaque, it effectively became part of the API

To the best of my knowledge there are two ways to have opaque handles:
 - "typedef void *foo;" - rather messy IMHO
 - "stuct foo;" and use "struct foo *" through the API

In our case amdgpu_device_handle is used only internally, plus
respective code is not used or applicable for r300 and r600. Hence we
copied the typedef.

Seemingly this will be a problem since libdrm_amdgpu wants to change the
API, while not updating the code(?).

Either way, we can safely s/amdgpU_device_handle/void */ and carry on.

Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak at amd.com>
2019-06-28 17:49:32 +01:00