fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 00:08:09 +02:00

Author	SHA1	Message	Date
Marek Olšák	8a71f60194	ac: replace glc,slc with cache_policy for loads cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:38:56 -04:00
Marek Olšák	a29e781961	ac: replace glc,slc with cache_policy for stores cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:38:54 -04:00
Nicolai Hähnle	cb07f91489	amd/common: move ac_shader_{binary,reloc} into r600 and rename They are no longer used by radeonsi or radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Nicolai Hähnle	510e74ff48	amd/common: removed unused ac_shader_binary functions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Nicolai Hähnle	b398230e6d	amd/common: remove unused ac_compile_module_to_binary Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Marek Olšák	969e5176c2	ac: rework ac_build_waitcnt for gfx10 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	3203a74dcb	radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	76898a8062	amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	4bdf44724f	radeonsi/gfx10: set DLC for loads when GLC is set This fixes L1 shader array cache coherency. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	1666ee183e	radeonsi/gfx10: implement hardware MSAA resolve MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile optimization that we use on gfx9 and earlier does not work. Be very explicit about how the swizzle mode of the temporary surface is selected. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	016a465d7d	radeonsi/gfx10: implement gfx10_shader_ngg For pipelines without API GS. We will later expand this to cover NGG geometry shaders as well. Note that the vtx offset passed into the GS part is just the vertex index multiplied by VGT_ESGS_RING_ITEMSIZE. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	84e7ee421f	ac/surface/gfx10: allow "rotated" micro mode Standard mode does not support DCC. The R is retconned to "render target" on gfx10. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	a66be784c3	ac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modes Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	6d416ac7e1	amd/common/gfx10: print gfx10 registers in debug dumps Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	70fd27d1e3	amd/common/gfx10: CMASK is only used for FMASK All regular color compression is done via DCC. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	b52bf8f12a	amd/common/gfx10: support new tbuffer encoding Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	c067aaa580	amd/common/gfx10: pad shader buffers for instruction prefetch Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	227c29a80d	amd/common/gfx10: implement scan & reduce operations Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	7ba80c1d19	amd/common/gfx10: add GS_ALLOC_REQ message define Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4c364c89e2	amd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEM Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	74a26af913	amd/common/gfx10: add register JSON A small number of fields now need new disambiguation. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	536782b0b7	amd/common: add GFX10 chips Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Marek Olšák	78cdf9a99f	amd/addrlib: add gfx10 support Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Samuel Pitoiset	83297baf2d	ac: compute the DCC fast clear size per slice on GFX8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:44 +02:00
Samuel Pitoiset	6517d226ac	ac: compute the size of one DCC slice on GFX8 Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:41 +02:00
Emil Velikov	4ec32413f3	ac: change ac_query_gpu_info() signature Currently libdrm_amdgpu provides a typedef of the various handles. While the goal was to make those opaque, it effectively became part of the API To the best of my knowledge there are two ways to have opaque handles: - "typedef void foo;" - rather messy IMHO - "stuct foo;" and use "struct foo " through the API In our case amdgpu_device_handle is used only internally, plus respective code is not used or applicable for r300 and r600. Hence we copied the typedef. Seemingly this will be a problem since libdrm_amdgpu wants to change the API, while not updating the code(?). Either way, we can safely s/amdgpU_device_handle/void */ and carry on. Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak at amd.com>	2019-06-28 17:49:32 +01:00
Samuel Pitoiset	34bef8a0d7	radv: clear CMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:28 +02:00
Samuel Pitoiset	476b907a3b	radv: clear FMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:25 +02:00
Marek Olšák	ac4b1e2f0a	radeonsi: set the calling convention for inlined function calls otherwise the behavior is undefined Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-24 21:04:10 -04:00
Nicolai Hähnle	bd3a3fd25a	amd/rtld: update the ELF representation of LDS symbols The initial prototype used a processor-specific symbol type, but feedback suggests that an approach using processor-specific section name that encodes the alignment analogous to SHN_COMMON symbols is preferred. This patch keeps both variants around for now to reduce problems with LLVM compatibility as we switch branches around. This also cleans up the error reporting in this function. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	0032f6b8a0	ac/surface: remove addrlib_family_rev_id Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Daniel Schürmann	0daeb1d127	amd/common: lower bitfield_extract to ubfe/ibfe. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	48a75e7af0	amd/common: lower bitfield_insert to bfm & bitfield_select Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Nicolai Hähnle	21dd881416	ac/rtld: report better error messages for LDS overallocation Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Marek Olšák	b64bd5887e	ac/rtld: check correct LDS max size Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	1ee0f0d315	radeonsi: add s_sethalt to shaders for debugging Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	87182200c7	ac/rtld: fix sorting of LDS symbols by alignment Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Connor Abbott	53a7649e5d	ac/nir: Set speculatable for buffer loads where allowed This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	3bf8981c51	ac,radeonsi: Always mark buffer stores as inaccessiblememonly inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-19 14:08:27 +02:00
Samuel Pitoiset	4c7ef1b02e	ac: make ac_compute_cmask() a static function Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 11:30:47 +02:00
Samuel Pitoiset	b5012a0518	ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+ LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 08:58:33 +02:00
Marek Olšák	abe9a51d27	ac: add radeon_info::is_amdgpu instead of checking drm_major == 3 and clean up Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-14 13:31:18 -04:00
Daniel Schürmann	deedc0b31d	amd/common: add support for AMD_shader_ballot functions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Nicolai Hähnle	f8315ae04b	amd/rtld: layout and relocate LDS symbols Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	1ff2440eee	amd/common: use ARRAY_SIZE for the LLVM command line options This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	3c958d924a	amd/common: add ac_compile_module_to_elf A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	77b05cc42d	radeonsi: use ac_shader_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	b3be346c68	amd/common: add a more powerful runtime linker Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	c129cb3861	amd/common: clarify ac_shader_binary::lds_size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:21 -04:00
Nicolai Hähnle	2e96c01073	amd/common: extract ac_parse_shader_binary_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:08 -04:00

1 2 3 4 5 ...

1250 commits