fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-15 22:58:05 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	9e5d2a73c5	ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens The hardware doesn't flush denorms, exactly like fmin/fmax, so we have to do it manually. This doesn't fix anything known. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:33 +01:00
Samuel Pitoiset	30ac733680	ac/llvm: fix 16-bit fmed3 on GFX8 and older gens 16-bit med3 is only supported on GFX9+. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:30 +01:00
Samuel Pitoiset	50b8c25274	ac/llvm: fix 64-bit fmed3 Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:28 +01:00
Samuel Pitoiset	6f4c300919	ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	a31bcf2be6	ac/llvm: fix missing casts in ac_build_readlane() Because ac_build_optimization_barrier() overwrites the original src_type, we have to keep track of it before emitting that barrier. Otherwise, wrong conversions are expected for pointers or small bitsizes. By doing this, we no longer need to do the cast dance in ac_build_readlane_no_opt_barrier(), it was just necessary for ac_build_optimization_barrier(). This fixes a bunch of crashes with subgroups related tests when RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash with The Surge 2. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395 Fixes: `0f45d4dc2b` ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>	2020-01-24 07:40:07 +01:00
Samuel Pitoiset	9e477d79b7	ac/nir: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Marek Olšák	4e4b2d13f0	ac: add helper ac_build_triangle_strip_indices_to_triangle Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	0f45d4dc2b	ac: add ac_build_readlane without optimization barrier Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	77393cf39b	ac: add prefix bitcount functions Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	eeb4a11c11	ac/cull: don't read Position.Z if it's not needed for culling It could be NULL. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:20 -05:00
Jason Ekstrand	d3737002ee	nir/lower_atomics_to_ssbo: Also lower barriers This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	e40b11bbcb	nir: Rename nir_intrinsic_barrier to control_barrier This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	60097cc840	nir: Add a new memory_barrier_tcs_patch intrinsic Right now, it's implemented as a no-op for everyone. For most drivers, it's a switch case in the NIR -> whatever which just breaks. For ir3, they already have code to delete tessellation barriers so we just add a case to also delete memory_barrier_tcs_patch. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Timur Kristóf	eccac46cdc	ac/llvm: Fix ac_build_reduce in wave32 mode. Previously, when cluster_size was set to 0, it always worked as if the cluster size was 64. This commit fixes it in wave32 mode by changing to work as if the cluster size was set to 32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-10 12:30:44 +01:00
Samuel Pitoiset	e77ff89914	amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Marek Olšák	9b71041627	ac: add ac_build_s_endpgm Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:03:48 -05:00
Marek Olšák	1c44480538	ac: add 128-bit bitcount Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:41 -05:00
Marek Olšák	d1c8aeb24f	ac: unify primitive export code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:38 -05:00
Marek Olšák	1c77a18cc2	ac: unify build_sendmsg_gs_alloc_req Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:36 -05:00
Marek Olšák	f90cbd18ff	ac: fix the return value in cull_bbox when bbox culling is disabled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	e5e3ffa6b9	ac: fix ac_get_i1_sgpr_mask for Wave32 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Samuel Pitoiset	a0f1a5fa05	ac/nir: fix out-of-bound access when loading constants from global Global load/store instructions can't know if the offset is out-of-bound because they don't use descriptors (no range). Fix this by clamping the offset for arrays that are indexed with a non-constant offset that's greater or equal to the array size. This fixes VM faults and GPU hangs with Dead Rising 4. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148 Fixes: `71a6794200` ("ac/nir: Enable nir_opt_large_constants") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-12 10:12:56 +00:00
Samuel Pitoiset	f63a3132e8	ac/llvm: fix atomic var operations if source isn't a deref Fixes some CTS regressions. Fixes: `e61a826f39` ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-03 09:41:33 +01:00
Rhys Perry	a814f3d8a7	ac/llvm: improve sync scope for global atomics Stronger ordering is implemented in SPIRV->NIR with barriers. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:27 +00:00
Rhys Perry	e61a826f39	ac/llvm: fix pointer type for global atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:18 +00:00
Samuel Pitoiset	c105e6169c	radv,ac/nir: lower deref operations for shared memory This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipeline-db (NAVI10/LLVM): SGPRS: 9043 -> 9051 (0.09 %) VGPRS: 7272 -> 7292 (0.28 %) Code Size: 638892 -> 621628 (-2.70 %) bytes LDS: 1333 -> 1331 (-0.15 %) blocks Max Waves: 1614 -> 1608 (-0.37 %) Found this while glancing at some F12019 shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-29 21:58:18 +01:00
Bas Nieuwenhuizen	e09426ad6b	amd/llvm: Refactor ac_build_scan. Split out the logic for exclusive scans into a separate function that makes clear what it does instead of having this opaque 60 line if. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-28 11:35:11 +01:00
Samuel Pitoiset	52aadbfd04	ac/llvm: convert src operands to pointers if necessary To avoid generating invalid LLVM IR when both operands don't have the same type. This might happen when performing pointer comparisons with SPIRV 1.4. Fixes invalid LLVM IR for: dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-28 08:26:51 +01:00
Marek Olšák	42318f9197	ac/nir: don't rely on data.patch for tess factors Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:10 -05:00
Samuel Pitoiset	0812dbd403	ac: add 8-bit and 16-bit supports to ac_build_permlane16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:42:42 +01:00
Samuel Pitoiset	c9aa843961	radv/gfx10: fix implementation of exclusive scans This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic..subgroupexclusive on GFX10. Fixes: `227c29a80d` ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:39:26 +01:00
Samuel Pitoiset	f6770b9726	ac/llvm: fix warning in ac_build_canonicalize() ../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’: ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4567 \| return ac_build_intrinsic(ctx, intr, type, params, 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4568 \| AC_FUNC_ATTR_READNONE); \| ~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-26 08:35:10 +01:00
Marek Olšák	b02e0d2604	radeonsi/nir: don't run si_nir_opts again if there is no change 0.3% less overhead Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-25 16:48:27 -05:00
Marek Olšák	f671cc4d95	ac: set swizzled bit in cache policy as a hint not to merge loads/stores LLVM now merges loads and stores for all opcodes, so this must be set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 16:48:27 -05:00
Connor Abbott	3b143369a5	ac/nir, radv, radeonsi: Switch to using ac_shader_args Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-11-25 14:17:10 +01:00
Connor Abbott	9885af3bdf	ac: Add a shared interface between radv, radeonsi, LLVM and ACO ac_shader_args will be similar to ac_shader_abi, except for being free from LLVM-specific concepts and therefore capable of being shared between LLVM and ACO. This will help us accomplish a few different things: - Decouple setting up SGPR and VGPR arguments from translating to LLVM, so that we can reference these arguments in NIR lowering passes, which will let us lower e.g. descriptor sets in NIR. - Stop using radv-specific structures for things like determining the chip generation in ACO. In the end, we should replace ac_shader_abi with this structure + driver-specific lowering passes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:12:46 +01:00
Samuel Pitoiset	bfb307aea9	ac/llvm: fix the local invocation index for wave32 Fixes dEQP-VK.compute.builtin_var.local_invocation_index with RADV_PERFTEST=cswave32. My initial fix was to lower it but Rhys suggested the shift-right and it's much better like this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 07:25:48 +00:00
Daniel Schürmann	0cbcfc071e	amd/llvm: Add Subgroup Scan functions for SI The idea of this implementation is taken from the ROCm Device Libs: https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-20 20:31:45 +00:00
Marek Olšák	ebe7579655	nir: move data.image.access to data.access The size of the data structure doesn't change. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:05 -05:00
Samuel Pitoiset	80c71cbbd8	ac: add 16-bit float support to ac_build_alu_op() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	670aa24c69	ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	21a9243f5e	ac: add 8-bit and 16-bit supports to ac_build_wwm() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	ef352a2466	ac: add 8-bit and 16-bit supports to get_reduction_identity() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	c8af1d51d4	ac: add 8-bit and 16-bit supports to ac_build_swizzle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	1565118d8f	ac: add 8-bit and 16-bit supports to ac_build_dpp() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	2113867f0c	ac: add 8-bit and 16-bit supports to ac_build_set_inactive() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	c29514bd22	ac: add 8-bit and 16-bit supports to ac_build_readlane() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	58d5ab98a3	ac: add 8-bit and 16-bit supports to ac_build_shuffle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	204cf54b70	ac: remove useless cast in ac_build_set_inactive() The return type is always the src type (32 or 64 bits). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	519d9b30de	radv: remove useless RADV_DEBUG=unsafemath debug option This option is useless and shouldn't be used at all. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-15 09:07:34 +01:00

1 2

60 commits