fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 00:10:10 +01:00

Author	SHA1	Message	Date
Timothy Arceri	b09a3196e0	ac: add load_tes_inputs() to the abi V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Samuel Pitoiset	a4d2782664	amd/common: scan if gl_PrimitiveID is used before translating to LLVM It makes more sense to move all scan stuff in the same place. Also, we don't really need to duplicate the uses_primid field for each stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 18:43:09 +01:00
Bas Nieuwenhuizen	c99426ea83	ac/nir: Handle loading data from compact arrays. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-04 00:14:23 +01:00
Samuel Pitoiset	3260a96c17	amd/common: rework set_userdata_location() and rename to set_loc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:17 +01:00
Samuel Pitoiset	4221a816e2	amd/common: rename set_userdata_location_shader() to set_loc_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:15 +01:00
Samuel Pitoiset	5081fd398e	amd/common: replace set_userdata_location_indirect() by set_loc_desc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:13 +01:00
Samuel Pitoiset	f8202ef683	amd/common: rename radv_define_vs_user_sgprs_phase2() ... to set_vs_specific_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:11 +01:00
Samuel Pitoiset	9d5a1787ee	amd/common: rename radv_define_common_user_sgprs_phase2() ... to set_global_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:08 +01:00
Samuel Pitoiset	9a2393a510	amd/common: rename add_user_sgpr_array_argument() to add_array_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:06 +01:00
Samuel Pitoiset	b6217bdbee	amd/common: replace add_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:04 +01:00
Samuel Pitoiset	32bbc9eb0f	amd/common: replace add_user_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:02 +01:00
Samuel Pitoiset	e946b5360d	amd/common: replace add_vgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:59 +01:00
Samuel Pitoiset	f1242a8976	amd/common: add new add_arg() helper for SGPRs/VGPRs arguments The idea is to clean up the add arguments logic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:57 +01:00
Samuel Pitoiset	bedfa06eaf	amd/common: rename radv_define_common_user_sgprs_phase1() ... to declare_global_input_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:55 +01:00
Samuel Pitoiset	0f58f67abe	amd/common: rename radv_define_vs_user_sgprs_phase1() ... to declare_vs_specific_inputs_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:53 +01:00
Samuel Pitoiset	5c91c1614c	amd/common: do not try to declare input VS SGPRs for GS It's a no-op anyway but it looked strange to me, remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:51 +01:00
Samuel Pitoiset	fc35a071b6	amd/common: add declare_vs_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:49 +01:00
Samuel Pitoiset	3015668cad	amd/common: add declare_tes_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:47 +01:00
Samuel Pitoiset	62942aa8c6	amd/common: remove unnecessary num_user_sgprs_used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:46 +01:00
Samuel Pitoiset	6edf1fcdf5	amd/common: remove unnecessary user_sgpr_count Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:44 +01:00
Samuel Pitoiset	38f9b87af2	amd/common: add ac_export_mrt_z() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:49 +01:00
Samuel Pitoiset	03ef264146	amd/common: pass the family to ac_llvm_context_init() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:44 +01:00
Samuel Pitoiset	4237c3d645	radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components F1 2017 looks good now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:26:25 +01:00
Dave Airlie	dd517ad96d	ac/nir: fix lds store for patch outputs. This wasn't calculating the correct value, this along with a nir patch fixes a regression in: dEQP-VK.tessellation.shader_input_output.barrier Fixes: `043d14db30` (ac/nir: don't write tcs outputs to LDS that aren't read back.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:24 +10:00
Samuel Pitoiset	225b198802	amd/common: add ac_build_waitcnt() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:44 +01:00
Samuel Pitoiset	24601810e9	amd/common: more use of i32_1 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:42 +01:00
Samuel Pitoiset	ec4e566560	amd/common: more use of i32_0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:41 +01:00
Samuel Pitoiset	88522e2bcd	radv: export SampleMask from pixel shaders at full rate Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:28 +01:00
Samuel Pitoiset	90c3bf0789	radv: do not load the local invocation index when it's unused Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:26 +01:00
Samuel Pitoiset	2e58ef46a8	radv: replace grid_components_used by uses_grid_size Use a boolean instead because the number of needed SGPRs is always 3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:42 +01:00
Samuel Pitoiset	97e57740d8	radv: always emit all compute block components The number of grid components is always 3 when gl_NumWorkGroups is declared, because it relies on the number of components of nir_instrinsic_load_num_work_groups. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:39 +01:00
Timothy Arceri	a5f9ac2928	ac: fix nir_op_f2f64 Without this we get the error "FPExt only operates on FP" when converting the following: vec1 32 ssa_5 = b2f ssa_4 vec1 64 ssa_6 = f2f64 ssa_5 Which results in: %44 = and i32 %43, 1065353216 %45 = fpext i32 %44 to double With this patch we now get: %44 = and i32 %43, 1065353216 %45 = bitcast i32 %44 to float %46 = fpext float %45 to double Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-13 13:20:28 +11:00
Bas Nieuwenhuizen	3342a432fa	ac/nir: Support vulkan_resource_reindex. Fixes: `93b4cb61eb` "spirv: Allow OpPtrAccessChain for block indices" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen	368f49b284	ac/nir: Don't load the descriptor in vulkan_resource_index. To support the reindex intrinsic, we need the result to be something on which we can adjust the index/address. Since it is all within a basic block, the compiler should be able to merge any extra loads. v2: Change visit_get_buffer_size too. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Samuel Pitoiset	5f81a43535	radv: use a faster version for nir_op_pack_half_2x16 This patch is ported from RadeonSI and it has two effects. It fixes a rendering issue which affects F1 2017 and Dawn of War 3 (Vega only) because LLVM was ending up by generating the new v_mad_mix_{hi,lo} instructions which appear to be buggy in some way. Not sure if Mesa is generating something wrong or if the issue is in LLVM only. Anyway, that explains why the DOW3 issue can't be reproduced with GL on Vega. It also improves performance because v_cvt_pkrtz_f16 is faster, and because I guess the rounding mode behaviour is similar between GL and VK, we can use it. About performance, it improves Talos by +3/4% but I don't see any other impacts. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-07 17:21:50 +01:00
Timothy Arceri	ccd1810bba	ac: add si_nir_load_input_gs() to the abi V2: make use of driver_location and don't expose NIR to the ABI. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	caf15ce670	ac: move build_varying_gather_values() to ac_llvm_build.h and expose Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	6fd6cb6616	ac: add basic nir -> llvm type helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Dave Airlie	043d14db30	ac/nir: don't write tcs outputs to LDS that aren't read back. If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 13:50:24 +10:00
Timothy Arceri	b73ce64fb8	ac: add gs_{prim,invocation}_id to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 10:54:03 +11:00
Timothy Arceri	8fe6abd964	ac: add emit_vertex to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Dave Airlie	6bec8bcd79	ac/nir: add support for all intrinsics. (v2) This is derived from tgsi/radeonsi code from the GLSL intrinsics. This should pre-fix radv for the upcoming spirv patches. v2: actually use wait_cnt, sleep deprived dad time! (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-09 01:25:59 +00:00
Dave Airlie	0084f4a422	ac/nir: for ubo load use correct num_components I was hacking something stupid in doom, and hit an assert for the bitcast following this, it definitely looks like this should be the number of 32-bit components, not the instr level ones. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-07 14:54:19 +10:00
Timothy Arceri	6e2eb96b64	ac: remove the remaining duplicate llvm types Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	e73a467005	ac: remove usused v4f32 Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	7f4966731f	ac: add v2f32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	cd6cfd1095	ac: use the ac f16 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	8f651ae062	ac: use the ac f32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	368654a299	ac: use the ac f64 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	d927db0672	ac: use the common v8i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00

1 2 3 4 5 ...

352 commits