fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 22:38:06 +02:00

Author	SHA1	Message	Date
Rhys Perry	e192e268de	aco: explicitly mark end blocks for exports For GS copy shaders, whether we want to do exports is conditional. By explicitly marking the end blocks, we can mark an IF's then branch as an export block and ensure that's where the assembler inserts null exports. v6: only fixup exports in the end block, like before v8: simplify some code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	d46a54ecff	radv/aco: allow ACO for GS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	8bad100f83	aco: implement GS on GFX7-8 GS is the same on GFX6, but GFX6 isn't fully supported yet. v4: fix regclass v7: rebase after shader args MR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	40bb81c9dd	radv/aco,aco: implement GS on GFX9+ v2: implement GFX10 v3: rebase v7: rebase after shader args MR v8: fix gs_vtx_offset usage on GFX9/GFX10 v8: use unreachable() instead of printing intrinsic v8: rename output_state to ge_output_state v8: fix formatting around nir_foreach_variable() v8: rename some helpers in the scheduler v8: rename p_memory_barrier_all to p_memory_barrier_common v8: fix assertion comparing ctx.stage against vertex_geometry_gs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	70f63c1988	aco: improve support for s_sendmsg In particular, the messages needed for GS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	0da7b3b18b	radv: move gs copy shader creation before other variants ACO lowers output derefs which breaks the shader_info pass used by gs copy shader creation. v3: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Timur Kristóf	23edcf6490	aco: Make a better guess at which instructions need the VCC hint. Previously, bool_to_vector_condition would always set the VCC hint on its result. This commit improves it by having the optimizer set the VCC hint only when the result really needs to be in the VCC. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>	2020-01-24 13:14:23 +00:00
Bas Nieuwenhuizen	0890482969	radv: Allow DCC & TC-compat HTILE with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT. I misunderstood the flag when initially disabling. But this flag only does something with mutable formats. If we have DCC and mutable formats, the formats are close enough that the allowed usage flags are not meaningfully different nor used during allocation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424>	2020-01-24 11:16:39 +00:00
Bas Nieuwenhuizen	1b447bd2e6	radv: Expose VK_KHR_swapchain_mutable_format. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2354 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425>	2020-01-24 10:47:07 +00:00
Samuel Pitoiset	a31bcf2be6	ac/llvm: fix missing casts in ac_build_readlane() Because ac_build_optimization_barrier() overwrites the original src_type, we have to keep track of it before emitting that barrier. Otherwise, wrong conversions are expected for pointers or small bitsizes. By doing this, we no longer need to do the cast dance in ac_build_readlane_no_opt_barrier(), it was just necessary for ac_build_optimization_barrier(). This fixes a bunch of crashes with subgroups related tests when RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash with The Surge 2. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395 Fixes: `0f45d4dc2b` ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>	2020-01-24 07:40:07 +01:00
Samuel Pitoiset	8d5203dad2	aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6 V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:48 +01:00
Samuel Pitoiset	4d92601715	aco: implement 64-bit nir_op_ffloor on GFX6 GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:45 +01:00
Samuel Pitoiset	fbd169e421	aco: implement 64-bit nir_op_fround_even on GFX6 GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:42 +01:00
Samuel Pitoiset	87588801d3	aco: implement 64-bit nir_op_fceil on GFX6 GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:38 +01:00
Samuel Pitoiset	aad5176c58	aco: implement 64-bit nir_op_ftrunc on GFX6 GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:34 +01:00
Samuel Pitoiset	36e7a5f5b9	aco: implement nir_intrinsic_global_atomic_* on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:30 +01:00
Samuel Pitoiset	22d8822683	aco: implement nir_intrinsic_load_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:27 +01:00
Samuel Pitoiset	d6af7571c2	aco: implement nir_intrinsic_store_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:24 +01:00
Samuel Pitoiset	01f0bef71e	aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample Only GFX6 was affected, my mistake. The total number of SGPR operands should be 4 when we want to create a vec4. Fixes: `dbdf3b3ef9` ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:21 +01:00
Samuel Pitoiset	54e54ec3e8	aco: fix printing assembly with CLRXdisasm on GFX6 We thought that CLRXdisasm allowed gfx600 as well as gfx700 but it actually doesn't. Use the family for GFX6 chips instead. Fixes: `0099f85232` ("aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531>	2020-01-23 11:34:37 +00:00
Samuel Pitoiset	12fe19ba3b	radv: advertise VK_AMD_shader_fragment_mask Only for GFX8+ because it's untested on older generations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	e030aef32c	aco: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	9e477d79b7	ac/nir: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	e60de08547	radv: handle missing implicit subpass dependencies When a subpass doesn't declare an explicit dependency from/to VK_SUBPASS_EXTERNAL, Vulkan says there is an implicit dependency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>	2020-01-23 11:25:41 +01:00
Samuel Pitoiset	0d2da2a8c0	radv: add explicit external subpass dependencies to meta operations No functional changes because a subpass dependency with dstStageMask set to VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is a no-op. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>	2020-01-23 11:25:38 +01:00
Rhys Perry	15a1cc00d3	aco: fix off-by-one error when initializing sgpr_live_in Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2394 Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511>	2020-01-22 17:23:30 +00:00
Samuel Pitoiset	bd51538d28	radv: fix double free corruption in radv_alloc_memory() If the driver fails to allocate memory for some reasons, it shouldn't free the 'mem' object twice. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2302 Fixes: `825ddfee59` ("radv: Handle device memory alloc failure with normal free.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508>	2020-01-22 17:01:16 +00:00
Rhys Perry	3f96a1ed86	aco: fix operand kill flags when a temporary is used more than once Helps create v_mac_f32 from v_mad_f32(b, a, b) Totals from affected shaders: SGPRS: 35824 -> 35824 (0.00 %) VGPRS: 33460 -> 33456 (-0.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2187264 -> 2180976 (-0.29 %) bytes LDS: 127 -> 127 (0.00 %) blocks Max Waves: 3802 -> 3802 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486>	2020-01-22 15:55:00 +00:00
Timur Kristóf	1c9ecb2123	aco: Fix signedness compare warning. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:17 +01:00
Timur Kristóf	533a20dbd5	aco: Fix maybe-uninitialized warnings. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:14 +01:00
Timur Kristóf	6fb3df2786	aco: Fix -Wstringop-overflow warnings in aco_span. GCC does not understand how aco_span works. This patch fixes it by casting the aco_span's this pointer to uintptr_t rather than to a char pointer, effectively telling GCC not to try to figure it out. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:10 +01:00
Bas Nieuwenhuizen	bd4380c63c	radv: Remove syncobj_handle variable in header. I strongly suspect it was supposed to be a typedef. However, used nowhere, we should remove it. Fixes: `eaa56eab6d` "radv: initial support for shared semaphores (v2)" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2385 Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479>	2020-01-21 12:28:00 +00:00
Marek Olšák	4e4b2d13f0	ac: add helper ac_build_triangle_strip_indices_to_triangle Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	0f45d4dc2b	ac: add ac_build_readlane without optimization barrier Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	77393cf39b	ac: add prefix bitcount functions Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Samuel Pitoiset	dbdf3b3ef9	aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6 GFX6 doesn't have FLAT instructions which means we have to emit a 64-bit MUBUF load. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	9e2fde84fc	aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7 According to the different ISA docs (and to LLVM), this bit seems to only exists on GFX6-GFX7. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	fe9157a700	aco: do not use the vec3 variant for loads on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	1b5bb204d9	aco: do not use the vec3 variant for stores on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	b8abfafe86	aco: fix constant folding of SMRD instructions on GFX6 SMRD instructions have an 8-bit dword offset on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Rhys Perry	29bfe18abd	aco: fix fall-through test in try_remove_simple_block() with back-edges `3bca0af2` enhanced empty block determination which exposed this bug and created an infinite loop in a Guild Wars 2 shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `3bca0af25d` ('aco: ignore parallelcopies to the same register on jump threading') Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2364 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>	2020-01-20 11:51:45 +00:00
Rhys Perry	e151398de6	aco: fix stack buffer overflow in apply_sgprs() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `cef7879719` ('aco: rewrite apply_sgprs()') Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>	2020-01-20 11:13:11 +00:00
Samuel Pitoiset	0099f85232	aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system LLVM only supports GFX8+. Using CLRXdisasm works most of the time, so it's useful to add support for it. Original patch by Daniel Schürmann. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>	2020-01-17 17:41:32 +00:00
Samuel Pitoiset	b9b393f0ce	aco: fix emitting slc for MUBUF instructions on GFX6-GFX7 Same as GFX10, only GFX8/GFX9 moved that bit near the opcode. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>	2020-01-17 16:56:04 +01:00
Daniel Schürmann	3bca0af25d	aco: ignore parallelcopies to the same register on jump threading The more conservative lowering to CSSA inserts unnecessary parallelcopies which might get coalesced and can be ignored on jump threading. v2: outline is_empty_block() check. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Daniel Schürmann	427e5eeb02	aco: handle phi affinities transitively through parallelcopies This can coalesce most unnecessarily inserted parallelcopies from lowering to CSSA. v2: refactor loop a bit to make it more efficient and readable. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Daniel Schürmann	d098024c40	aco: rework lower_to_cssa() This patch changes lower_to_cssa to be much more conservative about assumptions which phi operands might interfere. Previously, this pass wasn't exhaustive and could miss some corner cases. v2: remove optimizations to find better insertion points as it's hard to guarantee that they are always correct and have overall no benefit. Fixes: `0b8216b2cd` ('aco: Lower to CSSA') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Samuel Pitoiset	300f8dec76	aco: implement stream output with vec3 on GFX6 GFX6 doesn't support vec3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Samuel Pitoiset	a445cb35bd	aco: do not combine additions of DS instructions on GFX6 The offset field doesn't work as expected on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Samuel Pitoiset	923005bf54	aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6 Only GFX7 and later support large ds_read/ds_write. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00

... 4 5 6 7 8 ...

4814 commits