fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
Timur Kristóf	5107b0312a	aco: Implement load_patch_vertices_in. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>	2020-03-11 08:34:10 +00:00
Timur Kristóf	6edf6ad130	aco: Implement load_primitive_id for tessellation shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>	2020-03-11 08:34:10 +00:00
Timur Kristóf	754837f3b5	aco: Implement load_tess_coord. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>	2020-03-11 08:34:10 +00:00
Samuel Pitoiset	7618fe1b48	aco: fix image load/store with lod and 1D images Make sure to add the lod value if non-null as the 2nd operand. Fixes dEQP-VK.image.load_store_lod.with_format.1d.* on all gens except GFX9. Fixes: `4d49a7ac73` ("aco: handle nir_intrinsic_image_deref_{load,store} with lod") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>	2020-03-05 14:29:27 +01:00
Rhys Perry	8291d728dc	aco: improve GFX9 1D ddx/ddy assertion Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2547 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890>	2020-02-20 15:41:26 +00:00
Samuel Pitoiset	4b978cd950	aco: do not use ds_{read,write}2 on GFX6 According to LLVM, these instructions have a bounds checking bug. LLVM only uses them on GFX7+. This fixes broken geometry in Assassins Creed Origins. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2489 Fixes: `4a553212fa` ("radv: enable ACO support for GFX6") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>	2020-02-07 14:17:06 +01:00
Samuel Pitoiset	0d14f41625	aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6 When some unused channels are skipped and that we expand vec3 loads to vec4 loads, we have to adjust the fourth component. While we are at it, add an assertion to make sure we don't use MUBUF for vec3 loads on GFX6. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2450 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2442 Fixes: `6aecc316` ("aco: fix VS input loads with MUBUF on GFX6") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>	2020-01-31 13:48:56 +01:00
Daniel Schürmann	6f718edced	aco: simplify gathering of MIMG address components This patch has a slight effect on pipelinedb: Totals from affected shaders: SGPRS: 23616 -> 21504 (-8.94 %) VGPRS: 15088 -> 14444 (-4.27 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 662660 -> 664600 (0.29 %) bytes LDS: 49 -> 49 (0.00 %) blocks Max Waves: 3079 -> 3204 (4.06 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	901f06e9ad	aco: simplify adjust_sample_index_using_fmask() & get_image_coords() Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	71440ba0f5	aco: reorder VMEM operands in ACO IR For all VMEM instructions, the resource constant is now in operands[0]. For MIMG instructions, the sampler shares operands[1] with write data in case this instruction writes memory. Moving the VADDR to be the last operand for MIMG is the first step to support Navi NSA encoding. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Rhys Perry	5ea23ba659	aco: set exec_potentially_empty after continues/breaks in nested IFs Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	d282a292ec	aco: don't always add logical edges from continue_break blocks to headers Otherwise, code like this will be broken: loop { if (...) { break; } else { break; } } The continue_or_break block doesn't have any logical predecessors but it's a logical predecessor of the header block. This liveness error breaks the spiller in init_live_in_vars() (under "keep variables spilled on all incoming paths") and eventually creates garbage reloads. Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Samuel Pitoiset	6aecc316c0	aco: fix VS input loads with MUBUF on GFX6 Only MTBUF supports vec3. Fixes: `03a0d39366` ("aco: use MUBUF in some situations instead of splitting vertex fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>	2020-01-29 13:58:37 +00:00
Samuel Pitoiset	3922d95b51	aco: implement VK_AMD_shader_explicit_vertex_parameter Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Rhys Perry	03a0d39366	aco: use MUBUF in some situations instead of splitting vertex fetches Fixes most of the regressions from splitting vertex fetches in an earlier commit. pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 562696 -> 558344 (-0.77 %) VGPRS: 395596 -> 393752 (-0.47 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 11600912 -> 11311804 (-2.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 101839 -> 102372 (0.52 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:44:52 +00:00
Rhys Perry	d39f5519a1	aco: handle unaligned vertex fetch on GFX10 pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 795000 -> 802368 (0.93 %) VGPRS: 579632 -> 581280 (0.28 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 17208408 -> 17583652 (2.18 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 145731 -> 145279 (-0.31 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:40:10 +00:00
Rhys Perry	d9e357e35b	aco: skip unused channels at the start when fetching vertices pipeline-db (Vega): Totals from affected shaders: SGPRS: 161320 -> 161224 (-0.06 %) VGPRS: 153968 -> 149408 (-2.96 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 4331496 -> 4331308 (-0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 27814 -> 28594 (2.80 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 161504 -> 161408 (-0.06 %) VGPRS: 153836 -> 149440 (-2.86 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 4327572 -> 4327604 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 27837 -> 28618 (2.81 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:40:01 +00:00
Rhys Perry	525b107347	aco: rework vertex fetching a bit This will make it easier to skip unused channels at the start and to split unaligned loads on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:39:57 +00:00
Rhys Perry	92970adb4b	aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Samuel Pitoiset	d4b4f40595	aco: copy the literal offset of SMEM instructions to a temporary GFX6 only supports up to 8-bit for the literal offset, so make sure it's copied to a temporary SGPR before emitting a SMEM instruction. The optimizer will propagate the literal offset if possible anyways. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Samuel Pitoiset	b9cc50fbce	aco: fix a hardware bug for MRTZ exports on GFX6 GFX6 (except OLAND and HAINAN) has a bug that it only looks at the X writemask component. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Samuel Pitoiset	918f00eef8	aco: combine MRTZ (depth, stencil, sample mask) exports Instead of emitting up to 3 for each different components (depth, stencil and sample mask). This is needed to fix a hw bug on GFX6. Totals from affected shaders: SGPRS: 34728 -> 35056 (0.94 %) VGPRS: 26440 -> 26476 (0.14 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1346088 -> 1344180 (-0.14 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3922 -> 3915 (-0.18 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>	2020-01-24 16:42:15 +00:00
Rhys Perry	f8f7712666	aco: implement GS copy shaders v5: rebase on float_controls changes v7: rebase after shader args MR and load/store vectorizer MR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	de4ce66f5c	aco: remove needs_instance_id Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	e192e268de	aco: explicitly mark end blocks for exports For GS copy shaders, whether we want to do exports is conditional. By explicitly marking the end blocks, we can mark an IF's then branch as an export block and ensure that's where the assembler inserts null exports. v6: only fixup exports in the end block, like before v8: simplify some code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	8bad100f83	aco: implement GS on GFX7-8 GS is the same on GFX6, but GFX6 isn't fully supported yet. v4: fix regclass v7: rebase after shader args MR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	40bb81c9dd	radv/aco,aco: implement GS on GFX9+ v2: implement GFX10 v3: rebase v7: rebase after shader args MR v8: fix gs_vtx_offset usage on GFX9/GFX10 v8: use unreachable() instead of printing intrinsic v8: rename output_state to ge_output_state v8: fix formatting around nir_foreach_variable() v8: rename some helpers in the scheduler v8: rename p_memory_barrier_all to p_memory_barrier_common v8: fix assertion comparing ctx.stage against vertex_geometry_gs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Timur Kristóf	23edcf6490	aco: Make a better guess at which instructions need the VCC hint. Previously, bool_to_vector_condition would always set the VCC hint on its result. This commit improves it by having the optimizer set the VCC hint only when the result really needs to be in the VCC. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>	2020-01-24 13:14:23 +00:00
Samuel Pitoiset	8d5203dad2	aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6 V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:48 +01:00
Samuel Pitoiset	4d92601715	aco: implement 64-bit nir_op_ffloor on GFX6 GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:45 +01:00
Samuel Pitoiset	fbd169e421	aco: implement 64-bit nir_op_fround_even on GFX6 GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:42 +01:00
Samuel Pitoiset	87588801d3	aco: implement 64-bit nir_op_fceil on GFX6 GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:38 +01:00
Samuel Pitoiset	aad5176c58	aco: implement 64-bit nir_op_ftrunc on GFX6 GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:34 +01:00
Samuel Pitoiset	36e7a5f5b9	aco: implement nir_intrinsic_global_atomic_* on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:30 +01:00
Samuel Pitoiset	22d8822683	aco: implement nir_intrinsic_load_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:27 +01:00
Samuel Pitoiset	d6af7571c2	aco: implement nir_intrinsic_store_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:24 +01:00
Samuel Pitoiset	01f0bef71e	aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample Only GFX6 was affected, my mistake. The total number of SGPR operands should be 4 when we want to create a vec4. Fixes: `dbdf3b3ef9` ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:21 +01:00
Samuel Pitoiset	e030aef32c	aco: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Timur Kristóf	533a20dbd5	aco: Fix maybe-uninitialized warnings. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:14 +01:00
Samuel Pitoiset	dbdf3b3ef9	aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6 GFX6 doesn't have FLAT instructions which means we have to emit a 64-bit MUBUF load. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	fe9157a700	aco: do not use the vec3 variant for loads on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	1b5bb204d9	aco: do not use the vec3 variant for stores on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	300f8dec76	aco: implement stream output with vec3 on GFX6 GFX6 doesn't support vec3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Samuel Pitoiset	923005bf54	aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6 Only GFX7 and later support large ds_read/ds_write. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Timur Kristóf	dfaa3c0af6	aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible. When possible, get rid of an s_not when all it does is invert the SCC, and its successor s_cbranch / s_cselect can be inverted instead. Also modify some parts of instruction_selection to take advantage of this feature. Example: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 s2: %3902 = s_cselect_b64 -1, 0, %3900:scc s2: %407, s1: %3903:scc = s_not_b64 %3902 s2: %3906, s1: %3905:scc = s_and_b64 %407, %0:exec p_cbranch_z %3905:scc Can now be optimized to: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 p_cbranch_nz %3900:scc Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	338d03090f	aco: Allow optimizing vote_all and nir_op_iand. By adding an extra instruction, we can replace the operands of the s_cselect_b64, which allows it to get picked up by the optimizer when it looks for uniform booleans. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Rhys Perry	f92a89a979	aco: improve readfirstlane after uniform LDS loads Totals from affected shaders: SGPRS: 976 -> 968 (-0.82 %) VGPRS: 580 -> 584 (0.69 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 106032 -> 103076 (-2.79 %) bytes Max Waves: 237 -> 237 (0.00 %) Instructions: 19452 -> 18740 (-3.66 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Daniel Schürmann	05c81875d7	aco: fix unconditional demote_to_helper This patch fixes an out-of-bounds access on p_exit_early and binds the exec register to the correct operand. Fixes: `2ea9e59e8d` ('aco: move s_andn2_b64 instructions out of the p_discard_if') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>	2020-01-13 21:08:41 +00:00
Jason Ekstrand	d3737002ee	nir/lower_atomics_to_ssbo: Also lower barriers This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	e40b11bbcb	nir: Rename nir_intrinsic_barrier to control_barrier This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00

1 2 3 4

186 commits