Commit graph

394 commits

Author SHA1 Message Date
Timur Kristóf
4fc1da208e aco: Remove esgs_itemsize from LDS alignment calculation.
It was problematic to have it, because some shader stages might
not even know about the esgs_itemsize, for example TCS and
the merged VS+TCS stages.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
ca342701c5 aco: Extract LDS alignment calculation to a separate function.
This function is going to be reused in multiple functions when
storing or loading something in the LDS.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
fe80f22470 aco: Remove vertex_geometry_gs assertion from merged shaders.
We are going to support more kinds of merged shaders, such
as vertex_tess_control_hs and tess_eval_geometry_gs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
f53d31fb9b aco: Use mesa shader stage when loading inputs.
This makes it more clear which stages should load these inputs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
89ff5b1e51 aco: Implement load_view_index for TCS and TES.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
aa5eed673c aco: Implement memory_barrier_tcs_patch.
TCS outputs are going to be written to LDS, so it
has to use memory_barrier_shared in order to ensure
that it waits for LDS writes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
a8d15ab6da aco: Implement control_barrier for tessellation control shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
2489e4dfd1 aco: Implement load_invocation_id for tessellation control shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
5107b0312a aco: Implement load_patch_vertices_in.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
6edf6ad130 aco: Implement load_primitive_id for tessellation shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Timur Kristóf
754837f3b5 aco: Implement load_tess_coord.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Samuel Pitoiset
7618fe1b48 aco: fix image load/store with lod and 1D images
Make sure to add the lod value if non-null as the 2nd operand.

Fixes dEQP-VK.image.load_store_lod.with_format.1d.* on all gens
except GFX9.

Fixes: 4d49a7ac73 ("aco: handle nir_intrinsic_image_deref_{load,store} with lod")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
2020-03-05 14:29:27 +01:00
Rhys Perry
8291d728dc aco: improve GFX9 1D ddx/ddy assertion
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2547
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890>
2020-02-20 15:41:26 +00:00
Samuel Pitoiset
4b978cd950 aco: do not use ds_{read,write}2 on GFX6
According to LLVM, these instructions have a bounds checking bug.
LLVM only uses them on GFX7+.

This fixes broken geometry in Assassins Creed Origins.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2489
Fixes: 4a553212fa ("radv: enable ACO support for GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
2020-02-07 14:17:06 +01:00
Samuel Pitoiset
0d14f41625 aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6
When some unused channels are skipped and that we expand vec3 loads
to vec4 loads, we have to adjust the fourth component.

While we are at it, add an assertion to make sure we don't use
MUBUF for vec3 loads on GFX6.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2450
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2442
Fixes: 6aecc316 ("aco: fix VS input loads with MUBUF on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
2020-01-31 13:48:56 +01:00
Daniel Schürmann
6f718edced aco: simplify gathering of MIMG address components
This patch has a slight effect on pipelinedb:
Totals from affected shaders:
SGPRS: 23616 -> 21504 (-8.94 %)
VGPRS: 15088 -> 14444 (-4.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 662660 -> 664600 (0.29 %) bytes
LDS: 49 -> 49 (0.00 %) blocks
Max Waves: 3079 -> 3204 (4.06 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
901f06e9ad aco: simplify adjust_sample_index_using_fmask() & get_image_coords()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
71440ba0f5 aco: reorder VMEM operands in ACO IR
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Rhys Perry
5ea23ba659 aco: set exec_potentially_empty after continues/breaks in nested IFs
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
d282a292ec aco: don't always add logical edges from continue_break blocks to headers
Otherwise, code like this will be broken:
loop {
   if (...) {
      break;
   } else {
      break;
   }
}
The continue_or_break block doesn't have any logical predecessors but it's
a logical predecessor of the header block. This liveness error breaks the
spiller in init_live_in_vars() (under "keep variables spilled on all
incoming paths") and eventually creates garbage reloads.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Samuel Pitoiset
6aecc316c0 aco: fix VS input loads with MUBUF on GFX6
Only MTBUF supports vec3.

Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
2020-01-29 13:58:37 +00:00
Samuel Pitoiset
3922d95b51 aco: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Rhys Perry
03a0d39366 aco: use MUBUF in some situations instead of splitting vertex fetches
Fixes most of the regressions from splitting vertex fetches in an earlier
commit.

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 562696 -> 558344 (-0.77 %)
VGPRS: 395596 -> 393752 (-0.47 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11600912 -> 11311804 (-2.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 101839 -> 102372 (0.52 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:44:52 +00:00
Rhys Perry
d39f5519a1 aco: handle unaligned vertex fetch on GFX10
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 795000 -> 802368 (0.93 %)
VGPRS: 579632 -> 581280 (0.28 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17208408 -> 17583652 (2.18 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 145731 -> 145279 (-0.31 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:10 +00:00
Rhys Perry
d9e357e35b aco: skip unused channels at the start when fetching vertices
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 161320 -> 161224 (-0.06 %)
VGPRS: 153968 -> 149408 (-2.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4331496 -> 4331308 (-0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27814 -> 28594 (2.80 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 161504 -> 161408 (-0.06 %)
VGPRS: 153836 -> 149440 (-2.86 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4327572 -> 4327604 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27837 -> 28618 (2.81 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:01 +00:00
Rhys Perry
525b107347 aco: rework vertex fetching a bit
This will make it easier to skip unused channels at the start and to split
unaligned loads on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:57 +00:00
Rhys Perry
92970adb4b aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Samuel Pitoiset
d4b4f40595 aco: copy the literal offset of SMEM instructions to a temporary
GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
b9cc50fbce aco: fix a hardware bug for MRTZ exports on GFX6
GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
918f00eef8 aco: combine MRTZ (depth, stencil, sample mask) exports
Instead of emitting up to 3 for each different components (depth,
stencil and sample mask). This is needed to fix a hw bug on GFX6.

Totals from affected shaders:
SGPRS: 34728 -> 35056 (0.94 %)
VGPRS: 26440 -> 26476 (0.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1346088 -> 1344180 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3922 -> 3915 (-0.18 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
2020-01-24 16:42:15 +00:00
Rhys Perry
f8f7712666 aco: implement GS copy shaders
v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
de4ce66f5c aco: remove needs_instance_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
e192e268de aco: explicitly mark end blocks for exports
For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.

v6: only fixup exports in the end block, like before
v8: simplify some code

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
8bad100f83 aco: implement GS on GFX7-8
GS is the same on GFX6, but GFX6 isn't fully supported yet.

v4: fix regclass
v7: rebase after shader args MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
40bb81c9dd radv/aco,aco: implement GS on GFX9+
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Timur Kristóf
23edcf6490 aco: Make a better guess at which instructions need the VCC hint.
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
2020-01-24 13:14:23 +00:00
Samuel Pitoiset
8d5203dad2 aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6
V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:48 +01:00
Samuel Pitoiset
4d92601715 aco: implement 64-bit nir_op_ffloor on GFX6
GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:45 +01:00
Samuel Pitoiset
fbd169e421 aco: implement 64-bit nir_op_fround_even on GFX6
GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:42 +01:00
Samuel Pitoiset
87588801d3 aco: implement 64-bit nir_op_fceil on GFX6
GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:38 +01:00
Samuel Pitoiset
aad5176c58 aco: implement 64-bit nir_op_ftrunc on GFX6
GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:34 +01:00
Samuel Pitoiset
36e7a5f5b9 aco: implement nir_intrinsic_global_atomic_* on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:30 +01:00
Samuel Pitoiset
22d8822683 aco: implement nir_intrinsic_load_global on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:27 +01:00
Samuel Pitoiset
d6af7571c2 aco: implement nir_intrinsic_store_global on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:24 +01:00
Samuel Pitoiset
01f0bef71e aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample
Only GFX6 was affected, my mistake. The total number of SGPR operands
should be 4 when we want to create a vec4.

Fixes: dbdf3b3ef9 ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:21 +01:00
Samuel Pitoiset
e030aef32c aco: add support for nir_texop_fragment_{mask}_fetch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Timur Kristóf
533a20dbd5 aco: Fix maybe-uninitialized warnings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
2020-01-22 11:09:14 +01:00
Samuel Pitoiset
dbdf3b3ef9 aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6
GFX6 doesn't have FLAT instructions which means we have to emit
a 64-bit MUBUF load.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
fe9157a700 aco: do not use the vec3 variant for loads on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
1b5bb204d9 aco: do not use the vec3 variant for stores on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00