Commit graph

845 commits

Author SHA1 Message Date
Daniel Schürmann
6f718edced aco: simplify gathering of MIMG address components
This patch has a slight effect on pipelinedb:
Totals from affected shaders:
SGPRS: 23616 -> 21504 (-8.94 %)
VGPRS: 15088 -> 14444 (-4.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 662660 -> 664600 (0.29 %) bytes
LDS: 49 -> 49 (0.00 %) blocks
Max Waves: 3079 -> 3204 (4.06 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
901f06e9ad aco: simplify adjust_sample_index_using_fmask() & get_image_coords()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
99d032f3cd aco: fix register allocation with multiple live-range splits
This patch fixes register allocation if multiple live-range splits
occur to the same variable within one instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
71440ba0f5 aco: reorder VMEM operands in ACO IR
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Rhys Perry
db19e96c8c aco: fix exec mask consistency issues
There seems to be more, these are just the ones found in
Detroit: Become Human shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
c7d0514168 aco: parallelcopy exec mask before s_wqm
It can be used later and we want any uses to not be fixed to exec, so it's
definition can't be fixed to exec because of how exec masks interact with
register demand calculation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
517fc3abc4 aco: fill reg_demand with sensible information in add_coupling_code()
process_block() will use this to determine the register demand of the
before the current instruction. Previously, it was filled with zeroes
which could result in process_block() only using the register demand
of after the current instruction.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
26d2511bcb aco: improve assertion at the end of spiller
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
5ea23ba659 aco: set exec_potentially_empty after continues/breaks in nested IFs
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4e83e05e62 aco: error when block has no logical preds but VGPRs are live at the start
This would have caught the liveness error fixed in the previous commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
d282a292ec aco: don't always add logical edges from continue_break blocks to headers
Otherwise, code like this will be broken:
loop {
   if (...) {
      break;
   } else {
      break;
   }
}
The continue_or_break block doesn't have any logical predecessors but it's
a logical predecessor of the header block. This liveness error breaks the
spiller in init_live_in_vars() (under "keep variables spilled on all
incoming paths") and eventually creates garbage reloads.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
dba71de5c6 aco: only create parallelcopy to restore exec at loop exit if needed
The operand isn't fixed to exec, which can mess up the spiller. This also
adds a new situation where a phi is needed.

Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion
when compiling a Detroit: Become Human shader.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4537b97410 aco: don't update demand in add_coupling_code() for loop headers
We don't need to update it since it won't be used later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
521525fc0a aco: don't consider loop header blocks branch blocks in add_coupling_code
Loops without continues create header blocks with only 1 predecessor.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
590c26beab aco: fix target calculation when vgpr spilling introduces sgpr spilling
A shader might require vgpr spilling but not require sgpr spilling. In
that case, the spiller lowers the sgpr target by 5 which could mean sgpr
spilling is then required. Then the vgpr target has to be lowered to make
space for the linear vgprs. Previously, space wasn't make for the linear
vgprs.

Found while testing the spiller on the pipeline-db with a lowered limit

Fixes: a7ff1bb5b9
   ('aco: simplify calculation of target register pressure when spilling')

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Samuel Pitoiset
6aecc316c0 aco: fix VS input loads with MUBUF on GFX6
Only MTBUF supports vec3.

Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
2020-01-29 13:58:37 +00:00
Rhys Perry
404818dd28 aco: run p_wqm instructions in WQM
If the p_wqm ends up creating copies, these need to be in WQM. Helps (but
doesn't completely fix) artifacts in Strange Brigade. The actual issue
still exists and is harder to fix.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Rhys Perry
2d7386a2d0 aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM
We want any copies to be in WQM. I don't know if this fixes any real
application, but I can create a vkrunner test than reproduces the issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Samuel Pitoiset
3922d95b51 aco: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Daniel Schürmann
396be00640 aco: fix combine_salu_not_bitwise() when SCC is used
Previously, we didn't use the SCC bit, and thus, we didn't care about it.
With 'aco: Transform uniform bitwise instructions to 32-bit if possible.'
that changed, so that we have to handle it.

Fixes: 8a32f57fff ('aco: Transform uniform bitwise instructions to 32-bit if possible.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
2020-01-28 18:14:02 +01:00
Rhys Perry
7edcf4a59d aco: fix rebase error from GS copy shader support
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8f7712666 ('aco: implement GS copy shaders')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
2020-01-28 13:50:53 +00:00
Rhys Perry
03a0d39366 aco: use MUBUF in some situations instead of splitting vertex fetches
Fixes most of the regressions from splitting vertex fetches in an earlier
commit.

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 562696 -> 558344 (-0.77 %)
VGPRS: 395596 -> 393752 (-0.47 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11600912 -> 11311804 (-2.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 101839 -> 102372 (0.52 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:44:52 +00:00
Rhys Perry
21d2799cee aco: value-number MUBUF instructions
We will have to do this when we start creating MUBUF instructions for
load_input because NIR might not be able to tell they are identical since
it doesn't know whether two vertex attributes have the same offset.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:22 +00:00
Rhys Perry
d39f5519a1 aco: handle unaligned vertex fetch on GFX10
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 795000 -> 802368 (0.93 %)
VGPRS: 579632 -> 581280 (0.28 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17208408 -> 17583652 (2.18 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 145731 -> 145279 (-0.31 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:10 +00:00
Rhys Perry
d9e357e35b aco: skip unused channels at the start when fetching vertices
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 161320 -> 161224 (-0.06 %)
VGPRS: 153968 -> 149408 (-2.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4331496 -> 4331308 (-0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27814 -> 28594 (2.80 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 161504 -> 161408 (-0.06 %)
VGPRS: 153836 -> 149440 (-2.86 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4327572 -> 4327604 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27837 -> 28618 (2.81 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:01 +00:00
Rhys Perry
525b107347 aco: rework vertex fetching a bit
This will make it easier to skip unused channels at the start and to split
unaligned loads on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:57 +00:00
Rhys Perry
2dc63d39d3 aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
827681f921 aco: always add sgprs to sgpr_ids when choosing literals
Even if it's a literal, we should add this to sgpr_ids.
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
92970adb4b aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
e6c90e4af9 aco: fix WaR check for >64-bit FLAT/GLOBAL instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 5986e0019 ('aco: improve WAR hazard workaround with >64bit stores')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Samuel Pitoiset
d4b4f40595 aco: copy the literal offset of SMEM instructions to a temporary
GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
1ac49ba908 aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6
It's required to insert 1 wait state if the dst VGPR of any v_interp_*
is followed by a read with v_readfirstlane or v_readlane to fix GPU
hangs on GFX6. Note that v_writelane_* is apparently not affected.
This hazard isn't documented anywhere but AMD confirmed it.

This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
b9cc50fbce aco: fix a hardware bug for MRTZ exports on GFX6
GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
918f00eef8 aco: combine MRTZ (depth, stencil, sample mask) exports
Instead of emitting up to 3 for each different components (depth,
stencil and sample mask). This is needed to fix a hw bug on GFX6.

Totals from affected shaders:
SGPRS: 34728 -> 35056 (0.94 %)
VGPRS: 26440 -> 26476 (0.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1346088 -> 1344180 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3922 -> 3915 (-0.18 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
2020-01-24 16:42:15 +00:00
Timur Kristóf
c787b8d2a1 aco/gfx10: Fix VcmpxExecWARHazard mitigation.
The SOPP instruction shouldn't have a definition, and its block
should be set to -1 in order to prevent it from being recognized
as a branch.
Also fix a typo in the readme.

Fixes: d6dfce02d0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
2020-01-24 16:21:08 +00:00
Timur Kristóf
8a32f57fff aco: Transform uniform bitwise instructions to 32-bit if possible.
This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.

v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
2020-01-24 14:40:45 +00:00
Rhys Perry
b046f55086 aco: use nir_move_copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
f8f7712666 aco: implement GS copy shaders
v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
de4ce66f5c aco: remove needs_instance_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
e192e268de aco: explicitly mark end blocks for exports
For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.

v6: only fixup exports in the end block, like before
v8: simplify some code

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
8bad100f83 aco: implement GS on GFX7-8
GS is the same on GFX6, but GFX6 isn't fully supported yet.

v4: fix regclass
v7: rebase after shader args MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
40bb81c9dd radv/aco,aco: implement GS on GFX9+
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
70f63c1988 aco: improve support for s_sendmsg
In particular, the messages needed for GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Timur Kristóf
23edcf6490 aco: Make a better guess at which instructions need the VCC hint.
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
2020-01-24 13:14:23 +00:00
Samuel Pitoiset
8d5203dad2 aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6
V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:48 +01:00
Samuel Pitoiset
4d92601715 aco: implement 64-bit nir_op_ffloor on GFX6
GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:45 +01:00
Samuel Pitoiset
fbd169e421 aco: implement 64-bit nir_op_fround_even on GFX6
GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:42 +01:00
Samuel Pitoiset
87588801d3 aco: implement 64-bit nir_op_fceil on GFX6
GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:38 +01:00
Samuel Pitoiset
aad5176c58 aco: implement 64-bit nir_op_ftrunc on GFX6
GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:34 +01:00
Samuel Pitoiset
36e7a5f5b9 aco: implement nir_intrinsic_global_atomic_* on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:30 +01:00