Commit graph

54 commits

Author SHA1 Message Date
Dave Airlie
5247b311e9 radv/gfx9: fix set predication packet.
The predication packet changed format on GFX9, update the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:52:50 +10:00
Dave Airlie
9ee67467c9 radv: predicate cmask eliminate when using DCC.
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.

This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.

This increases the fps on Sascha Willems deferred.

Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:43 +01:00
Dave Airlie
a6c2001ace radv: add support for cmd predication.
This doesn't get used yet, it just adds support to various PKT3
emissions to enable it later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 02:06:49 +01:00
Dave Airlie
6a68170c83 radv: handle primitive id input into fragment shader with no geom shader
Fixes:
dEQP-VK.pipeline.framebuffer_attachment.no_attachments
dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:30 +10:00
Dave Airlie
a563f611c3 radv: set prim_id for geometry shaders
Noticed in passing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:22 +10:00
Grazvydas Ignotas
f490200973 radv: assert on CP_DMA_USE_L2 for SI
The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
is for CIK+ only, so let's assert on earlier ASICs.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-11 14:28:08 +03:00
Dave Airlie
86eff151b1 radv: move chip_class extraction down further.
This seems to matter here in a profile, without this we spend a lot
more time exiting this function with no flush bits.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:25:20 +10:00
Dave Airlie
f0b82bc545 radv/gfx9: use correct register setting for uconfig regs
Thanks to Marek for pointing this out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 08:09:03 +10:00
Bas Nieuwenhuizen
e08f741678 radv: Add early exit for cache flushes.
No sense checking each bit separately in the common case of none
being set.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Dave Airlie
5c8f8cae3e radv: add IA_MULTI_VGT_PARAM support for GFX9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:55 +10:00
Dave Airlie
67655cb24f radv: add rb+ support for GFX9
This adds some rb+ support, as on GFX9 we have to disable
it as per radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:45 +10:00
Dave Airlie
c2fbeb7ca0 radv: add GFX9 cache flushing support.
GFX9 needs to write event EOP to a fence buffer, allocate some
space for this, and just write an ever increasing number to it,
this isn't exactly what radeonsi does, but it seems to work.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:40 +10:00
Dave Airlie
87b3799493 radv: add GFX9 to initialisation cmd buffer.
This just adds support for initialising some GFX9 registers,
and handles the different init for the VGT reuse reg.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:35 +10:00
Dave Airlie
98f27b9cce radv: don't setup raster_config on gfx9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:32 +10:00
Dave Airlie
77b8aa4d95 radv: add gfx9 cp dma support.
This adds support to the CP dma code for GFX9, ported from
radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:29 +10:00
Dave Airlie
0063da8393 radv: add some misc gfx9 pieces.
This just adds the strings and includes the gfx9 register defs
in some files that we need them in.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:21 +10:00
Dave Airlie
04924c09be radv: fix typo in comment. 2017-06-06 08:59:30 +10:00
Dave Airlie
114d29e7fe radv: add a comment from radeonsi before cp dma function.
This is just copied over.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:44:01 +10:00
Dave Airlie
bcae327469 radv: realign cp dma code with radeonsi
This reworks this code to be like radeonsi, which will make it
easier to add GFX9 support to it in the future.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:49:11 +10:00
Dave Airlie
ad61eac250 radv: factor out eop event writing code. (v2)
In prep for GFX9 refactor some of the eop event writing code
out.

This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.

v2: bump the size checks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:56 +10:00
Dave Airlie
7205431e73 radv: factor out si_emit_wait_fence code.
This code was in a few places, consolidate into one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:20 +10:00
Dave Airlie
2add79a732 radv: apply the tess+GS hang workaround to Polaris12 as well
As I pointed out for radeonsi, and AMD confirmed, so fix this
in radv as well.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 11:17:48 +01:00
Dave Airlie
a096d8d3f7 radv: enable POLARIS12 support.
This just adds the chip in the right places.

We don't set the partial_vs_wave workaround, as radeonsi
doesn't, but have to confirm it's not required.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-05 11:07:40 +10:00
Bas Nieuwenhuizen
1e1165389c radv: Add shader prefetch.
Gives me approximately a 2% perf increase in bot dota2 & talos.

Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-19 23:47:27 +02:00
Bas Nieuwenhuizen
a4c4efad89 radv: Rework guard band calculation.
We want the guardband_x/y to be the largerst scalars such that each
viewport scaled by that amount is still a subrange of [-32767, 32767].

The old code has a couple of issues:
1) It used scissor instead of viewport_scissor, potentially taking into
   account a viewport that is too small and therefore selecting a scale
   that is too large.
2) Merging the viewports isn't ideal, as for example viewports with
   boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k,
   while their union [0, 1001] only allows a scale of ~32.

The new code just determines the guardband per viewport and takes the minimum.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-04-03 23:03:46 +02:00
Dave Airlie
03a67fbbf7 radv: fix order of the guardband register emission.
y is vert, x is horiz.

Noticed in visual inspection compared to radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-02 20:17:30 +10:00
Dave Airlie
3f0d69af20 radv: add ia_multi_vgt_param tessellation support.
This just ports the relevant radeonsi pieces.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:17:08 +10:00
Dave Airlie
aeb49bc2b9 radv: port polaris vgt vertex reuse workaround.
This ports the VGT_VERTEX_REUSE register settings
for Polaris GPUs from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:51 +10:00
Dave Airlie
46a820b383 radv: configure tessellation distribution register.
This just takes the radeonsi values.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:45 +10:00
Bas Nieuwenhuizen
0f3de89a56 radv: Use the guard band.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen
76603aa90b radv: Drop the default viewport when 0 viewports are given.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Marek Olšák
5691e14735 amd: GFX9 packet changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Dave Airlie
ae0551b4b3 radv: fix ia_multi_vgt_param for instanced vs indirect draw.
The logic was different than radeonsi, fix it up before adding
tess support.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:55 +10:00
Bas Nieuwenhuizen
ad4dee521d Revert "radv: Emit cache flushes before CP DMA."
This reverts commit cce43f6d8c.

Redundant, as the flush already happens at si_cp_dma_prepare.

Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-16 00:55:03 +01:00
Bas Nieuwenhuizen
cce43f6d8c radv: Emit cache flushes before CP DMA.
The flushes could be due to TRANSFER barriers.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-14 22:16:34 +01:00
Bas Nieuwenhuizen
66e12d4073 radv: Add L2 writeback.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 09:15:51 +01:00
Bas Nieuwenhuizen
5241fb0ffb radv: Flush in the initial preamble CS.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:19:58 +01:00
Bas Nieuwenhuizen
eac790811b radv: Split emitting the cache flush out.
So that we can use it without a cmd_buffer.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:19:45 +01:00
Dave Airlie
1f6376935b Revert "radv: detect command buffers that do no work and drop them (v2)"
This just keeps popping up minor problems and regressions we should
revisit in a more sustainable manner later.

This also reverts:
Revert "radv: query cmds should mark a cmd buffer as having draws."
Revert "radv: also fixup event emission to not get culled."

This reverts commit d1640e7932.
This reverts commit 8b47b97215.
This reverts commit b4b19afebe.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-20 09:00:40 +10:00
Dave Airlie
3360dbe0c1 radv: fixup IA_MULTI_VGT_PARAM handling.
This ports the remains of the workarounds from radeonsi for
the non-TESS cases. It should provide equivalent workarounds
for hawaii and bonarie.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 20:29:19 +00:00
Dave Airlie
09bf5491c4 radv: adopt some init config workarounds from radeonsi.
Just one bonaire fix.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:33 +10:00
Dave Airlie
5e988ac61f radv: align the initial state command buffer.
This just adds the padding to align this to an 8 dword boundary.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:21 +10:00
Dave Airlie
592069c1fb radv: use indirect buffer for initial gfx state.
This puts the common gfx state for the device into an
indirect buffer, and just calls out to it, on CIK and above.

This is taken from what radeonsi does.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:45 +00:00
Dave Airlie
b26253b34d radv: start splitting init config up
This is just prep work for the following patch to use
a common gfx init indirect buffer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:34 +00:00
Dave Airlie
604e562e5b radv: don't pass physical device to si_init_ fns.
This is just a trivial cleanup.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:06 +00:00
Dave Airlie
8b47b97215 radv: detect command buffers that do no work and drop them (v2)
If a buffer is just full of flushes we flush things on command
buffer submission, so don't bother submitting these.

This will reduce some CPU overhead on dota2, which submits a fair
few command streams that don't end up drawing anything.

v2: reorganise loop to count first then malloc,
rename some vars (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:00:28 +00:00
Dave Airlie
3b507855cb radv: fixup ia multi vgt param code to handle geom shaders.
This fixes up a few of the commented out blocks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:28 +10:00
Bas Nieuwenhuizen
8406f79d6a radv: Get physical device from radv_device instead of the instance.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-16 22:15:22 +01:00
Dave Airlie
e9d3cbca31 radv: fix multi-viewport emission
This set context req seq was in the wrong place.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-11 09:08:51 +01:00
Bas Nieuwenhuizen
97dfff5410 radv: Dump command buffer on hang.
v2:
  - Now use the filename specified by RADV_TRACE_FILE env var.
  - Use the same var to enable tracing.

I thought we could as well always set the filename explicitly
instead of having some arbitrary defaults, and at that point
we don't need a separate feature enable.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-09 21:44:03 +01:00