Commit graph

580 commits

Author SHA1 Message Date
Marek Olšák
8904fcca6d gallium: inline struct u_suballocator to remove dereferences
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7901>
2020-12-03 21:41:19 +00:00
Marek Olšák
86675a07f8 radeonsi: don't check for GS fast launch for NOT_EOP in the indexed case
GS fast launch always uses the non-indexed path.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
c7470c1760 radeonsi: don't set DrawID and StartInstance if they are unused
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
c4ddf67ee1 radeonsi: don't invalidate emitted NUM_INSTANCES for u_blitter
invalidate_draw_sh_constants should invalidate only SGPRs.
invalidate_draw_constants invalidates SGPRs and NUM_INSTANCES.

u_blitter called invalidate_draw_sh_constants, which previously
invalidated NUM_INSTANCES as well. This commit fixes that.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
623ea81530 radeonsi: don't update provoking vertex and outprim states in SGPR if unused
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
4641dca269 radeonsi: don't update indexed flag in SGPR if it's unused
to skip the register update when switching between indexed and non-indexed

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
509142876b radeonsi: add AMD_DEBUG=nofastlaunch for debugging
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
530c276c4c radeonsi: fix max_lds_size warning in release builds
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
9d21031265 radeonsi: fix line stippling with LINES_ADJACENCY without GS
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:03 -05:00
Marek Olšák
a287ab2020 radeonsi: use util_logbase2 instead of division by index_size
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
2020-12-01 15:33:02 -05:00
Marek Olšák
aaed7a29be radeonsi: implement GS fast launch for indexed triangle strips
This increases performance for indexed triangle strips up to +100%.
In practice, it's limited by memory bandwidth and compute power,
so 256-bit memory bus and a lot of CUs are recommended.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7681>
2020-11-27 06:16:59 +00:00
Marek Olšák
f7364c9fe0 radeonsi: don't allocate LDS for TCS inputs if it's not used
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák
1190808eca radeonsi: if VS and TCS have the same number of threads, merge the conditonals
Instead of:
    if (VS) {
	VS;
    }
    if (TCS) {
	TCS;
    }

Do this if the number of threads is the same in VS and TCS:
    exec = enabled_threads;
    VS;
    TCS;

Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return
values.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák
5df5ee2722 radeonsi: limit HS LDS usage per workgroup to 16K to allow at least 2 WGs/CU
This increases occupancy when the LDS size is e.g. 20K for 3 waves.
If we limit the size to 16K, we can fit 2 workgroups with 2 waves each,
so 4 waves in total.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák
bdee9dc633 radeonsi: don't allocate LDS for TCS outputs if they are not read
This reduces LDS usage by 50% in Unigine Heaven.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák
10beddf659 radeonsi: don't leave more than 8 unoccupied lanes in HS
Previously it was 16 and bigger patches would always trim the patch count
needlessly.

There are 2 variables to consider:
- lane occupancy
- LDS usage (limiting wave occupancy)

If LDS size is 32 KB (max limit per CU) for 3 waves and we can't maximize
occupancy, it's better to leave some lanes unoccupied because using 2
waves would decrease the LDS size to 21 KB, which is not enough to fit
another workgroup on the CU.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák
9b5b5cbc53 radeonsi: adjust tess SGPRs to allow fully occupied 3 HS waves of triangles
With triangles and 3 HS waves, 3 lanes were unoccupied. Adjust the SGPR
encoding to allow 1 more triangle to fit there.

Some of the fields are not large enough, but they weren't large enough
before either.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:20 +00:00
Marek Olšák
4753235406 radeonsi: don't do VGT_FLUSH before fast launch on gfx10.3
I don't see any hangs here. Blender and the Factorio trace work fine.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7542>
2020-11-18 06:19:59 +00:00
Marek Olšák
e29e41a3cd radeonsi: determine correctly if switching from normal launch to fast launch
Fixes: 3da91b3327 - radeonsi/ngg: add VGT_FLUSH when enabling fast launch

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7542>
2020-11-18 06:19:59 +00:00
Marek Olšák
8d2876a343 radeonsi: only do VGT_FLUSH for fast launch if previous draw was normal launch
Fixes: 3da91b3327 - radeonsi/ngg: add VGT_FLUSH when enabling fast launch

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7542>
2020-11-18 06:19:58 +00:00
Marek Olšák
74ea26f613 radeonsi: fix min_direct_count value
It was always 0.

Fixes: 0ce68852c "radeonsi: implement multi_draw but supporting only 1 draw"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7542>
2020-11-18 06:19:58 +00:00
Marek Olšák
602d4a78bc radeonsi: handle pipe_draw_info::increment_draw_id
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:25 +00:00
Marek Olšák
c4310f70aa radeonsi: swap DrawId and StartInstance SGPR locations
We need to change both values at the same time, so they need to be next
to each other.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:25 +00:00
Marek Olšák
f14a05d618 radeonsi: don't load DrawID for indirect draws if it's unused
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:25 +00:00
Marek Olšák
1cd455b17b gallium: extend draw_vbo to support multi draws
Essentially rename multi_draw to draw_vbo and remove start and count
from pipe_draw_info.

This is only an interface change. It doesn't add multi draw support
anywhere.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:25 +00:00
Marek Olšák
abe8ef862f gallium: make pipe_draw_indirect_info * a draw_vbo parameter
This removes 8 bytes from pipe_draw_info (think u_threaded_context)
and a lot of info->indirect pointer indirections.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:24 +00:00
Marek Olšák
1a717dca04 gallium: move count_from_stream_output into pipe_draw_indirect_info
This removes some overhead from tc_draw_vbo and increases the maximum number
of draws per batch from 153 to 192 in u_threaded_context.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:24 +00:00
Marek Olšák
a44868beda radeonsi: implement multi_draw for compute-based primitive culling
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
cc24ec8c07 radeonsi: set NOT_EOP for back-to-back draws on gfx10+
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
ca40dc01cc radeonsi: add support for multi draws
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
0ce68852c1 radeonsi: implement multi_draw but supporting only 1 draw
just adapting to the new interface

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
ae8d89260c radeonsi: don't check info->count == 0
it won't work with multi draws

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
d9c4ca2b7b radeonsi don't get count from pipe_draw_info in si_num_prims_for_vertices
This is needed for multi draws.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
7cc939f7dd radeonsi: add num_draws parameter into si_need_gfx_cs_space
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
2020-10-31 00:18:11 +00:00
Marek Olšák
b7501184b9 radeonsi: implement inlinable uniforms
This improves performance for uber shaders.

It must be enabled using the new driconf option.

The driver compiles the specialized shaders in another thread without stalls,
same as all other optimizations.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7057>
2020-10-30 11:07:22 +00:00
Marek Olšák
6810e6e4d0 Revert "radeonsi/gfx10: disable vertex grouping"
This reverts commit 42f921387b.

It causes GPU hangs on gfx10.3.

Fixes: a23802bcb9 - ac,radeonsi: start adding support for gfx10.3

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>
2020-10-17 01:58:19 +00:00
Marek Olšák
30c3b2c0b6 radeonsi: simplify NGG culling enablement and add radeonsi_shader_culling option
Add a vertex count threshold into si_shader_selector to simplify
the draw_vbo code.

The new option is supposed to be used in 00-mesa-defaults.conf and should be
tweaked for best performance unlike the AMD_DEBUG experimental options.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6948>
2020-10-01 16:29:46 +00:00
Pierre-Eric Pelloux-Prayer
90b98c0649 amd/tmz: move uses_secure_bos to radeon_winsys
This allows to inline radeon_uses_secure_bos calls and reduce CPU overhead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer
8e2768bbfb radeonsi/tmz: add tmz variant for sctx::tess_rings
tess_rings must be encrypted when used in a secure job so this commit
introduces a tess_rings_tmz resource.

The cs_preamble_state doesn't contain the tess_rings address anymore since
it can change. The tess_rings related registers go in a separate preamble.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer
2589888ce9 radeonsi/tmz: add tmz variant of sctx::wait_mem_scratch
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer
55b018b634 amd/winsys: add RADEON_FLUSH_TOGGLE_SECURE_SUBMISSION
Instead of exposing a cs_set_secure() callback that always needs a call
to si_flush_gfx_cs before a switch, this commit introduces a new
flag to switch between secure and non-secure on submissions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer
1b0d660cbc radeonsi/tmz: allow secure job if the app made a tmz allocation
This commit makes TMZ always allowed instead of being either off or forced-on
with AMD_DEBUG=tmz.

With this change:
- secure job can be used as soon as the application made a tmz allocation. Driver
  internal allocations are not enough to enable secure jobs (if tmz is supported
  and enabled by the kernel)
- AMD_DEBUG=tmz forces all scanout/depth/stencil buffers to be allocated as TMZ.
  This is useful to test app thats don't explicitely support protected content.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Marek Olšák
32d754825c radeonsi: always inline draw-related functions that have only one use
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
f24b5894f8 radeonsi: lift the conditional for skipping si_upload_vertex_buffer_descriptors
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
0b2f75f9ac radeonsi: add unlikely statements into si_draw_vbo
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
8ab15c9e33 radeonsi: move si_upload_vertex_buffer_descriptors into si_state_draw.c
It will be inlined there.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
12b1e8a35d radeonsi: reorganize the code around the gfx9 scissor bug
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
d647065b06 radeonsi: move a displaced comment in si_draw_vbo
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
816a867bbd radeonsi: call si_upload_graphics_shader_descriptors before the big conditional
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák
22253e6b65 gallium: rename PIPE_TRANSFER_* -> PIPE_MAP_*
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5749>
2020-09-22 03:20:54 +00:00