Marek Olšák
c252273f98
radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIK
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
e402961e1d
radeonsi: correct WRITE_DATA.DST_SEL definitions
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák
5183e794af
radeonsi: also apply the GS hang workaround to draws without tessellation
...
ported from AMDVLK.
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 18:55:58 -05:00
Marek Olšák
cba475b3e7
radeonsi: use u_decomposed_prims_for_vertices instead of u_prims_for_vertices
...
It seems to be the same, but this doesn't use integer division with
a variable divisor.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:56 -05:00
Marek Olšák
54bc87469a
radeonsi: make si_cp_wait_mem more configurable
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:54 -05:00
Marek Olšák
d28e208213
radeonsi: don't emit redundant PKT3_NUM_INSTANCES packets
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-01-02 15:01:50 -05:00
Nicolai Hähnle
23af72af25
radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:32 +01:00
Nicolai Hähnle
f18b2ac0db
radeonsi: add si_init_draw_functions and make some functions static
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:30 +01:00
Marek Olšák
eae8f49fc6
radeonsi: fix a VGT hang with primitive restart on Polaris10 and later
...
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
fcc70e4855
radeonsi: track context rolls better for the Vega scissor bug workaround
...
We should get fewer context rolls with the SET_CONTEXT_REG optimization,
but it would have been for nothing if the scissor state rolled the context
anyway. Don't emit the scissor state if there is no context roll.
2018-10-16 17:23:25 -04:00
Marek Olšák
0d05581578
radeonsi: rename si_gfx_* functions to si_cp_*
...
and write_event_eop -> release_mem
2018-10-16 15:28:22 -04:00
Marek Olšák
6e1cf6532d
radeonsi: make si_gfx_write_event_eop more configurable
2018-10-16 15:28:22 -04:00
Marek Olšák
066aa44fc5
radeonsi: fix a typo at CS_PARTIAL_FLUSH
...
harmless
2018-10-06 21:50:52 -04:00
Marek Olšák
fa023f293e
ac: correct PKT3_COPY_DATA definitions
2018-10-06 21:50:09 -04:00
Marek Olšák
ac72a6bd0b
radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:31 -04:00
Marek Olšák
86b52d4236
radeonsi: reduce LDS stalls by 40% for tessellation
...
40% is the decrease in the LGKM counter (which includes SMEM too)
for the GFX9 LSHS stage.
This will make the LDS size slightly larger, but I wasn't able to increase
the patch stride without corruption, so I'm increasing the vertex stride.
2018-07-23 20:23:52 -04:00
Marek Olšák
5a6414f135
radeonsi: implement vertex color clamping for tess and GS
2018-06-28 22:41:12 -04:00
Marek Olšák
6703fec58c
amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-19 13:08:50 -04:00
Marek Olšák
166c00e28e
radeonsi: set a better NUM_PATCHES hard limit
...
AMDVLK uses 64 (distributed) and 16 (non-distributed).
radeonsi will use 63 and 16.
* This might improve tessellation performance on Hawaii, Bonaire, Tahiti,
Pitcairn. (they will use 16)
* I'm not sure if this matters for 1 SE configs.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
0d685ba290
radeonsi: make sure LS-HS vector lanes are reasonably occupied
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
e93fe403bc
radeonsi: properly compute an LS-HS thread group size limit
...
"64 / max * 4" is less than "64 * 4 / max".
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-19 12:52:28 -04:00
Marek Olšák
22e994bb75
radeonsi: assume that rasterizer state is non-NULL in draw_vbo
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:36 -04:00
Marek Olšák
f3b3ee6974
radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:34 -04:00
Marek Olšák
d6974feb90
radeonsi: move the guardband registers into a separate state atom
...
They have a different frequency of updates and don't change when scissors
change.
I think this even fixes something in si_update_vs_viewport_state.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:31 -04:00
Marek Olšák
68b1c669e7
radeonsi/gfx9: implement the scissor bug workaround without performance drop
...
This might improve performance on Vega10 and Raven.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:27 -04:00
Marek Olšák
73b0d10152
radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:25 -04:00
Marek Olšák
28ee825e19
radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs
...
same as amdvlk.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:23 -04:00
Marek Olšák
6fadfc01c6
radeonsi: use r600_resource() typecast helper
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
3160ee876a
radeonsi: remove unused atom parameter from si_atom::emit
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
ccebcba893
radeonsi: remove si_atom::id
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
639b673fc3
radeonsi: don't use an indirect table for state atoms
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
9054799b39
radeonsi: rename r600_atom -> si_atom
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
c6f1d36019
radeonsi: add support for VegaM
...
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:33 -04:00
Marek Olšák
9a1363427e
radeonsi: always prefetch later shaders after the draw packet
...
so that the draw is started as soon as possible.
v2: only prefetch the API VS and VBO descriptors
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
e4b7974ec7
radeonsi: emit shader pointers before cache flushes & waits
...
This code was written with the constant engine in mind.
We can simplify it now.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
6a93441295
radeonsi: remove r600_common_context
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f77361d2e
radeonsi: remove r600_pipe_common::screen
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5777488406
radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
076afb4f0e
radeonsi: rename a few R600/r600_ -> SI_/si_
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c424f86180
radeonsi: use si_context instead of pipe_context in parameters pt1
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4c5efc40f4
radeonsi: update copyrights
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
95bc30275b
radeonsi: switch radeon_add_to_buffer_list parameter to si_context
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3069cb8b78
radeonsi: use r600_common_context less pt2
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0606190059
radeonsi: don't use r600_common_context in si_emit_cache_flush
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3de323f9bb
radeonsi: switch r600_atom::emit parameter to si_context
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2b70dd8c8a
radeonsi: flatten / remove struct r600_ring
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fa09388704
radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2d03c4cac8
radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs
...
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
fca7dee9c6
radeonsi: put both tessellation rings into 1 buffer
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
41895c26d3
radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits
...
For a later patch.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00