Marek Olšák
442ef8c3e3
radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
...
This decreases memory usage, because serialized NIR is more compact.
The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:45 -05:00
Marek Olšák
62229e8949
radeonsi: use IR SHA1 as the cache key for the in-memory shader cache
...
instead of using whole IR binaries. This saves some memory.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:42 -05:00
Marek Olšák
4d1e43badb
radeonsi: initialize shader compilers in threads on demand
...
It takes a noticable amount of time with piglit.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-28 21:36:18 -04:00
Marek Olšák
fff884e09d
radeonsi/nir: implement pipe_screen::finalize_nir
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
268e0e01f3
radeonsi/nir: simplify si_lower_nir signature
...
just a cleanup
2019-10-15 21:52:09 -04:00
Marek Olšák
dd4cc56ebd
nir: add a strip parameter to nir_serialize
...
so that drivers don't have to call nir_strip manually.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-10 15:47:07 -04:00
Marek Olšák
743a9d85e2
radeonsi: add FMASK slots for shader images (for MSAA images)
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:31 -04:00
Marek Olšák
eec7b0a865
radeonsi: use simple_mtx_t instead of mtx_t
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:05:07 -04:00
Timothy Arceri
896885025f
util/u_queue: track job size and limit the size of queue growth
...
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.
This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.
The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-19 15:03:27 +10:00
Marek Olšák
360cf3c4b0
radeonsi: fix scratch buffer WAVESIZE setting leading to corruption
...
Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:32 -04:00
Marek Olšák
40e5ac45ae
radeonsi: align scratch and ring buffer allocations for faster memory access
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:28 -04:00
Marek Olšák
d8f27552f4
radeonsi: consolidate determining VGPR_COMP_CNT for API VS
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
4dde40908f
radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags
...
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.
Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
1426acf9e7
radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables
...
It varies depending on si_shader_key::as_ngg.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
28f44ee533
radeonsi/gfx10: fix InstanceID for legacy VS+GS
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
e121d75de9
radeonsi/gfx10: add as_ngg variant for VS as ES to select Wave32/64
...
Legacy GS only works with Wave64.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
f34d023f1a
radeonsi/gfx10: create the GS copy shader if using legacy streamout
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
776f05a307
radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
cab5b3861d
radeonsi/gfx10: fix tessellation for the legacy pipeline
...
ported from PAL
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
a9bb566955
radeonsi: move some global shader cache flags to per-binary flags
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
810846e157
radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache
...
It could load an NGG shader when we want a legacy shader and vice versa.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Samuel Pitoiset
021feb1bf6
ac: add rbplus_allowed to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:41 +02:00
Samuel Pitoiset
c08401f035
ac: add has_distributed_tess to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:11 +02:00
Marek Olšák
223b3174bd
radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
...
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.
This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
bdcbac9459
radeonsi: handle the use_ngg_streamout flag in si_update_ngg
2019-08-19 17:23:38 -04:00
Marek Olšák
a6b3ca1c70
radeonsi: move the tess factor ring size assertion to a place where it matters
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
8ce4f9bbc3
radeonsi: remove the always_nir option
...
tgsi_to_nir is no longer optional if NIR is enabled.
2019-08-12 14:52:17 -04:00
Marek Olšák
6a2bdb8d01
gallium: add TGSI_PROPERTY_VS_BLIT_SGPRS_AMD for tgsi_to_nir
...
needed by radeonsi NIR support
2019-08-12 14:52:17 -04:00
Marek Olšák
91227a1e17
radeonsi/gfx10: add global use_ngg and use_ngg_streamout flags
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:09:02 -04:00
Marek Olšák
f064b530f6
radeonsi/gfx10: remove an obsolete VGT_REUSE_OFF workaround
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:09:01 -04:00
Marek Olšák
c5a6ecf61a
radeonsi/gfx10: implement a bug workaround for GE_PC_ALLOC
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:58 -04:00
Marek Olšák
8f8c28767e
radeonsi/gfx10: implement a bug workaround for NGG -> legacy transitions
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:57 -04:00
Marek Olšák
cb9d95623b
radeonsi/gfx10: implement a GE bug workaround
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:56 -04:00
Marek Olšák
71b53020b7
radeonsi/gfx10: simplify NGG code in si_update_shaders
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:53 -04:00
Marek Olšák
a232f5e07c
radeonsi/gfx10: fix input VGPRs for legacy VS
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:51 -04:00
Marek Olšák
8b8819e88a
radeonsi: make sure that DSA state != NULL and remove all NULL checking
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
b758eed9c3
radeonsi: make sure that blend state != NULL and remove all NULL checking
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
e777720173
radeonsi/nir: lower PS inputs before scanning the shader
...
Lowering PS inputs can eliminate some of them, which messes up
persp/linear barycentric coord usage info.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:46 -04:00
Marek Olšák
5f16fdefdf
radeonsi/nir: add an option to convert TGSI to NIR
...
Use at your own risk.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-30 22:06:23 -04:00
Marek Olšák
e718f8e713
radeonsi: simplify si_get_input_prim and remove incorrect TODO comment
...
u_vertices_per_prim(QUADS) is the same as TRIANGLES.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:49 -04:00
Marek Olšák
ad642d5b3a
radeonsi: stop using info.opcode_count[TGSI_OPCODE_INTERP_SAMPLE]
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:46 -04:00
Marek Olšák
8f72f137ad
radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64
...
Legacy GS has to use Wave64, so TES before GS has to use Wave64 too.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
88efb63caf
radeonsi/gfx10: implement Wave32
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
d3a80f2dda
radeonsi/gfx10: remove the disable_ngg option
...
because legacy VS hangs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
0f30223cf4
radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
985a59e0d1
radeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
7f0ada3f3e
radeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGG
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
e08463ac22
radeonsi/gfx10: update a tunable max_es_verts_base for NGG
...
We have to fix the computation so as not to break quads.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
79d56e6a4a
radeonsi/gfx10: implement ARB_post_depth_coverage
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
4985c3ee22
radeonsi/gfx10: set HS/GS/CS.WGP_MODE
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00