Marek Olšák
a803008c7f
radeonsi: replace TGSI_INTERPOLATE with INTERP_MODE
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
6925401a38
radeonsi: remove si_shader_selector::type
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
966307983b
radeonsi: precompute si_*_descriptors_idx in si_shader_selector
...
It helps remove one use of sel->type.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
3c54d73e4b
radeonsi: change PIPE_SHADER to MESA_SHADER (debug flags)
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
b1cb72c449
radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_selector::type)
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
14391533f8
radeonsi: simplify handling color interp modes in si_emit_spi_map
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340 >
2020-09-02 23:03:00 -04:00
Marek Olšák
81d106d6ec
radeonsi: lower IO intrinsics - complete rewrite of input/output scanning
...
Input and output info is gathered from intrinsics. nir_variables are
ignored (and we'll remove them anyway).
This is a prerequisite for ACO, but also makes the IR prettier.
The ac_nir_to_llvm change has to be in this commit.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445 >
2020-09-02 22:45:38 -04:00
Marek Olšák
b7a6333ee4
amd/registers: switch to new generated register definitions
...
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6423 >
2020-09-01 08:45:54 -04:00
Marek Olšák
633d2aa915
radeonsi: use the same units for esgs_ring_size and ngg_emit_size
...
for consistency
Fixes: a23802bcb9 - ac,radeonsi: start adding support for gfx10.3
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6137 >
2020-08-07 11:22:21 -04:00
Marek Olšák
8af0f91fd3
radeonsi: add reg shadowing codepaths to GS and tess ring setup
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5798 >
2020-07-22 12:08:33 -04:00
Marek Olšák
b84dbd2936
radeonsi: reorder code in update_gs_ring_buffers and init_tess_factor_ring
...
to reduce the churn when adding codepaths for shadowed registers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5798 >
2020-07-22 12:08:19 -04:00
Marek Olšák
081691b5ae
radeonsi: prevent a gfx10_ngg_calculate_subgroup_info failure for TES+NGG GS
...
arb_tessellation_shader-tes-gs-max-output -small -scan 1 50 -auto -fbo
doesn't pass, but at least all shaders are compiled successfully.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5700 >
2020-07-16 04:04:52 +00:00
Marek Olšák
50d7553600
radeonsi: add a debug option to enable NGG culling for tessellation
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5524 >
2020-06-30 10:56:41 +00:00
Marek Olšák
9049e39804
radeonsi: always use Wave32 for GS fast launch, because Wave64 hangs
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5524 >
2020-06-30 10:56:41 +00:00
Pierre-Eric Pelloux-Prayer
5a05f9714b
radeonsi: bump SI_NUM_SHADER_BUFFERS to 32
...
Some app uses more than 8 SSBOs (https://gitlab.freedesktop.org/mesa/mesa/-/issues/2946 ),
so increase SI_NUM_SHADER_BUFFERS to 32 (which allows 16 SSBOs).
Since we're now using a 64 bits number to track buffers, we could bump
SI_NUM_SHADER_BUFFERS to 48 but that would conflict with Mesa's
MAX_COMBINED_ATOMIC_BUFFERS limit (= 90).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2122
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5632 >
2020-06-30 09:23:14 +02:00
Marek Olšák
da78d50bc8
radeonsi: make si_pm4_cmd_begin/end static and simplify all usages
...
There is no longer the confusing trailing si_pm4_cmd_end call.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5603 >
2020-06-26 07:02:57 +00:00
Marek Olšák
7b2a0f880b
radeonsi: disallow adding BOs into si_pm4_state except 1 shader BO per state
...
The si_shader pointer is already there, so use it and remove the array
of BOs.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5603 >
2020-06-26 07:02:57 +00:00
Marek Olšák
428360662f
radeonsi: don't add the tess ring buffers into the cs_preamble state
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5603 >
2020-06-26 07:02:57 +00:00
Marek Olšák
1c1d34a67a
radeonsi: rename init_config states to cs_preamble states
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5603 >
2020-06-26 07:02:57 +00:00
Marek Olšák
3fec2f67c3
radeonsi: compact MRTs to save PS export memory space
...
If there are holes between color outputs (e.g. a shader exports MRT1, but
not MRT0), we can remove the holes by moving higher MRTs lower.
The hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK.
This is a performance optimization, but MRTs with holes are pretty rare,
so there is most likely no effect on any app.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5535 >
2020-06-23 00:23:51 -04:00
Marek Olšák
a23802bcb9
ac,radeonsi: start adding support for gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
a1602516d7
ac,radeonsi: replace == GFX10 with >= GFX10 where it's needed
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
2a3806ffa3
amd: replace SH -> SA (shader array) in comments
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5184 >
2020-05-26 06:00:54 -04:00
Marek Olšák
2cf46f2e3d
ac/gpu_info: replace num_good_cu_per_sh with min/max_good_cu_per_sa
...
Perf counters use the new max number.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5184 >
2020-05-26 06:00:54 -04:00
Marek Olšák
3cd96b5109
radeonsi: don't use INDIRECT_BUFFER within IBs
...
It's fragile. If I change the size or alignment, it hangs. Better safe than
sorry.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5095 >
2020-05-23 03:44:44 -04:00
Axel Davy
45e69e7d11
radeonsi: Enable tgsi to nir disk cache
...
Enable the tgsi to nir cache for radeonsi.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4993 >
2020-05-13 19:43:05 +00:00
Axel Davy
522bd414f3
ttn: Add new allow_disk_cache parameter
...
For now this parameter doesn't do anything.
It means the implementation is allowed to use
a cache on disk.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4993 >
2020-05-13 19:43:05 +00:00
Pierre-Eric Pelloux-Prayer
547e81655a
radeonsi: don't print gs_copy_shader stats for shaderdb
...
Fixes: dbc86fa3de ("radeonsi: dump shader stats when hitting the live cache")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607 >
2020-05-05 12:26:02 +02:00
Marek Olšák
b4fd8f1919
ac,radeonsi: simplify checking for Navi1x chips
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698 >
2020-04-24 10:38:54 +00:00
Pierre-Eric Pelloux-Prayer
dbc86fa3de
radeonsi: dump shader stats when hitting the live cache
...
With the introduction of the live shader cache, when a shader is
fetched from the cache no stats are printed for shaderdb.
So in a sequence like this: vs1, fs1, vs1, fs2, shaderdb may see
3 or 4 lines, depending on the threads being used.
If one run produces 3 lines while the other produces 4 lines, it
would compare vs1 stats with fs2 stats.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355 >
2020-04-02 08:31:37 +02:00
Pierre-Eric Pelloux-Prayer
d7008fe46a
radeonsi: switch to 3-spaces style
...
Generated automatically using clang-format and the following config:
AlignAfterOpenBracket: true
AlignConsecutiveMacros: true
AllowAllArgumentsOnNextLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: false
AlwaysBreakAfterReturnType: None
BasedOnStyle: LLVM
BraceWrapping:
AfterControlStatement: false
AfterEnum: true
AfterFunction: true
AfterStruct: false
BeforeElse: false
SplitEmptyFunction: true
BinPackArguments: true
BinPackParameters: true
BreakBeforeBraces: Custom
ColumnLimit: 100
ContinuationIndentWidth: 3
Cpp11BracedListStyle: false
Cpp11BracedListStyle: true
ForEachMacros:
- LIST_FOR_EACH_ENTRY
- LIST_FOR_EACH_ENTRY_SAFE
- util_dynarray_foreach
- nir_foreach_variable
- nir_foreach_variable_safe
- nir_foreach_register
- nir_foreach_register_safe
- nir_foreach_use
- nir_foreach_use_safe
- nir_foreach_if_use
- nir_foreach_if_use_safe
- nir_foreach_def
- nir_foreach_def_safe
- nir_foreach_phi_src
- nir_foreach_phi_src_safe
- nir_foreach_parallel_copy_entry
- nir_foreach_instr
- nir_foreach_instr_reverse
- nir_foreach_instr_safe
- nir_foreach_instr_reverse_safe
- nir_foreach_function
- nir_foreach_block
- nir_foreach_block_safe
- nir_foreach_block_reverse
- nir_foreach_block_reverse_safe
- nir_foreach_block_in_cf_node
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '<[[:alnum:].]+>'
Priority: 2
- Regex: '.*'
Priority: 1
IndentWidth: 3
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyExcessCharacter: 100
SpaceAfterCStyleCast: false
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: false
SpacesInContainerLiterals: false
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319 >
2020-03-30 11:05:52 +00:00
Marek Olšák
4ef1c8d60b
radeonsi/gfx10: fix the wave size for compute-based culling
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4269 >
2020-03-28 00:58:34 +00:00
Marek Olšák
65e9239977
radeonsi: add num_vbos_in_user_sgprs into the shader cache key
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4269 >
2020-03-28 00:58:34 +00:00
Marek Olšák
7ba5e94c50
ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143 >
2020-03-12 17:27:23 +00:00
Marek Olšák
7481c4be58
radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS
...
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079 >
2020-03-09 16:08:10 -04:00
Daniel Schürmann
9d64ad2fe7
radeonsi: lower discard to demote when FS_CORRECT_DERIVS_AFTER_KILL is enabled
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047 >
2020-03-09 12:29:32 +00:00
Marek Olšák
c046551e60
radeonsi: print shader cache stats with AMD_DEBUG=cache_stats
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929 >
2020-01-24 20:29:29 -05:00
Marek Olšák
2fd3bb23ab
radeonsi: restructure si_shader_cache_load_shader
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929 >
2020-01-24 20:29:29 -05:00
Marek Olšák
0db74f479b
radeonsi: use the live shader cache
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929 >
2020-01-24 20:29:29 -05:00
Marek Olšák
7ce84b256e
radeonsi: make si_compile_shader return bool
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
735a3ba007
radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling
...
Only non-indexed triangle lists and strips are supported. This increases
performance if there is something to cull.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
c377f45c18
radeonsi/gfx10: rewrite late alloc computation
...
- Use conservative late alloc when the number of CUs <= 6.
- Move the late alloc GS register to the GS shader state, so that it can be
tuned for NGG culling.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
8db00a51f8
radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
aa2d846604
radeonsi/gfx10: move GE_PC_ALLOC setting to shader states
...
The value is not changed. I just use a different way to compute it.
The value will vary with NGG culling.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
41fef6fc09
radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough
...
v2: TES doesn't use the GS PrimitiveID
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
42112010a3
radeonsi: rename si_shader_create -> si_create_shader_variant for clarity
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
03950473df
radeonsi: merge si_tessctrl_info into si_shader_info
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
5fa2ab831e
radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
363b4027fc
radeonsi: put up to 5 VBO descriptors into user SGPRs
...
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs
We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.
Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
312e04689a
radeonsi: don't allow draw calls with uninitialized VS inputs
...
These always hang, because vertex buffer descriptors are not set up.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00