Commit graph

195 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
b55a2065e0 radeonsi/sqtt: rework pm4.reg_va_low_idx
The initial logic was to remember the place were SPI_SHADER_PGM_LO_*
are written, then assume that we can get the register offset because
the sequence would always be:

   PKT3_SET_SH_REG
   SPI_SHADER_PGM_LO_* register offset
   VA low 32 bits value <- reg_va_low_idx

The problem is that this sequence isn't guaranteed, for instance we
can get this instead:

   0   c0067600 |
   1   00000046 |
   2   003ffffd | SPI_SHADER_PGM_RSRC3_VS
   3   00000020 | SPI_SHADER_LATE_ALLOC_VS
   4 * 00002080 | SPI_SHADER_PGM_LO_VS
   5   00000080 | SPI_SHADER_PGM_HI_VS

So the assert in si_state_draw.cpp would fail as well as the VA
update logic.

So instead remember which the SPI_SHADER_PGM_LO_* offset, and the low
32 bits of the VA in si_update_shaders.

Fixes: 8034a71430 ("radeonsi/sqtt: re-export shaders in a single bo")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
2023-12-20 12:23:27 +00:00
Bas Nieuwenhuizen
d04ee07712 radeonsi: Add support to clear LDS at the end of a shader.
No hash updates as I didn't find a facility to do it in radeonsi
(even though there are flags like forcing fma32).

Note that we do this very late to avoid any optimizations that
might remove the dead stores. (Checked that LLVM doesn't remove
them, but it is admittedly potentially brittle)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
2023-12-20 09:15:45 +00:00
Marek Olšák
197af03698 radeonsi: add PS input info into si_shader_binary_info
It will be modified to reflect PS inputs after uniform inlining.
For now, it's just a copy of selector->info.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
c77bcf00a3 radeonsi/gfx11: prefer Wave64 for VS/TCS/TES/GS because it's slightly faster
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
257f07f499 radeonsi: clean up how debug flags and shader profiles determine the wave size
- remove DBG_W32_PS_DISCARD
- just return the wave size instead of setting local variables dbg_wave_size
  and profile_wave_size

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
716b521515 radeonsi/gfx11: disable the shader profile for Medical that disables binning
GFX11 performs better with the default behavior.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
f85488824e radeonsi/gfx11: disable the shader profile for Medical that forces Wave64
GFX10 should keep using it, but not GFX11.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
65b3b0b355 radeonsi/gfx11: prefer Wave64 for PS without inputs for better VALU perf
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
55d81214c9 radeonsi: replace gl_FrontFacing with a constant if one side is always culled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
bf7debee82 radeonsi: in bind_{blend,rs}_state, only call 1 update function per if
Also don't use "key.ps.part.prolog.color_two_side" during updates
because it would depend on the order the update functions are called,
which is not a problem now, but it's a trap for the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
53aa36772a radeonsi: rewrite si_get_total_colormask as si_any_colorbuffer_written
The result is only used as bool.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
e2b817b948 radeonsi: rewrite how shader key bits dependent on current_rast_prim are updated
Don't set do_update_shaders every time current_rast_prim changes, which can
be EVERY DRAW. Instead, just update the shader key bits and set
do_update_shaders only if any bits are different.

When we bind a new rasterizer state, do the same.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
4ab5374ec3 radeonsi: clean up setting poly/line/stipple shader key bits
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
f9c4ac3477 radeonsi: update shaders for rasterizer state only if the shader key changed
Check if any key bit changed before setting do_update_shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
613ea16aab radeonsi: update shaders for blend state only if the shader key changed
Check if any key bit or state changed before setting do_update_shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
c8411ddf17 radeonsi: change the low-priority compiler queue to normal priority
I'm guessing that low priority could cause us to get optimized shaders later
than we need.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
98e7a7123b radeonsi: don't set non-existent VGT_GS_MAX_PRIMS_PER_SUBGROUP on gfx10
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Pierre-Eric Pelloux-Prayer
945288ffae radeonsi: check sctx->tess_rings is valid before using it
Fixes: c89ca3b47f ("radeonsi: change si_emit_derived_tess_state into a state atom")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10015

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26190>
2023-11-25 15:33:03 +00:00
Marek Olšák
b07a58157d radeonsi: remove the LAYER output if the framebuffer state has only 1 layer
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26274>
2023-11-24 15:37:24 +00:00
Marek Olšák
3a0a3a5c35 radeonsi: implement gl_Layer in FS as a system value
This replaces the vec4 FS input with the Ancillary VGPR input.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26274>
2023-11-24 15:37:24 +00:00
Marek Olšák
130428e758 radeonsi: don't allocate output space for LAYER/VIEWPORT before TES and GS
The outputs are ignored according GL_ARB_shader_viewport_layer_array.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26274>
2023-11-24 15:37:24 +00:00
Marek Olšák
2ac6816b70 radeonsi/gfx11: use SET_CONTEXT_REG_PAIRS_PACKED for other states
It's used where registers are non-contiguous.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25941>
2023-11-10 18:03:57 -05:00
Marek Olšák
6a31c7a841 radeonsi: move SPI_SHADER_IDX_FORMAT into the preamble (it's immutable)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:44 +00:00
Marek Olšák
15293217e2 radeonsi: remove num_params variable from gfx10_shader_ngg
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:44 +00:00
Marek Olšák
8edb0c7038 radeonsi: move emitting VGT_TF_PARAM into gfx10_emit_shader_ngg
so that it's next to other registers instead of separated

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:44 +00:00
Marek Olšák
ac22440859 radeonsi: rename radeon_*push_*_sh_reg -> gfx11_*push_*_sh_reg
Those will only be used by gfx11.x.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:44 +00:00
Marek Olšák
df87c593f8 radeonsi: rewrite PM4 packet building helpers with less duplication
First, the following universal helpers are defined:
- radeon_set_reg_seq
- radeon_set_reg
- radeon_opt_set_reg
- radeon_opt_set_reg2
- radeon_opt_set_reg3
- radeon_opt_set_reg4
- radeon_opt_set_reg5
- radeon_opt_set_regn
- gfx11_push_sh_reg
- gfx11_opt_push_sh_reg

Then the config, context, sh, uconfig, push_gfx and push_compute helpers
are implemented calling the above.

A lot of macros were receiving sctx via a parameter, which is changed to
use sctx directly in the macro (and the parameter is renamed to "_unused").

The only functional change is that the perfctr registers that incorrectly
set the predicate bit now correctly set the RESET_FILTER_CAM bit.

The helpers no longer check info.uses_kernel_cu_mask.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:44 +00:00
Marek Olšák
b74d849a29 ac/gpu_info: split has_set_pairs_packets into context and sh flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26095>
2023-11-07 19:27:43 +00:00
Marek Olšák
12c239f829 radeonsi: various isolated cosmetic changes
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26055>
2023-11-05 14:06:56 -05:00
Marek Olšák
6708ccd3bf radeonsi: remove and inline si_shader::ngg::prim_amp_factor
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26055>
2023-11-05 12:39:42 -05:00
Marek Olšák
738babc67a radeonsi: inline si_allocate_gds and si_add_gds_to_buffer_list
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26055>
2023-11-05 12:39:42 -05:00
Qiang Yu
bad8fbe7f8 radeonsi: include ac_llvm_util.h when llvm available
Remove unused include.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25632>
2023-10-26 10:27:55 +08:00
Qiang Yu
032c592619 radeonsi: stop llvm context creation when use aco
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25632>
2023-10-26 10:27:55 +08:00
Qiang Yu
5bae345fb7 radeonsi: move llvm compiler alloc/free into create/destroy funcntion
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25632>
2023-10-26 10:27:55 +08:00
Marek Olšák
59e49cc6ab radeonsi: simplify/merge emit_shader_ngg functions
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24759>
2023-08-19 19:36:56 +00:00
Marek Olšák
1c82067b60 radeonsi: improve the heuristic when to use Wave32 for compute shaders
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24759>
2023-08-19 19:36:56 +00:00
Marek Olšák
e359254a19 radeonsi: allow setting any index in radeon_set_sh_reg_idx
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24759>
2023-08-19 19:36:56 +00:00
Marek Olšák
eb90fffa58 radeonsi: move GE_CNTL emission from si_draw into si_emit_vgt_pipeline_state
It doesn't depend on pipe_draw_info since pipe_context::set_patch_vertices
was added.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
1e4b539042 radeonsi: handle deferred cache flushes as a state (si_atom)
This allows us to remove a little bit of code from si_draw, and enable
removing more code in the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
c3129b2b83 radeonsi: add a simple version of si_pm4_emit_state for non-shader states
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
95cbdcee83 radeonsi: add index parameter into si_atom::emit
si_pm4_state will use si_atom, and both loops in si_emit_all_states will
be merged. This is a preparation for that because si_pm4_emit needs to know
the state index.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
7d67e10b02 radeonsi: remove splitting IBs that use too much memory
It was needed for r300, not so much for GCN/RDNA.
This reduces draw overhead.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
3a9de499b8 radeonsi: move si_emit_spi_map into si_state_shaders.cpp
to reduce the amount of code in si_state_draw.cpp.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Marek Olšák
e234c9fc21 radeonsi: move si_update/emit_tess_io_layout_state into si_state_shaders.cpp
to reduce the amount of code in si_state_draw.cpp.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24732>
2023-08-17 15:34:06 +00:00
Qiang Yu
85c0f31099 radeonsi: add exec_size to shader binary
Used by aco binary to split exec code and const data when combine
multi part shader binary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24443>
2023-08-16 11:25:28 +08:00
Mike Blumenkrantz
7672545223 gallium: move vertex stride to CSO
this simplifies code in most place and enables some optimizations in
frontends

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24117>
2023-08-14 01:23:25 +00:00
Marek Olšák
146a92dd9f radeonsi/gfx11: only use SET_*_PAIRS* packets on dGPUs
They are not available on APUs.

This adds a new template parameter HAS_PAIRS. into draw functions.
Other places add back the non-pairs code for gfx11.

Fixes: 22f3bcfb - radeonsi/gfx11: use SET_*_REG_PAIRS_PACKED packets for pm4 states
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9259

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24010>
2023-07-09 04:18:05 -04:00
Yonggang Luo
e53915828f treewide: Replace the usage of ubyte/ushort with uint8_t/uint16_t
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23577>
2023-06-27 18:18:29 +08:00
Marek Olšák
77f5b1cce0 radeonsi: clean up #includes
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23517>
2023-06-22 08:35:55 +00:00
Marek Olšák
56c787b36d radeonsi: declare compiler[] and nir_options as pointers to reduce #includes
so that we don't have to include the structure definitions.
(ac_llvm_compiler includes LLVM, and nir_options includes NIR)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23517>
2023-06-22 08:35:55 +00:00