Commit graph

258 commits

Author SHA1 Message Date
Marek Olšák
5003465c42 radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs
Compile a monolithic optimized shader to do that, and clean up the comments.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
58132d6fc8 radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass
This uses si_shader_info to store whether gl_FragDepth can be removed,
and it uses the kill_z epilog flag to do the removal without recompilation.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
b56f47611a radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3
It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1
to indicate that alpha for alpha-to-coverage should be read from mrtz.a.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
08abddd235 radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together
alpha-to-coverage must be applied before alpha-to-one. The only way to do
that is to export alpha for alpha-to-coverage via mrtz, and export 1 via
mrt0.a.

ACO and monolithic shader support is already in place thanks to RADV,
so we only need to change the LLVM PS epilog and the shader key.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
de996ac481 radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled
This adds kill_z and kill_stencil flags to the shader PS epilog key, which
removes those outputs if depth or stencil are disabled.

It must be implemented in:
* ACO PS epilog
* LLVM PS epilog
* ac_nir_lower_ps for monolithic shaders

Some of the samplemask code wasn't completely correct, but probably harmless.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
823e9e846e radeonsi: switch to the new TCS LDS/offchip size computation
The new TCS LDS size should be less than what it was before.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>
2024-12-18 11:07:59 +00:00
Marek Olšák
85c20def94 ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads
Cross-invocation TCS input access doesn't prevent same-invocation access.
This improves shaders that use both for the same inputs.

Also, if some components of a vec4 slot only use same-invocation access and
other components only use cross-invocation access (it's possible after
compaction), this takes the VGPR path for the components with
same-invocation access, which didn't happen previously because all masks
only describe whole vec4s.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>
2024-12-18 11:07:59 +00:00
Alyssa Rosenzweig
41076b2a55 radeonsi: use mesa_prim_has_adjacency
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Marek Olšák
680f7afe0b radeonsi: don't use nir_io_dont_optimize because it's deprecated
There is a new environment variable that can be used instead.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:30 +00:00
Marek Olšák
06292538ae radeonsi: add helper si_shader_culling_enabled
it will contain more logic

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:29 +00:00
Marek Olšák
d7415d3717 radeonsi: clean up and rename gfx10_edgeflags_have_effect
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:29 +00:00
Marek Olšák
6988967a1f radeonsi: rewrite/replace gfx10_ngg_get_vertices_per_prim
Reuse si_get_input_prim (which is similar) and split it into 2 functions:
- si_get_output_prim_simplified
- si_get_num_vertices_per_output_prim

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:29 +00:00
Marek Olšák
963a84677e radeonsi: optionally return MESA_PRIM_UNKNOWN from si_get_input_prim
it will be used later

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:29 +00:00
Marek Olšák
691a9ccb33 radeonsi: prepare for making SI_NGG_CULL_TRIANGLES/LINES VS only, rename them
They will have no effect on TES and GS, so this will make it more obvious.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32257>
2024-11-20 21:08:29 +00:00
Marek Olšák
51aa1d8381 radeonsi: fix gl_FrontFace elimination when one side is culled
Fixes: 55d81214c9 - radeonsi: replace gl_FrontFacing with a constant if one side is always culled

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>
2024-11-19 11:43:20 +00:00
Marek Olšák
5be9d76861 radeonsi: fix an assertion failure in si_shader_ps with AMD_DEBUG=mono
assert(!shader->key.ps.part.prolog.force_persp_center_interp ||
       (!G_0286CC_PERSP_SAMPLE_ENA(input_ena) && !G_0286CC_PERSP_CENTROID_ENA(input_ena)));
failed when all FS inputs have been eliminated by optimizations, which
causes LLVM to set PERSP_SAMPLE_ENA because at least 1 of those must be
enabled, which this code didn't expect.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>
2024-11-19 11:43:20 +00:00
Marek Olšák
8deb32ac2e radeonsi: split outputs_written_before_tes_gs into ls_es_* and tcs_* masks
these will have different values later

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>
2024-11-16 21:59:29 -05:00
Marek Olšák
40d9616bd3 radeonsi: don't pad esgs_vertex_stride if it's 0
so that we don't allocate any LDS for ES->GS varyings if it's unused.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31291>
2024-09-20 19:49:44 +00:00
Marek Olšák
ce72376641 radeonsi: rename SI_CONTEXT_* flags to SI_BARRIER_* flags
some of the definition names are changed completely

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>
2024-09-17 20:44:58 +00:00
Marek Olšák
834aa812ea radeonsi: rename si_context::flags -> barrier_flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>
2024-09-17 20:44:58 +00:00
Marek Olšák
dac99e75af radeonsi: rename "cache_flush" -> "barrier"
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31193>
2024-09-17 20:44:58 +00:00
Marek Olšák
1d5ffb13d6 radeonsi: add ACQUIRE_MEM, RELEASE_MEM PWS packet helpers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31168>
2024-09-14 11:03:44 -04:00
Marek Olšák
1a1138817c radeonsi: add a new PM4 helper radeon_event_write
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31168>
2024-09-14 11:03:44 -04:00
Marek Olšák
b7136d0890 radeonsi: pass TCS inputs_read mask to LS output lowering on GFX9 + monolithic
This will allocate less LDS for LS outputs if there are holes between
varyings when we have monolithic merged LS+TCS. (it removes the holes)

There are 2 steps to this:
- add helper si_shader_lshs_vertex_stride and use it everywhere
- pass the TCS inputs_read bitmask instead of the "map" callback
  to si_lower_ls_outputs_mem

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
2024-09-05 19:54:29 +00:00
Samuel Pitoiset
80e8e18cc6 ac: add ac_gfx103_get_cu_mask_ps()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Qiang Yu
1ee612e1ac radeonsi: use wave64 for KHR_shader_subgroup enabled shader
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:20 +08:00
Qiang Yu
a78d1d49e6 radeonsi: consider both stages to determine merged shader wave_size
Previously we determine wave_size of merged shader stages separately,
and ignore the condition which may cause them to be different.

Now we determine the wave_size of the TCS/GS part first, then use the
wave_size for VS/TES part. So that we can condider the previous shader
stage's information when determine the wave_size of TCS/GS, and two
stages in the merged shader can affect each other's wave_size.

This requires si_shader_selector to have two kinds of main part for
wave32 and wave64 when part mode, to be combined with other shader
part with various wave size.

This also enables merged shader stages with different
si_shader_info->has_divergent_loop to use wave32. We'll add another
condition for KHR_shader_subgroup latter.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:20 +08:00
Qiang Yu
196d91ed78 radeonsi: remove NULL check in si_determine_wave_size
This function is always called with non-NULL shader now.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:20 +08:00
Marek Olšák
0e27df4521 radeonsi/gfx12: fix VS output corruption with streamout
We increased VS_EXPORT_COUNT to 8 for streamout in gfx10_shader_ngg,
but we forgot to increase the attribute ring stride, causing all waves
except the first one to get corrupted VS outputs.

Fixes: f703dfd1bb - radeonsi: add gfx12

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
2024-08-05 19:35:39 +00:00
Marek Olšák
a5b4ae67ae ac: add radeon_info::has_scratch_base_registers
Fixes: 3b0bfd254f - radeonsi/gfx11: make flat_scratch changes for compute

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30071>
2024-07-15 13:52:25 -04:00
Marek Olšák
c353394a21 radeonsi: replace si_shader::scratch_bo with scratch_va, don't set it on gfx11+
This removes the unnecessary buffer reference and improves this fragile
code.

Fixes: 3b0bfd254f - radeonsi/gfx11: make flat_scratch changes for compute
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11463

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30071>
2024-07-15 13:52:18 -04:00
Pierre-Eric Pelloux-Prayer
a7a1e3d329 radeonsi: fix crash in si_update_tess_io_layout_state for gfx8 and earlier
si_set_patch_vertices was only called if tcs.current was non-NULL but
this condition is not enough for GFX9+ since vs is used as ls.

Add a check in si_update_tess_io_layout_state instead, and set
sctx->do_update_shaders for case where the ls_current is not yet
available.
This fix crashes on GFX6.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29876>
2024-07-09 15:07:27 +02:00
Patrick Lerda
301a3bacce radeonsi: fix assert triggered on gfx6 after the tessellation update
This change updates the affected calls to the proper function
which is radeon_set_config_reg().

For instance, this issue is triggered with
"piglit/bin/textureSize tes isampler2DMSArray -auto -fbo":
vertex-program-two-side: ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:4981: void si_emit_spi_ge_ring_state(si_context*, unsigned int): Assertion `(0x008988) >= CIK_UCONFIG_REG_OFFSET && (0x008988) < CIK_UCONFIG_REG_END' failed.

Fixes: bd71d62b8f ("radeonsi: program tessellation rings right before draws")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29645>
2024-06-11 14:01:21 +00:00
Marek Olšák
fe7a4ed708 radeonsi: use shader_info::use_aco_amd to determine whether to use ACO
It's set by si_nir_scan_shader, so we need to use it after that.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>
2024-06-08 05:48:11 +00:00
Samuel Pitoiset
428601095c ac,radeonsi import PM4 state from RadeonSI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>
2024-06-06 20:26:47 +00:00
Mike Blumenkrantz
2aaa6ebba1 build/amd: add amd-use-llvm build option
this allows amd drivers to disable llvm support while still allowing
llvmpipe/lavapipe to be built

by disabling llvm support in amd drivers, the load times for these drivers
decreases by 5-10ms

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969>
2024-05-30 19:05:00 +00:00
Marek Olšák
90b0925588 radeonsi: constify struct pipe_vertex_buffer *
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
2024-05-15 06:42:34 +00:00
Marek Olšák
96cf96f611 radeonsi: serialize shader disassembly string to fix asm dumps for ACO
Shaders loaded from the shader cache should be printable. Before this,
sometimes only "(null)" was printed.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
2024-05-15 06:42:33 +00:00
Samuel Pitoiset
8b85c58429 radeonsi: remove the _unused parameter in all radeon_xxx macros
I plan to re-use all these macros in RADV, mostly for GFX11 paired
packets and for GFX12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29130>
2024-05-13 12:24:18 +00:00
Marek Olšák
f703dfd1bb radeonsi: add gfx12
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29007>
2024-05-11 22:14:06 -04:00
Marek Olšák
256cc77f84 radeonsi: don't add whether NIR is used into the shader key
This is from when we had TGSI and NIR was a debug option.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Marek Olšák
38f74d6277 radeonsi: disable VRS flat shading for selected 8xMSAA and thick tiling cases
for better slow clear performance

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Marek Olšák
26a5955821 radeonsi: change allow_flat_shading to make it a single condition
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Marek Olšák
aad2302cf5 radeonsi: move TCS epilog key bits to the key->ge.opt section
Since the TCS epilog is no more, this is required to apply those bits
to monolithic shaders.

tessfactors_are_def_in_all_invocs was unused.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Marek Olšák
a094339d64 radeonsi: add the radeonsi_optimize_io option into the shader cache key
otherwise the options would be ignored if the shader cache had already
cached the same shader with the option inverted.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Marek Olšák
d478693dc6 radeonsi/gfx11: don't prefetch constants in binaries into the instruction cache
Only prefetch shader instructions. There will be more GFX versions
in that list.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
2024-04-24 19:17:10 +00:00
Samuel Pitoiset
758e6d9005 ac,radeonsi: add helpers to compute the number of tess patches/lds size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28015>
2024-04-23 17:20:40 +00:00
Timur Kristóf
b34e99d021 radeonsi: Use one more bit for number of patches in TCS offchip layout.
There was 1 more bit left, may as well use it for something.
In the future, this may allow increasing the maximum number of
patches per workgroup.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30 21:56:48 +01:00
Timur Kristóf
04dea4aef2 radeonsi: Remove tess bits from VS state.
These parts are not used anymore, therefore we no longer need to
change the VS state when tessellation states change.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30 21:56:45 +01:00
Timur Kristóf
b82614e06b radeonsi: Add number of VS outputs to TCS output layout.
Use tcs_offchip_layout instead of VS state to determine the
number of LS outputs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30 21:56:42 +01:00