This uses si_shader_info to store whether gl_FragDepth can be removed,
and it uses the kill_z epilog flag to do the removal without recompilation.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1
to indicate that alpha for alpha-to-coverage should be read from mrtz.a.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
alpha-to-coverage must be applied before alpha-to-one. The only way to do
that is to export alpha for alpha-to-coverage via mrtz, and export 1 via
mrt0.a.
ACO and monolithic shader support is already in place thanks to RADV,
so we only need to change the LLVM PS epilog and the shader key.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
This adds kill_z and kill_stencil flags to the shader PS epilog key, which
removes those outputs if depth or stencil are disabled.
It must be implemented in:
* ACO PS epilog
* LLVM PS epilog
* ac_nir_lower_ps for monolithic shaders
Some of the samplemask code wasn't completely correct, but probably harmless.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
Cross-invocation TCS input access doesn't prevent same-invocation access.
This improves shaders that use both for the same inputs.
Also, if some components of a vec4 slot only use same-invocation access and
other components only use cross-invocation access (it's possible after
compaction), this takes the VGPR path for the components with
same-invocation access, which didn't happen previously because all masks
only describe whole vec4s.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>
assert(!shader->key.ps.part.prolog.force_persp_center_interp ||
(!G_0286CC_PERSP_SAMPLE_ENA(input_ena) && !G_0286CC_PERSP_CENTROID_ENA(input_ena)));
failed when all FS inputs have been eliminated by optimizations, which
causes LLVM to set PERSP_SAMPLE_ENA because at least 1 of those must be
enabled, which this code didn't expect.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32186>
This will allocate less LDS for LS outputs if there are holes between
varyings when we have monolithic merged LS+TCS. (it removes the holes)
There are 2 steps to this:
- add helper si_shader_lshs_vertex_stride and use it everywhere
- pass the TCS inputs_read bitmask instead of the "map" callback
to si_lower_ls_outputs_mem
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
Previously we determine wave_size of merged shader stages separately,
and ignore the condition which may cause them to be different.
Now we determine the wave_size of the TCS/GS part first, then use the
wave_size for VS/TES part. So that we can condider the previous shader
stage's information when determine the wave_size of TCS/GS, and two
stages in the merged shader can affect each other's wave_size.
This requires si_shader_selector to have two kinds of main part for
wave32 and wave64 when part mode, to be combined with other shader
part with various wave size.
This also enables merged shader stages with different
si_shader_info->has_divergent_loop to use wave32. We'll add another
condition for KHR_shader_subgroup latter.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
We increased VS_EXPORT_COUNT to 8 for streamout in gfx10_shader_ngg,
but we forgot to increase the attribute ring stride, causing all waves
except the first one to get corrupted VS outputs.
Fixes: f703dfd1bb - radeonsi: add gfx12
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
si_set_patch_vertices was only called if tcs.current was non-NULL but
this condition is not enough for GFX9+ since vs is used as ls.
Add a check in si_update_tess_io_layout_state instead, and set
sctx->do_update_shaders for case where the ls_current is not yet
available.
This fix crashes on GFX6.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29876>
This change updates the affected calls to the proper function
which is radeon_set_config_reg().
For instance, this issue is triggered with
"piglit/bin/textureSize tes isampler2DMSArray -auto -fbo":
vertex-program-two-side: ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:4981: void si_emit_spi_ge_ring_state(si_context*, unsigned int): Assertion `(0x008988) >= CIK_UCONFIG_REG_OFFSET && (0x008988) < CIK_UCONFIG_REG_END' failed.
Fixes: bd71d62b8f ("radeonsi: program tessellation rings right before draws")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29645>
this allows amd drivers to disable llvm support while still allowing
llvmpipe/lavapipe to be built
by disabling llvm support in amd drivers, the load times for these drivers
decreases by 5-10ms
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28969>
Shaders loaded from the shader cache should be printable. Before this,
sometimes only "(null)" was printed.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
Since the TCS epilog is no more, this is required to apply those bits
to monolithic shaders.
tessfactors_are_def_in_all_invocs was unused.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
otherwise the options would be ignored if the shader cache had already
cached the same shader with the option inverted.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
There was 1 more bit left, may as well use it for something.
In the future, this may allow increasing the maximum number of
patches per workgroup.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
These parts are not used anymore, therefore we no longer need to
change the VS state when tessellation states change.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
Use tcs_offchip_layout instead of VS state to determine the
number of LS outputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>