The Wave32 pass manager has been removed a while ago.
Fixes: 94a1f45e15 ("ac/llvm: set target features per function instead of per target machine")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12833>
Some tokens can be excluded without instruction timing. This reduces
RGP capture sizes significantly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12853>
Additional work is needed for storage images with DCC without DCC image stores to not be broken.
Fixes black screens in Doom Eternal.
Fixes: #5345
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12818>
By replacing the 48-byte ralloc header with our exec_node gc_node (16
bytes), runtime of shader-db on my system across this series drops
-4.21738% +/- 1.47757% (n=5).
Inspired by discussion on #5034.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
Right now we're using ralloc to GC our NIR instructions, but ralloc has
significant overhead for its recursive nature so it would be nice to use a
simpler mechanism for GCing instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
We were using the ralloc parent in some places, which should work out to
be the shader I think, but to de-ralloc the instrs we should just pass the
existing shader pointer in.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
This code was being tricky with passing a mem_ctx instead of the shader,
then freeing the mem_ctx when the pass was done and all the parallel
copies had been removed from the shader. Use the right type for instr
creation and do a bit of manual list management to prepare the way for
non-ralloc NIR instrs.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
Calling this lower pass twice in a row would cause spurious
set_vertex_and_primitive_count(0, undef) intrinsics after the proper
set_vertex_and_primitive_count intrinsic. This pretty much turns any
geometry shader into garbage.
Fix this by treating nir_intrinsic_emit_vertex_with_counter and
nir_intrinsic_end_primitive_with_counter just like the non-_with_counter
versions. If no blocks would need set_vertex_and_primitive_count
intrinsics added, exit the pass before doing any work. This prevents
the need for DCE to do extra clean up later.
Since this pass is potentially called multiple times via multiple
invocations of a finalize_nir callback, it is (hypothetically?) possible
that control flow could be changed to add new blocks that need this
intrinsic. The check implemented in this commit should be robust
against that possibility.
v2: Add a_block_needs_set_vertex_and_primitive_count. Suggested by
Timur.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12802>
This replaces vs_output_param_offset by vs_output_ps_input_cntl,
which is easier to use.
For geometry shaders, vs_output_ps_input_cntl is stored in the GS si_shader
structure, not gs_copy_shader. This requires that gs_copy_shader compilation
is finished before the GS main shader part, so that GS can initialize
vs_output_ps_input_cntl using the compiled GS copy shader.
output_semantic_to_slot becomes unused, so it's removed.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
This decreases overhead of si_update_shaders and overall driver overhead.
The VS shader key portion related to VS inputs is updated in set & bind
functions. Other fields related to outputs are still updated
in si_shader_selector_key.
Now that all modified fields are set to 0 when not needed, and remove
the memsets.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
This decreases overhead of si_update_shaders and overall driver overhead.
There is only one function that depends on the rasterized primitive type,
and thus it can't be moved to set & bind functions.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>