Commit graph

208499 commits

Author SHA1 Message Date
Marek Olšák
f6aecfb886 ac/llvm: don't declare LDS as an array for HS & GS & CS, use IntToPtr(0)
We don't need all this stuff anymore.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
5ded4f3c7d aco: remove unused aco_symbol_lds_ngg_gs_out_vertex_base
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
404d242809 radeonsi: use shader_info::next_stage correctly
Separate shaders have next_stage == MESA_SHADER_NONE.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
30676319c7 radeonsi: remove all uses of NIR_PASS_V
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
ab8b5499bc radeonsi: add a comment about early prim exports
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
ece9b47196 radeonsi: determine compute shader LDS size from NIR instead of LLVM
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
24260644e8 radeonsi: remove now unused LLVM LDS logic for NGG
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
65c5ee1628 radeonsi: stop using LLVM LDS linking logic for the GS out LDS offset
This will enable large code removal.

shader->config.lds_size is now always computed the same as ACO except for
compute shaders.

We have to add a new 8-bit user SGPR bitfield called
GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset
that was previously set by the relocation.

Since the offset must be a multiple of 256, we have to add padding
to the LDS size computation to make sure the alignment to 256 for the ESGS
LDS size doesn't cause us to exceed the maximum LDS size.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
fbbf029529 radeonsi: enable 16-bit mediump IO for PS outputs only, and VS->PS with env var
It has been implemented and works for PS outputs already.

The lowering callback needs 2 variants because we can't access
pipe_screen from it. The callback is rewritten to be more general.

We also need to do nir_clear_mediump_io_flag for any outputs we don't
lower because the mediump flag might prevent optimizations if it's not
cleared.

v2: fix si_nir_optim

Acked-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
5a7ff54aaa radeonsi: remove gs_input_verts_per_prim from si_shader_info
It can be computed from input_primitive.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
1a197aa057 radeonsi: remove unused output_type and output_usage from si_shader_info
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
58f12b3c81 radeonsi: don't count outputs with GS streams > 0 for outputs_written_before_ps
outputs_written_before_ps is used to determine kill_outputs, which removes
param exports, but non-zero GS streams are xfb-only and not exported.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
0b3b105bde radeonsi: use si_assign_param_offsets for legacy GS too
The result of that function was overwritten by other code, so just remove it.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
cc497fd0e4 radeonsi: move gfx10_shader_ngg.c contents into si_shader.c
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
d3c1c638c4 radeonsi: cull against cull distances in the shader and don't export them
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
5b5addd9e9 radeonsi: enable culling against clip/cull distances and clip planes in GS
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
cee54211df radeonsi: reduce the size of 2 fields in si_shader_variant_info
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
45acb5857d radeonsi: pack clip/cull distance export components
This removes unused and no-op clip/cull distance components, though
it's not very common.

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
fec40557d3 radeonsi: use nir_opt_clip_cull_const
It eliminates no-op (>= 0) clip/cull distance output components by setting
no_sysval_output = true.

We have to gather clip/cull distances manually to get reduced clip/cull
masks.

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
c743c3dd1a radeonsi: support 8 non-ClipVertex clip planes instead of 6
If there are more than 6 planes without gl_ClipVertex and gl_ClipDistance,
add "gl_ClipVertex = gl_Position;" to support up to 8 planes.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
1b594e6745 radeonsi: gather nr_pos_exports from the final NIR
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
2c0eb09e39 radeonsi: simplify old_vs & old_ps checking in si_update_shaders
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
e73f70e135 radeonsi: add si_shader_variant_info::clip/culldist_mask
so that it can be different between shader variants

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Valentine Burley
391c40f9fc freedreno/ci: Add ASan jobs on a618
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Introduce nightly Address Sanitizer jobs for GLES and Vulkan on a618.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053>
2025-07-12 09:21:03 +00:00
Valentine Burley
08152633fb ci/lava: Add arm64 ASan job templates
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053>
2025-07-12 09:21:03 +00:00
Valentine Burley
201ac3bf49 turnip/ci: Skip Vulkan Video tests
Vulkan Video isn't supported, since video isn't part of the gpu.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053>
2025-07-12 09:21:02 +00:00
Georg Lehmann
92d433c54a aco: vectorize conversions from 8bit to 16bit
Massively helps emulated fp8 performance.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:15 +00:00
Georg Lehmann
7fece5592c aco: vectorize 16bit extracts
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:14 +00:00
Georg Lehmann
a045e9a624 ac/nir: lower uniform extract_i8/u8 to 32bit
To prevent vectorizing this later.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:13 +00:00
Georg Lehmann
2cc3e1876c ac/llvm: support vec2 extract
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:13 +00:00
Julia Zhang
d34b069e9b radeonsi: small fixes of radeonsi renderstage
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Remove redundent lines in si_perfetto.cpp

Convert offset_B of buffer gpu_address to uint64_t ptr offset when
si_buffer_map is being called to read timestamp.

Destroy sctx->trace by calling u_trace_fini in si_utrace_fini which
will be called in si_destroy_context.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36066>
2025-07-12 07:28:46 +00:00
Marek Olšák
f8918ed6c6 radv: stop using LLVM LDS linking logic
Not needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
44dd39d121 radv: pack clip and cull distance outputs for both legacy and NGG pipelines
This increases primitive throughput when packing reduces the number
of pos exports due to holes in clip and cull distance arrays that could be
punched out by nir_opt_clip_cull_const. This applies to all chips.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
2751d488ce radv: enable nir_opt_clip_cull_const for GS too
The pass also supports GS now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
bdcfe15457 radv: don't export cull distances if the shader culls against them
This increases primitive throughput for all hw with NGG if the shader
culls and the removal of cull distances reduces the number of position
exports.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
0cce0505cc radv: compute the number of position outputs after compilation
It will be different between NGG and legacy because NGG with culling
will not export cull distances.

The number of position exports could also be gathered from final NIR
to reduce logic duplication.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
21646b0124 radv: don't include positions exports in pipeline executable stats
It will be different between NGG and legacy because NGG with culling
will not export cull distances.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:04 +00:00
Marek Olšák
88a1c1f881 radv: enable NGG culling for GS
This is very useful for increasing raw primitive throughput for GS
(mostly just RDNA 2), increasing raw primitive throughput with clip
and cull distance outputs when they actually cull anything (RDNA 1-4),
and reducing attribute store bandwidth usage (RDNA 3-4).

It will also replace fixed-func culling against cull distances when
culling in the shader is enabled, which will increase primitive throughput
even further.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:04 +00:00
Marek Olšák
ae4d539540 radv: rework radv_link_shaders_info as as not be called in a loop
It receives all shaders and decides how to link them.

When culling is enabled for GS, we will need ES, GS, and FS in this
function at the same time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
b97c4bfd58 radv: enable W/front/back face NGG culling with multiple viewports
This is supported.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
89e1ec92c5 radv: cull against clip and cull distances in the shader
Clip and cull distance outputs decrease primitive throughput, so culling
against them in the shader has even more benefit than other culling
options.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
ae78e8d198 ac/nir: handle VARYING_SLOT_VARn_16BIT the same as other slots
They are the same as regular VARn.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:02 +00:00
Marek Olšák
762fdf8236 ac/nir: fix mediump XFB
The previous code was completely wrong and untested. This is tested.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:02 +00:00
Marek Olšák
56f80479fc ac/nir: remove unnecessary 16-bit handling from pre-rast GS and XFB loads/stores
All callers always pass 32 bits in there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:02 +00:00
Marek Olšák
65972f2301 ac/nir: return GSVS emit sizes from legacy GS lowering and simplify shader info
This simplifies shader info in drivers by returning GSVS emit sizes from
ac_nir_lower_legacy_gs. The pass knows the sizes, so drivers shouldn't
have to determine them independently.

This also makes the values more accurate because both drivers were
computing the GSVS emit sizes inaccurately and had redundant fields
in shader info. RADV had a lot of redudancy there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:02 +00:00
Marek Olšák
c1d3108855 radv: call radv_get_legacy_gs_info after ac_nir_lower_legacy_gs
The pass will determíne the GSVS ring size, so radv_get_legacy_gs_info
must be called after that.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:01 +00:00
Marek Olšák
76ce37058d radv: set the maximum possible workgroup size for legacy GS before linking
The optimal workgroup size will be set after lowering.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:00 +00:00
Marek Olšák
d674e97d5c radv: use shared ac_legacy_gs_compute_subgroup_info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:00 +00:00
Marek Olšák
8a1e357f71 radv: use shared ac_ngg_compute_subgroup_info
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12496

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:19:59 +00:00
Connor Abbott
a3a53b7cee tu: Implement VK_VALVE_fragment_density_map_layered
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In order to implement the extension we have to override the last
pre-rasterization shader to inject "gl_ViewportIndex = gl_Layer" at the
end, because there is no layered rendering equivalent to the
VIEWPORTINDEXINCR bit that adds gl_ViewIndex to gl_ViewportIndex in HW.
We also have to deal with the case where layered rendering is enabled
but the bit isn't set, in which case patchpoints that depend on the view
will see num_views = 1 but the patchpoint is for a higher view (aka
layer). This requires changing all of the patchpoints to handle this
case. Finally we have to change a number of cases which needed the
number of FDM layers to stop using num_views directly from the
renderpass and take into account whether per-layer rendering is enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35594>
2025-07-11 22:05:20 +00:00