fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 13:28:09 +02:00

Author	SHA1	Message	Date
Qiang Yu	d9df597042	ac,radv: move mesh_fast_launch_2 to ac To be shared with radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Marek Olšák	3bc31c307f	ac/nir: fix indexing GS inputs with non-constant vertex index on gfx9-11 This hasn't been reproducible because RADV and GLSL always lower non-constant slot and vertex indexing of GS inputs, but we'll stop lowering it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>	2025-07-10 16:37:45 +00:00
Daniel Schürmann	764ee3a834	radv: don't lower subdword phis to scalar Totals from 193 (0.24% of 79839) affected shaders: (Navi48) MaxWaves: 6004 -> 6024 (+0.33%) Instrs: 169276 -> 166784 (-1.47%); split: -3.01%, +1.53% CodeSize: 940608 -> 915768 (-2.64%); split: -4.29%, +1.64% VGPRs: 8012 -> 7716 (-3.69%); split: -3.99%, +0.30% SpillVGPRs: 185 -> 0 (-inf%) Scratch: 13568 -> 0 (-inf%) Latency: 2159787 -> 2147084 (-0.59%); split: -2.86%, +2.28% InvThroughput: 664022 -> 395859 (-40.38%); split: -42.59%, +2.21% VClause: 2998 -> 2880 (-3.94%); split: -4.27%, +0.33% SClause: 3117 -> 3120 (+0.10%) Copies: 21290 -> 16278 (-23.54%); split: -24.74%, +1.20% Branches: 4757 -> 4760 (+0.06%); split: -0.34%, +0.40% PreSGPRs: 7369 -> 7378 (+0.12%); split: -0.11%, +0.23% PreVGPRs: 4257 -> 3859 (-9.35%); split: -9.94%, +0.59% VALU: 83173 -> 79804 (-4.05%); split: -5.68%, +1.63% SALU: 36672 -> 37318 (+1.76%); split: -0.02%, +1.78% VMEM: 4012 -> 3762 (-6.23%); split: -6.83%, +0.60% SMEM: 4300 -> 4303 (+0.07%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>	2025-07-09 14:10:36 +00:00
Daniel Schürmann	2c51a8870d	nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to nir_lower_alu_width(), the callback can return the desired number of components for a phi, or 0 for no lowering. The previous behavior of nir_lower_phis_to_scalar() with lower_all=true can be elicited via nir_lower_all_phis_to_scalar() while the previous behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar() with NULL callback. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
Marek Olšák	b31f73a1b1	ac/nir: use u_foreach_bit more Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Marek Olšák	896dd9bc93	ac/nir: eliminate sample_id/sample_pos if MSAA is disabled Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Marek Olšák	1c2007005e	ac/nir: rename force_center_interp_no_msaa to msaa_disabled Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Alyssa Rosenzweig	d31cb824df	treewide: use VARYING_BIT_* Some checks failed macOS-CI / macOS-CI (dri) (push) Has been cancelled Details macOS-CI / macOS-CI (xlib) (push) Has been cancelled Details Via Coccinelle patch generated by the following Python: varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4", "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX", "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID", "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE", "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER", "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0", "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ] t = """ @@ @@ -(1 << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -(1ull << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD64_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} """ for v in varys: from mako.template import Template print(Template(t).render(V = v)) Closes: #13453 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common] Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom] Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl] Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>	2025-07-04 19:01:04 +00:00
Marek Olšák	028591aead	ac/nir: remove kill_pointsize and kill_layer options from lowering passes The outputs are removed by a separate pass. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:46 +00:00
Marek Olšák	42ad7543b8	ac/nir: switch legacy GS lowering to ac_nir_prerast_out completely This changes legacy GS outputs to use the same logic as NGG GS. It enables the same optimizations that NGG has such as forwarding constant GS output components to the GS copy shader at compile time. ac_nir_gs_output_info is removed. GS output info is no longer passed to ac_nir_lower_legacy_gs and ac_nir_create_gs_copy_shader separately. ac_nir_lower_legacy_gs now gathers ac_nir_prerast_out, generates GSVS ring stores, and also generates the GS copy shader with GSVS ring loads. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:45 +00:00
Marek Olšák	723ce13f90	ac/nir: move gs_output_component_mask_with_stream to prerast utils Legacy GS will use it. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:45 +00:00
Marek Olšák	2c64cdc047	ac/nir: return the GS copy shader from ac_nir_lower_legacy_gs This way we won't have to pass output info between the two functions. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:44 +00:00
Marek Olšák	98f3fc494e	ac/nir: remove no-op loop from ac_nir_create_gs_copy_shader Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:43 +00:00
Marek Olšák	4263b49778	ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass This is a cleanup. Old gs LDS layout: [es outputs][gs outputs][scratch] Old nogs LDS layout: [xfb/cull][scratch] New gs LDS layout: [es outputs][scratch\|gs outputs] New nogs LDS layout: [scratch\|xfb/cull] The LDS scratch is moved to the beginning of the preceding buffer in LDS, while the addresses in that LDS buffer are offset by the scratch size. It effectively merges the LDS scratch with the preceding buffer in LDS. Thanks to that, we no longer need the ngg_scratch ABI and the offset in a user SGPR. The lowering passes now return the LDS scratch size, which is used by the drivers to determine the final LDS size. The ngg_lds_layout SGPR is now unused without GS in RADV. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:41 +00:00
Marek Olšák	b1b581f855	ac/nir/lower_ngg: add an option not to export cull distances if the shader culls them Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	8c04a91d12	ac/nir: rename clip_cull_mask parameter to clearer export_clipdist_mask Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	ed0f393607	ac/nir/lower_ngg: rename clip_cull_dist_mask and use it correctly We incorrectly used it to determine whether the shader should cull, which luckily had no effect because it wasn't used everywhere. cull_clipdist_mask should be used instead, which also reflects whether clip planes are enabled in GL. clip_cull_dist_mask is renamed to export_clipdist_mask to make it clear. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	f6af3c0e17	ac/nir/lower_ngg: forward constant GS & XFB output components from stores to loads for LDS This removes LDS space and loads/stores for constant GS & XFB output components. Constant output components skip LDS stores, and LDS loads are replaced with the gathered constants. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	0ba4e3ae83	ac/nir/lower_ngg: add & use new scalar helpers for XFB loads/stores This simplifies the code and scalarizes the loads/stores. Scalar loads/stores will allow forwarding constant output components from stores to loads easily. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	4b6ae11207	ac/nir/lower_ngg: add & use new scalar helpers for GS loads/stores This simplifies the code and scalarizes the loads/stores. Scalar loads/stores will allow forwarding constant output components from stores to loads easily. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	f407129b7f	ac/nir/lower_ngg_gs: cull against clip/cull distances & clip planes in GS This is finally implemented. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Alyssa Rosenzweig	3c2f46fcac	treewide: use nir_break_if with named if Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Via Coccinelle patch: @@ expression builder, condition; identifier nif; @@ -nir_if *nif = nir_push_if(builder, condition); -{ -nir_jump(builder, nir_jump_break); -} -nir_pop_if(builder, nif); +nir_break_if(builder, condition); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>	2025-06-30 14:51:54 -04:00
Marek Olšák	6afa638b18	ac/nir/lower_ngg: rename user_clip_plane_enable_mask -> cull_clipdist_mask Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	814990684d	ac/nir/lower_ngg: pack GS outputs and XFB outputs in LDS optimally This switches the code to the new slot offsets from ac_nir_prerast_out instead of using a prefix bitmask over outputs_written. The LDS layout no longer includes these: - GS: output components that are not written by GS - VS/TES+XFB: output components that are not written by XFB - VS/TES+XFB: slots that are not written by XFB (this could be significant) This is also a cleanup because it unduplicates the bitcounts. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	75b1602c14	ac/nir/lower_ngg_gs: return LDS size from the pass instead of computing it separately. This is better because ac_nir_lower_ngg_gs knows the final LDS size anyway, and it will be easier to modify the size calculation this way. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	d79f28e9b3	ac/nir/lower_ngg: return LDS size for NGG VS and TES from the pass instead of computing it separately. This is better because ac_nir_lower_ngg_nogs knows the final LDS size anyway, and it will be easier to modify the size calculation this way. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	8346469ec0	ac/nir/lower_ngg_gs: split lower_ngg_gs_intrinsic into gathering and lowering We need to gather outputs before lowering because lowering requires that we know the LDS vertex stride, so that we can lower output stores to LDS stores. The pass will determine the LDS vertex stride, not drivers. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	84e8e899cd	ac/nir: add an option not to gather values in ac_nir_gather_prerast_store_output_info This will be needed in the next commit. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:26 +00:00
Marek Olšák	ebdd97a993	ac/nir: add LDS layout info for GSVS and XFB to ac_nir_prerast_per_output_info This will be used to reduce the NGG LDS size for uncompacted GS and XFB outputs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	39a9dce5fc	ac/nir: add an option to pack clip/cull distance components to remove holes Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	6cd813810e	ac/nir: add an option write_pos_to_clip_vertex to clip against POS This enables emulating clip planes without ClipVertex via clip distances (max 8) instead of the fixed-func hw (max 6 planes). Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	3dd3f2f889	ac/nir/lower_ngg_gs: build streamout after lowering intrinsics Streamout will require prerast info, which is gathered by lower_ngg_gs_intrinsics. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	83dc5917fe	ac/nir: lower ClipVertex before all position exports just code reordering (position exports should be at the end for perf) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	c9b6a95038	ac/nir: remove the done parameter from ac_nir_export_position Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	7c3760201d	ac/nir/lower_ngg: never export edge flags via position exports It has no effect, but the extra export instructions is unnecessary and we can't gather the effective number of position exports from NIR if we insert incorrect exports. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35351>	2025-06-28 08:20:25 +00:00
Marek Olšák	1754507d49	nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:54 +00:00
Georg Lehmann	f047a67fba	nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Rhys Perry	ac2e36b377	ac/nir: create lowered inverse_ballot Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `b49eab68a8` ("ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:10 +00:00
Karol Herbst	4ff66b4343	ac/llvm: fix bitfield ops Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35423>	2025-06-13 07:33:03 +00:00
Rhys Perry	bc2edf14d8	ac/nir: run nir_lower_vars_to_ssa after nir_lower_task_shader nir_lower_task_shader does nir_lower_returns, so we need this if the launch_mesh_workgroups was in control flow. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13326 Backport-to: 25.1 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35411>	2025-06-11 09:01:39 +00:00
Marek Olšák	edd2fc3c7f	radeonsi: use AC_EXP_PARAM_UNDEFINED for clarity The code was slightly confusing. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35392>	2025-06-10 03:31:20 +00:00
Marek Olšák	d279d019d4	ac/nir/tess: remove parameter from and simplify hs_per_patch_output_vmem_offset Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	fa5e07d5f7	ac/nir/tess: write TCS patch outputs to memory as vec4 stores at the end This moves per-patch output VMEM stores to the end of the shader where they execute only once. They are skipped if the whole workgroup discards all patches. If tcs_vertices_out == 1, per-patch output VMEM stores use the same lanes as per-vertex output VMEM stores, which are aligned to 4 or 8 lanes to get cached bandwidth for the stores. Previously, per-patch outputs were stored to memory for every store_output intrinsic in TCS. Additionally, LDS is no longer allocated for per-patch outputs that are only written and read by invocation 0, or they are written by all invocations but not read, and don't have indirect indexing. This reduces LDS usage and LDS traffic. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	c732306c5a	ac/nir/tess: unify computing LDS output patch size, minimize LDS bank conflicts This unifies the duplicated LDS output patch size computation between hs_output_lds_offset and ac_nir_compute_tess_wg_info. "+ 4" to the output patch stride minimizes LDS bank conflicts by making the beginning of each patch start on a different LDS bank. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	37dc376395	ac/nir/tess: use if-ladder to determine valid tess level components for the vote Checking whether every compoment is valid in tess_level_has_effect() when prim_mode is unknown generated too many SALU. Do this instead: if (triangles) ... subgroup vote for triangles else if (quads) .. subgroup vote for quads else // isoline subgroup vote for isolines Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	2f0d9495c5	ac/nir/tess: inline mask helpers Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	10ae5b2fbf	ac/nir/tess: rewrite tess level tracking, don't use LDS for more cases This rewrites tess level value tracking to use the 2-bit masks, which means LDS allocation is determined separately for outer and inner levels. LDS is not allocated for tess levels that are only written by invocation 0 and never read or only read by invocation 0. If the number of output patch vertices is 1, LDS is also not allocated for tess levels. Tess level outputs for TES are always written as whole vec4 to get cached bandwidth. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	9d9cfd89da	ac/nir/tess: compute the number of remapped VRAM outputs in common code This unifies it for both drivers. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	ea70060826	ac/nir/tess: stop using tes_inputs_read / tes_patch_inputs read for TCS & TES use ac_nir_tess_io_info instead Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	c38bc4824f	ac/nir/tess: apply no_varying to ac_nir_tess_io_info This has the effect that no_varying is finally honored for per-patch outputs, skipping VMEM stores that TES doesn't read. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00

1 2 3 4

177 commits