fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 23:58:10 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	a75bd251df	ac/surface: add a flag to forbid some swizzles for surface<->memory copies 256KiB (also block variables) aren't supported on GFX11. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>	2025-07-15 09:12:13 +00:00
Samuel Pitoiset	f5f2392cf7	ac/surface: add support for surface<->memory copy using addrlib Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>	2025-07-15 09:12:13 +00:00
Samuel Pitoiset	16be376cc5	ac/surface: constify bpe_to_format() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>	2025-07-15 09:12:12 +00:00
David Rosca	e2554c8f51	ac/surface: Support RADEON_SURF_FORCE_SWIZZLE_MODE on gfx12 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>	2025-07-14 07:42:26 +00:00
Yogesh Mohan Marimuthu	0068dbd76b	ac: enable kernelq reg shadowing only when userq is disabled Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35106>	2025-07-13 20:05:25 +00:00
Yogesh Mohan Marimuthu	1000ee3d2f	ac,radeonsi,radv: rename register_shadowing_required rename register_shadowing_required to has_kernelq_reg_shadowing Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35106>	2025-07-13 20:05:25 +00:00
Marek Olšák	34580a32ff	ac/nir: remove redundant option dont_export_cull_distances It has the same value as can_cull. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:21 +00:00
Marek Olšák	54c969882b	ac/nir: rename ac_nir_get_lds_gs_out_slot_offset -> ac_nir_get_gs_out_lds_offset Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:21 +00:00
Marek Olšák	fde3384cfd	ac/nir: remove pack_clip_cull_distances option it's always true Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:21 +00:00
Marek Olšák	0fbdefd770	ac/llvm: remove LDS linking code LDS sizes and offsets from LLVM are no longer used. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:21 +00:00
Marek Olšák	65c5ee1628	radeonsi: stop using LLVM LDS linking logic for the GS out LDS offset This will enable large code removal. shader->config.lds_size is now always computed the same as ACO except for compute shaders. We have to add a new 8-bit user SGPR bitfield called GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset that was previously set by the relocation. Since the offset must be a multiple of 256, we have to add padding to the LDS size computation to make sure the alignment to 256 for the ESGS LDS size doesn't cause us to exceed the maximum LDS size. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:20 +00:00
Georg Lehmann	a045e9a624	ac/nir: lower uniform extract_i8/u8 to 32bit To prevent vectorizing this later. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>	2025-07-12 08:39:13 +00:00
Marek Olšák	44dd39d121	radv: pack clip and cull distance outputs for both legacy and NGG pipelines This increases primitive throughput when packing reduces the number of pos exports due to holes in clip and cull distance arrays that could be punched out by nir_opt_clip_cull_const. This applies to all chips. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:06 +00:00
Marek Olšák	ae78e8d198	ac/nir: handle VARYING_SLOT_VARn_16BIT the same as other slots They are the same as regular VARn. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:02 +00:00
Marek Olšák	762fdf8236	ac/nir: fix mediump XFB The previous code was completely wrong and untested. This is tested. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:02 +00:00
Marek Olšák	56f80479fc	ac/nir: remove unnecessary 16-bit handling from pre-rast GS and XFB loads/stores All callers always pass 32 bits in there. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:02 +00:00
Marek Olšák	65972f2301	ac/nir: return GSVS emit sizes from legacy GS lowering and simplify shader info This simplifies shader info in drivers by returning GSVS emit sizes from ac_nir_lower_legacy_gs. The pass knows the sizes, so drivers shouldn't have to determine them independently. This also makes the values more accurate because both drivers were computing the GSVS emit sizes inaccurately and had redundant fields in shader info. RADV had a lot of redudancy there. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:02 +00:00
Marek Olšák	76ce37058d	radv: set the maximum possible workgroup size for legacy GS before linking The optimal workgroup size will be set after lowering. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:00 +00:00
Rhys Perry	5ad04c02d4	ac/nir: don't combine multiple non-constant offsets into a global access This isn't correct if the addition overflows. No fossil-db changes (gfx1201, navi10, pitcairn). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>	2025-07-11 12:15:03 +00:00
Samuel Pitoiset	01fccec1dc	ac/descriptors,radv: move the nbc view param to the gfx10 union This is only GFX10+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36043>	2025-07-11 05:46:50 +00:00
Qiang Yu	88c79a13b9	ac,radv: move nir_load_ring_mesh_scratch_offset_amd to ac Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details To be shared with radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Qiang Yu	5ddbd8c83b	ac,radv: move mesh scratch ring constants to ac To be shared with radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Qiang Yu	78fed5fc13	ac,radv: move nir_load_task_ring_entry_amd to ac To be shared with radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Qiang Yu	79ecca962a	ac: parse ib for mesh shader dispatch packets Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Qiang Yu	d9df597042	ac,radv: move mesh_fast_launch_2 to ac To be shared with radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Marek Olšák	3bc31c307f	ac/nir: fix indexing GS inputs with non-constant vertex index on gfx9-11 This hasn't been reproducible because RADV and GLSL always lower non-constant slot and vertex indexing of GS inputs, but we'll stop lowering it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>	2025-07-10 16:37:45 +00:00
Daniel Schürmann	764ee3a834	radv: don't lower subdword phis to scalar Totals from 193 (0.24% of 79839) affected shaders: (Navi48) MaxWaves: 6004 -> 6024 (+0.33%) Instrs: 169276 -> 166784 (-1.47%); split: -3.01%, +1.53% CodeSize: 940608 -> 915768 (-2.64%); split: -4.29%, +1.64% VGPRs: 8012 -> 7716 (-3.69%); split: -3.99%, +0.30% SpillVGPRs: 185 -> 0 (-inf%) Scratch: 13568 -> 0 (-inf%) Latency: 2159787 -> 2147084 (-0.59%); split: -2.86%, +2.28% InvThroughput: 664022 -> 395859 (-40.38%); split: -42.59%, +2.21% VClause: 2998 -> 2880 (-3.94%); split: -4.27%, +0.33% SClause: 3117 -> 3120 (+0.10%) Copies: 21290 -> 16278 (-23.54%); split: -24.74%, +1.20% Branches: 4757 -> 4760 (+0.06%); split: -0.34%, +0.40% PreSGPRs: 7369 -> 7378 (+0.12%); split: -0.11%, +0.23% PreVGPRs: 4257 -> 3859 (-9.35%); split: -9.94%, +0.59% VALU: 83173 -> 79804 (-4.05%); split: -5.68%, +1.63% SALU: 36672 -> 37318 (+1.76%); split: -0.02%, +1.78% VMEM: 4012 -> 3762 (-6.23%); split: -6.83%, +0.60% SMEM: 4300 -> 4303 (+0.07%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>	2025-07-09 14:10:36 +00:00
Daniel Schürmann	2c51a8870d	nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to nir_lower_alu_width(), the callback can return the desired number of components for a phi, or 0 for no lowering. The previous behavior of nir_lower_phis_to_scalar() with lower_all=true can be elicited via nir_lower_all_phis_to_scalar() while the previous behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar() with NULL callback. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
jesse.zhang	56d758d321	amd: Add user queue HQD count to hw_ip info Add a new field userq_num_hqds to drm_amdgpu_info_hw_ip to expose the number of available hardware queue descriptors (HQDs) for user queues. This allows userspace to query the maximum number of user queues that can be created for a particular IP block. the patch link in driver side: https://lists.freedesktop.org/archives/amd-gfx/2025-June/126686.html v2: we should also put userq_num_hqds into radeon_info and print it where other fields are printed. (Marek Olšák) v3: rename num_userqs to num_queue_slots and add print log in ac_print_gpu_info. (Marek Olšák) v4: rename userq_num_hqds to userq_num_slots in hw_ip_info, and update the hw information (Marek Olšák) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35850>	2025-07-08 10:17:51 +00:00
Marek Olšák	b31f73a1b1	ac/nir: use u_foreach_bit more Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Marek Olšák	896dd9bc93	ac/nir: eliminate sample_id/sample_pos if MSAA is disabled Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Marek Olšák	1c2007005e	ac/nir: rename force_center_interp_no_msaa to msaa_disabled Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35345>	2025-07-07 11:41:57 +00:00
Alyssa Rosenzweig	d31cb824df	treewide: use VARYING_BIT_* Some checks failed macOS-CI / macOS-CI (dri) (push) Has been cancelled Details macOS-CI / macOS-CI (xlib) (push) Has been cancelled Details Via Coccinelle patch generated by the following Python: varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4", "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX", "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID", "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE", "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER", "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0", "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ] t = """ @@ @@ -(1 << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -(1ull << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD64_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} """ for v in varys: from mako.template import Template print(Template(t).render(V = v)) Closes: #13453 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common] Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom] Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl] Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>	2025-07-04 19:01:04 +00:00
Pierre-Eric Pelloux-Prayer	fab2c9a923	ac: fix invalid array size Reported by static analysis. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35877>	2025-07-04 15:26:38 +00:00
Pierre-Eric Pelloux-Prayer	6e371f0a8a	ac: fix potential overflows Reported by static analysis. Multiplication may overflow before being converted to the larger type, so fix this by casting one of the operands to the destination type. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35877>	2025-07-04 15:26:38 +00:00
Samuel Pitoiset	2af3ef9305	ac/surface: select a different swizzle mode for ASTC formats on GFX12 It seems only 4KiB swizzle works fine with ASTC. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34877>	2025-07-03 15:31:04 +00:00
Samuel Pitoiset	cb6f2d9409	ac/surface: use align with NPOT for estimating surface size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ac_estimate_size() triggers an assertion because the block size isn't aligned to a power of two for ASTC formats. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35879>	2025-07-03 08:02:17 +00:00
Marek Olšák	028591aead	ac/nir: remove kill_pointsize and kill_layer options from lowering passes The outputs are removed by a separate pass. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:46 +00:00
Marek Olšák	42ad7543b8	ac/nir: switch legacy GS lowering to ac_nir_prerast_out completely This changes legacy GS outputs to use the same logic as NGG GS. It enables the same optimizations that NGG has such as forwarding constant GS output components to the GS copy shader at compile time. ac_nir_gs_output_info is removed. GS output info is no longer passed to ac_nir_lower_legacy_gs and ac_nir_create_gs_copy_shader separately. ac_nir_lower_legacy_gs now gathers ac_nir_prerast_out, generates GSVS ring stores, and also generates the GS copy shader with GSVS ring loads. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:45 +00:00
Marek Olšák	723ce13f90	ac/nir: move gs_output_component_mask_with_stream to prerast utils Legacy GS will use it. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:45 +00:00
Marek Olšák	2c64cdc047	ac/nir: return the GS copy shader from ac_nir_lower_legacy_gs This way we won't have to pass output info between the two functions. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:44 +00:00
Marek Olšák	98f3fc494e	ac/nir: remove no-op loop from ac_nir_create_gs_copy_shader Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:43 +00:00
Marek Olšák	098d33766a	ac: add legacy GS subgroup size computation from radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:43 +00:00
Marek Olšák	fa8db1ccd3	ac: add NGG subgroup size computation from radeonsi RADV will use it. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:42 +00:00
Marek Olšák	4263b49778	ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass This is a cleanup. Old gs LDS layout: [es outputs][gs outputs][scratch] Old nogs LDS layout: [xfb/cull][scratch] New gs LDS layout: [es outputs][scratch\|gs outputs] New nogs LDS layout: [scratch\|xfb/cull] The LDS scratch is moved to the beginning of the preceding buffer in LDS, while the addresses in that LDS buffer are offset by the scratch size. It effectively merges the LDS scratch with the preceding buffer in LDS. Thanks to that, we no longer need the ngg_scratch ABI and the offset in a user SGPR. The lowering passes now return the LDS scratch size, which is used by the drivers to determine the final LDS size. The ngg_lds_layout SGPR is now unused without GS in RADV. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:41 +00:00
Marek Olšák	b1b581f855	ac/nir/lower_ngg: add an option not to export cull distances if the shader culls them Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	8c04a91d12	ac/nir: rename clip_cull_mask parameter to clearer export_clipdist_mask Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	ed0f393607	ac/nir/lower_ngg: rename clip_cull_dist_mask and use it correctly We incorrectly used it to determine whether the shader should cull, which luckily had no effect because it wasn't used everywhere. cull_clipdist_mask should be used instead, which also reflects whether clip planes are enabled in GL. clip_cull_dist_mask is renamed to export_clipdist_mask to make it clear. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	f6af3c0e17	ac/nir/lower_ngg: forward constant GS & XFB output components from stores to loads for LDS This removes LDS space and loads/stores for constant GS & XFB output components. Constant output components skip LDS stores, and LDS loads are replaced with the gathered constants. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00
Marek Olšák	0ba4e3ae83	ac/nir/lower_ngg: add & use new scalar helpers for XFB loads/stores This simplifies the code and scalarizes the loads/stores. Scalar loads/stores will allow forwarding constant output components from stores to loads easily. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:40 +00:00

1 2 3 4 5 ...

3308 commits