fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-06 02:58:05 +02:00

Author	SHA1	Message	Date
Karol Herbst	a0131b53ad	nvk: use hardware limits for maxComputeSharedMemorySize It doesn't change the reported values, but it will allow us to easily advertise real hardware limits in the future. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:02 +00:00
Karol Herbst	1d5a1b11db	nak/qmd: base shared mem size allocation on hardware limits We can allocate more than 48k of shared memory, but the limits differ across hardware, so we need to take it all into account to create the shared memory splits the hardware can accept. This does change behavior on Turing, but the assumption is, that the hardware has simply rounded up. Might need performance testing on Turing to verify nothing regresses here. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:02 +00:00
Karol Herbst	b09deba713	nouveau/winsys: add shared memory size tables It's a bit of a disaster, but each generation supports a different set of shared memory configurations. Knowing the maximum is important for compute shader performance, knowing all the legal sizes for QMD generation. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:01 +00:00
Karol Herbst	3c9fa18069	nvk: prepare for higher shared memory sizes On hw we have up to 228k of available Shared memory so a 16 bit int isn't enough for that. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:01 +00:00
Karol Herbst	083a3dc545	util: move typed_memcpy into macros.h Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37135>	2025-09-05 20:01:00 +00:00
Mel Henning	1c764357e8	nvk: Only copy 32-bits for cond render operand A Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that we're guaranteed the upper 32 bits are zero initialized, there's no reason we need to do a 64-bit write here. This is a 0.3% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (638 fps -> 640 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	4d8e2f7768	nvk: Don't re-initialize cond rendering operand B We can initialize this just once from the CPU side instead of overwriting it each time using the copy engine. This is a 5% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (607 fps -> 638 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	966a1b5380	nvk: Reuse the same cond render temp in a cmd_buf Within a single command buffer, we know that our operations will happen sequentially so we don't need to allocate a unique address per vkCmdBeginConditionalRenderingEXT - we can re-use the same address instead. Improves perf on the Sascha Willems conditionalrender demo with all rendering disabled by about 2% (595 fps -> 607 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:33 +00:00
Mel Henning	64b4e52755	nvk: Move cond rendering memory out of gart This is a 41% performance improvement on the Sascha Willems conditionalrender demo with all rendering disabled (422 fps -> 595 fps) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:32 +00:00
Mel Henning	0b43a625f4	nvk: Remove gart from the name of cond_render_mem Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37187>	2025-09-05 18:57:32 +00:00
Connor Abbott	a89f897870	freedreno/ci: Add a750 sparse skips Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	de60f2ff68	tu: Advertise shaderResourceMinLod Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	e72fed3faa	ir3: Support min_lod tex source Use the .clp modifier. In order to fix dEQP-VK.glsl.texture_functions.textureoffsetclamp.* we need to add a workaround for an empirically-discovered problem. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	63959bb716	ir3: Assemble and disassemble .clp modifier Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	655934eef7	tu: Expose shaderResourceResidency Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	70cf40086c	ir3: Implement sparse residency check Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	120f755bdb	ir3: Assemble and disassemble rck modifier Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	918e25e158	tu: Support sparse residency for images The tricky thing here is that we have to emulate the 64k "standard" tile sizes in terms of the native 4k macrotiles. We do this by manipulating which 4k pages get mapped, dividing the 64k tile into 4k macrotiles and mapping each tile in such a way that, when viewed in terms of the final swizzled image coordinates, the 4k tiles linearly tile the image region that's supposed to be mapped to the 64k "tile". Supporting the standard block sizes allows emulation layers to claim D3D Tiled Resources Tier 2, which is required for the 12.0 feature level. It's also required for ARB_sparse_texture2. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	ae53234414	freedreno/fdl: Add sparse layout support Compute the Vulkan "sparse miptail," add support for padding the array stride in order to make sure that the sparse miptail is large enough as mandated by the Vulkan spec, and add a function to compute the standard sparse block size. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	166bda02aa	freedreno/fdl: Handle layout differences for r8g8 images We don't handle copying r8g8 tiled images yet, but at least return the correct tile size and bank swizzle so that r8g8 sparse textures work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	7225334589	freedreno/fdl: Handle cpp=32 and cpp=64 when getting macrotile size These can only happen with multisampled images, which aren't supported by fdl_tiled_memcpy. However these cases can be hit by multisampled sparse textures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	8ef64f2042	freedreno/fdl: Refactor and expose bank swizzling logic For sparse, we will need to handle bank swizzling and macrotiles when mapping sparse textures. However the functions for handling this were leaking internal tiled_memcpy implementation details, like the concept of a 256-byte "block" that doesn't really exist in the tiling (instead everyone else deals with UBWC blocks, which may be 256 bytes or smaller, and 4K macrotiles). Rewrite them to work in terms of macrotiles, and take an fdl_layout. In order to avoid having to pass an fdl_layout everywhere, pass around the computed bank_mask and bank_swizzle everywhere. This also means that we don't recompute several times. Finally, expose a function to compute the macrotile size, which will also be needed to work with bank swizzling. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Connor Abbott	348ffdc996	freedreno/fdl: Expose fdl6_is_r8g8_layout() publicly We will need to use this in other places in fdl. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>	2025-09-05 16:58:09 +00:00
Mike Blumenkrantz	6596bf69c6	zink: add another flag to determine whether linked program compile is done it's otherwise possible for this to race and hit the draw before precompile finishes without ever waiting on the fence I guess this just worked coincidentally before? cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37197>	2025-09-05 16:29:15 +00:00
Mike Blumenkrantz	0b586d546d	zink: remove rebar requirement for descriptor buffer support this is not really relevant; if db is supported, use it Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37197>	2025-09-05 16:29:15 +00:00
Rhys Perry	efe536dbe9	vtn: use vtn_has_decoration more Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37175>	2025-09-05 15:58:03 +00:00
Mike Blumenkrantz	721af20a58	aux/trace: dump more mesh draw info Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37196>	2025-09-05 14:53:01 +00:00
Mike Blumenkrantz	a70e247b9b	mesa: add task/mesh to _mesa_shader_stage_to_subroutine_prefix() Reviewed-by: Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37196>	2025-09-05 14:53:01 +00:00
Rob Clark	76fece61c6	freedreno/registers: Add A7XX_CX_DBGC This was added on kernel side in commit 13ed0a1af263 ("drm/msm: Fix a7xx debugbus read"), but mesa copy of the registers was updated from an earlier revision of that patch which did not have A7XX_CX_DBGC. Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37192>	2025-09-05 14:23:13 +00:00
Karmjit Mahil	651df8029a	freedreno/registers: Fix SP_READ_SEL_LOCATION Five possible values are defined by `enum a7xx_state_location` so SP_READ_SEL_LOCATION must be at least 3 bit wide. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13836 Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37192>	2025-09-05 14:23:13 +00:00
Valentine Burley	ed15433c35	zink/ci: Document recent a618 EGL flakes Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36787>	2025-09-05 14:03:55 +00:00
Valentine Burley	2bcb25ee27	zink/ci: Enable VVL for Turnip on a618 Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36787>	2025-09-05 14:03:55 +00:00
Valentine Burley	31f6235126	tu: Enable robustBufferAccessUpdateAfterBind This is supported and must be enabled when descriptorBindingUpdateAfterBind is active. Fixes the following VVL error: Validation Error: [ VUID-VkDeviceCreateInfo-robustBufferAccess-10247 ] vkCreateDevice(): robustBufferAccessUpdateAfterBind is false, but both robustBufferAccess and a descriptorBindingUpdateAfterBind feature are enabled. Fixes: `d9fcf5de55` ("turnip: Enable nonuniform descriptor indexing") Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36787>	2025-09-05 14:03:55 +00:00
Timur Kristóf	038aac57a3	radeonsi: Fix some comments to also include GFX11.5 Just a nitpick. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	637f618ac5	radeonsi: Flush L2 for render condition when CP can't use L2 If CP can't use L2 then it also can't read the render condition through L2, so we need a flush, just like on GFX6-8. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	78efa4157a	radv: Don't use V_370_PFP or V_028A90_PS_DONE on compute queues The compute queue doesn't support these things. This change doesn't fix any known issues, but better to be safe. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	8447a4bfca	radv: Clean up use of RELEASE_MEM on GFX7 MEC MEC probably doesn't support EVENT_WRITE_EOP. Both PAL and RadeonSI use RELEASE_MEM. RADV used RELEASE_MEM too but "is_gfx8_mec" was very misleading. This commit just cleans that up. No functional changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	c56c746b71	radv: Don't use EVENT_WRITE_EOS on GFX7 EOS events are buggy on GFX7 and can cause hangs when used together in the same IB with CP DMA packets that use L2. While we don't use the L2 for CP DMA copies, we still use it with CP DMA prefetches, so the issue needs to be mitigated. As a mitigation, avoid using EVENT_WRITE_EOS and prefer to use the BOTTOM_OF_PIPE event instead of PS_DONE/CS_DONE, which should be close enough. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	2f587ea8be	radv: Don't set SWITCH_ON_EOI without tessellation For reference, see si_get_init_multi_vgt_param. The SWITCH_ON_EOI bit is only needed with tessellation. Also remove some useless lines. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	e8d1e935fb	radv/amdgpu: Don't use IB2 on GFX6 (for now) GFX6 actually supports IB2, but doesn't support chaining between chunks inside the IB2. See WaCpIb2ChainingUnsupported in PAL. Disable IB2 on GFX6 for now. The proper fix will be to disable use_ib in just secondary command buffers on GFX6 and emit multiple IB2 packets in the main command buffer. This will be implemented later. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	3056279d09	radv/amdgpu: Use correct NOP packets when unchaining a CS GFX6 doesn't support single-dword PKT3 NOP packets, so they shouldn't be used when unchaining a CS. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Timur Kristóf	132a61c6b7	radv/amdgpu: Fix crash with RADV_DEBUG=noibs After a refactor last year, the noibs option stopped working because it hits an assertion when empty IBs are submitted. Emit a single large NOP packet to avoid submitting empty IBs. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>	2025-09-05 13:42:56 +00:00
Christoph Pillmayer	f81f3c85e2	nir/opt_algebraic: Convert a + b + a to b + 2a Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This allows fusing into one FMA later. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37113>	2025-09-05 11:39:51 +00:00
Samuel Pitoiset	8233d9d571	radv: rename RADV_CMD_DIRTY_FS_STATE to RADV_CMD_DIRTY_PS_STATE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It's called PS everywhere else. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37087>	2025-09-05 10:16:20 +00:00
Samuel Pitoiset	f180682441	radv: add a new dirty bit for emitting a PS epilog Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37087>	2025-09-05 10:16:19 +00:00
Samuel Pitoiset	211e0823ec	radv: add a new dirty bit for compiling/binding a PS epilog Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37087>	2025-09-05 10:16:18 +00:00
Samuel Pitoiset	11e5f86a94	radv: add a function to bind a PS epilog The idea would be to separate compiling and emitting PS epilog in two separate states. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37087>	2025-09-05 10:16:17 +00:00
Samuel Pitoiset	bc71787ea3	radv: remove unnecessary NULL check when creating PS epilogs It's already checked in the caller. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37087>	2025-09-05 10:16:15 +00:00
Samuel Pitoiset	d771f2c462	radv: add small helper to dispatch RT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37141>	2025-09-05 09:21:26 +00:00
Samuel Pitoiset	1b6aad9def	radv/meta: use radv_CmdDispatchBase() directly for ASTC decode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37141>	2025-09-05 09:21:25 +00:00

... 4 5 6 7 8 ...

211847 commits