fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-18 19:28:24 +02:00

Author	SHA1	Message	Date
José Roberto de Souza	dbf64e9ad5	anv: Use anv_device_get_general_state_pool() Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42133>	2026-06-10 22:49:10 +00:00
José Roberto de Souza	9d06679d89	anv: Add function to get each anv_state_pool Xe3P will allow us to reduce the number of anv_state_pool in use, this will improve performance as it will result in less uAPI calls to allocate memory and less memory waste in anv_state_pool with not much use. As this will be a run-time decision, here I'm adding a function to get each anv_state_pool, then we can just change the function and all the callers will use the correct anv_state_pool. Next patches will replace directly access to each anv_state_pool by a function call in the next patches. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42133>	2026-06-10 22:49:10 +00:00
Julien Schueller	fd616bab71	glx: avoid crash on glXBindTexImageEXT when no texture target set Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details If a GLXPixmap is created without GLX_TEXTURE_TARGET_EXT, textureTarget remains 0. Calling glXBindTexImageEXT on such a drawable would pass 0 to _mesa_get_current_tex_object(), triggering an internal implementation error and a null-pointer segfault. Return early when textureTarget is 0 - the drawable was never set up for texturing, so bind is a no-op. Reviewed-by: Adam Jackson <ajax@redhat.com> Assisted-by: DeepSeek V4 Flash Closes: #58 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42093>	2026-06-10 20:57:35 +00:00
Benjamin Cheng	69c7f6d456	radv/video: Use {min,max}_qp caps from ac Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42136>	2026-06-10 20:34:33 +00:00
Benjamin Cheng	880fbcbeee	ac/video: Add {min,max}_qp to video enc caps Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42136>	2026-06-10 20:34:33 +00:00
Benjamin Cheng	c2e76e111d	radv/video: Report MULTIPLE_SLICE_SEGMENTS_PER_TILE_BIT VCN supports one tile only, but with multiple slice segments. Cc: mesa-stable Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42136>	2026-06-10 20:34:33 +00:00
Benjamin Cheng	b8b8035c6b	radv/video: Set accurate minQp/QIndex The spec requires us to follow the constantQp/base_q_idx from the app, which is constrained by the caps. Report the more accurate caps. Cc: mesa-stable Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42136>	2026-06-10 20:34:33 +00:00
Job Noorman	11334c438a	ir3: fix possible signed overflow in ir3_link_add `1 << 31` is undefined since `1` is a signed integer. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `1f9839907a` ("ir3: Skip missing VS outputs in VS out map when linking") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42147>	2026-06-10 20:07:01 +00:00
Christian Gmeiner	b83f446642	panvk: Advertise VK_KHR_shader_fma vtn lowers OpFmaKHR to nir_op_ffma and every Mali has a native fused multiply-add, so there is nothing to do in the backend. fp16 is gated on shaderFloat16. A 16-bit OpFmaKHR also needs the Float16 capability and only shaderFloat16 turns that on, so without it the bit would not be usable. Mali has no fp64, so that one stays off. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42075>	2026-06-10 19:42:49 +00:00
Danylo Piliaiev	67471fed86	tu: Enable texel buffer / SSBO emulation for known problematic games Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Job Noorman	9b32234726	tu: Add option to raise the maximum SSBO size Emulates SSBOS via global memory, real SSBO size and global base address are stored in the descriptor. The size can be accessed using resbase, the base address is parsed manualy from the descriptor by passing the bindless base address into the shader via a driver UBO or const file. nir_lower_ssbo is used to lower SSBO accesses to global memory when the buffer size exceeds the limit. We also use it to insert bounds checks on global memory. The final code for SSBO accesses looks like this: if (@get_ssbo_size >= max_storage_buffer_range_bytes) { if (offset < @get_ssbo_size) { // global memory access using base (from resbase) + offset } else { // do nothing (stores) or return 0 (loads) } } else { // original SSBO access } A new pass is added to lower @load_ssbo_address generated by nir_lower_ssbo. We set native_offset=true for nir_lower_ssbo to make sure it doesn't generate 64 bit address math. The new pass then transforms @load/store_global into @load/store_global_ir3 passing the 32 bit offset from @load_ssbo_address. Signed-off-by: Job Noorman <jnoorman@igalia.com> Co-authored-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Danylo Piliaiev	dc1bb7bbf4	tu: Add option to raise the maximum texel buffer size Emulates texel buffers via 3D image access, real texel buffer size and start offset (due to image aligment requirements) are stored in the descriptor and accessed via resbase. - Read-only access: isam.a.1d to read as 3d image. - RW access: stib.b.typed.3d/ldib.b.typed.3d to read as 3d image. Verified that proprietary D3D12 driver uses the same workaround, the only difference is that proprietary driver uses arrayed 2d load for read-only access instead of 3d load, but benefits are not verified. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Danylo Piliaiev	652864e385	tu/a8xx: Set real storage/texel buffer size limits From tests A8XX seem to fix incompatible with D3D12 limits. However, proprietary driver exposes old texel buffer element limit. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Danylo Piliaiev	d18b637a7c	tu: Specify max texel buffer and storage buffer limits via GPU props A8XX has different storage buffer range limit. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Danylo Piliaiev	fd99d813af	tu: Add allow_oob_indirect_ubo_loads to device cache uuid Fixes: `f4c40fc89c` ("tu: Add workaround for D3D11 games accessing UBO out of bounds") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Danylo Piliaiev	3c36e3b7b1	ir3: Add resbase_ir3 intrinsic Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Job Noorman	2fee7ac87f	nir/lower_ssbo: add option to insert bounds checks This is mostly useful in combination with `min_ssbo_size` when the native SSBO access instructions do the bounds check in HW so we don't want to add bounds checks for all SSBO accesses. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Job Noorman	7b2dfdf15d	nir/lower_ssbo: add option to only lower large SSBOs Some HW may have native SSBO instructions that only support a limited buffer size. It may be beneficial to use those instructions for small SSBOs and only fall back to global memory accesses for large ones. This commit adds an option (min_ssbo_size) that, if non-zero, will cause code like this to be emitted: if (@get_ssbo_size >= min_ssbo_size) { // global memory access } else { // original SSBO access } Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Job Noorman	ca9c01ddc5	nir/lower_ssbo: take offset_shift into account Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
Job Noorman	c0c1a2b0af	nir/get_io_index_src_number: support @load_ssbo_address Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>	2026-06-10 18:15:01 +00:00
squidbus	6e5773687f	kk,wsi/metal: Support VK_(KHR/EXT)_swapchain_maintenance1 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Primary additions are support for releasing images and changing present mode in the Metal WSI backend. Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42062>	2026-06-10 17:41:01 +00:00
squidbus	5882459c45	kk,wsi/metal: Support VK_EXT_hdr_metadata HDR metadata is packed and passed through as the `CAMetalLayer` `EDRMetadata`. Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42045>	2026-06-10 17:07:45 +00:00
squidbus	621b816aeb	wsi/metal: Support HDR10 color spaces HDR color spaces also should enable `wantsExtendedDynamicRangeContent`. Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42045>	2026-06-10 17:07:45 +00:00
Michel Dänzer	178a3d7396	egl/gbm: Eliminate local variable "max_age" in get_back_bo Use dri2_surf->back->age instead. No functional change intended. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:59 +00:00
Michel Dänzer	a668971bb5	egl/gbm: Use continue instead of nested block Suggested by Daniel Stone. No functional change intended. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:59 +00:00
Michel Dänzer	aa3ef4dd42	egl/gbm: Eliminate local variable "age" in get_back_bo Use buffer->age instead. No functional change intended. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:59 +00:00
Michel Dänzer	5fdaab9ef8	egl/gbm: Use local variable for better readability in get_back_bo No functional change intended. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:58 +00:00
Michel Dänzer	ab8e57cf31	egl/gbm: Ignore current front buffer in get_back_bo Even if the front buffer isn't locked yet, it will normally get locked, so we can't reuse it as a back buffer. Pointed out by Daniel Stone. Fixes: `4a976b60b1` ("egl_dri2: use gbm_surface as the native window type in drm platform") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:58 +00:00
Michel Dänzer	962fd789c8	egl/gbm: Ignore buffers with no BO for destroying excess BOs Can't destroy a BO that doesn't exist in the first place. Should fix the crash described in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41845#note_3510521 . Fixes: `dd7ae41091` ("egl/gbm: Destroy excess BOs") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>	2026-06-10 16:21:58 +00:00
Jose Maria Casanova Crespo	519f631e6b	v3dv: gate Dawn-required limits and features behind V3D_WEBGPU_OVERRIDE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Exposes higher limits than the ones supported by the HW and several ArrayDynamicIndexing features not yet implemented so the Dawn WebGPU implementation can be used while it doesn't exercise these limits or features. The override is enabled using the V3D_WEBGPU_OVERRIDE=1 envvar. When it is enabled it: - Increases the framebuffer dimension limit from the real HW value (4096 on RPi4, 7680 on RPi5) to 8192. - Bumps the advertised maxMipLevels reported per format from 13 to 14 to match the bumped 8192-wide images and 15 for non-2D images. The TMU HW already supports that for sampling. - Increases max_varying_components from 64 to 72 (HW limit is 64). - Exposes features that are not actually implemented; CTS tests that exercise them will hit asserts in debug builds: - shaderUniformBufferArrayDynamicIndexing - shaderSampledImageArrayDynamicIndexing - shaderStorageBufferArrayDynamicIndexing - shaderStorageImageArrayDynamicIndexing - Increases maxImageDimension1D from 4096 to 16384 - Increases maxImageDimension2D from 4096 to 8192 - Increases maxImageDimension3D from 4096 to 16384 - Increases maxImageDimensionCube from 4096 to 16384 When V3D_WEBGPU_OVERRIDE is unset (the default), the driver advertises the real HW limits already set up by the preceding "use real HW framebuffer limit" change, so Vulkan CTS conformance is unaffected. To help diagnose applications that hit the over-advertised paths, mesa_loge errors are emitted from three places: - lower_vulkan_resource_index() warns before the existing UNREACHABLE for dynamic descriptor indexing, so the cause is visible in release builds where the assertion is compiled out. - create_image() warns when vkCreateImage is called with attachment usage and dimensions above the real HW framebuffer limit. Storage and sampled-only images above that limit work fine via the TMU. - job_compute_frame_tiling() erros when a render job width/height exceeds the real HW framebuffer limit. The per-plane slices[] array in struct v3dv_image is sized at V3D_MAX_MIP_LEVELS + 2 so the override case (which advertises 14/15 mip levels for the bumped 8192-wide 2D images and 16384 for 1D/3D images) still fits without enlarging the default array. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>	2026-06-10 15:22:27 +00:00
Jose Maria Casanova Crespo	0ae28c9056	broadcom: raise framebuffer size to 7680 on V3D 7.1 Add a new max_framebuffer_size to devinfo so V3D 4.2 and V3D 7.1 can expose different framebuffer dimensions: 4096 on RPi4 and 7680 on RPi5. This is bounded by the maximum clip size supported by the framebuffer. Take advantage of this to also raise maxImageDimensions* to max_framebuffer_size. A non-power-of-two framebuffer means framebuffer_size_for_pixel_count can compute a height larger than max_framebuffer_size. Clamp the height to the maximum and recompute the width from the division so w * h <= num_pixels. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>	2026-06-10 15:22:27 +00:00
Jose Maria Casanova Crespo	5242d4c171	broadcom: add and use max_render_targets to devinfo Use the new devinfo value instead the V3D_MAX_RENDER_TARGETS macro. We only maintain the usage of the macro in devinfo initialization and the V3D in the versioned file src/gallium/drivers/v3d/v3dx_state.c Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>	2026-06-10 15:22:27 +00:00
Samuel Pitoiset	f21a95f890	radv/ci: skip all WSI tests also on NAVI21/NAVI31 To make sure pre-merge jobs don't hit the random issue, it would still be tested by nightly jobs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42146>	2026-06-10 14:58:31 +00:00
Jose Maria Casanova Crespo	94abf86561	v3dv: set non-zero array stride in null texture descriptor state pack_null_texture_state(), introduced to support VK_KHR_robustness2 nullDescriptor for image bindings, left the TEXTURE_SHADER_STATE "Array Stride (64-byte aligned)" field at 0. On real V3D HW it is fine: a TMU read against a null descriptor returns zero regardless of the descriptor contents, but V3D simulator validates the TMU array stride before issuing the read. Setting array_stride_64_byte_aligned = 1 (64 bytes raw) fixes failing dEQP-VK.robustness.robustness2.bind..null_descriptor.samples_1.3d. tests case under the simulator. Fixes: `990d76eae6` ("v3dv: Implement and enable nullDescriptor support") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42112>	2026-06-10 14:38:50 +00:00
Eric Engestrom	127acbb126	ci: bump fedora from 42 to 44 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>	2026-06-10 13:53:26 +00:00
Eric Engestrom	4ebf2e3baa	ci: bump bindgen version from 0.71.1 to 0.72.1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>	2026-06-10 13:53:26 +00:00
Eric Engestrom	dae8bc711d	ci: bump rust version from 1.90 to 1.96 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>	2026-06-10 13:53:25 +00:00
Eric Engestrom	47570e74ec	meson: exclude known buggy versions of bindgen Backport-to: * Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>	2026-06-10 13:53:25 +00:00
Eric Engestrom	09ea05cf23	rusticl: skip bindgen for pipe_shader_state_from_tgsi bindgen up to at least 0.72.1 generates invalid code (see below) and that function is not used, so simply skip it. src/gallium/frontends/rusticl/rusticl_mesa_bindings.c:795:81: error: duplicate ‘const’ declaration specifier [-Werror=duplicate-decl-specifier] 795 \| void pipe_shader_state_from_tgsi__extern(struct pipe_shader_state state, const const struct tgsi_token tokens) { pipe_shader_state_from_tgsi(state, tokens); } \| ^~~~~ Backport-to: * Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>	2026-06-10 13:53:25 +00:00
yserrr	38a98a4803	v3d: remove duplicate util_blitter_save_so_targets() call Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details v3d_blitter_save() saves the stream output targets twice. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42139>	2026-06-10 13:36:26 +00:00
Rhys Perry	addc719ec2	radv: workaround has_smem_partial_oob_access_bug Just use an existing flag to increase the bo size slightly. Fixes a ring gfx timeout with dEQP-VK.spirv_assembly.instruction.compute.opfma.fp32.vec3.undef.denorm_flush.directed on vega10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: * Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41937>	2026-06-10 13:01:47 +00:00
Rhys Perry	f7a3884278	ac/gpu_info: add has_smem_partial_oob_access_bug Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: * Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41937>	2026-06-10 13:01:46 +00:00
Rhys Perry	bed7ba2780	aco: schedule split barriers Move the s_barrier_signal as earlier and the s_barrier_wait later. fossil-db (gfx1201): Totals from 2152 (1.03% of 208640) affected shaders: Instrs: 1463236 -> 1463248 (+0.00%); split: -0.00%, +0.01% CodeSize: 7710732 -> 7710720 (-0.00%); split: -0.00%, +0.00% Latency: 7164883 -> 7159042 (-0.08%); split: -0.10%, +0.01% InvThroughput: 1593643 -> 1593651 (+0.00%); split: -0.00%, +0.00% VClause: 30170 -> 30166 (-0.01%) SClause: 26771 -> 26772 (+0.00%) Copies: 123002 -> 123004 (+0.00%) SALU: 221966 -> 221967 (+0.00%) VOPD: 1680 -> 1681 (+0.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:19 +00:00
Rhys Perry	26b942c306	aco: use split barrier instructions fossil-db (gfx1201): Totals from 135 (0.06% of 208640) affected shaders: Instrs: 155940 -> 155932 (-0.01%); split: -0.02%, +0.02% CodeSize: 905460 -> 905432 (-0.00%); split: -0.02%, +0.01% Latency: 1910087 -> 1909703 (-0.02%); split: -0.02%, +0.00% InvThroughput: 886321 -> 886280 (-0.00%) Copies: 12025 -> 12024 (-0.01%) VALU: 89681 -> 89679 (-0.00%) VOPD: 177 -> 178 (+0.56%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:19 +00:00
Rhys Perry	a95f841125	aco: add split barrier instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:19 +00:00
Rhys Perry	7c6be36cf4	aco: don't emit waitcnts before subgroup-scope execution barriers This delays the waitcnt for has_attr_ring_wait_bug by a few instructions. fossil-db (gfx1201): Totals from 9 (0.00% of 208640) affected shaders: Instrs: 19352 -> 19506 (+0.80%) CodeSize: 101180 -> 101716 (+0.53%) Latency: 660221 -> 678782 (+2.81%); split: -0.00%, +2.81% InvThroughput: 95106 -> 97398 (+2.41%) fossil-db (navi33): Totals from 58834 (28.20% of 208626) affected shaders: Instrs: 22424304 -> 22424571 (+0.00%) CodeSize: 110198112 -> 110199184 (+0.00%) Latency: 115894319 -> 126491124 (+9.14%); split: -0.00%, +9.14% InvThroughput: 19424631 -> 19754358 (+1.70%); split: -0.00%, +1.70% I don't think the stats are very accurate. This seems to often move the s_waitcnt down into a divergent branch, but the wait still happens later if the branch isn't taken, so the wait is counted twice. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:18 +00:00
Rhys Perry	3676c3860e	aco: only assume load/store with semantic_atomic is atomic ACCESS_ATOMIC was added a while ago. fossil-db (gfx1201): Totals from 84 (0.04% of 208640) affected shaders: Instrs: 74569 -> 74402 (-0.22%) CodeSize: 379220 -> 378552 (-0.18%) Latency: 589791 -> 575984 (-2.34%) InvThroughput: 56042 -> 54921 (-2.00%) fossil-db (navi33): Totals from 79 (0.04% of 208626) affected shaders: Instrs: 69170 -> 69015 (-0.22%) CodeSize: 349580 -> 348928 (-0.19%) Latency: 563270 -> 549156 (-2.51%) InvThroughput: 61245 -> 59887 (-2.22%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:18 +00:00
Rhys Perry	a0d5c117fc	aco: optimize redundant s_wait_alu vm_vsrc(0) during waitcnt insertion fossil-db (gfx1201): Totals from 143 (0.07% of 208640) affected shaders: Instrs: 104804 -> 104588 (-0.21%) CodeSize: 543148 -> 542320 (-0.15%) Latency: 751702 -> 751446 (-0.03%); split: -0.04%, +0.00% InvThroughput: 78599 -> 78588 (-0.01%); split: -0.02%, +0.00% fossil-db (navi33): Totals from 170 (0.08% of 208626) affected shaders: Instrs: 107230 -> 106983 (-0.23%) CodeSize: 554952 -> 553940 (-0.18%) Latency: 746901 -> 746628 (-0.04%); split: -0.04%, +0.00% InvThroughput: 102412 -> 102390 (-0.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:18 +00:00
Rhys Perry	650715b077	aco: fix printing of primitive exports Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:17 +00:00
Rhys Perry	c815c51dcb	aco/waitcnt: always use uint32_t for event masks This shouldn't fix anything, because event_vmem_bvh was never used here. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>	2026-06-10 12:13:17 +00:00

1 2 3 4 5 ...

224098 commits