fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-27 06:10:13 +01:00

Author	SHA1	Message	Date
Romaric Jodin	a25e88cd84	aux/trace: remove -I argument enums2names.py is only uses in one place. I propose to remove the -I argument that is not strictly necessary as we can already get the header name from the `-H` argument. That modification is motivated by the need to help ninja-to-soong to generate proper rule for the Android build system. ninja-to-soong can't differenciate output file location and a string matching the output file name. Ref #14072 Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37785>	2025-10-23 07:03:01 +00:00
Faith Ekstrand	aa0f404f7b	nvk: Disable sampleLocationsSampleCounts for 1x MSAA Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Suggested-by: Mel Henning <mhenning@darkrefraction.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14108 Fixes: `a34edc7500` ("nvk: Fill out sample locations on Maxwell B+") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38018>	2025-10-22 21:55:22 +00:00
Faith Ekstrand	d1793c7a59	nvk: Include the chipset in the pipeline/binary cache UUID Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38018>	2025-10-22 21:55:22 +00:00
Lionel Landwerlin	aa929ea706	nir/lower_io: add missing levels intrinsics to get_io_index_src_number Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c7ac46a1d8` ("nir/lower_io: add get_io_index_src_number support for image intrinsics") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38012>	2025-10-22 21:21:47 +00:00
Aitor Camacho	cf49338ccd	kk: Expose missing BC formats Exposing textureCompressionBC requires all BC formats in Vulkan. Missing BC formats added in this commit: - VK_FORMAT_BC1_RGB_UNORM_BLOCK - VK_FORMAT_BC1_RGB_SRGB_BLOCK Fixes: `fa6fa8c19e` ("kk: BCn Formats") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38014>	2025-10-22 21:08:27 +00:00
Lionel Landwerlin	f3df267735	brw: handle GLSL/GLSL tessellation parameters Apparently various tessellation parameters come specified from TESS_EVAL stage in GLSL while they come from the TESS_CTRL stage in HLSL. We switch to store the tesselation params more like shader_info with 0 values for unspecified fields. That let's us merge it with a simple OR with values from from tcs/tes and the resulting merge can be used for state programming. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a91e0e0d61` ("brw: add support for separate tessellation shader compilation") Fixes: `50fd669294` ("anv: prep work for separate tessellation shaders") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37979>	2025-10-22 20:48:59 +00:00
Lionel Landwerlin	8d05b7b72e	anv: rename structure holding 3DSTATE_WM_DEPTH_STENCIL state Cc stable for the next commit. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37979>	2025-10-22 20:48:59 +00:00
Mel Henning	fafb81cd02	treewide: Use vk_collect_dependency_info_src_stages Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37961>	2025-10-22 19:22:17 +00:00
Mel Henning	90d17c0088	vulkan: Add vk_collect_dependency_info_src_stages Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37961>	2025-10-22 19:22:17 +00:00
Valentine Burley	fd2fa0fbc9	tu: Fix maxVariableDescriptorCount with inline uniform blocks It must not be larger than maxInlineUniformBlockSize. Fixes VKCTS 1.4.4.0's dEQP-VK.api.maintenance3_check.support_count_inline_uniform_block*. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38002>	2025-10-22 18:14:00 +00:00
Valentine Burley	17e25b4983	tu: Fix indexing with variable descriptor count Based on RADV. The Vulkan spec says: "If bindingCount is zero or if this structure is not included in the pNext chain, the VkDescriptorBindingFlags for each descriptor set layout binding is considered to be zero. Otherwise, the descriptor set layout binding at VkDescriptorSetLayoutCreateInfo::pBindings[i] uses the flags in pBindingFlags[i]." Fixes dEQP-VK.api.maintenance3_check.* in VKCTS 1.4.4.0. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38002>	2025-10-22 18:14:00 +00:00
Simon Perretta	ff51e6dc9e	nir: commonize barycentric intrinsic opt pass Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Introduces an opt pass that attempts to optimize load_barycentric_at_{sample,offset} with simpler load_barycentric_* equivalents where possible, and optionally lowers load_barycentric_at_sample to load_barycentric_at_offset with a position derived from the sample ID instead. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37658>	2025-10-22 16:48:01 +00:00
Caio Oliveira	e38491eb18	mesa/st: Lower to ALU scalar after fp64 subgroup lowering Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The subgroup lowering may generate new fp64 vector operations, so ensure that those are lowered before calling nir_lower_doubles(). Issue spotted by Georg Lehmann. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38003>	2025-10-22 15:35:34 +00:00
Isaac Marovitz	fa6fa8c19e	kk: BCn Formats Signed-off-by: Isaac Marovitz <isaacryu@icloud.com> Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37967>	2025-10-22 15:21:29 +00:00
Jarrett Johnson	a62f285607	kk: advertise multiDrawIndirect Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37962>	2025-10-22 15:08:13 +00:00
Olivia Lee	bca29b1c92	hk: fix data race when initializing poly_heap hk_heap is called during command buffer recording, which may be concurrent, so writing dev->heap without synchronization is a data race. Signed-off-by: Olivia Lee <olivia.lee@collabora.com> Fixes: `5bc8284816` ("hk: add Vulkan driver for Apple GPUs") Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37973>	2025-10-22 14:54:24 +00:00
Benjamin Cheng	b6d6c1af73	radv/video_enc: Cleanup slice count assert This was left over when first enabling multiple slice encoding. Fixes: `63e952ff2c` ("radv/video: Support encoding multiple slices") Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37999>	2025-10-22 14:31:45 +00:00
Alyssa Rosenzweig	05481f56a0	brw: use the right int8/int16 division lowering lowering bitsize before lowering idiv is silly, since then it forces us down the software int32 division path instead of the much faster int8/int16 lowered path. Relevant CTS tests: dEQP-VK.spirv_assembly.type.scalar.i16.div_comp, dEQP-VK.spirv_assembly.type.scalar.i8.rem_comp, Go from: SIMD8 shader: 46 instructions. 1 loops. 4716 cycles. 0:0 spills:fills SIMD8 shader: 1008 instructions. 0 loops. 3600 cycles. 0:0 spills:fills, 8 sends to: SIMD8 shader: 17 instructions. 1 loops. 2556 cycles. 0:0 spills:fills SIMD8 shader: 464 instructions. 0 loops. 1394 cycles. 0:0 spills:fills, 8 sends No stats change on fossil-db (which has very little int8/int16 and even less integer division, apparently). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37966>	2025-10-22 10:00:36 -04:00
Aksel Hjerpbakk	c2a6fb6419	panvk: cull semaphores in unrelated subqueues Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Skip waiting/signaling on semaphores with stages not related to a given subqueue Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>	2025-10-22 11:48:50 +00:00
Aksel Hjerpbakk	1381228329	panvk: refactor vk_stage_to_subqueue_mask - rename vk_stage_to_subqueue_mask -> vk_stages_to_subqueue_mask - handle stage masks instead of single stages. - Add which sync scope it is reading to better reason with the mask semantics. - Handle ALL_COMMANDS as well as TOP/BOTTOM (using sync scopes) - add timestamp utility vk_stage_to_timestamp_subqueue_mask Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>	2025-10-22 11:48:50 +00:00
Julian Orth	9fde755357	kopper: disable color management for wayland surfaces Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Otherwise clients cannot use color management themselves. Signed-off-by: Julian Orth <ju.orth@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37693>	2025-10-22 10:22:41 +00:00
Pierre-Eric Pelloux-Prayer	90103fe618	radeonsi: propagate shader updates for merged shaders In case of merged shaders (eg: VS+GS), a change to VS should trigger a GS update. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13935 Fixes: `b1a34ac95d` ("radeonsi: change do_update_shaders boolean to a bitmask") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37882>	2025-10-22 09:45:49 +00:00
Pierre-Eric Pelloux-Prayer	091f18ea57	radeonsi: set VS dirty bit from si_vs_key_update_inputs Since it has to be done for every caller we might as well do it from a single place. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37882>	2025-10-22 09:45:49 +00:00
Pierre-Eric Pelloux-Prayer	d3f8571e84	radeonsi: limit the sqtt buffer size We don't want the buffer size to wraparound to 0. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37882>	2025-10-22 09:45:49 +00:00
Lars-Ivar Hesselberg Simonsen	a7bf37e3bd	panvk/v9+: Reduce maxBoundDescriptorSets to 7 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Until now, the driver has been using a single set for internal state while leaving maxBoundDescriptorSets to the remaining 15. This gives us no room for optimizations of driver sets, which might become an issue in the future. To remedy this, we therefore reduce maxBoundDescriptorSets to 7. This aligns with the proprietary driver and gives us the space to optimize the driver sets. We might increase this in the future if we see that we don't need all the driver sets we now reserve. Reviewed-by: John Anthony <john.anthony@arm.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37978>	2025-10-22 10:38:35 +02:00
Eric Engestrom	4ab65cdaa4	docs: update/fix vk spec urls Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37993>	2025-10-22 09:23:34 +02:00
Faith Ekstrand	efbecd93ba	util: Build util/cache_ops_x86.c with -msse2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details __builtin_ia32_clflush() requires -msse2 so we need to set -msse2 at least for building that file. Fortunately, there are no GPUs that actually need userspace cache flushing that can ever be bolted onto a pre-SSE2 x86 CPUs. Fixes: `555881e574` ("util/cache_ops: Add some cache flush helpers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14134 Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37958>	2025-10-22 03:36:10 +00:00
Faith Ekstrand	3739d7a90c	util: Don't advertise cache ops on x86 without SSE2 Fixes: `555881e574` ("util/cache_ops: Add some cache flush helpers") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37958>	2025-10-22 03:36:10 +00:00
Olivia Lee	a410d90fd2	panfrost: fix cl_local_size for precompiled shaders Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_lower_compute_system_values will attempt to lower load_workgroup_size unless workgroup_size_variable is set. For precomp shaders, the workgroup size is set statically for each entrypoint by nir_precompiled_build_variant. Because we call lower_compute_system_values early, it sets the workgroup size to zero. Temporarily setting workgroup_size_variable while we are still processing all the entrypoints together inhibits this. Signed-off-by: Olivia Lee <olivia.lee@collabora.com> Fixes: `20970bcd96` ("panfrost: Add base of OpenCL C infrastructure") Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37799>	2025-10-22 00:15:49 +00:00
Aitor Camacho	02a1ec1021	kk: Hash vertex input state We embed vertex fetching into vertex shaders and therefore we require hashing that state so the hash works as expected. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37992>	2025-10-21 23:59:33 +00:00
Lorenzo Rossi	dc0dcc993b	nvk: implement VK_EXT_discard_rectangles Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33476>	2025-10-21 23:17:38 +00:00
Lorenzo Rossi	4c62e09505	vulkan: increase MESA_VK_MAX_DISCARD_RECTANGLES Turing and newer Nvidia cards can work with up to 8 discard rectangles Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33476>	2025-10-21 23:17:38 +00:00
Rhys Perry	b18421ae3d	amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This can happen with (for example) 32x2 loads with align_mul=4,align_offset=2. This patch does bit_size=min(bit_size,bytes) to prevent num_components from being 0. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `52cd5f7e69` ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	64ec757688	nir/lower_mem_access_bit_sizes: increase chunk limit Not sure about creating u64vec16 loads, but creating unaligned loads is possible with opt_if_rewrite_uniform_uses. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	e89b22280f	amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.3 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	8829fc3bd6	amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering Summary of changes: - handle unaligned 16-bit scalar loads when supported_dword=true - increases the size of 8/16/32/64-bit buffer loads which are not dword aligned, which can create less SMEM loads. - handles when "bytes" is less than "bit_size / 8" fossil-db (gfx1201): Totals from 26 (0.03% of 79839) affected shaders: Instrs: 12676 -> 12710 (+0.27%); split: -0.30%, +0.57% CodeSize: 67272 -> 67384 (+0.17%); split: -0.24%, +0.40% Latency: 44399 -> 44375 (-0.05%); split: -0.09%, +0.04% SClause: 352 -> 344 (-2.27%) SALU: 3972 -> 3992 (+0.50%) SMEM: 554 -> 528 (-4.69%) fossil-db (navi21): Totals from 6 (0.01% of 79825) affected shaders: Instrs: 2192 -> 2186 (-0.27%) CodeSize: 12188 -> 12140 (-0.39%) Latency: 10037 -> 10033 (-0.04%); split: -0.12%, +0.08% SMEM: 124 -> 118 (-4.84%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	79b2fa785d	amd/lower_mem_access_bit_sizes: don't create subdword UBO loads with LLVM These are unsupported. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14127 Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:33 +00:00
Dylan Baker	38e1a43f53	intel/mda: Fix potential underflow in printing code The actual chances of this happening seem dubious, but the cleaned up code seems nice. printf returns a value >= 0 on success, which is the number of characters it writes a return < 0 means that an error occurred, and then errno is set. Which negative value doesn't seem to be specified, but it also seems unlikely that any implementation would return `-MAX_INT`... Anyway, this is fixed by converting the generic `print_repeated` to a `print_separator` that avoids the need to do arithmetic at all by just stopping the loop at 1 instead of 0, and then printing a newline. CID: 1666497 CID: 1666256 CID: 1666531 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37746>	2025-10-21 21:55:53 +00:00
Dylan Baker	f25e59b951	intel/mda/tests: use an ASSERT on fread() Coverity is pointing out that we should check this, and in reality if this isn't what we expect the rest of the test is probably invalid anyway. CID: `1666504` CID: 1666544 CID: 1666552 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37750>	2025-10-21 14:39:18 -07:00
Mel Henning	28fbc6addb	nvk: VK_DEPENDENCY_ASYMMETRIC_EVENT_BIT_KHR This was missed in the original maintenance9 MR. Fixes the flakes in test dEQP-VK.synchronization2.op.single_queue.event.write_ssbo_compute_read_ssbo_compute.buffer_16384_maintenance9 Fixes: `7692d3c0` ("nvk: Advertise VK_KHR_maintenance9") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37964>	2025-10-21 20:57:41 +00:00
Karol Herbst	e7dca5a6ca	nak: fix MMA latencies on Ampere Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Fixes: `7a01953a39` ("nak: Add Ampere and Ada latency information") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37941>	2025-10-21 20:12:30 +00:00
Karol Herbst	cf4df97093	nak: improve fp16 latencies on Ampere Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37941>	2025-10-21 20:12:30 +00:00
Karol Herbst	85480200f8	nak: simplify SM80 HMMA latency categorization Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37941>	2025-10-21 20:12:30 +00:00
Karol Herbst	3bbf3f7826	nak: ensure deref has a ptr_stride in cmat load/store lowering With untyped pointer we might get a deref_cast with a 0 ptr_stride. But we were supposed to ignore the stride information on the pointer anyway, so let's do that properly now. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Fixes: `05dca16143` ("nak: extract nir_intrinsic_cmat_load lowering into a function") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37941>	2025-10-21 20:12:30 +00:00
Karol Herbst	f632bfc715	nak: extract cmat load/store element offset calculation Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Fixes: `05dca16143` ("nak: extract nir_intrinsic_cmat_load lowering into a function") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37941>	2025-10-21 20:12:30 +00:00
Konstantin Seurer	d423554e9e	radv/bvh: Pair compress triangles in more cases Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>	2025-10-21 19:32:55 +00:00
Konstantin Seurer	c0f332f1cb	vulkan/bvh: Add leaf.h to vk_bvh_includes Otherwise, the shader will not recompile when the file was modified. Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>	2025-10-21 19:32:55 +00:00
Konstantin Seurer	020bd86d30	vulkan: Remove the vk_ir_triangle_node::id field Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>	2025-10-21 19:32:55 +00:00
Konstantin Seurer	c18a7d0e2b	radv: Emit compressed primitive nodes on GFX12 The normal encode pass writes batches to a section in build scratch memory. Those batches contain information about the internal node and the primitive nodes. The encoder is split to avoid the register pressure of the compressor and maximize occupancy. The compressor works in two passes because one pass can not guarantee that every primitive node (except) has at least two triangles. This guarantee is used to advertise a smaller acceleration structure size to the application. During compression, every invocation processes at most two triangles. Groups of 8 invocations are used to support the maximum triangle count of 16 that the hardware supports. The first step of compression is loading the triangle(s). Shared vertices are deduplicated early to avoid doing it in the compression loop. The compression loop tries to add triangles to a list of triangles until the computed node size needed for storing the triangles reaches the hardware node size. For this, each invocation first deduplicates vertices with the triangles that have already been picked. It then computes the node size of the picked triangles plus the candidate triangles of the current invocation. The invocation that computed the smallest size is added to the list. Because it may not be possible to fit every triangle into the same node, there can be multiple hardware nodes which are written in parallel for optimal performance. If there are no nodes with only one triangle, all nodes are written. If there is, compression of the batch is aborted and the index of the batch is written to build scratch memory. The second compression pass will repeat the steps above but only for those aborted batches. The nodes with only one triangle can and are now merged. It can not be determined during box node encode which triangles will be compressed together so the encoder also has to fix up the parent box node's child infos. Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>	2025-10-21 19:32:55 +00:00
Konstantin Seurer	c5f9fe5e3b	radv/rra/gfx12: Properly validate geometry indices Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>	2025-10-21 19:32:54 +00:00

1 2 3 4 5 ...

213920 commits