fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-21 21:40:22 +01:00

Author	SHA1	Message	Date
David Rosca	e3f886ac15	radeonsi/vcn: Use correct frame context buffer for preencode on VCN5 Fixes: `3c5fe03b92` ("radeonsi/vcn: Add support for VCN5 dpb tier2") Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> (cherry picked from commit `4ec43c59da`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-19 14:29:03 -08:00
Francisco Jerez	f35c690b12	intel/fs/xe2: Fix up subdword integer region restriction with strided byte src and packed byte dst. This fixes a corner case of the LNL sub-dword integer restrictions that wasn't being detected by has_subdword_integer_region_restriction(), specifically: > if(Src.Type==Byte && Dst.Type==Byte && Dst.Stride==1 && W!=2) { > // ... > if(Src.Stride == 2) && (Src.UniformStride) && (Dst.SubReg%32 == Src.SubReg/2 ) { Allowed } > // ... > } All the other restrictions that require agreement between the SubReg number of source and destination only affect sources with a stride greater than a dword, which is why has_subdword_integer_region_restriction() was returning false except when "byte_stride(srcs[i]) >= 4" evaluated to true, but as implied by the pseudocode above, in the particular case of a packed byte destination, the restriction applies for source strides as narrow as 2B. The form of the equation that relates the subreg numbers is consistent with the existing calculations in brw_fs_lower_regioning (see required_src_byte_offset()), we just need to enable lowering for this corner case, and change lower_dst_region() to call lower_instruction() recursively, since some of the cases where we break this restriction are copy instructions introduced by brw_fs_lower_regioning() itself trying to lower other instructions with byte destinations. This fixes some Vulkan CTS test-cases that were hitting these restrictions with byte data types. Fixes: `217d412360` ("intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-19 14:28:55 -08:00
Lionel Landwerlin	7dc34f1147	anv: fix missing push constant reallocation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `62d96a6546` ("anv: add dirty tracking for push constant data") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12151 (cherry picked from commit `8845255881`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:16 -08:00
Kenneth Graunke	8e45bd6365	brw: Fix try_rebuild_source's ult32/ushr handling to use unsigned types We were accidentally doing a signed integer comparison here for ult32, or a sign-extending shift for ushr. One notable bit of fallout was that load_global_uniform_block_intel address calculations broke on platforms that don't have native 64-bit integer support, as the iadd64 lowering for "do I need to carry?" was using ult32...and performing the wrong comparison. We spotted this in Borderlands 3 on Alchemist once we turned on other optimizations. Thanks to Lionel Landwerlin for helping spot the problem! Fixes: `c7b312ad45` ("brw: factor out source extraction for rematerialization") Fixes: `339630ab05` ("brw: enable A64 loads source rematerialization") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `5848035443`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:15 -08:00
Karol Herbst	9728a9075c	vtn: handle struct kernel arguments passed by value Due to LLVM ABI reasons the SPIRV-LLVM-Translator always uses pointers to private memory for struct function parameters. This includes kernel entry points. However technically it's also legal to pass those parameters by value according to the OpenCL SPIR-V Env spec. One compiler making use of this is e.g. artic based on Thorin. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12149 Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (cherry picked from commit `d0560f59ce`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:14 -08:00
Erik Faye-Lund	f4d83eb508	glx: avoid null-deref psc can be NULL here, so let's avoid dereferencing it. Fixes: `34dea2b38e` ("glx: unify extension binding") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `5ced8b0ea2`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:13 -08:00
Mary Guillemard	3567dac750	bi: Execute nir_opt_algebraic after nir_lower_pack nir_lower_pack can generate split operations, execute algebraic again to handle them. This fix an assert on "dEQP-VK.spirv_assembly.instruction.compute.opphi.vartype_float16" and probably others tests. Fixes: `3904cfabd6` ("bi: Use nir_opt_load_store_vectorize") Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: John Anthony <john.anthony@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (cherry picked from commit `e5d64ca69c`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:12 -08:00
Lionel Landwerlin	d857c4a418	anv: fix incorrect aspect flag for depth/stencil formats We're asking if compression is supported and anv_formats_ccs_e_compatible() is assuming color aspect. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0317c44872` ("anv: add VK_EXT_host_image_copy support") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12155 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> (cherry picked from commit `431f353bfe`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:12 -08:00
Matt Turner	a3a064b92c	vulkan: Avoid pointer aliasing Avoids the sanitizer errors: ``` Test case 'dEQP-VK.pipeline.monolithic.spec_constant.graphics.vertex.basic.mixed_packed'.. ../src/vulkan/util/vk_util.c:111:38: runtime error: load of misaligned address 0x603002b1c591 for type 'const uint16_t', which requires 2 byte alignment 0x603002b1c591: note: pointer points here 00 00 00 98 76 98 54 76 98 ba 10 32 54 76 98 ba dc fe ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ../src/vulkan/util/vk_util.c:108:38: runtime error: load of misaligned address 0x603002b1c593 for type 'const uint32_t', which requires 4 byte alignment 0x603002b1c593: note: pointer points here 00 98 76 98 54 76 98 ba 10 32 54 76 98 ba dc fe ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ../src/vulkan/util/vk_util.c:105:38: runtime error: load of misaligned address 0x603002b1c597 for type 'const uint64_t', which requires 8 byte alignment 0x603002b1c597: note: pointer points here 54 76 98 ba 10 32 54 76 98 ba dc fe ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 03 11 00 ^ ``` Fixes: `476dc3c050` ("vulkan: add vk_spec_info_to_nir_spirv util method") (cherry picked from commit `3d24f0ece1`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:11 -08:00
Boris Brezillon	de9faec619	panvk/csf: Fix a wait-LS operation in finish_cs() cs_wait_slots() expects a mask, cs_wait_slot() a slot ID. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> (cherry picked from commit `c3ff3f2405`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:10 -08:00
Karol Herbst	1c6b2f701c	rusticl/kernel: fix kernel variant selection Apparently I messed up enough so that the optimized kernel variant was almost never selected. This fixes that :) Fixes: `f098620c21` ("rusticl/kernel: add optimized Kernel variant") (cherry picked from commit `a5149f3fef`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:09 -08:00
Robert Mader	436e5c06b9	v3d: Support SAND128 base modifier The BROADCOM_SAND128 modifier is usually used with an extra parameter to pass in the stride via a side channel. Quoting from drm_fourcc.h: > The pitch between the start of each column is set to optimally > switch between SDRAM banks. This is passed as the number of lines > of column width in the modifier (we can't use the stride value due > to various core checks that look at it , so you should set the > stride to width*cpp). So apparently this is just a workaround for limitations in some kernel APIs. DRM modifiers, however, are arguably a bad fit for extra parameters that aren't known in advance. In the Wayland/KMS ecosystem many components depend on being able to treat modifiers as opaque, e.g. for negotiations etc. In practice the current approach requires various software components to manually use the `DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT()` macro - using the `DRM_FORMAT_MOD_BROADCOM_SAND128` modifier directly with formats like `NV12` results in a rejection in the KMS driver and corrupted output in Mesa (because we'd bail out early in `v3d_sand8_blit()`). Fortunately the stride check limitations mentioned above don't seem to apply to Mesa though. Thus we can just add support for the base modifier and stride (coming from V4L2), allowing various toolkits, Wayland compositors and V4L2 decoder implementations to support e.g. `NV12` + `DRM_FORMAT_MOD_BROADCOM_SAND128` (`NC12` in V4L2) in a generic way. Notes: 1. Wayland compositors trying to offload composition to KMS will still fail when doing a test commit. 2. There is another limitation - in the V4L2 MPLANE API - that requires userspace to know the correct offset of the second plane. That's a known API limitation though and only affects V4L2 decoder implementations. Cc: mesa-stable Signed-off-by: Robert Mader <robert.mader@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> (cherry picked from commit `758941ab0c`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:07 -08:00
Sam Lantinga	922a339d91	util: Fixed crash in HEVC encoding on 32-bit systems This builds on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25059, and extends that change to all 32-bit systems. This fixes a crash on SteamOS with the following test case: unsigned char data[] = { 0x00, 0x00, 0x00, 0x01, 0x40, 0x01, 0x0c, 0x01, 0xff, 0xff, 0x01, 0x60, 0x00, 0x00, 0x03, 0x00, 0xb0, 0x00, 0x00, 0x03, 0x00, 0x00, 0x03, 0x00, 0x99, 0x2c, 0x0c, 0x01, 0x64, 0x7c, 0x00, 0x7c, 0xd2, 0x56, 0x01, 0x40, 0x00, 0x00, 0x00, 0x01, 0x42, 0x01, 0x01, 0x01, 0x60, 0x00, 0x00, 0x03, 0x00, 0xb0, 0x00, 0x00, 0x03, 0x00, 0x00, 0x03, 0x00, 0x99, 0xa0, 0x02, 0x80, 0x80, 0x32, 0x16, 0x24, 0xbb, 0x90, 0x84, 0x48, 0x9a, 0x83, 0x03, 0x03, 0x02, 0x00, 0xb2, 0x3e, 0x00, 0x3e, 0x69, 0x2b, 0x00, 0x5f, 0x08, 0x04, 0x10, 0x00, 0x00, 0x00, 0x01, 0x44, 0x01, 0xc0, 0x62, 0x0f, 0x02, 0x24 }; vlVaContext context; vlVaBuffer buf; memset(&context, 0, sizeof(context)); memset(&buf, 0, sizeof(buf)); context.packed_header_emulation_bytes = true; buf.data = data; buf.size = sizeof(data); vlVaHandleVAEncPackedHeaderDataBufferTypeHEVC(&context, &buf); Cc: mesa-stable Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `4ed8ef74b4`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:06 -08:00
Samuel Pitoiset	b4b12c6708	radv: fix ignoring src stage mask when dst stage mask is BOTTOM_OF_PIPE Otherwise the driver doesn't synchronize if there are image layout transitions. This fixes rendering issues with displayable DCC (usually black squares in the bottom of screen). This mostly happens when an application uses a lower resolution than the screen supports and fshack (wine/proton) which upscales images uses COMPUTE_SHADER->BOTTOM_OF_PIPE for the barrier after a dispatch. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11547 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11600 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11789 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8705 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9890 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `c08d2c40ed`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:40:05 -08:00
Iván Briano	232c6b2d8e	anv: remove unused/misleading/wrong parameters from the RT trampoline Since the shader parameters are passed as inline data, push constants are no longer used and so, not actually set on dispatch. But the nr_params = 4 was still making the shader emit the code to load them, causing page faults on simulation, and would also on HW if we didn't always have a scratch page set. The uses_inline_data parameter will be set from brw_compile_cs(), called shortly after this point, so we don't need it here. The subgroup_size is misleading, as we don't actually require that size and the code that checks for it isn't even running for this shader. Fixes: `97b17aa0b1` ("brw/nir: rework inline_data_intel to work with compute") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `d32a26b3e6`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:39:58 -08:00
David Heidelberg	1e9229fd09	compiler/rust: drop duplicated bindgen check The same check is present in meson file in root directory. Cc: mesa-stable # 24.3 Reviewed-by: Eric Engestrom <eric@igalia.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `1368ee5e1a`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-18 09:39:57 -08:00
Michel Dänzer	14f9d6456a	Revert "util: Use persistent array of index entries" This reverts commit `031f2c2a69`. It broke the macOS build. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12083 (cherry picked from commit `fdc1c61306`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:43 -08:00
Michel Dänzer	72271ed3fc	Revert "util/mesa-db: Further simplify mesa_db_compact" This reverts commit `92893309bc`. Need to revert this as well for the next revert. (cherry picked from commit `66d68263f8`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:42 -08:00
Danylo Piliaiev	bd8fb8a930	nir/nir_opt_offsets: Do not fold load/store with const offset > max When (off_const > max) there is a wrap around uint when calling try_extract_const_addition. Exit early since folding doesn't make sense in this case. Cc: mesa-stable Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (cherry picked from commit `b501cbf153`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:40 -08:00
Friedrich Vock	743b2fdf8e	vulkan/rmv: Correctly set heap size RMV expects the size to be in bits 5-68, not 4-68. Fixes: `845792db` ("vulkan: Add RMV file exporter") (cherry picked from commit `73d513c5be`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:39 -08:00
Lionel Landwerlin	9c55d78353	brw: allocate physical register sizes for spilling All of the spilling code should work with physical register units because for example SEND messages will expect a physical register as destination. So always allocate a full physical register for the spilled/unspilled values and adjust the offsets of the registers to physical sizes too. Cc: mesa-stable Fixes: `aa494cba` ("brw: align spilling offsets to physical register sizes") Closes: mesa/mesa#11967 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Found-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `a21cd8c5b6`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:39 -08:00
David Rosca	c1517edde6	radv/video: Avoid selecting rc layer over maximum Vulkan spec doesn't say if this is allowed or not, but trying to do this will hang. Fixes: `4a19047d32` ("radv/video: Select temporal layer when encoding each frame") Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `d1c1a33b35`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:38 -08:00
David Rosca	bab3391381	radv/video: Report correct encodeInputPictureGranularity Only aligned size can be encoded. Fixes: `54d499818c` ("radv/video: add initial support for encoding with h264.") Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `e941acfb9d`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:37 -08:00
David Rosca	42822bbca2	radv/video: Fix HEVC slice control This needs to use aligned size, otherwise it will output two slices when the size is not 64 aligned. Fixes: `967e4e09de` ("radv/video: add h265 encode support") Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `e4ec135d8b`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:36 -08:00
David Rosca	ecc3f03d83	radv/video: Fix H264 slice control This needs to use aligned size, otherwise it will output two slices when the size is not 16 aligned. Fixes: `54d499818c` ("radv/video: add initial support for encoding with h264.") Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `6a121f1507`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:36 -08:00
David Heidelberg	a725b1373e	llvmpipe: align with u_cpu_detect struct changes Cc: mesa-stable # 24.3 Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `d21f7f75ff`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:35 -08:00
David Heidelberg	9116861d3c	util: drop XOP detection code Introduced in 2013 with prospect of being used in future. ... 11 years later. Fixes: `4b45b61fef` ("util: add avx2 and xop detection to cpu detection code") # 24.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `962b996d4c`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:34 -08:00
David Heidelberg	cbb58f2623	util: Drop ancient Intel CPU detection We don't use it for anything. Cc: mesa-stable # 24.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `ca947e1295`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:34 -08:00
David Heidelberg	f6653b1f59	util: Remove MMX/MMXext detection code Currently pointless, Pentium II or Celeron and later has SSE. Cc: mesa-stable # 24.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `a78c2bf2a4`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:33 -08:00
David Heidelberg	41af3ea120	util: Drop 3Dnow optimisation leftovers Fixes: `a3218e65d1` ("mesa: remove long dead 3Dnow optimisation") # 24.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: David Heidelberg <david@ixit.cz> (cherry picked from commit `ae85e6920c`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:32 -08:00
Timothy Arceri	439879abd3	glsl/nir: fix function cloning at link time As per the code comment added in this commit the nir produced from glsl to nir doesn't always keep function declarations before the code that calls them e.g. calls from within other function implementations. The change in this commit works around this problem by first cloning all function declarations in a first pass, then cloning the implementations in a second pass once we have filled the remap table. Fixes: `cbfc225e2b` ("glsl: switch to a full nir based linker") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12115 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Acked-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `59b2549279`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>	2024-11-14 09:31:31 -08:00
Tomeu Vizoso	e86386df89	etnaviv/nn: Fix use of etna_core_info Right now we were retrieving the properties of the NPU from the etna_core_info of the GPU. Fixes: `92a6f697d5` ("etnaviv: npu: Switch to use etna_core_info") Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> (cherry picked from commit `f9bb9aa7d5`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-13 08:30:59 -08:00
Tomeu Vizoso	e839ff344e	etnaviv/ml: Fix includes etnaviv_ml.h uses dynarray, but the u_inlines.h header is needed by some of the files that include it. Fixes: `d6473ce28e` ("etnaviv: Use NN cores to accelerate convolutions") Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de (cherry picked from commit `70bff0c971`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-13 08:30:59 -08:00
M Henning	3e45c3eec2	nvk: Fix invalidation of NVK_CBUF_TYPE_DYNAMIC_UBO Because dyn_start and dyn_end are indices into nvk_root_descriptor_table->dynamic_buffers, we would need to offset cbuf->dynamic_idx by nvk_root_descriptor_table->set_dynamic_buffer_start[cbuf->desc_set] in order to do those comparisons correctly. We could do that, but it's simpler and no less precise to sinply re-use the same comparison that we do in the other cases here. This fixes a rendering artifact in Baldur's Gate 3 (Vulkan), which regressed with the commit listed below. Fixes: `091a945b57` ("nvk: Be much more conservative about rebinding cbufs") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> (cherry picked from commit `dc12c78235`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-13 08:30:58 -08:00
M Henning	e7ebb97fdf	nvk/cmd_buffer: Pass count to set_root_array Previously, we were passing the end index which was incorrect. Also, improve the macros so that they can take an expression for the count. Fixes: `b2d85ca36f` ("nvk: Use helper macros for accessing root descriptors") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> (cherry picked from commit `64f17c1391`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-13 08:30:57 -08:00
Lionel Landwerlin	97d974a3ad	anv: update shader descriptor resource limits Some limits got stuck to the old binding table limits. Those don't apply anymore since EXT_descriptor_indexing was implemented. Fixes: `6e230d7607` ("anv: Implement VK_EXT_descriptor_indexing") Fixes: `96c33fb027` ("anv: enable direct descriptors on platforms with extended bindless offset") Reviewed-by: Ivan Briano <ivan.briano@intel.com> (cherry picked from commit `d6acb56f11`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-13 08:30:56 -08:00
Jose Maria Casanova Crespo	dc8e19aede	v3d: Enable Early-Z with discards when depth updates are disabled The Early-Z optimization is disabled when there is a discard instruction in the shader used in the draw call. But if discard is the only reason to disable Early-Z, and at draw call time the updates in the draw call are disabled we can enable Early-Z using a shader variant. If there are occlussion queries active we also need to disable Early-z optimization. So this patch enables Early-Z in this scenario. The performance improvement is significant when running gfxbench benchmark showing an average improvement of 11.15% fps_avg helped: gl_gfxbench_aztec_high.trace: 3.13 -> 3.73 (19.13%) fps_avg helped: gl_gfxbench_aztec.trace: 4.82 -> 5.68 (17.88%) fps_avg helped: gl_gfxbench_manhattan31.trace: 5.10 -> 6.00 (17.59%) fps_avg helped: gl_gfxbench_manhattan.trace: 7.24 -> 8.36 (15.52%) fps_avg helped: gl_gfxbench_trex.trace: 19.25 -> 20.17 ( 4.81%) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable (cherry picked from commit `5b951bcdd7`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:20 -08:00
Karmjit Mahil	d185a4658e	nir: Fix `no_lower_set` leak on early return Addresses: ``` Indirect leak of 256 byte(s) in 2 object(s) allocated from: #0 0x7faaf53ee0 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x7fa8cfe900 in ralloc_size ../src/util/ralloc.c:118 #2 0x7fa8cfeb20 in rzalloc_size ../src/util/ralloc.c:152 #3 0x7fa8cff004 in rzalloc_array_size ../src/util/ralloc.c:232 #4 0x7fa8d06a84 in _mesa_set_init ../src/util/set.c:133 #5 0x7fa8d06bcc in _mesa_set_create ../src/util/set.c:152 #6 0x7fa8d0939c in _mesa_pointer_set_create ../src/util/set.c:613 #7 0x7fa95e5790 in nir_lower_mediump_vars ../src/compiler/nir/nir_lower_mediump.c:574 #8 0x7fa862c1c8 in tu_spirv_to_nir(tu_device, void, unsigned long, VkPipelineShaderStageCreateInfo const, tu_shader_key const, pipe_shader_type) ../src/freedreno/vulkan/tu_shader.cc:116 #9 0x7fa8646f24 in tu_compile_shaders(tu_device, unsigned long, VkPipelineShaderStageCreateInfo const, nir_shader, tu_shader_key const, tu_pipeline_layout, unsigned char const, tu_shader, char, void, nir_shader, VkPipelineCreationFeedback) ../src/freedreno/vulkan/tu_shader.cc:2741 #10 0x7fa85a16a4 in tu_pipeline_builder_compile_shaders ../src/freedreno/vulkan/tu_pipeline.cc:1887 #11 0x7fa85eb844 in tu_pipeline_builder_build<(chip)7> ../src/freedreno/vulkan/tu_pipeline.cc:3923 #12 0x7fa85e6bd8 in tu_graphics_pipeline_create<(chip)7> ../src/freedreno/vulkan/tu_pipeline.cc:4203 #13 0x7fa85c2588 in VkResult tu_CreateGraphicsPipelines<(chip)7>(VkDevice_T, VkPipelineCache_T, unsigned int, VkGraphicsPipelineCreateInfo const, VkAllocationCallbacks const, VkPipeline_T**) ../src/freedreno/vulkan/tu_pipeline.cc:4234 ``` seen in: dEQP-VK.binding_model.mutable_descriptor.single.switches.uniform_texel_buffer_storage_image.update_write.no_source.no_source.pool_expand_types.pre_update.no_array.vert Fixes: `7e986e5f04` ("nir/lower_mediump_vars: Don't lower mediump shared vars with atomic access.") Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> (cherry picked from commit `2a7df331af`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:20 -08:00
Karmjit Mahil	e3f3e315af	tu: Fix potential alloc of 0 size We can end up calling vk_multialloc_alloc with 0 size when `attachment_count` is 0 and `clearValueCount` is 0. Addressed: ``` Direct leak of 1 byte(s) in 1 object(s) allocated from: #0 0x7faf033ee0 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x7fada5cc10 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26 #2 0x7fac50b270 in vk_alloc ../src/vulkan/util/vk_alloc.h:48 #3 0x7fac555040 in vk_multialloc_alloc ../src/vulkan/util/vk_alloc.h:234 #4 0x7fac555040 in void tu_CmdBeginRenderPass2<(chip)7>(VkCommandBuffer_T, VkRenderPassBeginInfo const, VkSubpassBeginInfo const*) ../src/freedreno/vulkan/tu_cmd_buffer.cc:4634 #5 0x7fac900760 in vk_common_CmdBeginRenderPass ../src/vulkan/runtime/vk_render_pass.c:261 ``` seen in: dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_texel_buffer.no_fmt_qual.len_252.samples_1.1d.frag Fixes: `4cfd021e3f` ("turnip: Save the renderpass's clear values in the cmdbuf state.") Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> (cherry picked from commit `c923eff742`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:19 -08:00
Karmjit Mahil	27b2c2b869	tu: Fix push_set host memory leak on command buffer reset Addresses: ``` Direct leak of 192 byte(s) in 1 object(s) allocated from: #0 0x7fbe5e4230 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164 #1 0x7fbd008bf4 in vk_default_realloc ../src/vulkan/util/vk_alloc.c:37 #2 0x7fbbabb2fc in vk_realloc ../src/vulkan/util/vk_alloc.h:70 #3 0x7fbbaead38 in tu_push_descriptor_set_update_layout ../src/freedreno/vulkan/tu_cmd_buffer.cc:3173 #4 0x7fbbaeb0b4 in tu_push_descriptor_set ../src/freedreno/vulkan/tu_cmd_buffer.cc:3203 #5 0x7fbbaeb500 in tu_CmdPushDescriptorSet2KHR(VkCommandBuffer_T, VkPushDescriptorSetInfoKHR const) ../src/freedreno/vulkan/tu_cmd_buffer.cc:3235 #6 0x7fbbe35c80 in vk_common_CmdPushDescriptorSetKHR ../src/vulkan/runtime/vk_command_buffer.c:300 ``` seen in: dEQP-VK.binding_model.shader_access.secondary_cmd_buf.bind.with_push.sampler_mutable.tess_eval.multiple_discontiguous_descriptors.1d_array Fixes: `03294e1dd1` ("turnip: Keep a host copy of push descriptor sets.") Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> (cherry picked from commit `53c2d5e426`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:19 -08:00
Job Noorman	a9f1c10a10	ir3/ra: prevent moving source intervals for shared collects Non-trivial collects (i.e., ones that will introduce moves because the sources don't line-up with the destination) may cause source intervals to get implicitly moved when they are inserted as children of the destination interval. Since we don't support moving intervals in shared RA, this may cause illegal register allocations. Prevent this by creating a new top-level interval for the destination so that the source intervals will be left alone. Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `fa22b0901a` ("ir3/ra: Add specialized shared register RA/spilling") (cherry picked from commit `b36a7ce0f1`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:18 -08:00
Matt Turner	739c3615ce	anv: Align anv_descriptor_pool::host_mem Otherwise anv_descriptor_set is accessed through an unaligned pointer, which is undefined behavior in C. ``` anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5 for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5 ``` Fixes: `2570a58bcd` ("anv: Implement descriptor pools") (cherry picked from commit `a2c4a34303`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:17 -08:00
Alyssa Rosenzweig	4a71355172	asahi: fix a2c with sample shading, harder Fixes: `9bbe93d158` ("hk: fix alpha-to-coverage with sample shading") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (cherry picked from commit `b94bcf0318`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:17 -08:00
Russell Greene	dd14b60b49	perfetto: fix macos compile On macos, <sys/types.h> does not declare clockid_t, but it's instead in <time.h>, which also includes <sys/types.h> on Linux, so just include <time.h> on all UNIX platforms. Fixes: `a871eabc` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12064 Tested-by: Vinson Lee <vlee@freedesktop.org> (cherry picked from commit `ae9d365686`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:16 -08:00
Chia-I Wu	67bd351553	panvk: ensure res table is restored after meta Set res_table to 0 to ensure that the res table is re-emitted. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Fixes: `5067921349` ("panvk: Switch to vk_meta") (cherry picked from commit `015f6a7aff`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:14 -08:00
Iván Briano	ea9b3f928d	intel/rt: fix ray_query stack address calculation While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack address calculation, what the HW actually uses is NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform, while the latter is fixed to 2048 for all current platforms. Fixes: `6c84cbd8c9` ("intel/dev/xe: Set max_eus_per_subslice using topology query") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `aee04bf4fb`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:11 -08:00
Ian Romanick	7994534fe9	brw/cse: Don't eliminate instructions that write flags With other changes in my tree, I observed this code from dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second cmp.z removed. undef(8) %69:UD cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F mov(1) v58+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0 cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d ... undef(8) %72:UD cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F mov(1) v63+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0 This was also fixed by running dead-code elimination before CSE. That seems more like avoiding the problem than fixing it, though. I believe this affects shader-db results because leaving the second CMP in the shader can give more opportunities for cmod propagation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `234c45c929` ("intel/brw: Write a new global CSE pass that works on defs") shader-db: All Intel platforms had similar results. (Lunar Lake shown) total cycles in shared programs: 922097690 -> 922260862 (0.02%) cycles in affected programs: 3178926 -> 3342098 (5.13%) helped: 130 HURT: 88 helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16 helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18% HURT stats (abs) min: 4 max: 11992 x̄: 2292.55 x̃: 47 HURT stats (rel) min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61% 95% mean confidence interval for cycles value: 320.36 1176.63 95% mean confidence interval for cycles %-change: 1.59% 5.73% Cycles are HURT. LOST: 2 GAINED: 1 fossil-db: Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown) Totals: Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00% Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00% Max live registers: 48013385 -> 48013343 (-0.00%) Totals from 507 (0.09% of 551441) affected shaders: Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01% Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86% Max live registers: 94413 -> 94371 (-0.04%) DG2 Totals: Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00% Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00% Fill count: 150673 -> 150631 (-0.03%) Max live registers: 31871520 -> 31871476 (-0.00%) Totals from 506 (0.08% of 633197) affected shaders: Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01% Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74% Fill count: 2779 -> 2737 (-1.51%) Max live registers: 51383 -> 51339 (-0.09%) Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00% Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00% Fill count: 106610 -> 106614 (+0.00%) Max live registers: 32195006 -> 32194966 (-0.00%) Totals from 411 (0.06% of 637268) affected shaders: Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01% Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02% Fill count: 2865 -> 2869 (+0.14%) Max live registers: 42883 -> 42843 (-0.09%) (cherry picked from commit `9aba731d03`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:10 -08:00
Ian Romanick	1e792b0933	brw/copy: Don't copy propagate through smaller entry dest size Copy propagation would incorrectly occur in this code mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 to create mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, u0<0>:UD NoMask group0 This has different behavior. I think I just made a mistake when I changed this condition in `e3f502e007`. It seems like this condition could be relaxed to cover cases like (note the change of destination stride) mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 I'm not sure it's worth it. No shader-db or fossil-db changes on any Intel platform. Even the code for the test case mentioned in the original commit did not change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e3f502e007` ("intel/fs: Allow copy propagation between MOVs of mixed sizes") Closes: #12116 (cherry picked from commit `80a5d158ae`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-12 09:12:07 -08:00
Ian Romanick	8f53de4a5d	brw/emit: Add correct 3-source instruction assertions for each platform Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled on this while trying some stuff with !31852. v2: Don't be lazy. Add proper assertions for all the things on all the platforms. Based on a suggestion by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7bed11fbde` ("intel/brw: Allow immediates in the BFE instruction on Gfx12+") (cherry picked from commit `c1c09e3c4a`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-08 10:03:27 -08:00
Hans-Kristian Arntzen	baba2805ca	vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO. This is required, otherwise we regress latency in cases where applications are using FIFO without explicit KHR_present_wait. This is an unacceptable regression. The fix is to normalize the behavior to X11 WSI. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Fixes: `d052b0201e` ("vulkan/wsi/wayland: Use fifo protocol for FIFO") (cherry picked from commit `5f70858ece`) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>	2024-11-08 10:03:26 -08:00

1 2 3 4 5 ...

182906 commits