fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-21 03:48:22 +02:00

Author	SHA1	Message	Date
Francisco Jerez	c3cdcd09ed	intel/brw: Add NIR pass to vectorize dot products into DPAS matrix multiplications. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add a new optimization pass that identifies sequences of scalar dot product operations and combines them into DPAS (Dot Product Accumulate Systolic) matrix multiplication instructions for XeHP+ EUs that have a systolic array pipeline (AKA XMX engine). This is possible because a matrix multiplication as performed by DPAS can be expressed like: E^i_k = D^i_k + Sum_j A^i_j B^j_k I.e. each scalar component of a matrix multiplication is just a (possibly large) dot product. This pass identifies such chains of sdot_4x8_iadd dot products in the program and bins them according to the A and B arguments used. Sets of dot products with consecutive components are transformed into a matrix product for each densely occupied interval of indices within each bin, as long as there is an efficient way to transpose one of the arguments in the register file. This enables programs to opportunistically take advantage of the systolic array pipeline for linear arithmetic, which has massively greater throughput than the regular FPUs (roughly a factor of 4x the throughput for the specific instructions replaced currently), without the application having to be updated in order to take advantage of it through a matrix multiplication API like KHR_cooperative_matrix. The immediate motivation for this is getting the open source driver to accelerate the matrix multiplications used for inference by the XeSS ML-driven upscaling library, since the Mesa driver was currently limited to the generic HLSL path that doesn't take advantage of the XMX pipeline. Alternative AI-driven upscaling libraries can be supported in theory though this hasn't been pursued yet, and there are some assumptions in the optimization pass that might get in the way currently: - Currently only the sdot_4x8_iadd intrinsic is supported for no particular reason other than it being the intrinsic generated by the XeSS library in its multivendor path. It would be straightforward to add support for additional types supported by the systolic pipeline. - Currently one of the arguments of the dot products is restricted to be an SSBO load because that's what we encounter in the XeSS library, but any other kind of memory load intrinsic could be supported easily. - Also accidental is the current limitation to run on Xe2+ hardware. Getting it to work on XeHP (e.g. DG2) is theoretically possible beyond some minor differences so it will probably be a future area for improvement. - The limitation of the shader subgroup size to 16 done at the end of the optimization pass is less accidental, because on all Intel Xe platforms released so far the DPAS instruction is limited to run at a fixed execution width (8 on XeHP and 16 on Xe2-3), so the backend would need a way to expose variable-width DPAS intrinsics e.g. by lowering them using SIMD splitting. I have some code to try to achieve that, but the naïf SIMD splitting approach of DPAS instructions appears to hurt more cases than it helps so I don't have a ready solution to lift this restriction yet. Evaluating the impact of this on the performance of XeSS kernels using our internal microbenchmarks shows a performance improvement for XeSS inference between 26% and 44% depending on the quality preset and resolution, with a geomean improvement of 35% across the rendering modes tested. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>	2026-06-15 08:10:51 +00:00
Francisco Jerez	8857da4db5	intel/brw/swsb: Omit redundant read-after-read synchronization for back-to-back DPAS. Multiple DPAS instructions executed on the same functional unit are guaranteed to read their source operands in program order, so no scoreboard synchronization is required between a DPAS read and another DPAS read of the same register. In order to achieve that track the pipeline (DPAS vs. other) of each out-of-order dependency via a new field on the dependency struct along with the token ID of the out-of-order dependency. When a read dependency for a DPAS instruction is encountered whose producer is also a DPAS unit, strip the SRC synchronization flag so that no redundant wait is emitted. The DST synchronization flag is preserved since write-after-read hazards still require ordering. This reduces the number of scoreboard stalls emitted within chains of DPAS instructions that have overlapping sources (common in matrix multiplication kernels), improving occupancy of the systolic pipeline. It avoids performance regressions in XeSS kernels in combination with the following vectorization optimization, and could also be helpful in theory with other workloads that utilize the systolic pipeline via KHR_cooperative_matrix. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>	2026-06-15 08:10:51 +00:00
Francisco Jerez	b394292085	intel/brw: Sort scheduling modes by performance after initial RA failure. Previously when the first register allocation attempt failed, brw_allocate_registers() would iterate over scheduling modes in the fixed order specified by pre_modes[], assuming that the first successful mode would be the most performant. However that wasn't ever a very reliable guarantee, and it becomes less so on Xe3+ were a lower-register-pressure schedule can have higher thread parallelism. But actually that's a bit of a silly situation since the pre_modes[] loop that runs before the first brw_assign_regs() attempt already iterates over multiple scheduling heuristics in order to choose which one to try first, so it has a static analysis model of the relative performance of the different heuristics which we can use in order to properly sort the pre_modes[] list and make a more informed decision about the iteration order at little extra cost. This seems to be helpful even before xe3 in cases where BRW_SCHEDULE_PRE_(NON_)LIFO outperforms BRW_SCHEDULE_PRE(_LATENCY), in particular when the critical path heuristic used by BRW_SCHEDULE_PRE_(NON_)LIFO does a better job at minimizing the latency of the program than the mostly backward-looking heuristic of BRW_SCHEDULE_PRE(_LATENCY). That is apparently the case in several shaders from the XeSS library, where the BRW_SCHEDULE_PRE heuristic hoists most of the memory loads of the shader aggressively to the top creating a bottleneck instead of interleaving the messages more effectively with the arithmetic along the critical path of the program. This patch avoids performance regressions with the subsequent DPAS vectorization patch as a result of this inversion of performance between the PRE and PRE_NON_LIFO scheduling heuristics. Note that this doesn't necessarily run the scheduler more times, it just changes the order that the different scheduling modes are attempted, no significant difference in the compile-time of shader-db nor fossil-db has been observed. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>	2026-06-15 08:10:51 +00:00
Francisco Jerez	f638b30c03	nir/divergence: Allow local_invocation_id.z to be treated as uniform. Add a new nir_divergence_uniform_local_invocation_id_z divergence option that allows the Z component of the local invocation ID to be treated as uniform across the subgroup, for cases where the driver knows that as a result of the hardware's subgroup walk order the Z component is guaranteed to remain constant across a subgroup. On Intel hardware for the walk order currently in use all invocations within a single subgroup are guaranteed to share the same local_invocation_id.z value when the product of the X and Y workgroup dimensions is a multiple of the SIMD width (32 at most). This allows the subsequent vectorization optimization to have an effect for many dot products in XeSS kernels whose two arguments currently appear divergent, however one of them only appears divergent due to the dependency on local_invocation_id.z, which is actually subgroup-uniform for these kernels. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>	2026-06-15 08:10:51 +00:00
Sergi Blanch Torne	7c018be258	ci: disable Collabora's farm due to maintenance Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Planned downtime in the farm: * Start: 2026-06-15 07:00 UTC * End: 2026-06-15 13:00 UTC Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41745>	2026-06-15 06:47:55 +00:00
Marek Olšák	ce4654ead6	radv: rename vrs_coarse_shading -> vrs_flat_shading Some checks failed macOS-CI / macOS-CI (dri) (push) Has been cancelled Details macOS-CI / macOS-CI (xlib) (push) Has been cancelled Details All coarse shading is VRS, but this code is about flat shading. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42234>	2026-06-13 19:29:59 +00:00
Marek Olšák	339945833c	radv,radeonsi: disallow VRS flat shading if SubgroupInvocationID is used The sysval is affected by VRS. More subgroup sysvals might have to be added here. Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42234>	2026-06-13 19:29:59 +00:00
Mary Guillemard	8f272b1fe1	nouveau/mme: Add a test for MME Shadow RAM behavior Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add a test to prove MME Shadow RAM behavior. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41506>	2026-06-13 14:16:39 +02:00
Mary Guillemard	8a1092712c	nouveau/mme: Add some simple MME shadow RAM dumper Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41506>	2026-06-13 14:16:39 +02:00
Mike Blumenkrantz	8faf71d84f	aux/tc: enforce strict resolve semantics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details chromium/skia (stupidly) hits this path when drawing transparent svgs, and it's definitely a bug in the browser engine, but no human can possibly comprehend how any of that works Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>	2026-06-12 21:55:41 +00:00
Mike Blumenkrantz	2c5a2d8b39	util/tc: iterate the rp info more accurately during batch execution these cases all trigger rp ends, but the info wasn't being iterated to reflect the driver's expectation, leading to desync Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>	2026-06-12 21:55:41 +00:00
Mike Blumenkrantz	a4c07ed881	zink: tag tc info update in a few more places these are places which might trigger rp ends Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42222>	2026-06-12 21:55:41 +00:00
Lionel Landwerlin	4e2abd872a	anv: align storage texel buffer support on image support We've enable image support based on typed write without format support. Let's do the same for texel buffers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/12384 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42225>	2026-06-12 21:28:55 +00:00
Faith Ekstrand	d7f9fede84	kraid: Re-materialize constants This isn't a great long-term solution but it cuts down on register pressure for now and lets more shaders compile. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Faith Ekstrand	d7a3276386	kraid: Allocate whole registers for staging destinations The LOAD and LD_PKA instructions have i8, i16, and i24 forms that can, in theory, operate on partial registers. However, there are issues with races between ALU and message instructions on partial registers. We could probably come up with a complex model for this but for now it's easiest to just force whole registers for message destinations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Faith Ekstrand	5ebd05b8ea	kraid: Add a Model::op_dst_is_staging_reg() helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Faith Ekstrand	fb7817bc71	kraid: Add a Model::op_src_is_staging_reg() helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Faith Ekstrand	cda7d27ad7	kraid: Fix RA for dead destinations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Faith Ekstrand	33e2ed7168	kraid: Only dump shaders if KRAID_DEBUG=print is set Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42226>	2026-06-12 17:10:28 -04:00
Collabora's Gfx CI Team	8aa1ad6cd4	Uprev ANGLE to 8e09325ebad45c7e11630a79754361e965e5fab0 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details `196d1b79ea...8e09325eba` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41977>	2026-06-12 18:04:00 +00:00
Pohsiang (John) Hsu	6617b5b3fb	mediafoundation: detach xThreadProc frame processing from apiLock to unblock concurrentt ProcessOutput calls Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>	2026-06-12 17:49:48 +00:00
Pohsiang (John) Hsu	6d4f890182	mediafoundation: fix a few minor variant bool handling Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>	2026-06-12 17:49:48 +00:00
Pohsiang (John) Hsu	6c49c1083c	mediafoundation: extract code to ProcessDX12EncodeContext Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42217>	2026-06-12 17:49:48 +00:00
Valentine Burley	51325d9ac3	venus/ci: Revert ADL jobs to stable 6.17 kernel Xe is unstable on 6.18+, so we need to revert to the previous stable kernel if we want to have pre-merge jobs on ADL. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>	2026-06-12 17:23:02 +00:00
Valentine Burley	24d707d7e2	venus/ci: Retire Intel Comet Lake runner The Flip-hatch devices are getting retired in the Collabora lab. We can also drop a few skips that were only needed for CML. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>	2026-06-12 17:23:02 +00:00
Valentine Burley	057ff58fe6	venus/ci: Move pre-merge ANV coverage from Comet Lake to Alder Lake Swap the pre-merge CML Venus-on-ANV job with the nightly ADL one, making the ADL job pre-merge. Also Move the nightly Android CTS job to ADL. The ADL runners are available after disabling anv-adl-vk, and this keeps Venus-on-ANV coverage in Cuttlefish Android VMs. Using 4 parallel runners allows the pre-merge VK CTS test suite to run with a lower fraction. Update the xfails to match, since the new fraction covers a different subset of tests. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>	2026-06-12 17:23:01 +00:00
Valentine Burley	6f867bd317	anv/ci: Disable anv-adl-vk job We already have VK CTS coverage on TGL and RPL, with the latter running the full test suite pre-merge. Having three gfx12 VK CTS jobs is redundant, so disable anv-adl-vk. The ADL runners will be reused for a different job. Cc: mesa-stable Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42041>	2026-06-12 17:23:01 +00:00
Danylo Piliaiev	860fe2d793	tu: Don't process A2R10G10B10 clear values via new pack function tu_pack_float32_for_color doesn't correctly handle that format, and it actually correctly quantized without it. Fixes: `38a10950e3` ("tu: Match SW color clear value packing with HW") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42215>	2026-06-12 16:58:38 +00:00
Georg Lehmann	18d7ab925a	radv: inline 8 and 16bit push constant loads We already lower all 8/16bit push constant loads to 32bit later using ac_nir_lower_mem_access_bit_sizes, so we just need to adjust how we gather the initial inlinable push constant mask. No changes to the actual push constant lowering are needed. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14415 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42182>	2026-06-12 16:12:00 +00:00
Georg Lehmann	aacdaffc5e	radv: fix setting inline push constants when only the last one is used The 64bit mask was truncated, and then when the low half is 0, the base was -1. By accident, u_bit_consecutive64(-1, 65) is the original mask, so we uploaded a single garbage value. Fixes: `7f6262bb85` ("radv: allow holes in inline push constants") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42182>	2026-06-12 16:12:00 +00:00
Rob Clark	92e4fb7aac	freedreno/crashdec: Add additional HFI queue Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details See kernel commit 61957ab99d8c ("drm/msm/a6xx: Add support for Debug HFI Q"). Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42193>	2026-06-12 15:55:32 +00:00
Rob Clark	31e53276fb	freedreno/crashdec: Update gpu revision parsing Newer kernels just print hex chip-id rather than unsigned "ipv4" style. Update parsing to handle this. See kernel commit cc53487e01fc ("drm/msm/adreno: Change chip_id format"). Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42193>	2026-06-12 15:55:32 +00:00
Lucas Fryzek	db4159c431	Modify x11_xcb_display_supports_xshm to get xshm opcode Previously this helper function would not capture the xshm opcode from the server's shm reply and drisw_glx requires the value to work properly. Fixes: `5f4eccf1` ("glx: Check that xshm can be attached") Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40926>	2026-06-12 15:16:05 +00:00
Matt Turner	44c03f8ba6	util, llvmpipe: flush subnormals to zero on ARM/AArch64 NEON always flushes subnormals to zero; previously lp_test_arit special-cased vector paths to suppress the resulting failures. The proper fix mirrors x86: set FPSCR/FPCR FZ so VFP also flushes, keeping scalar and vector paths consistent with the C reference. util_fpstate_{get,set,set_denorms_to_zero} now read/write FPSCR (ARMv7) or FPCR (AArch64) via inline asm. flush_denorm_to_zero in lp_test_arit flushes subnormal inputs on ARM/AArch64 to match. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42178>	2026-06-12 14:34:14 +00:00
Jakob Sinclair	7606854f43	pan/compiler: fix spilling for 64-bit values For the BIR-compiler, 64-bit values were not properly tracked in the spill logic and PHIs were always assumed to be 32-bits. This could create issues were only one half of the value was reloaded or spills would overlap each other leading to garbage values. This patch fixes these issues by keeping track of how many words each value needs. Also, it adds a constraint for SHADD sources where it splits and collects them right before the SHADD instruction itself to make it easier for RA to handle the register pairs. Fixes: `4542982062` ("pan/compiler: Use SHADDX instruction for i64 add") Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42080>	2026-06-12 14:12:27 +00:00
Marc Alcala Prieto	990d4a19f8	pan/compiler: Rename multiview to per_view_outputs On v14+, multiview is not lowered to per-view output stores. Rename "multiview" to "per_view_outputs" to make it clear that this logic only applies when the shader uses nir_intrinsic_store_per_view_output. Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>	2026-06-12 13:47:31 +00:00
Marc Alcala Prieto	a4b0c79ee4	pan: Add helper for max multiview view count and rise it to 16 on v14+ Replace PAN_MAX_MULTIVIEW_VIEW_COUNT with a helper taking the GPU architecture, so both the compiler and PanVK can query the right limit. And rise maximum multiview view count to 16 on v14+. Up from 8 on older generations. Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>	2026-06-12 13:47:31 +00:00
Marc Alcala Prieto	e4cb605fc5	panvk: Fix multiview support on v14+ On v14+, the view mask moved from PRIMITIVE_FLAGS to PRIMITIVE_FLAGS_2. The multiview vertex shader unrolling no longer needs to be handled in software. The GPU now runs one shader invocation per view, where each writes a single view and the view index is passed through a preload. Fixes: `4258888f4d` ("pan/genxml: Add v14 definition") Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>	2026-06-12 13:47:31 +00:00
Marc Alcala Prieto	59ed63c259	pan/bi: Load vertex view index from preload on v14+ On v14+, the GPU runs one vertex shader invocation per view, where each writes a single view and the view index is passed through BI_PRELOAD_VIEW_ID. Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>	2026-06-12 13:47:30 +00:00
Juan A. Suarez Romero	25f698d6de	Revert "people: update Marek's email" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This reverts commit `b5063953ca`. Marek is signing with his GMail account. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42213>	2026-06-12 13:43:04 +00:00
Caius-Moldovan-img	f6fbfe3967	pco: Fix metadata invalidation Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Signed-off-by: Caius Moldovan <caius.moldovan@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42113>	2026-06-12 13:28:07 +00:00
Christian Gmeiner	011c7a1c2c	compiler/rust: move ACORN PRNG to shared location Move the ACORN random number generator from src/nouveau/compiler/acorn/ to src/compiler/rust/acorn/ so it can be shared between different driver hardware test infrastructures. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42165>	2026-06-12 13:02:04 +00:00
Jakob Sinclair	d7f8097adf	pan: Add G52 skip for xlib wsi failure This started failing and it looks like a CI issue: ResourceError (Failed to open display: '' at tcuLnxX11.cpp:83) Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42040>	2026-06-12 12:40:04 +00:00
Jakob Sinclair	36c57e62da	pan/va: Decode support for ARSHIFT_OR on Valhall The ISA.xml for Valhall did not match exactly ARSHIFT as it was based on RSHIFT. We could generate ARSHIFT_OR however so in certain trace dumps the output would be empty. Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42040>	2026-06-12 12:40:04 +00:00
Juan A. Suarez Romero	e8b5f93c31	v3dv: increase max push constants size There is no hardware restriction that limits the current size, it was selected manually. Increase it to 256 as this aligns more with other hardware, and this is the minimum requirement for Vulkan 1.4. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42212>	2026-06-12 12:20:37 +00:00
Icenowy Zheng	425a4177b6	pvr: apply the culling everything viewport shift for only triangles Currently we shift the viewport as an implementation of FRONT_AND_BACK culling mode. However, as culling should only take effect on triangles, this shift should only be applied when the active rasterizing primitive is triangles. Check the primitive topology before applying the viewport shift. This fixes the new Vulkan CTS test `dEQP-VK.glsl.builtin_var.frontfacing. add_ubo_load.{point,line}_list.front_and_back` introduced in CTS 1.4.6.0. Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42164>	2026-06-12 12:02:46 +00:00
Rhys Perry	efefed1e35	nir/opt_undef: fix prefer_nan fossil-db (gfx1201): Totals from 279 (0.13% of 210263) affected shaders: Instrs: 661579 -> 653245 (-1.26%); split: -1.29%, +0.03% CodeSize: `3612816` -> 3572600 (-1.11%); split: -1.14%, +0.03% SpillSGPRs: 313 -> 305 (-2.56%) Latency: 5147724 -> 5139048 (-0.17%); split: -0.18%, +0.01% InvThroughput: 939696 -> 937981 (-0.18%); split: -0.19%, +0.00% VClause: 14732 -> 14696 (-0.24%); split: -0.29%, +0.05% SClause: 12517 -> 12495 (-0.18%); split: -0.19%, +0.02% Copies: 60783 -> 60472 (-0.51%); split: -0.61%, +0.09% Branches: 20669 -> 20488 (-0.88%); split: -1.16%, +0.28% PreSGPRs: 14960 -> 14968 (+0.05%); split: -0.03%, +0.08% PreVGPRs: 15948 -> 15960 (+0.08%) VALU: 306409 -> 304055 (-0.77%); split: -0.79%, +0.02% SALU: 134363 -> 131367 (-2.23%); split: -2.27%, +0.04% VMEM: 21760 -> 21715 (-0.21%); split: -0.24%, +0.04% SMEM: 21358 -> 21323 (-0.16%) VOPD: 32352 -> 32184 (-0.52%); split: +0.02%, -0.53% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `68dc336af7` ("nir: handle new multadd opcodes in lowerings and opts") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42145>	2026-06-12 11:29:14 +00:00
Juan A. Suarez Romero	45b3d733e9	mesa: allow GL_TEXTURE_COMPARE_{MODE,FUN} with EXT_shadow_samplers Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details These texture parameters are not only defined by ARB_shadow extension but also by EXT_shadow_samplers. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15626 Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42166>	2026-06-12 10:56:05 +00:00
Mary Guillemard	5435618d2e	nvk: Implement support for non graphics timestamp Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that we have a unified layout for timestamp, we can implement timestamp writes on DMA and Compute sub channels. This also expose timestamp on non graphics queues. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42208>	2026-06-12 10:00:44 +02:00
Faith Ekstrand	a07ded8e8c	kraid: Better document swizzles Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42204>	2026-06-12 00:13:08 -04:00

1 2 3 4 5 ...

224275 commits