fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 00:38:48 +02:00

Author	SHA1	Message	Date
Mary Guillemard	5a5febfccd	nvk: Ensure that shader I-cache prefetch is enabled on Ada+ Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40700>	2026-04-08 00:05:40 +00:00
Mary Guillemard	55a279e8b8	nvk: Wire up shader program prefetch method On Ampere B and later, we can specify the prefetch size in blocks of a gfx shader we are binding. NVIDIA proprietary driver always set it with the max size possible. (up to 127 blocks) Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40700>	2026-04-08 00:05:40 +00:00
Mary Guillemard	742c91ce68	nvk: Move shader size and offset calculations to nvk_shader_get_shader_size We are going to need the total shader size (without embedded data), let's move this out of the upload codepath. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40700>	2026-04-08 00:05:40 +00:00
Ian Romanick	ffdc310bf1	brw/const: Don't allow type changes when accumulators are involved Integer accumulators and float accumulators do not occupy the same bits, so the types cannot be arbitrarily changed. No shader-db or fossil-db changes on any Intel platform. v2: Use is_accumulator() instead if brw_reg_is_arf(). Add an extra test to show the desired behavior when an accumulator is not involved. Suggested by Caio. Fixes: `64c251bb3a` ("intel/fs: Combine constants for SEL instructions too") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40638>	2026-04-07 23:37:26 +00:00
Mixie	11c3173890	xlib: clear currentDpy when switching current context Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When making a new context current, the previously current context may be unbound as part of the transition. In this path, current->currentDpy was not cleared for the old context, leaving a stale display association behind after the context was no longer current. Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947 Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Mixie	96cbc791d5	xlib: remove vishandle from XMesaVisual and fix XVisualInfo leak Remove the unused vishandle pointer and rely solely on visualid-based matching. This also eliminates the leak. This mirrors the cleanup previously done in fakeglx.c. (`781232e0ac`) Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Mixie	59ef73d71f	xlib: fix skipping visuals in destroy_visuals_on_display This commit decrements the loop index after deletion to ensure all visuals for the display are properly destroyed. Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947 Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Mixie	61d1e8fc85	xlib: use XMesaDestroyVisual instead of manual free Replace it with XMesaDestroyVisual() to properly handle deallocation. Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947 Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Mixie	8131af3f77	xlib: use XMesaDestroyVisual when destroying display visuals destroy_visuals_on_display() frees XMesaVisual objects directly, but XMesaVisual has a dedicated destructor. Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947 Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Mixie	447a1d2e8d	xlib: clear currentDpy when releasing the current context After `abe6d750e5`, glXDestroyContext() can defer destruction by marking the context with xid == None while it is still current. However, the release-current path did not clear current->currentDpy, so a context that had already been marked for deletion could remain associated with a display after unbinding. Fixes: `abe6d750e5` ("xlib: fix glXDestroyContext in Gallium frontends") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947 Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40422>	2026-04-07 22:39:49 +00:00
Marek Olšák	cec1024b22	ac,radv: remove AC_TRACKED_DB_VRS_OVERRIDE_CNTL as well AC_TRACKED_DB_PA_SC_VRS_OVERRIDE_CNTL can be used instead because the DB and PA registers are mutually exclusive. 2 definitions are moved because consecutive enums aren't allowed to cross a multiple of 32 because of static assertions in the bitset. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40586>	2026-04-07 22:07:48 +00:00
Marek Olšák	623d2a9f3c	radv,radeonsi: don't set PA_SC_HIS_INFO the preamble sets it Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40586>	2026-04-07 22:07:48 +00:00
Marek Olšák	9c26b8b924	ac,radv: use AC_TRACKED_DB_PA_SC_VRS_OVERRIDE_CNTL for PA_SC_VRS_OVERRIDE_CNTL The enum is meant to be used for both. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40586>	2026-04-07 22:07:48 +00:00
Caio Oliveira	3b4a7f2d1a	brw: In "Clear Accumulator" workaround, never set predicate_inverse Since there's no predicate, the inverse bit is not relevant, so always set it to false instead of using whatever was set by the previous instruction. Hardware already ignores this but will make verifying later changes easier. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40800>	2026-04-07 20:33:46 +00:00
Caio Oliveira	e382d82ca9	anv: Fix assert in anv_nir_compute_push_layout Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When per-primitive padding is needed, max_push_buffers is set to 3 (instead of 4) to reserve the last slot for it. The assert was requiring `n_push_ranges < max_push_buffers`, which incorrectly fired when the 3 ranges were used. Fixes: `a8ba682919` ("anv: assert we haven't gone over the maximum number of push_buffers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15155 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40803>	2026-04-07 19:56:55 +00:00
Alyssa Rosenzweig	959ec01ac8	brw/nir_lower_fs_load_output: optimize pixel coord this saves a conversion or two. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>	2026-04-07 19:32:15 +00:00
Alyssa Rosenzweig	1d0f42c264	brw/eu_emit: relax assertion to allow ARF NULL On new platforms, it's valid to use a NULL destination in conjunction with a cmod, where you care about the implicit flag write but you don't need to clobber any GRF. Something like: if (x * y > z) { compiling (with fast-math) to mad.gt.f0 _, -z, x, y (f0) if This patch allows us to emit that instruction. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>	2026-04-07 19:32:15 +00:00
Alyssa Rosenzweig	2ed6ff728a	brw: explicitly pad tgl_swsb This lets us treat it as a packed data structure without worrying about garbage. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>	2026-04-07 19:32:15 +00:00
Samuel Pitoiset	33676d8296	spirv: mark all resources as non-uniform by default with descriptor heap It's required by descriptor heap. There is already a NIR pass that optimizes non-uniform access, so this should be mostly safe. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	74aa40f6ed	nir: remove resource/sampler heap ptrs sysvals They are no longer used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	d2b9ccf20b	vulkan: adjust lowering of descriptor heaps Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	fb96f85d19	spirv: implement SpvOpUntypedImageTexelPointerEXT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	7088621874	spirv: emit nir_intrinsic_image_heap when resource/sampler ptrs are used This seems better because there are no variables at all. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	f1afce673b	spirv: set the image format for image intrinsics This isn't required for deref instructions because it's possible to get the image format back from the variable but it will be useful for descriptor heap. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	7dbef85365	spirv: change the resource/sampler builtins variable mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Samuel Pitoiset	457eac7d66	nir: add new variable modes for the resource/sampler heap pointers These two new variable modes are used to relax restrictions on deref casts through because it's possible to cast different modes from the heap pointers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Valentine Burley	e5144bcd6a	Revert "ci: Disable Collabora's farm due to network issues" Internet is back. This reverts commit `994ead31bd`. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40827>	2026-04-07 18:25:48 +00:00
Daniel Schürmann	daa3d5292f	nir/opt_if: allow undef instructions on ELSE side for if-simplification Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Totals from 72 (0.04% of 202440) affected shaders: (Navi48) Instrs: 213900 -> 213873 (-0.01%); split: -0.03%, +0.02% CodeSize: 1215012 -> 1214924 (-0.01%); split: -0.01%, +0.01% Latency: 4458993 -> 4458679 (-0.01%) Copies: 18840 -> 18816 (-0.13%) Branches: 5044 -> 5043 (-0.02%) VALU: 116547 -> 116529 (-0.02%) SALU: 28686 -> 28669 (-0.06%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40738>	2026-04-07 17:28:11 +00:00
Daniel Schürmann	1394f79517	nir/opt_if: allow load_const instructions on ELSE-side for if-simplifaction Totals from 1974 (0.98% of 202440) affected shaders: (Navi48) Instrs: 6438920 -> 6437055 (-0.03%); split: -0.06%, +0.03% CodeSize: 35080732 -> 35075136 (-0.02%); split: -0.04%, +0.02% SpillSGPRs: 2106 -> 2122 (+0.76%) Latency: 52684517 -> 52677236 (-0.01%); split: -0.02%, +0.01% InvThroughput: 8977644 -> 8976740 (-0.01%); split: -0.01%, +0.00% VClause: 124447 -> 124444 (-0.00%) SClause: 117561 -> 117560 (-0.00%); split: -0.00%, +0.00% Copies: 413450 -> 410708 (-0.66%); split: -0.67%, +0.01% Branches: 136429 -> 136169 (-0.19%); split: -0.20%, +0.01% PreSGPRs: 114813 -> 114918 (+0.09%); split: -0.01%, +0.10% PreVGPRs: 108142 -> 108145 (+0.00%); split: -0.00%, +0.00% VALU: 3275624 -> 3274927 (-0.02%); split: -0.03%, +0.00% SALU: 1166159 -> 1165039 (-0.10%); split: -0.17%, +0.07% VOPD: 333456 -> 333183 (-0.08%); split: +0.02%, -0.10% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40738>	2026-04-07 17:28:11 +00:00
Alyssa Rosenzweig	dbb7050ca7	util/sparse_bitset: add u_sparse_bitset_clear_all It is sometimes useful to remove all elements of a bitset while retaining the backing storage. With a dense bitset, we would just memset everything to 0, which is O(capacity). With a sparse bitset, previously we would have to free and reallocate, which is O(capacity) in the dense case and O(cardinality) in the sparse case. That is the correct asymptoptic behaviour O(cardinality) in the worst case, but there is an unfortunate constant-factor associated with the redundant allocation & free in the dense case. Therefore, we add a new helper to clear all elements of the sparse bitset in one go, avoiding reallocation in the dense case. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40806>	2026-04-07 16:59:11 +00:00
Valentine Burley	994ead31bd	ci: Disable Collabora's farm due to network issues Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Cambridge office has lost internet connection. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40824>	2026-04-07 14:30:35 +00:00
Mary Guillemard	6d700284ac	nvk: Use SET_PRIMITIVE_TOPOLOGY instead of MME scratch Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Instead of keeping track of the topology with some scratch value in MME, we can rely on SET_PRIMITIVE_TOPOLOGY to directly set it. This simplify some of the MME codegen but does not seems to have any impact on performance in general. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40749>	2026-04-07 14:11:16 +00:00
Valentine Burley	bbed00ac81	ci/freedreno: Move remaining lazor a618 jobs, retire device type The sc7180-trogdor-lazor-limozeen devices have been dying off over the past few weeks, so move the last two jobs to sc7180-trogdor-kingoftown and retire the device type. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40818>	2026-04-07 11:55:59 +00:00
Natalie Vock	fded5e321d	aco: Nuke ACO-side prolog selection Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	afe519406b	radv: Rewrite the RT prolog in NIR Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	b53dc3f052	aco/lower_to_hw_instr: Run p_init_scratch if the program has a call Callees may use scratch even if the caller doesn't. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	378c9536de	aco/isel: Fix stack_ptr synthesis info.stack_ptr.is_reg is always true. We have a stack pointer to use if and only if the program is a callee. Also, apply_scratch_offset needs to be true in a few more places. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Natalie Vock	31e08322d7	aco/spill_preserved: Only compute preserved registers if in a callee Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008>	2026-04-07 11:28:05 +00:00
Xianzhong Li	248b0b47b7	panfrost: Fix GEM handle refcount leak in panfrost_bo_import panfrost_bo_import() calls drmPrimeFDToHandle() then pan_kmod_bo_import(), which also calls drmPrimeFDToHandle() internally. This double import causes GEM handle refcount leaks because each drmPrimeFDToHandle() increments the kernel's GEM handle refcount, but only one drmCloseBufferHandle() is called during cleanup by panfrost_kmod_bo_free(or panthor_kmod_bo_free). Fix by removing the redundant drmPrimeFDToHandle() and using pan_kmod_bo_import() directly. On re-import of existing buffers, properly release the extra pan_kmod_bo reference with pan_kmod_bo_put(). This ensures GEM handle refcount, pan_kmod_bo refcount, and panfrost_bo refcount are all properly balanced. Fixes: `5089a758df` ("panfrost: Back panfrost_bo with pan_kmod_bo object") Signed-off-by: Xianzhong Li <xianzhong.li@nxp.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40778>	2026-04-07 11:06:34 +00:00
Natalie Vock	436acc321a	radv: Disable RADV_DEBUG=llvm in release builds The LLVM backend is unmaintained. Let's not encourage users to swap out entire parts of the driver with an unsupported codepath. Enabling this option is a footgun nowadays anyway, given that it disables many features and thus may trigger bigger changes in behavior than intended. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40815>	2026-04-07 09:55:25 +00:00
Daniel Schürmann	58390ceb98	radv: increase limit for peephole_select in radv_optimize_nir_algebraic_early() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Totals from 4868 (2.40% of 202440) affected shaders: (Navi48) MaxWaves: 128008 -> 128004 (-0.00%); split: +0.04%, -0.05% Instrs: 10006725 -> 9978721 (-0.28%); split: -0.31%, +0.03% CodeSize: 54085500 -> 54018184 (-0.12%); split: -0.19%, +0.07% VGPRs: 299524 -> 299584 (+0.02%); split: -0.10%, +0.12% SpillSGPRs: 8707 -> 8669 (-0.44%); split: -0.48%, +0.05% Latency: 79101292 -> 79243875 (+0.18%); split: -0.55%, +0.73% InvThroughput: 13645193 -> 13731338 (+0.63%); split: -0.08%, +0.71% VClause: 181709 -> 181485 (-0.12%); split: -0.23%, +0.10% SClause: 222587 -> 221191 (-0.63%); split: -1.26%, +0.63% Copies: 708979 -> 690992 (-2.54%); split: -2.71%, +0.17% Branches: 232868 -> 223146 (-4.17%) PreSGPRs: 275370 -> 274818 (-0.20%); split: -0.25%, +0.05% PreVGPRs: 238859 -> 238907 (+0.02%); split: -0.01%, +0.03% VALU: 5291185 -> 5291617 (+0.01%); split: -0.08%, +0.09% SALU: 1610496 -> 1604458 (-0.37%); split: -0.68%, +0.30% VMEM: 303401 -> 303037 (-0.12%) SMEM: 358335 -> 357964 (-0.10%) VOPD: 377180 -> 376374 (-0.21%); split: +0.05%, -0.27% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40708>	2026-04-07 08:00:04 +00:00
Samuel Pitoiset	71b6db06e1	ac/nir: add descriptor heap support to opt_flip_if_for_mem_loads() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40702>	2026-04-07 06:15:24 +00:00
Samuel Pitoiset	1184610de4	ac/nir: add descriptor heap support to ac_nir_lower_image_tex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40702>	2026-04-07 06:15:24 +00:00
Samuel Pitoiset	d2132ae011	ac/nir: adjust lowering of query size for descriptor heap Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40702>	2026-04-07 06:15:24 +00:00
Mel Henning	001de6d71b	nak: Fix mufu's f16 bit on sm90+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes multiple cts tests on blackwell, including eg. dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_2.opfdiv_tessc Fixes: `d031365f7c` ("nak: support MUFU.F16") Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40804>	2026-04-07 05:10:16 +00:00
Faith Ekstrand	0d5cae97b7	pan/bi: Vectorize 8-bit ops up to v4i8 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40720>	2026-04-06 21:39:25 +00:00
Faith Ekstrand	15d5675e8e	pan/bi: Pack 8-bit vec2s We used to splat out 8-bit vec2s to 16-bit by repeating both 8-bit halves twice with the B0011 swizzle. I think the original idea here was that 16-bit swizzles were more widely available in the hardware and that this would make swizzling things easier. The problem is that nothing actually knows that the value is half-repeated like this so nothing knows it can upgrade a swizzle from B0022 to B0123 (H01). So instead we get a bunch of B0022 swizzles, which nothing supports. We can shave a lot of instructions if we just stop trying to be so clever and instead repeat the whole thing with a B0101 swizzle. The only real issue here is that v2[fiu]8_to_v2[fiu]16 needs a B0011 swizzle, which we have to apply on-the-fly. Fortunately, any swizzle can be composed with B0011. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40720>	2026-04-06 21:39:25 +00:00
Faith Ekstrand	db8cb73b34	pan/bi: Add bytewise copy propagation This adds a new bytewise copy propagation pass which chews through MKVEC and SWZ instructions. The word-based copy propagation pass only existed to chew through SPLIT/COLLECT but MKVEC is COLLECT for bytes and we had nothing to help with that. This is actually two passes in one: Byte propagation and swizzle propagation. Any time we see a MKVEC, we look at its sources only as bytes and chase individual bytes back, through other MKVEC and SWZ, to their generating instruction and make the MKVEC only consume the original bytes. If the MKVEC happens to construct something that's just a swizzle of another def (this is fairly common), we record that as well. The idea here is that a lot of MKVEC just consume other MKVEC and we can get rid of the intermediate ones or even the whole chain if it just ends up being a swizzle in the end. For SWZ instructions, we first look at them like a MKVEC of the individual bytes they consume. If that doesn't yield a single swizzled word, we then crawl through the words table, just accumulating swizzles. This gives us the best (closest to the generating instructions) coherent word. We could also replace SWZ with MKVEC and just do byte propagation but MKVEC is often 2 instructions whereas SWZ is often one (or folded into a source) so this is probably the better balance. Finally, we not only replace the MKVEC and SWZ instructions but we also attempt to propagate swizzles into individual ALU op sources. For v4i8 ops, this often fails since the full generality isn't always available but for fp16, we can almost always fold the swizzle into the consuming instruction. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40720>	2026-04-06 21:39:25 +00:00
Faith Ekstrand	a4e9002660	pan/bi: Emit MKVEC directly Now that we have bi_lower_mkvec_swz(), there's no need to be so careful in the NIR -> bi translation. We can just emit MKVEC and move on. The lowering pass will sort out the detaisl. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40720>	2026-04-06 21:39:25 +00:00
Faith Ekstrand	b9e33c7897	pan/bi: Stop lowering swizzles on mkvec and swz The new lowering can handle all the swizzle cases and is generally better at it than swizzle lowering. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40720>	2026-04-06 21:39:25 +00:00

1 2 3 4 5 ...

220795 commits