fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-01 05:58:05 +02:00

Author	SHA1	Message	Date
Faith Ekstrand	51a68ecc87	panvk: Optimize in the preprocess hook Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details NIR is actually pretty good at optimizing UBO, SSBO, and shared memory access but in order to do so, we actually have to run the optimizations before we lower it all. Same for I/O. By doing all our lowering in panvk before we ever run the optimization loop, we risk hampering it significantly. Ignoring loop changes (several get unrolled now), fossil-db on Sascha Willems demos and a few others looks lik Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01% CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01% Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36% Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01% Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%) Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58% Number of hardware loops: 283 -> 278 (-1.77%) Totals from 186 (19.81% of 939) affected shaders: Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01% CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02% Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59% Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02% Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%) Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47% Number of hardware loops: 63 -> 58 (-7.94%) Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Faith Ekstrand	1a9c7f8c8a	panvk: Only lower outputs to temporaries We need to lower outputs to get rid of output reads and so that we can fix up layer writes on Bifrost. However, there's really no point in lowering reads besides moving them to the top. Even then, NIR can probably copy propagate the copies and we'll end up reading straight from the input variable anyway. Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Faith Ekstrand	a8b6213983	panvk: Lower copy_deref and indirect derefs before nir_lower_io Neither nir_lower_io() nor nir_lower_indirect_derefs() know what to do with copy_deref so we need to get rid of those first. Also, there are some NIR passes which can insert more copy_deref or propagate an indirect load to the I/O variable so we want to lower those away right before lowering I/O. Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Faith Ekstrand	d6dc0ea5ae	panvk: Split var copies and lower local vars early These two passes are a prerequisite for basically anything that optimizes on variables. Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Faith Ekstrand	586e1ac2b8	pan/compiler: Expose the bifrost optimization loop Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Faith Ekstrand	0e9fcb33c3	nir: Add a couple panfrost sysvals to divergence analysis Fixes: `2af6e4beeb` ("pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex}") Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Daniel Schürmann	5682e39e6b	amd: enable load/store_shared2_amd for GFX6 Totals from 1509 (2.43% of 62200) affected shaders: (Pitcairn) MaxWaves: 8078 -> 8057 (-0.26%); split: +0.09%, -0.35% Instrs: 977182 -> 951746 (-2.60%); split: -2.62%, +0.02% CodeSize: 4951468 -> 4758192 (-3.90%); split: -3.92%, +0.01% SGPRs: 76704 -> 76696 (-0.01%) VGPRs: 81092 -> 81068 (-0.03%); split: -0.34%, +0.31% Latency: 11663237 -> 11526070 (-1.18%); split: -1.19%, +0.01% InvThroughput: 6198904 -> 6114851 (-1.36%); split: -1.43%, +0.07% VClause: 26656 -> 26655 (-0.00%); split: -0.05%, +0.05% SClause: 22304 -> 22307 (+0.01%); split: -0.03%, +0.04% Copies: 107503 -> 109564 (+1.92%); split: -0.23%, +2.15% Branches: 22917 -> 22918 (+0.00%) PreSGPRs: 42246 -> 42242 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 64561 -> 64761 (+0.31%); split: -0.01%, +0.32% VALU: 600285 -> 601139 (+0.14%); split: -0.26%, +0.40% SALU: 130622 -> 130851 (+0.18%); split: -0.16%, +0.33% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>	2025-11-11 17:12:17 +00:00
Daniel Schürmann	9abbcbc00e	nir/opt_load_store_vectorize: don't add negative offsets to load/store_shared2_amd By hoisting the low address instead, we can make use of these instructions on GFX6. Totals from 3 (0.00% of 79839) affected shaders: (Navi48) Instrs: 3768 -> 3776 (+0.21%); split: -0.03%, +0.24% CodeSize: 20024 -> 20048 (+0.12%); split: -0.04%, +0.16% Latency: 16093 -> 16198 (+0.65%) InvThroughput: 3868 -> 3864 (-0.10%) VClause: 97 -> 93 (-4.12%) VALU: 2333 -> 2331 (-0.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>	2025-11-11 17:12:15 +00:00
Christian Gmeiner	688718be8b	mesa: OES_texture_stencil8 requries OpenGL ES 3.1 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38360>	2025-11-11 15:59:06 +00:00
Tapani Pälli	12b2476b40	anv: throw anv_finishme warnings only on debug builds Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14259 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38369>	2025-11-11 12:51:32 +00:00
Samuel Pitoiset	0d9d45db4e	radv: add vk_wsi_disable_unordered_submits and enable for GTK GTK is missing a semaphore between QueueSubmit() and QueuePresent() causing the WSI submit to be "unordered" and to immediately signal the semaphores (because it's missing a wait semaphore in QueuePresent()). The workaround is to disable unordered WSI submits until GTK fixes it properly. Cc: "25.3" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14087 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38351>	2025-11-11 12:13:41 +00:00
Daniel Schürmann	668259ef0b	aco/scheduler: move clauses through RAR dependencies Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For simplicity, we limit this feature to only one RAR-dependency per clause. This allows to quickly correct the register demand changes that occur by switching the kill flags. Totals from 5861 (7.34% of 79839) affected shaders: (Navi48) Instrs: 4891340 -> 4883789 (-0.15%); split: -0.21%, +0.06% CodeSize: 25556612 -> 25527244 (-0.11%); split: -0.16%, +0.05% VGPRs: 347044 -> 347140 (+0.03%); split: -0.13%, +0.16% Latency: 32697095 -> 32642428 (-0.17%); split: -0.25%, +0.08% InvThroughput: 4975909 -> 4975086 (-0.02%); split: -0.06%, +0.05% VClause: 102152 -> 93852 (-8.13%); split: -8.22%, +0.10% SClause: 101232 -> 101205 (-0.03%); split: -0.03%, +0.00% Copies: 305189 -> 305651 (+0.15%); split: -0.56%, +0.71% Branches: 87032 -> 87045 (+0.01%); split: -0.00%, +0.02% VALU: 2776634 -> 2777097 (+0.02%); split: -0.06%, +0.08% SALU: 662066 -> 660379 (-0.25%); split: -0.26%, +0.01% VOPD: 4801 -> 4800 (-0.02%); split: +1.21%, -1.23% Totals from 5680 (7.12% of 79825) affected shaders: (Vangogh) MaxWaves: 111282 -> 111290 (+0.01%) Instrs: 4955907 -> 4950709 (-0.10%); split: -0.15%, +0.04% CodeSize: 26026264 -> 26014272 (-0.05%); split: -0.10%, +0.05% VGPRs: 320784 -> 320776 (-0.00%); split: -0.03%, +0.03% Latency: 35645457 -> 35584438 (-0.17%); split: -0.32%, +0.15% InvThroughput: 8233912 -> 8236524 (+0.03%); split: -0.10%, +0.13% VClause: 107017 -> 96804 (-9.54%); split: -9.69%, +0.15% SClause: 98633 -> 98592 (-0.04%); split: -0.05%, +0.01% Copies: 394041 -> 393584 (-0.12%); split: -0.52%, +0.40% Branches: 120235 -> 120231 (-0.00%); split: -0.02%, +0.01% VALU: 3183571 -> 3183114 (-0.01%); split: -0.06%, +0.05% SALU: 735546 -> 734143 (-0.19%); split: -0.20%, +0.01% Totals from 2507 (3.96% of 63370) affected shaders: (Vega10) MaxWaves: 13643 -> 13637 (-0.04%) Instrs: 1496453 -> 1496135 (-0.02%); split: -0.11%, +0.09% CodeSize: 7777880 -> 7776608 (-0.02%); split: -0.09%, +0.07% VGPRs: 134164 -> 134104 (-0.04%); split: -0.11%, +0.07% Latency: 17465181 -> 17483075 (+0.10%); split: -0.36%, +0.47% InvThroughput: 8830470 -> 8851751 (+0.24%); split: -0.09%, +0.33% VClause: 42012 -> 38825 (-7.59%); split: -8.00%, +0.42% SClause: 34586 -> 34549 (-0.11%); split: -0.12%, +0.01% Copies: 137896 -> 137668 (-0.17%); split: -0.86%, +0.69% VALU: 1092468 -> 1092240 (-0.02%); split: -0.11%, +0.09% SALU: 132956 -> 132569 (-0.29%); split: -0.34%, +0.05% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:52 +00:00
Daniel Schürmann	65ba8a0e8b	aco/scheduler: refactor downwards dependency check We can also ignore killed operands when checking for RAR dependencies as these cannot appear later anymore. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:52 +00:00
Daniel Schürmann	ce3cc03153	aco/scheduler: use hashmap for RAR_dependencies Store information about the (relative) position of the RAR dependency. This will allow to correct for register-demand changes when scheduling across. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:52 +00:00
Daniel Schürmann	6c0dd8164f	aco/scheduler: remove MoveState::RAR_dependencies_clause Since moving clauses as batch, this can easily be derived from RAR_dependencies. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:52 +00:00
Daniel Schürmann	5ef47ba231	aco/scheduler: assert that the register demand stays within pre-determined bounds Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:52 +00:00
Daniel Schürmann	82ba730994	aco/scheduler: remove unused include Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>	2025-11-11 11:31:51 +00:00
Kenneth Graunke	9ffae42975	brw: Store brw_urb_inst::offset in bytes on Xe2 Xe2 uses byte offsets rather than OWord offsets. We've been storing the per-slot offsets in bytes on Xe2 for a while, but kept the global offset immediate in OWords for some reason, choosing to lower it during logical send lowering. This patch makes both offsets (global immediate, per-slot) in the same units, so they could be added together if necessary without scaling. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>	2025-11-11 10:55:44 +00:00
Kenneth Graunke	cde3a34a43	brw: Use nir_intrinsic_[set_]base rather than poking at const_index[0] Much clearer, especially since we're dealing with at least four different kinds of intrinsics. These helpers were introduced years ago, but probably didn't exist when we first wrote this code. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>	2025-11-11 10:55:43 +00:00
Kenneth Graunke	439c156831	brw: Add an assertion that writemasks can be fully ignored I noticed that our backend was completely ignoring writemasks, despite them appearing on many of the intrinsics we're implementing. Rhys Perry pointed out that nir_lower_mem_access_bitsizes is removing all non-trivial writemasking today, so ssbo/global/shared/scratch/etc. stores should only ever see all components enabled. Which means what we're doing is legitimate, if non-obvious. Add an assert to make it obvious. Thanks a lot to Rhys for helping me rediscover what made this work. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>	2025-11-11 10:55:42 +00:00
Kenneth Graunke	6151eb4372	nir: Drop writemask from all Intel memory store intrinsics The backend has been fully ignoring all writemasks for a long time, so it really doesn't make sense to have them on our custom intrinsics. I'm not sure they even make sense for some of the block intrinsics. Also, the store_ssbo -> store_ssbo_intel pass was not setting writemask at all, leaving it at the default value of 0 (aka write nothing, if it had been respected...) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>	2025-11-11 10:55:41 +00:00
Roland Scheidegger	d6fd8b4201	llvmpipe: do bounds checking for shared memory Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Just compare against the size that was declared. This is probably overkill. I couldn't figure out what vulkan says wrt OOB access of shared memory. D3D however (which is very strict about these things) says that for TGSM writes the entire contents of the TGSM becomes undefined, for reads the result is undefined. Hence, rather than masking out such accesses, to avoid the segfaults it would be enough to just clamp the offsets to valid values. nir doesn't seem easily able to tell us if an access is guaranteed in-bound (unlike for ssbo access), so assume always potentially OOB. v2: fix rusticl - for cl we don't know the shared size at compilation time, this is only provided at launch_grid() time, the nir shader info shared_size might be zero. Hence pass through the size via cs jit context, there already actually was a member in there which looks like it was intended for that (interestingly enough, the cs jit context was actually unused, since resources are passed elsewhere nowadays). Reviewed-by: Brian Paul <brian.paul@broadcom.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38307>	2025-11-11 09:28:30 +00:00
Erik Faye-Lund	4490275332	pvr: rework pds_state array length logic This attempts to avoid needing hwdefs in headers. It's not perfect, but hopefully a step in the right direction. Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:14 +01:00
Erik Faye-Lund	1eab712245	pvr: move static_asserts to source-files This avoids needless dependencies on HW-defs in header files. Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:14 +01:00
Erik Faye-Lund	b2b8ec1a4c	pvr: move non-rogue helpers to pvr_hw_utils.h Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:14 +01:00
Erik Faye-Lund	02b5e78f0d	pvr: rename rogue_get_slc_cache_line_size This isn't really rogue-specific, so let's rename it to not cause any confusion. Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:14 +01:00
Erik Faye-Lund	e7fb4a9948	pvr: factor out pvr_sampler Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:14 +01:00
Erik Faye-Lund	cf08978985	pvr: break out pvr_instance and pvr_physical_device These files shouldn't not be per-arch, so break them out to their own modules before we start making things multi-arch. Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:11 +01:00
Erik Faye-Lund	4d0ab70caa	pvr: move queue function to pvr_queue.c Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:11 +01:00
Erik Faye-Lund	5e400e7449	pvr: remove needless include Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:11 +01:00
Erik Faye-Lund	428fadd71f	pvr: remove unused macros Reviewed-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>	2025-11-11 10:13:11 +01:00
Tapani Pälli	2741ddd75a	anv: fix issues found with indirect data stride Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Use tristate for the aligned setting, otherwise it is always first disabled which contributes to the condition if we set the new stride active. v2: set ByteStride in dword units and take secondary cmdbuf in to account (Lionel) Cc: mesa-stable Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38349>	2025-11-11 05:05:43 +00:00
Alyssa Rosenzweig	997b3ebbdb	poly: fix cull distance Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details More fallout from strict NIR validation but easy to fix. I hit this when attempting to CTS changes for parent_instr. Closes: #14245 Fixes: `2f6b4803ab` ("nir/validate: expand IO intrinsic validation with nir_io_semantics") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38356>	2025-11-11 01:34:24 +00:00
Christian Gmeiner	9c31b9b342	etnaviv: blt: Add Z16_UNORM format translation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Passes dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16 Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38362>	2025-11-11 00:30:21 +01:00
Christian Gmeiner	0ca826692a	etnaviv: blt: Add S8_UINT_Z24_UNORM format translation Passes dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38361>	2025-11-10 23:59:18 +01:00
Timothy Arceri	595a2fdbd2	glsl: assign block indices in the order they appear The hash lookup should be negligible. This makes things predictable rather than having hash table modifications causing the order to change, and fixes things for some seemingly buggy games. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13802 Fixes: `be5a15f11d` ("util/hash_table: start with 16 entries to reduce reallocations") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38300>	2025-11-10 21:52:25 +00:00
Iván Briano	aa97c23484	brw: shut -Wmaybe-uninitialized up Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Release builds are noisy about flush_type and scope being used uninitialized, even though they are always set. Initialize them to the final else values to make GCC happy. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38357>	2025-11-10 21:06:50 +00:00
Aitor Camacho	f458825d95	kk: Force vertex attribute rebinding when pipeline changes Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38325>	2025-11-10 20:50:54 +00:00
Sagar Ghuge	16f66ffe55	intel/common: Consider 0 threads while setting TG In ray tracing dispatch, we have dispatch.threads set to 0 since we calculate the local_size_x/y/z based on the launch sizes. This change takes 0 threads into an account and returh the TG size 8 in such scenarios. Before this change, we were setting TG size to 2. Fixes: `0c4e1c9efc` ("intel/common: Add helper for compute thread group dispatch size") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38229>	2025-11-10 12:09:30 -08:00
Samuel Pitoiset	6929333b0f	ac/surface: ban 256KB swizzle modes for non-MSAA images on GFX11+ This seems to hurt more than it helps and AMD drivers also disable 256 KB for non-MSAA. While we are at it, remove an useless check about GFX12 APUs because they don't exist. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14237 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38219>	2025-11-10 19:29:22 +00:00
Georg Lehmann	9ef0c96f26	nir/opt_algebraic: optimize open coded pack_32_2x16 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Foz-DB Navi48: Totals from 4 (0.00% of 80287) affected shaders: Instrs: 6231 -> 6101 (-2.09%) CodeSize: 35916 -> 35156 (-2.12%) Latency: 72190 -> 71317 (-1.21%) InvThroughput: 20817 -> 19962 (-4.11%) VALU: 3145 -> 3029 (-3.69%) VOPD: 310 -> 312 (+0.65%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37937>	2025-11-10 19:04:32 +00:00
Ian Romanick	d9bed33c11	nir/opt_if: Both parts of logic-joined conditions can be evaluated For cases like 'if (X && Y)', both X and Y must be true in the then branch. Their values are unknown in the else branch. Similarly, 'if (X \|\| Y)' must have both X and Y false in the else branch. The shader-db results are pretty bad, especially on Skylake. Ouch. The fossil-db results are good enough that they make up for it. v2: s/alu/alu_src/ in nir_src_parent_instr(use_src) != &alu_src->instr. Noticed by Rhys. shader-db: Lunar Lake total instructions in shared programs: 17203905 -> 17196251 (-0.04%) instructions in affected programs: 668828 -> 661174 (-1.14%) helped: 352 / HURT: 2 total cycles in shared programs: 879896264 -> 888462774 (0.97%) cycles in affected programs: 330523984 -> 339090494 (2.59%) helped: 187 / HURT: 167 total spills in shared programs: 3318 -> 3329 (0.33%) spills in affected programs: 4 -> 15 (275.00%) helped: 0 / HURT: 4 total fills in shared programs: 1903 -> 1917 (0.74%) fills in affected programs: 7 -> 21 (200.00%) helped: 0 / HURT: 4 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19969129 -> 19961439 (-0.04%) instructions in affected programs: 665860 -> 658170 (-1.15%) helped: 354 / HURT: 0 total cycles in shared programs: 884509249 -> 887353784 (0.32%) cycles in affected programs: 323242817 -> 326087352 (0.88%) helped: 208 / HURT: 146 total spills in shared programs: 4801 -> 4808 (0.15%) spills in affected programs: 14 -> 21 (50.00%) helped: 0 / HURT: 6 total fills in shared programs: 4454 -> 4467 (0.29%) fills in affected programs: 17 -> 30 (76.47%) helped: 0 / HURT: 6 Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 19913774 -> 19906147 (-0.04%) instructions in affected programs: 667348 -> 659721 (-1.14%) helped: 351 / HURT: 3 total cycles in shared programs: 861253468 -> 864535803 (0.38%) cycles in affected programs: 325577148 -> 328859483 (1.01%) helped: 180 / HURT: 174 total spills in shared programs: 3440 -> 3455 (0.44%) spills in affected programs: 18 -> 33 (83.33%) helped: 0 / HURT: 8 total fills in shared programs: 1946 -> 1961 (0.77%) fills in affected programs: 18 -> 33 (83.33%) helped: 0 / HURT: 8 Skylake total instructions in shared programs: 19031768 -> 19023604 (-0.04%) instructions in affected programs: 671633 -> 663469 (-1.22%) helped: 347 / HURT: 7 total cycles in shared programs: 868474831 -> 868132073 (-0.04%) cycles in affected programs: 320499758 -> 320157000 (-0.11%) helped: 246 / HURT: 108 total spills in shared programs: 4024 -> 4063 (0.97%) spills in affected programs: 28 -> 67 (139.29%) helped: 0 / HURT: 18 total fills in shared programs: 3722 -> 3746 (0.64%) fills in affected programs: 34 -> 58 (70.59%) helped: 0 / HURT: 18 fossil-db: Lunar Lake Totals: Instrs: 928574038 -> 928568364 (-0.00%); split: -0.00%, +0.00% Subgroup size: 40916656 -> 40916672 (+0.00%) Send messages: 41467974 -> 41467909 (-0.00%); split: -0.00%, +0.00% Loop count: 970202 -> 970191 (-0.00%) Cycle count: 106297789925 -> 106301305901 (+0.00%); split: -0.00%, +0.01% Spill count: 3424464 -> 3424452 (-0.00%); split: -0.00%, +0.00% Fill count: 6525458 -> 6525119 (-0.01%); split: -0.01%, +0.00% Max live registers: 193525368 -> 193524886 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 232027347 -> 232026610 (-0.00%); split: -0.00%, +0.00% Totals from 1130 (0.06% of 2018793) affected shaders: Instrs: 2662692 -> 2657018 (-0.21%); split: -0.27%, +0.06% Subgroup size: 16 -> 32 (+100.00%) Send messages: 112689 -> 112624 (-0.06%); split: -0.07%, +0.01% Loop count: 5723 -> 5712 (-0.19%) Cycle count: 1176696438 -> 1180212414 (+0.30%); split: -0.33%, +0.63% Spill count: 9895 -> 9883 (-0.12%); split: -0.13%, +0.01% Fill count: 26892 -> 26553 (-1.26%); split: -1.26%, +0.00% Max live registers: 215462 -> 214980 (-0.22%); split: -0.30%, +0.08% Non SSA regs after NIR: 398940 -> 398203 (-0.18%); split: -0.21%, +0.03% Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 1000318839 -> 1000314218 (-0.00%); split: -0.00%, +0.00% Send messages: 45548952 -> 45548887 (-0.00%); split: -0.00%, +0.00% Loop count: 1026441 -> 1026430 (-0.00%) Cycle count: 92411461807 -> 92395024225 (-0.02%); split: -0.02%, +0.00% Spill count: 3665265 -> 3665221 (-0.00%); split: -0.00%, +0.00% Fill count: 6504830 -> 6504801 (-0.00%); split: -0.00%, +0.00% Max live registers: 121790079 -> 121789811 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 38062488 -> 38062648 (+0.00%) Non SSA regs after NIR: 256900770 -> 256900038 (-0.00%); split: -0.00%, +0.00% Totals from 1124 (0.05% of 2284852) affected shaders: Instrs: 2724110 -> 2719489 (-0.17%); split: -0.24%, +0.07% Send messages: 112096 -> 112031 (-0.06%); split: -0.07%, +0.01% Loop count: 5697 -> 5686 (-0.19%) Cycle count: 960659254 -> 944221672 (-1.71%); split: -1.91%, +0.20% Spill count: 13791 -> 13747 (-0.32%); split: -0.40%, +0.08% Fill count: 43216 -> 43187 (-0.07%); split: -0.14%, +0.08% Max live registers: 114877 -> 114609 (-0.23%); split: -0.31%, +0.07% Max dispatch width: 12768 -> 12928 (+1.25%) Non SSA regs after NIR: 412320 -> 411588 (-0.18%); split: -0.20%, +0.03% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:42 +00:00
Ian Romanick	3e0c9ad316	nir/opt_if: Conditionally do not propagate constants through bcsel In some cases propagating through a bcsel may be harmful. If the bcsel uses are unlikely to be eliminated in both branch of an if statement, propagating through it may result in extra moves for phi instructions and extended live ranges. v2: Fix missing parameter in call. Noticed by Rhys. I fixed this on the test machine, but I must have forgotten to propagate the change back to my dev machine. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:41 +00:00
Ian Romanick	a3b6d05a3b	nir/opt_if: Specify which branches are valid for evaluate_if_condition Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:41 +00:00
Marek Olšák	0216f09e45	nir/lower_interpolation: check IO location correctly Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Vangogh timed out. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38337>	2025-11-10 16:44:36 +00:00
Ahmed Hesham	6901bb0c6c	panfrost/lima/panvk: Define a common vendor ID Rusticl reports `CL_DEVICE_VENDOR_ID` using the `vendor_id` property defined in Panfrost. The value is not set so a `0` is reported instead. Initialise the value to `0x13B5`, which is Arm's PCI vendor ID. Add the definition in `lib/pan_props.h` so it can be shared with Gallium Lima, Panfrost and PanVK. Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38283>	2025-11-10 14:01:40 +00:00
Valentine Burley	e91832739b	venus/ci: Add missing Collabora farm rules to ANV jobs Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38348>	2025-11-10 12:47:26 +00:00
Patrick Lerda	ae049f6fea	r600: limit pre-evergreen predicate ready size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With the current stack configuration the rv770 seems to be unable to go beyond three with the "vs-output-array-float-index-wr-before-gs.shader_test" test. Anyway, the value four seems to be sufficient for the other tests. This issue was triggered on rv770, for instance, with: "piglit/bin/shader_runner tests/spec/glsl-1.50/execution/variable-indexing/gs-output-array-float-index-wr.shader_test -auto -fbo" "piglit/bin/shader_runner tests/spec/glsl-1.50/execution/variable-indexing/vs-output-array-float-index-wr-before-gs.shader_test -auto -fbo" Fixes: `713edb5998` ("r600/sfn: handle the IF predicate in the scheduler") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38213>	2025-11-10 12:25:38 +00:00
Karol Herbst	92a4ae0ab2	rusticl/spirv: preserve signed zeroes by default Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38327>	2025-11-10 10:52:56 +00:00
Karol Herbst	df344f12cc	rusticl/kernel: take no kernel_info reference inside the launch closure Otherwise patterns like this wouldn't work: clCreateKernel(prog) clEnqueueNDRangeKernel clReleaseKernel clBuildProgram(prog) Fixes: `bb2453c649` ("rusticl/kernel: move most of the code in launch inside the closure") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38327>	2025-11-10 10:52:56 +00:00

... 2 3 4 5 6 ...

198927 commits