fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-04 05:28:05 +02:00

Author	SHA1	Message	Date
Georg Lehmann	d4c0318f48	aco: apply DPP with scalar src1 on gfx11.5+ Foz-DB Navi48: Totals from 6261 (7.62% of 82179) affected shaders: MaxWaves: 176284 -> 176236 (-0.03%); split: +0.01%, -0.03% Instrs: 5850185 -> 5828451 (-0.37%); split: -0.41%, +0.04% CodeSize: 31363324 -> 31419904 (+0.18%); split: -0.08%, +0.26% VGPRs: 328284 -> 328200 (-0.03%); split: -0.07%, +0.05% SpillSGPRs: 2268 -> 2256 (-0.53%) Latency: 50235516 -> 50218816 (-0.03%); split: -0.06%, +0.03% InvThroughput: 8256243 -> 8242036 (-0.17%); split: -0.22%, +0.05% VClause: 81000 -> 80975 (-0.03%); split: -0.11%, +0.08% SClause: 136376 -> 136387 (+0.01%); split: -0.11%, +0.11% Copies: 414021 -> 417894 (+0.94%); split: -0.13%, +1.07% Branches: 105301 -> 105298 (-0.00%); split: -0.00%, +0.00% PreSGPRs: 291360 -> 291432 (+0.02%) PreVGPRs: 238593 -> 238729 (+0.06%); split: -0.02%, +0.08% VALU: 3425446 -> 3403463 (-0.64%); split: -0.65%, +0.01% SALU: 815505 -> 819372 (+0.47%); split: -0.02%, +0.50% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:51 +00:00
Georg Lehmann	3fe329b3d0	aco/ra: don't move sgpr into v_fmac_f32_dpp src0 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:50 +00:00
Georg Lehmann	903d940fa9	aco: don't convert VOP3P to VOP3 when applying DPP Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:50 +00:00
Georg Lehmann	8ac7b9fc37	aco: undo operand swap if applying DPP fails Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:50 +00:00
Georg Lehmann	531228159f	aco/validate: allow dpp with scalar src1 on gfx11.5+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:50 +00:00
Georg Lehmann	140ca3bb50	aco: disable DPP for rev integer subs and shifts It is not documented anywhere, but at least on gfx12 and gfx10.3 DPP is applied to src1 instead of src0. This might be useful for shifts, but to be safe just disable DPP completely for now. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14739 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:49 +00:00
Georg Lehmann	510dbbae7f	aco/optimizer: use opcode_supports_dpp Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:49 +00:00
Georg Lehmann	8e99bf5380	aco: add a helper function for non supported DPP opcodes Cc: mesa-stable Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>	2026-01-27 20:42:49 +00:00
Eric Engestrom	d12e3454e6	nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests Fixes: `4c30c44b75` ("nir: Generate unit tests for nir_opt_algebraic") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39550>	2026-01-27 20:03:16 +00:00
Nanley Chery	4512d81559	intel/blorp: Bump pitch when clearing unaligned bottom rows Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This might be faster if the layer starts at a 64KB offset. No performance benefits found in the performance CI. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:55 +00:00
Nanley Chery	3e331e4fe9	intel/blorp: Optimize non-zero-layer fast-clears Allow surface redescription when fast-clearing a layer > 0. This affects at least five traces in the performance CI, but the CI doesn't report any performance benefit from this. We already had code to handle unaligned rows at the bottom of an image. Now that this handles the misalignment at the top of the image range, we gain some symmetry. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:55 +00:00
Nanley Chery	ba63883692	intel/blorp: Avoid unused surface redescription calc Suggested-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:54 +00:00
Nanley Chery	e42b2a5d70	anv: Don't partial resolve LOD1+ for non-FCV CCS We don't allow fast-clears in this case. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:54 +00:00
Nanley Chery	21d187b7f5	anv: Support fast clears on more layers On Xe2+, support multi-layer and non-zero-layer CCS fast-clears. To do this in a simple manner, drop the code which splits multi-layer clears into fast clears and slow clears. The performance CI reports no regressions nor improvements on BMG. For MCS on all platforms and for CCS on prior platforms, use a new heuristic. Instead of only allowing fast clears on the first slice/layer, do the following: For 3D images, only fast-clear if all slices are cleared. Enables fast-clearing every slice of 3D textures in: * Terminator Resistance - 480x270x128. * Ghostrunner 2 - 320x180x128. For 2D arrays, match the Xe2+ behavior and allow clearing to any layer. This is possible because we only allow fast-clearing if the clear color matches the default value. Enables fast-clearing every layer of 2D array textures in: * Assassin's Creed - 128x128, 6-layers. * Blackops 3 - 1024x1024, 6-layers. * Borderlands 3 - 128x128, 6-layers. * Cyberpunk - 1024x1024, 10-layers. * Unigine Superposition - 4K, 2-layers. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11893 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:54 +00:00
Nanley Chery	b8f6ad9060	anv: Use variable default value for some images using CLEAR A future commit will enable clearing to more than the first layer of 2D array images. To ensure consistency for the clear color, require the ANV_FAST_CLEAR_DEFAULT_VALUE for such images if they make use of ISL_AUX_STATE_CLEAR. Also, use a non-zero default value for some image formats. I tested the majority of workloads in the performance CI. This will cause those which clear to 2D array layers to gain clears on more than just the first layer. At the moment, we still only support clearing the first layer, so there should be no change in performance. Affected games are documented in the code. Acked-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:53 +00:00
Nanley Chery	811c413f98	anv: Don't return the Xe2+ fast-clear type early Don't return early from anv_layout_to_fast_clear_type() for Xe2+. We'll need to make more use of the function for some MCS changes in later commits. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:53 +00:00
Nanley Chery	7bb7b63b96	anv: Line wrap anv_CmdClearColorImage Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:52 +00:00
Nanley Chery	390c9e3fda	anv: Inline the CCS/MCS predicated resolve functions Now we can see the MI writes performed before and after the resolves in transition_color_buffer(). Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:52 +00:00
Nanley Chery	4d8c71ab1f	anv: Delete conversion of CCS_D partial resolve Now that hasvk is the driver for supporting HSW and BDW, we no longer need to convert CCS_D partial resolves to full resolves to avoid an assert-failure in BLORP. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:51 +00:00
Nanley Chery	b1db1179c2	anv: Set compressed bit separately from fast-clear type This will make handling fast-clears on multiple layers simpler by saving us from having to pass more parameters into fast-clear state setting functions. It also allows us to set more complex fast-clear state for FCV_CCS_E without marking the image as compressed. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	c054d4fe2f	anv: Support partial resolves on any level/layer Enables more support for FCV_CCS_E partial resolves if we ever need it. Also enables support for multiple layers being fast cleared and needing resolves. Support for that will arrive in several commits. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	0a8ab13b9d	anv: Reset fast-clear type in transition_color_buffer() Moving the code here will simplify the task of supporting fast-clears on multiple array layers and depth slices. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Nanley Chery	ce196c9de5	anv: Fix the fast clear type for FCV writes We started allowing non-default clear colors with FCV in commit `cd8e120b97`. When rendering to an image with FCV, set the fast-clear type to ANV_FAST_CLEAR_ANY if the image properties allow such fast-clears. Fixes: `cd8e120b97` ("anv: Allow more single subresource fast-clears with FCV") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Nanley Chery	e7854d06a5	anv: Update predicated resolve documentation * Don't mention gfx7-8 due to the hasvk split. * Account for the array of clear colors. Fixes: `0e6b132a75` ("anv: Access more colors in fast_clear_memory_range") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:48 +00:00
Nanley Chery	6c6b2d8f30	iris: Use the CLEAR state on Xe2+ for MCS On Xe2+, HSD 14011946253 and the related documents explain that MCS still only supports a single clear color. Fixes: `df006bba02` ("iris: Update aux state for color fast clears (xe2)") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:48 +00:00
Nanley Chery	3b642f7456	iris: Set missing flags on clear color changes When changing the clear color without a fast clear, use dirty bits to ensure that surfaces with inline clear colors are updated and that partial resolves are done as needed. Remove the flags at the bottom of fast_clear_color() as blorp_fast_clear() already sets them for us. Fixes: `64d861b700` ("iris: Skip some fast-clears even on color changes") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:48 +00:00
Nanley Chery	eb4a581e44	intel/isl: Fix QPitch of arrayed MCS From RENDER_SURFACE_STATE::AuxiliarySurfaceQPitch on BDW+, This field must be set to an integer multiple of the Surface Vertical Alignment Accomplish this by aligning the height of each MCS layer to main surface's vertical alignment. Prevents the following test group from failing on Xe2 when a future commit enables multi-layer fast-clears in anv: dEQP-VK.api.image_clearing.. clear_color_attachment.multiple_layers. _clamp_input_sample_count_* The main test I used to debug this: dEQP-VK.api.image_clearing.core. clear_color_attachment.multiple_layers. a8b8g8r8_unorm_pack32_64x11_clamp_input_sample_count_2 Backport-to: 25.3 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:47 +00:00
Connor Abbott	b95839b9c9	ir3: Fix barrier error case calculation Make waves_per_wg actually be the waves per workgroup, so that the condition below for when we should error out actually is correct. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39463>	2026-01-27 18:15:06 +00:00
Mel Henning	f3c53cf66b	nvk: Disable large pages for now Reviewed-by: Mary Guillemard <mary@mary.zone> Fixes: `cabfdb4404` ("nvk: Enable compression") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39364>	2026-01-27 17:58:40 +00:00
Georg Lehmann	4b1996b1c7	aco: fix demote in header of single iteration loop The control is not divergent before a divergent break in a single iteration loop, but we already pushed the loop mask on the stack. Fixes: `90faadae72` ("aco/insert_exec_mask: don't disable dead quads on demote in divergent CF") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14733 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39528>	2026-01-27 17:39:05 +00:00
Duncan Brawley	b8889f5eaa	pvr: add basic support for shader statistics framework Mesa now has a statistics framework. This adds support for emitting additional statistics about PowerVR shaders for the Rogue architecture. Add support for emitting the following statistics: Code size, scratch size, spill count, temp count, loop count, number of inst groups, number of main inst groups, number of bitwise inst groups and number of control inst groups. Add support for new PCO_DEBUG_PRINT option "stats" to emit shader stats. Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com> Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39523>	2026-01-27 16:58:30 +00:00
Kenneth Graunke	41d7debcfe	brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This avoids generating some useless math that would need to be cleaned up later, without complicating things too much. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	24c66d3871	brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize This helps cut down URB messages on tessellation and mesh shaders significantly. fossil-db results on Battlemage: Instrs: 505172392 -> 505207187 (+0.01%); split: -0.00%, +0.01% Send messages: 23678197 -> 23656126 (-0.09%); split: -0.09%, +0.00% Cycle count: 63150470088 -> 63147482640 (-0.00%); split: -0.01%, +0.00% Spill count: 576554 -> 576616 (+0.01%) Fill count: 545304 -> 545413 (+0.02%) Max live registers: 141099192 -> 141150675 (+0.04%); split: -0.00%, +0.04% Max dispatch width: 39856192 -> 39856208 (+0.00%) Totals from 4231 (0.27% of 1583648) affected shaders: Instrs: 1620161 -> 1654956 (+2.15%); split: -0.25%, +2.40% Send messages: 128652 -> 106581 (-17.16%); split: -17.18%, +0.03% Cycle count: 24650700 -> 21663252 (-12.12%); split: -12.82%, +0.70% Spill count: 378 -> 440 (+16.40%) Fill count: 1308 -> 1417 (+8.33%) Max live registers: 364676 -> 416159 (+14.12%); split: -0.24%, +14.36% Max dispatch width: 67952 -> 67968 (+0.02%) There are several reasons we didn't go with nir_opt_vectorize_io: 1. nir_opt_vectorize_io appears to work on the slot location level. We want to be able to vectorize based on the URB offsets, especially for cases like point size, layer, and viewport which have different VARYING_SLOT_* values but live in the same vec4 in a URB entry. 2. We want vec8 stores, and nir_opt_vectorize_io only seems to vectorize within a single 32-bit vec4. It does handle 8 components, but that's only for packing 16-bit values into a 32-bit vec4. Improves performance of Sascha Willems' tessellation demo by around 4% on Meteorlake. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	aafe8967fd	brw: Avoid using URB global offset with per-slot offsets on <= Icelake Both the URB Global Offset and Per-Slot Offsets are specified to be unsigned numbers. The URB Global Offset is only 11 bits, and so is limited to be between [0, 2047]. While the per-slot offsets are given as U32 values, it would appear that adding the two offsets does not handle 32-bit overflow/unsigned wrap correctly. This pops up in Piglit's TCS variable-indexing tests, which ends up performing loads from offset (x - 16) and a base of 18, and at an offset (x) with a base of 2. These should be equivalent, but when x <= 15, the per-slot offset calculated in the shader is negative (0xfffffff[0-f]) and adding the base of 18 is not wrapping around correctly to [2, 17]. To work around this, avoid using the global offset when the per-slot offset is present, and just add the two in the shader where unsigned wrap works correctly. Tigerlake and later don't seem to have this issue. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	07ac0e3463	brw: Skip vec8 store_urb_vec4_intel noop writemasks as well We were checking for 0xf which is fine for vec4, but vec8 gets 0xff. Either way, nothing is writemasked, so we can skip sending the mask. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	dbb24ff56b	brw: Assert that urb_vec4_intel stores only have 4/8 components vec1-3, 5-7, and 9+ are not supported. Only vec4 and vec8. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	b844082017	nir: Add a round_up_components callback to load/store vectorization By default, load/store vectorization uses nir_round_up_components() to round up loads and possibly writemasked stores to the next valid NIR vector width. However, some backends may not support load/stores at all sizes. For example, older Intel supports only power-of-two vector widths. Newer Intel also supports vec2 and vec3, but not vec5/6/7. By providing a callback, backends can request promotion to their next supported memory load/store vector width. The existing "should we vectorize?" callback should continue to return false for unsupported vector widths (i.e. beyond the maximum supported). With this new callback, they do not need to say "no" to vectorization that would normally produce an unsupported count (e.g. vec5/6/7) but instead request that the component count be rounded up appropriately. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	e23a83b786	nir: Add load/store vectorizer option for rounding up masked stores This adds a new option, round_up_store_components, which rounds up the number of components for stores that support writemasking to the next valid vector size. For example, vec4+vec2 stores would round up from 6 components (which wouldn't be supported) to a full supportable vec8 store, relying on writemasking to ensure the correct pieces are written. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	37f3c59b2c	nir: Teach opt_load_store_vectorize how to handle Intel URB intrinsics URB intrinsics are simply memory load/stores to a special memory region, so it's pretty reasonable to handle these in the memory vectorizer. We treat emit_vertex_* intrinsics as a barrier for shader outputs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	c2f03ba12f	nir: Add memory modes to URB load intrinsics This makes it easier for NIR passes to distinguish between inputs and outputs without having to reason about which URB handle source was passed to the intrinsic. It probably also makes it a bit easier for humans to read the NIR too. v2: Don't add memory mode to store intrinsics. It's always output. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Tapani Pälli	bb84773c81	blorp: fix asserts hit with msaa blorp blits on xe3 Tested on PTL, fixes various copy_and_blit tests that utilize compute after `ab9d3528dc` that exposed this to them. Fixes: `ab9d3528dc` ("anv: fix queue check in anv_blorp_execute_on_companion on xe3") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39548>	2026-01-27 15:28:55 +00:00
Caleb Callaway	1038ab7b57	anv/driconf: Disable shader LTO for MHW MHW has a long-running shader compile step on first launch that is significantly sped up by disabling Link Time Optimization in the ANV driver. Shader compile times with LTO disabled are 50% of baseline measurements and the benchmark shows no stastically significant change to performance (tested on LNL-M OOB) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39544>	2026-01-27 14:57:21 +00:00
Caleb Callaway	a91a636faf	driconf: LTO disable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39544>	2026-01-27 14:57:20 +00:00
Samuel Pitoiset	5709644f2c	radv: optimize barriers when clearing HiZ on GFX12 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details HiZ must only be cleared when the full HiZ workaround is enabled. This means that the previous slow clear draw would disable HiZ because it hits the conditions (ie. depth/stencil enable and depth writes enabled). So, the draw and the dispatch can run in parallel by moving the barrier earlier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39433>	2026-01-27 14:37:01 +00:00
Samuel Pitoiset	96829d6c5e	radv/meta: return the flush bits from radv_clear_hiz() Similar to other functions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39433>	2026-01-27 14:37:01 +00:00
Samuel Pitoiset	5911ba5ff5	radv/meta: fix 3D color resolves with compute when base slice isn't zero Needs to consider the base offset, otherwise it's resolving to the first 3D slice. Fixes very recent VKCTS coverage dEQP-VK.pipeline..multisample.m10_resolve.. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39393>	2026-01-27 14:14:19 +00:00
Caterina Shablia	a3ec5ece8b	panvk: fix sparse image non-opaque binds I have no idea how this passed CTS. Fixes: `5326c451` ("panvk/csf: implement sparse image non-opaque binds") Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39546>	2026-01-27 12:35:09 +00:00
Hans-Kristian Arntzen	27c61f3c0c	docs: Add VK_EXT_present_timing to new features. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:53 +00:00
Hans-Kristian Arntzen	a9e261fa14	wsi/common: Allow timestampValidBits < 64 for present timing. In this case, do the wrapping logic on our end and normalize right away to host time domain. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:53 +00:00
Hans-Kristian Arntzen	5e2814c8a4	wsi/display: Implement present timing on KHR_display. Deal with VRR vs FRR as well. Loosely based on earlier work by Keith Packard and Emma Anholt (MR 38472 for reference). Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:53 +00:00

1 2 3 4 5 ...

217825 commits