fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-21 12:28:24 +02:00

Author	SHA1	Message	Date
Georg Lehmann	a3d9712e8e	aco/assembler: chain branches in emit order Blocks aren't emitted in the order of program->blocks anymore, use the real emitted order when chaining branches. This fixes assumptions about block offsets only increasing. Fixes: `102aca9843` ("aco/assembler: emit block_kind_loop_latch before the loop header") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15628 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42246>	2026-06-16 19:50:48 +00:00
Marek Olšák	e98c3e63f4	nir/opt_idiv_const: a / uint_max -> b2i(a == uint_max) Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details uint_max instance divisors are used by Deathloop to get stride==0 behavior. Totals from 36 (0.02% of 202440) affected shaders: Instrs: 116064 -> 116028 (-0.03%) CodeSize: 579120 -> 578880 (-0.04%) Latency: 142081 -> 142054 (-0.02%) InvThroughput: 34268 -> 34178 (-0.26%) PreSGPRs: 2264 -> 2272 (+0.35%) VALU: 66228 -> 66192 (-0.05%) We could replace (InstanceIndex / uint_max) with 0 if we assumed that InstanceIndex == uint_max can't practically occur, and promote the VBO loads to SMEM. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42229>	2026-06-16 19:14:27 +00:00
Matt Turner	136596668e	nir: fix dedup_entry memcmp on structs with padding nir_scalar contains a pointer followed by an unsigned, leaving 4 bytes of compiler-inserted trailing padding. Copying a nir_scalar via struct assignment propagates whatever garbage bytes were in the source temporary's padding, so both XXH32(entry, sizeof(dedup_entry)) and memcmp(a, b, sizeof(dedup_entry)) could produce wrong results for semantically identical entries. Entries are allocated with rzalloc, which zeros all bytes. Preserve that invariant by assigning nir_scalar fields member-by-member instead of via struct assignment, keeping the padding bytes zero throughout the entry's lifetime. XXH32 and memcmp over the full struct are then correct. Add a static_assert on sizeof(dedup_entry) to catch future layout changes that would require auditing the assignments. Fixes: `ca137e545c` ("nir/opt_varyings: rewrite elimination of duplicated outputs") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42235>	2026-06-16 18:37:32 +00:00
Daniel Schürmann	9d558626ef	aco/assembler: pass std::vector to insert_code Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41990>	2026-06-16 18:16:15 +00:00
Daniel Schürmann	f43f2f7f72	aco/assembler: Fix s_inst_prefetch insertion after loop latch rotation If the loop latch is not the header, then the preheader ends in a branch in order to jump over the loop latch on the first iteration. Emit the s_inst_prefetch before the branch. 2415 (1.16% of 208626) affected shaders (Navi31) Fixes: `102aca9843` ('aco/assembler: emit block_kind_loop_latch before the loop header') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41990>	2026-06-16 18:16:15 +00:00
Mike Blumenkrantz	f69fbb7fd7	lavapipe: EXT_multisampled_render_to_swapchain Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42053>	2026-06-16 17:14:08 +00:00
Mike Blumenkrantz	9103cd2873	vulkan/wsi: add VK_IMAGE_CREATE_MULTISAMPLED_RENDER_TO_SINGLE_SAMPLED_BIT_EXT where supported this enables the extension to be used with swapchains Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42053>	2026-06-16 17:14:08 +00:00
Mike Blumenkrantz	f037abac8a	vulkan/wsi: pass VkSurfaceCapabilities2KHR to get_capabilities no functional changes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42053>	2026-06-16 17:14:08 +00:00
Faith Ekstrand	e5b39e03b0	kraid: OpShiftLop is unsigned After doing a bit of hardware testing on v13, it seems that all shift+lop are unsigned when it comes to widening. By switching the opcode itself, we'll enable future optimizations where we fold widens into sources. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	b8298e34ff	kraid/swizzle: Return Option<Swizzle> from AsmSwizzleWiden::to_swizzle() This makes it a bit more useful as it now returns None instead of just asserting on you if you attempt an invalid conversion. This also switches it to be basically exactly the same big switch statement as from_swizzle(), just in the other direction. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	ed1aa3dc33	kraid/hw_tests: Allow the test to specify swizzles and lanes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	a758273a34	kraid/validate: Fix 64-bit destination validation Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	fabffca0bd	kraid/swizzle: Take a src_bytes param in Swizzle::bytes_read() Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	04e3afaac8	kraid/swizzle: Add an is_none() special case in fold_u32() This does nothing since NONE is B0123 but it makes the folding code clearer and probably a bit faster for a common case. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:10 +00:00
Faith Ekstrand	162d7091e9	kraid/swizzle: Add a Swizzle::is_none() helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42274>	2026-06-16 16:57:09 +00:00
Rhys Perry	6ab6bc94d3	radv: do nir_opt_algebraic last in radv_optimize_nir_algebraic_early fossil-db (navi21): Totals from 15 (0.02% of 84369) affected shaders: Instrs: 88362 -> 88310 (-0.06%) CodeSize: 477056 -> 476804 (-0.05%) Latency: 1410938 -> 1402039 (-0.63%) InvThroughput: 704454 -> 700004 (-0.63%) Copies: 5034 -> 5024 (-0.20%); split: -0.40%, +0.20% PreSGPRs: 1014 -> 1010 (-0.39%) VALU: 66264 -> 66221 (-0.06%); split: -0.07%, +0.01% SALU: 10115 -> 10120 (+0.05%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38399>	2026-06-16 16:36:12 +00:00
Rhys Perry	06e2f12f85	radv: repeat loop in radv_optimize_nir_algebraic_early more nir_opt_peephole_select, nir_opt_remove_phis and nir_opt_dead_cf can create work for nir_opt_constant_folding, so they should repeat the loop if they made progress. fossil-db (navi21): Totals from 24 (0.03% of 84369) affected shaders: Instrs: 13007 -> 13028 (+0.16%); split: -0.15%, +0.32% CodeSize: 70024 -> 70032 (+0.01%); split: -0.13%, +0.14% Latency: 65968 -> 65623 (-0.52%); split: -0.57%, +0.04% InvThroughput: 18398 -> 18410 (+0.07%); split: -0.06%, +0.13% SClause: 341 -> 327 (-4.11%) Copies: 960 -> 1031 (+7.40%); split: -0.10%, +7.50% PreSGPRs: 834 -> 846 (+1.44%) PreVGPRs: 1022 -> 1021 (-0.10%) VALU: 8665 -> 8684 (+0.22%); split: -0.06%, +0.28% SALU: 1523 -> 1559 (+2.36%); split: -1.05%, +3.41% SMEM: 594 -> 573 (-3.54%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38399>	2026-06-16 16:36:11 +00:00
Rhys Perry	ca9191a8a7	radv: use nir_opt_uub fossil-db (navi21): Totals from 3894 (4.62% of 84369) affected shaders: MaxWaves: 86615 -> 86637 (+0.03%) Instrs: 7490485 -> 7475241 (-0.20%); split: -0.22%, +0.01% CodeSize: 39623244 -> 39530064 (-0.24%); split: -0.25%, +0.01% VGPRs: 208256 -> 208176 (-0.04%); split: -0.06%, +0.02% SpillSGPRs: 1000 -> 996 (-0.40%) Latency: 70318585 -> 70222574 (-0.14%); split: -0.18%, +0.05% InvThroughput: 18896416 -> 18879399 (-0.09%); split: -0.12%, +0.03% VClause: 148207 -> 148155 (-0.04%); split: -0.07%, +0.04% SClause: 168961 -> 168957 (-0.00%); split: -0.06%, +0.06% Copies: 488812 -> 488065 (-0.15%); split: -0.30%, +0.15% Branches: 206484 -> 205549 (-0.45%); split: -0.48%, +0.03% PreSGPRs: 179557 -> 179512 (-0.03%); split: -0.19%, +0.17% PreVGPRs: 178818 -> 178735 (-0.05%); split: -0.05%, +0.00% VALU: 4587157 -> 4580140 (-0.15%); split: -0.17%, +0.01% SALU: 1646841 -> 1640392 (-0.39%); split: -0.43%, +0.04% VMEM: 245983 -> 245974 (-0.00%) SMEM: 256836 -> 256596 (-0.09%); split: -0.10%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38399>	2026-06-16 16:36:11 +00:00
Michael Cheng	cb36bfccee	intel/brw: allow baking more SBID dependencies into instructions on Xe2+ On Xe2+, the SWSB encoding supports combined RegDist + SBID annotations in more cases than the scoreboard lowering pass was utilizing: - SBID DST wait can combine with RegDist when ordered_pipe is ALL (encoded as mode 0b11) - SBID SRC wait can combine with RegDist when the ordered pipe matches the instruction's inferred execution pipe (encoded as mode 0b10) Previously, DST could only be baked when ordered_pipe matched the inferred pipe exactly, and SRC could never be baked when an ordered dependency was present. This caused unnecessary sync nop insertions for the multi-dependency case. Totals: Instrs: 635590947 -> 635755058 (+0.03%) CodeSize: 8898399200 -> 8749325392 (-1.68%) Cycle count: 74079919465 -> 73911266034 (-0.23%); split: -0.23%, +0.00% Max dispatch width: 28235024 -> 28233232 (-0.01%); split: +0.01%, -0.02% Totals from 842018 (69.11% of 1218461) affected shaders: Instrs: 578157685 -> 578321796 (+0.03%) CodeSize: 8100966624 -> 7951892816 (-1.84%) Cycle count: 73446641513 -> 73277988082 (-0.23%); split: -0.23%, +0.00% Max dispatch width: 20028944 -> 20027152 (-0.01%); split: +0.02%, -0.03% Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42205>	2026-06-16 16:12:54 +00:00
Rhys Perry	472957270a	radv: inline some helpers used in radv_pipeline_get_shader_key These are each very simple and are now only used once. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:39 +00:00
Rhys Perry	546442a751	radv: rework creation of traveral radv_shader_stage_key It doesn't make sense to use a key from a random intersection shader. fossil-db (gfx1201): Totals from 3 (0.00% of 210263) affected shaders: Instrs: 9728 -> 10024 (+3.04%) CodeSize: 60140 -> 60012 (-0.21%) Latency: 95724 -> 95905 (+0.19%) InvThroughput: 15015 -> 15044 (+0.19%) VALU: 2985 -> 2997 (+0.40%) VMEM: 345 -> 429 (+24.35%) VOPD: 307 -> 323 (+5.21%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:39 +00:00
Rhys Perry	fc02ffed24	radv: merge radv_shader_stage_key for combined ahit/isec shaders This should be done if the any-hit enables robustness but the intersection does not. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:39 +00:00
Rhys Perry	ec21d14a30	radv: make raytracing radv_shader_stage_key initialization per-shader The traversal stage key is still a bit nonsense. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:39 +00:00
Rhys Perry	8dabaa50c5	radv: make raytracing radv_shader_stage_key array per-shader It doesn't really make sense to make this per-mesa_shader_stage. Each VkPipelineShaderStageCreateInfo can have different flags. This is just a refactor at the moment. Actually letting them differ within a mesa_shader_stage is for a later commit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:39 +00:00
Rhys Perry	e5dfdfa1e0	radv: parse stats from binary in radv_parse_binary_debug_info This is more appropriate, and can be done now that the function is called in radv_shader_deserialize(). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:38 +00:00
Rhys Perry	64b4cb001c	radv: cache shader IR, asm and spir-v This improves RADV_DEBUG=hang's pipeline.log when shader caching is not disabled. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:38 +00:00
Rhys Perry	53e127e33b	radv: add radv_shader_stage_key::keep_shader_arg_info We're going to be caching args_string soon. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:38 +00:00
Rhys Perry	e2d834fde6	radv: simplify radv_declare_shader_args parameters Instead of passing various fields from stage, just pass the entire object. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:37 +00:00
Rhys Perry	1ab613dc5b	radv: don't create nir_string if dump_shader=true NIR printing is done earlier without nir_string, so I don't know why this was done. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:37 +00:00
Rhys Perry	52e37fba33	radv: move nir_debug_info from debug to key It seems likely that this could affect compilation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:36 +00:00
Rhys Perry	6df83703c9	radv: use radv_shader_stage_key::keep_{statistic,executable}_info more No need to create these again and pass them around as parameters. These functions already have plenty of those. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>	2026-06-16 15:35:36 +00:00
Samuel Pitoiset	9a2d7186eb	vulkan: fix lowering untyped accel struct with descriptor heap Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It points to the heap variable. This fixes dEQP-VK.binding_model.descriptor_heap.basic.raygen.acceleration_structure_untyped. Fixes: `20d11c59a4` ("vulkan: Add a lowering pass for descriptor heap mappings") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42252>	2026-06-16 15:12:33 +00:00
Samuel Pitoiset	ee5f548678	radv/amdgpu: fix padding by one VM page This isn't intended to be used for sparse BOs and it was incorrect anyways because flags isn't initialized, so it was only clearing the original VA range, not including the padding. Since sparse is still experimental on GFX6-7, let's just apply the workaround to non-sparse BOs. This fixes sparse support on VEGA10, since `addc719ec2` ("radv: workaround has_smem_partial_oob_access_bug"). Fixes: `10a5e5e4f3` ("radv/amdgpu: Add ability to pad BOs with a read-only VM page") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42245>	2026-06-16 14:37:17 +00:00
Samuel Pitoiset	d72d4666ba	radv: fix REPLAYED shader arena blocks not being marked as holes on free Because free_list is always NULL for REPLAYED arenas, freed blocks were never passed to add_hole() and freelist.prev was still NULL. So, adjacent blocks were never merged together and that caused a memleak with unreachable blocks. This fixes a memleak detected by ASAN in dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.s0_l11_check_capture_replay_handles and similar tests. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42012>	2026-06-16 13:16:39 +00:00
Caius-Moldovan-img	44290e1899	nir: Fix trailing comment generation for variable naming Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add semantic location check, because multiple variables can share same component location. Fixes: `ea863c0c1c` ("nir/print: Do not access invalid indices of load_uniform") Signed-off-by: Caius Moldovan <caius.moldovan@imgtec.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42114>	2026-06-16 12:20:38 +00:00
Rhys Perry	1845e53865	ac/lower_global_access: combine multiple 32-bit offsets fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*): Totals from 224 (13.80% of 1623) affected shaders: Instrs: 16351 -> 13429 (-17.87%) CodeSize: 100640 -> 84732 (-15.81%) VGPRs: 4428 -> 3636 (-17.89%) Latency: 171321 -> 166654 (-2.72%); split: -2.76%, +0.04% InvThroughput: 13990 -> 11588 (-17.17%) VClause: 453 -> 436 (-3.75%) Copies: 1350 -> 1159 (-14.15%); split: -18.07%, +3.93% PreSGPRs: 2010 -> 2084 (+3.68%) PreVGPRs: 1888 -> 1586 (-16.00%) VALU: 7066 -> 4648 (-34.22%) SALU: 2432 -> 3004 (+23.52%) VOPD: 20 -> 5 (-75.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41968>	2026-06-16 10:59:28 +00:00
Rhys Perry	2292109694	ac/lower_global_access: extract constants after ishl/imul Extract constants from imul(iadd(a, #b), #c) or imul(u2u64(iadd.nuw(a, #b)), #c). nir_opt_algebraic tries to make sure those expressions don't appear, but it's not perfect. fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*): Totals from 308 (18.98% of 1623) affected shaders: Instrs: 38046 -> 23516 (-38.19%) CodeSize: 220700 -> 142704 (-35.34%) VGPRs: 9720 -> 6084 (-37.41%) Latency: 264784 -> 241855 (-8.66%); split: -8.71%, +0.05% InvThroughput: 43237 -> 21806 (-49.57%) VClause: 630 -> 647 (+2.70%) Copies: 1812 -> 1762 (-2.76%) PreVGPRs: 4081 -> 2606 (-36.14%) VALU: 19466 -> 9765 (-49.84%) VOPD: 45 -> 36 (-20.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41968>	2026-06-16 10:59:28 +00:00
Rhys Perry	a197036242	ac/lower_global_access: parse u2u64 even if out_offset!=NULL We might be able to extract constants from this 32-bit value. fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.): Totals from 74 (4.56% of 1623) affected shaders: Instrs: 7412 -> 5184 (-30.06%) CodeSize: 44416 -> 31064 (-30.06%) VGPRs: 1920 -> 1392 (-27.50%) Latency: 55606 -> 55354 (-0.45%); split: -0.56%, +0.10% InvThroughput: 7612 -> 4504 (-40.83%) PreVGPRs: 648 -> 578 (-10.80%) VALU: 3960 -> 2406 (-39.24%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41968>	2026-06-16 10:59:27 +00:00
Rhys Perry	3b4a317439	ac/lower_global_access: rewrite try_extract_additions In preparation for future improvements. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41968>	2026-06-16 10:59:27 +00:00
Rhys Perry	4a906e2b03	ac/lower_global_access: set cursor earlier This can improve CSE. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41968>	2026-06-16 10:59:27 +00:00
Jose Maria Casanova Crespo	cff8dbd452	v3dv: rename format_plane unorm/snorm flags to sw_unorm/sw_snorm Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42176>	2026-06-16 10:46:39 +02:00
Jose Maria Casanova Crespo	4edc659231	v3dv: route blending of UNORM16/SNORM16 RTs through software lowering UNORM16/SNORM16 render targets are backed by 16-bit-integer TLB formats, which V3D HW cannot blend. The compiler already supports software blend lowering in NIR, but V3DV only enabled it for dual-src blending. As a result format_supports_blending refused the BLEND_BIT for these formats and Dawn could not advertise the WebGPU Unorm16TextureFormats feature. Set pipeline->blend.use_software when any color attachment uses a software-normalised format so the existing NIR blend lowering kicks in, and expose VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT for those formats. Assisted-by: Claude Opus 4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42176>	2026-06-16 10:46:38 +02:00
Lionel Landwerlin	051ede709c	anv: fix 3DSTATE_SF line width programming with Bresenham lines Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42249>	2026-06-16 08:21:35 +00:00
Lionel Landwerlin	373ec78bc8	anv: add missing condition to update 3DSTATE_RASTER update_clip_raster() checks the rasterization sample count. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42249>	2026-06-16 08:21:35 +00:00
Lorenzo Rossi	1a7ee324e4	kraid: Add Foldable and initial tests for OpShiftLop Add a Foldable trait similar to what is already used in NAK for software emulation of opcodes, since Mali has many variations like V4I8 that run the same exact operation independently on each component of the vector, this commit also adds a FoldableComp trait that lets the implementor only focus on a single component and automatically implements Foldable. We also add tests on OpShiftLop as an initial subject, we'll add most of the arithmetic opcodes as time goes on to have a tight description of the hardware. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00
Lorenzo Rossi	15d6595a62	kraid: Add basic hw_tests Add the generic infrastructure to load/store the test data and compile the shader, along simple tests that use the hw_runner. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00
Lorenzo Rossi	1d171a4174	kraid: Add hw_runner This is a very small driver that just sends compute jobs to the graphics card without any of the Vulkan or OpenGL indirections. For now it only supports v10-v13 since it's what Kraid is targeting. Lots of the low-level code that handles CSF encoding and descriptor handling is in C foir semplicity (and because there is no genxml equivalent for rust yet). device.rs also implements a barebone memory-safe Rust abstraction for mali GPUs, as a treat. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00
Lorenzo Rossi	a3d39ec727	kraid/model: Ensure dyn Model is Send + Sync We'll need the extra ensurance if we want to share the model across threads. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00
Lorenzo Rossi	b4b8604fd0	kraid: Add OpIMul Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00
Lorenzo Rossi	83c26dd3a6	kraid/ops: Add a small crate documentation for conventions Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42189>	2026-06-16 07:32:12 +00:00

1 2 3 4 5 ...

208043 commits