Commit graph

222015 commits

Author SHA1 Message Date
Faith Ekstrand
ddfde51985 pan/nir: Add a pass for lowering texture ops in NIR on Valhall+
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
ffae24bfe2 panvk: Implement bitfield_select
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
58cba7887a nir: Add a new nir_texop_gradient_pan
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
e0fffabda7 nir/builder: Allow backend1/2 in nir_build_tex()
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
337aaa0ab9 pan,nir: Add cube face intrinsics
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:15 +00:00
Faith Ekstrand
c99f97efd3 panfrost: Add and use a new pan_nir_res_handle() helper
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:15 +00:00
Yiwei Zhang
1883f3094f ci: uprev virglrenderer
This uprev:
- brings in vrend fixes with virgl ci expectation updated
- enables new venus extensions support
- drops render-server-worker since process isolation is the default
- updates venus ci expectations

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41331>
2026-05-05 00:41:46 +00:00
Jesse Natalie
758a0e1ad9 d3d12: proactively trim completed pending-free entries
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:44 +00:00
Jesse Natalie
955b2672d3 d3d12: drop peer-batch peeking in resource_is_busy / wait_idle
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:44 +00:00
Jesse Natalie
18012b69ab d3d12: implement pb_fence vtbl for cache/slab reuse
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:44 +00:00
Jesse Natalie
b8f2b968de d3d12: reclaim in-flight BO memory on allocation failure
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:43 +00:00
Jesse Natalie
a1c7f7479d d3d12: transfer batch->bos refs to screen at submit
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:43 +00:00
Jesse Natalie
a518b7f103 d3d12: transfer batch local_bos refs to screen at submit
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:42 +00:00
Jesse Natalie
3e47a65811 d3d12: clear stale per-context BO state at context destroy
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:41 +00:00
Jesse Natalie
381b56389c d3d12: add screen pending-free list plumbing
Assisted-by: Claude Opus 4.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41322>
2026-05-05 00:22:41 +00:00
Konstantin Seurer
af746cc2a6 radv/rt: Use 64-bit keys for gfx11-
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This has a bit of sorting overhead, but can significantly increase BVH
quality especially in big BVHs. gfx12 is faster at intersecting, so only
enable for gfx11 and earlier right now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
2026-05-04 20:42:50 +00:00
Konstantin Seurer
c432ffc5ce vulkan: Implement 64-bit morton codes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
2026-05-04 20:42:50 +00:00
Konstantin Seurer
74e21c2c59 vulkan: Rename key_id_pair to key32_id_pair
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
2026-05-04 20:42:49 +00:00
Konstantin Seurer
04463fe91e vulkan: Rename radix_sort to radix_sort_u64
Preparation for optionally building with 96bit radix sort.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
2026-05-04 20:42:49 +00:00
Konstantin Seurer
a1c2b96cd1 vulkan/radix_sort: Add support for 96-bit keys
64-bit morton codes are required for decent lbvh tlas builds since the
scene bounds are usually much bigger than the area that is actually
important.

The changes were done without understanding the code but they seem to
work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
2026-05-04 20:42:49 +00:00
José Roberto de Souza
a2175b7ec3 iris: Improve and standardize the behavior of madvice in i915
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This removes the conversion between iris_madvice and i915 values, placing it by
a static assert in case this values ever don't match.

Also adds a warn once in case of DRM_IOCTL_I915_GEM_MADVISE ever fails.

As at last in case of failure of DRM_IOCTL_I915_GEM_MADVISE returns as if the
bo is not retained anymore to have a safe behavior.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Suggested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40573>
2026-05-04 20:11:23 +00:00
José Roberto de Souza
cbc1ec206d intel: Add support for madvise purgeable VMAs in Xe KMD
Initially this uAPI was part of the first public version of Xe KMD uAPI but as
it did not had any users it was removed in some of fixes releases of the
Linux version that added Xe KMD but I missed to update the comment in Mesa.

At that time this uAPI had a restriction that did not allowed us to use, it
was compatible with VMs created with DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE but now
this flag is supported so here implementing it.

Link: https://patchwork.freedesktop.org/series/156651/
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40573>
2026-05-04 20:11:23 +00:00
José Roberto de Souza
b2293977e4 intel: Sync xe_drm.h
Sync with:
    commit a6fe8bd56f7344b0c42f287c4b744d4d43e31ebe
    Merge: 0389aa700912 314f6179e370
    Author: Dave Airlie <airlied@redhat.com>
    Date:   Thu Apr 23 16:01:08 2026 +1000

        Merge tag 'drm-intel-next-fixes-2026-04-22' of https://gitlab.freedesktop.org/drm/i915/kernel

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40573>
2026-05-04 20:11:23 +00:00
Caleb Callaway
0d9ae02665 docs: fix Intel tracepoints.py path
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40988>
2026-05-04 20:06:17 +00:00
Gurchetan Singh
b5f91ed589 gfxstream: emit global state wrapped decoding for vkCmdEvent
Helpful for gfxstream-on-lavapipe.

Test: launch_cvd --gpu_mode=gfxstream_guest_angle_host_lavapipe

Reviewed-by: David Gilhooley <djgilhooley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41306>
2026-05-04 19:49:52 +00:00
Serdar Kocdemir
4ac60d221f gfxstream: some cleanup on device extension allow list
Remove duplicated items, use KHR version of vertex attrib divisor
extension, re-enable VK_KHR_16bit_storage.

Test: CI

Reviewed-by: David Gilhooley <djgilhooley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41306>
2026-05-04 19:49:52 +00:00
Mike Blumenkrantz
7a56d8112f vulkan: update spec to 1.4.350
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41317>
2026-05-04 19:11:49 +00:00
Eric Guo
352a8d6beb pan/compiler: Clamp fp16 ldexp exponent range
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fix OpenCL-CTS error in `math_brute_force/test_bruteforce -w ldexp`

Valhall LDEXP.v2f16 takes a 16-bit exponent, while NIR ldexp uses a
32-bit exponent. Truncating large exponents can flip overflow into
underflow or leave huge 16-bit exponents to hardware behavior that does
not match OpenCL's expected signed infinity/zero results.

Clamp the exponent to a range sufficient to overflow or underflow all
fp16 values before lowering to ldexp16_pan.

Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41234>
2026-05-04 17:59:18 +00:00
Rhys Perry
081feabf9c nir/search: fix nir_algebraic_automaton after constant folding op(bcsel)
Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/jobs/98917704

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f4812dc11d ("nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41343>
2026-05-04 17:27:38 +00:00
Samuel Pitoiset
f47e7b7bd5 radv: bump VkConformanceVersion to 1.4.5.3
This property is unrelated to the CTS conformance process from Khronos,
it just means that the driver passes that CTS version, even if not
"officially" conformant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41258>
2026-05-04 17:05:47 +00:00
Georg Lehmann
0ff1650662 ac/nir/lower_tex_coord: fix moving wqm coordinates
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even if they are in the same block, we might still need to move the
source instructions if they are otherwise after our insert location.
This can happen in the case where we insert strict_wqm_coord before
terminate_if.

Fixes: ac33f82d54 ("ac/nir/lower_tex_coords: move input loads instead of cloning them")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41336>
2026-05-04 15:09:46 +00:00
Patrick Lerda
2c1923458c r600: update memory barrier operations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Note: the atomic host-mem-barrier tests assume that the atomic
buffer could be shared which is not how the r600 operates.

This change was tested on palm and cayman, with the exception
of the "atomic counter" tests, it fixes all the other cases:
spec/arb_shader_image_load_store/host-mem-barrier/.*: fail pass

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41199>
2026-05-04 11:34:57 +00:00
Rhys Perry
6f50dda648 aco/gfx11.7: fix v_pk_min_f16/v_pk_max_f16 opcode numbers
Apparently the opcode numbers in LLVM were wrong:
https://github.com/llvm/llvm-project/pull/195180

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 58debf726c ("aco/gfx11.7: add opcode numbers")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41311>
2026-05-04 11:11:52 +00:00
Danylo Piliaiev
8a146a1be9 tu/perfetto: Add performance warning tracepoints
LRZ and FDM have a few major performance pitfalls, if they are not
clearly surfaced when doing perfetto trace - they are easy to miss.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
2026-05-04 10:46:39 +00:00
Danylo Piliaiev
109d98b4cf tu/perfetto: Add a performance warning track to perfetto
The idea is to emit single tracepoints with warning that would
stick until the relevant render stage ends.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
2026-05-04 10:46:39 +00:00
Connor Abbott
638b10c5e0 tu: Disable LRZ when resuming if the GPU doesn't support tracking
We rely on tu_lrz_flush_valid_at_suspending_rp_boundary() to make sure
that subsequent resuming renderpasses get the correct LRZ state. However
this doesn't work on early a6xx GPUs without tracking support. Disable
LRZ in this case, similar to secondaries.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
2026-05-04 10:46:39 +00:00
Connor Abbott
f590e46b9d tu: Fix LRZ+FDM offset+secondaries
As the comment says, we need to have an image view in order to disable
LRZ so that secondaries know it's disabled. Noticed by inspection.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
2026-05-04 10:46:38 +00:00
Daniel Schürmann
f4812dc11d nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)
for all ALU instructions except fneg instead of using nir_opt_algebraic
for a small subset.

Totals from 17711 (8.49% of 208640) affected shaders: (Navi48)
MaxWaves: 364391 -> 364397 (+0.00%); split: +0.01%, -0.01%
Instrs: 33873994 -> 33780398 (-0.28%); split: -0.31%, +0.03%
CodeSize: 198627596 -> 198259724 (-0.19%); split: -0.23%, +0.05%
VGPRs: 1435516 -> 1435144 (-0.03%); split: -0.04%, +0.02%
SpillSGPRs: 652827 -> 654577 (+0.27%); split: -0.00%, +0.27%
SpillVGPRs: 594840 -> 593598 (-0.21%); split: -0.28%, +0.07%
Scratch: 31791360 -> 31543552 (-0.78%)
Latency: 417824569 -> 415881858 (-0.46%); split: -0.48%, +0.02%
InvThroughput: 80376232 -> 80307996 (-0.08%); split: -0.10%, +0.01%
VClause: 557238 -> 554770 (-0.44%); split: -0.50%, +0.06%
SClause: 688297 -> 688125 (-0.02%); split: -0.04%, +0.02%
Copies: 3571756 -> 3566704 (-0.14%); split: -0.44%, +0.29%
Branches: 628710 -> 628576 (-0.02%); split: -0.07%, +0.05%
PreSGPRs: 1100316 -> 1103478 (+0.29%); split: -0.02%, +0.30%
PreVGPRs: 1132139 -> 1128765 (-0.30%); split: -0.30%, +0.00%
VALU: 18944830 -> 18912030 (-0.17%); split: -0.20%, +0.03%
SALU: 4363054 -> 4342748 (-0.47%); split: -0.57%, +0.10%
VMEM: 1894420 -> 1891754 (-0.14%); split: -0.19%, +0.05%
SMEM: 1073860 -> 1073741 (-0.01%); split: -0.01%, +0.00%
VOPD: 1734659 -> 1735718 (+0.06%); split: +0.20%, -0.14%

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Daniel Schürmann
8b1c60add4 nir/opt_constant_folding: create const_value_for_alu() helper
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Georg Lehmann
52b195b4e8 nir/opt_algebraic: add more fmulz pattern
Totals from 3 (0.00% of 202440) affected shaders: (Navi48)
Instrs: 5684 -> 5641 (-0.76%); split: -0.77%, +0.02%
CodeSize: 30952 -> 30708 (-0.79%); split: -0.80%, +0.01%
Latency: 9236 -> 9199 (-0.40%); split: -0.42%, +0.02%
InvThroughput: 2287 -> 2273 (-0.61%)
VALU: 3900 -> 3884 (-0.41%)
SALU: 305 -> 289 (-5.25%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Emma Anholt
7372c7c9e2 tu: Add capture/replay for sparse buffers and descriptor buffer.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This matches the behavior of radv for these two.

Fixes:
dEQP-VK.binding_model.descriptor_buffer.traditional_buffer.capture_replay.sparse_buffer_descriptor_data_consistency
dEQP-VK.binding_model.descriptor_buffer.traditional_buffer.capture_replay.sparse_buffer_descriptor_data_consistency_and_usage

Fixes: 8feed47fce ("tu: Initial support for sparse binding")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38148>
2026-05-04 08:09:19 +00:00
Pierre-Eric Pelloux-Prayer
917058a4c5 radeonsi/tests: update expectations
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:33 +02:00
Pierre-Eric Pelloux-Prayer
2267c14803 ac/info: add gfx12.1 identification
Not the full support yet, just the id part so the family/gfx_level
fields are set to the proper values.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:31 +02:00
Pierre-Eric Pelloux-Prayer
20b0349b05 radeonsi: clamp cp prefetch size
Limit the size instead of asserting that the size (which comes
from the shader bo) is smaller.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15184
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:28 +02:00
Pavel Ondračka
c1f1b704d9 dri: add big-endian 8888 entries to driImageFormatToSizedInternalGLFormat
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
So that dma-buf-imported EGLImages on big-endian hosts resolve to a
sized GL internal format in st_bind_egl_image() instead of falling
back to unsized GL_RGBA/GL_RGB.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:43 +02:00
Pavel Ondračka
8f56c51d51 dri: add big-endian 8888 entries to dri2_format_table
So that dri2_get_mapping_by_fourcc() resolves the byte-reversed fourccs
(DRM_FORMAT_BGRA/BGRX/RGBA/RGBX8888) used for the native 8888 visual
on big-endian hosts.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:31 +02:00
Pavel Ondračka
1e97d3ed94 dri3: add big-endian 8888 fourccs to dri3_cpp_for_fourcc
Otherwise dri3_alloc_render_buffer() fails on big-endian hosts because
BGRA/BGRX/RGBA/RGBX8888 return cpp=0.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:21 +02:00
Samuel Pitoiset
9361a5b865 docs: describe the contributions workflow for RADV
This workflow has been discussed a lot with the team for the past
few years. Let's just clarify it for real in the documentation.

Co-written-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41239>
2026-05-04 06:35:14 +00:00
Georg Lehmann
38e691fc0a nir/opt_varyings: do no_signed_zero linking even for non removable stores
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
E.g. position in VS.

Foz-DB Navi48:
Totals from 948 (0.79% of 120695) affected shaders:
MaxWaves: 26816 -> 26828 (+0.04%)
Instrs: 799692 -> 796993 (-0.34%); split: -0.34%, +0.01%
CodeSize: 3855744 -> 3846816 (-0.23%); split: -0.24%, +0.01%
VGPRs: 50256 -> 50220 (-0.07%)
Latency: 2209359 -> 2207667 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 305260 -> 303519 (-0.57%); split: -0.57%, +0.00%
VClause: 11640 -> 11643 (+0.03%); split: -0.01%, +0.03%
SClause: 21152 -> 21149 (-0.01%)
Copies: 51658 -> 51675 (+0.03%); split: -0.11%, +0.14%
Branches: 18656 -> 18655 (-0.01%)
PreVGPRs: 37999 -> 37984 (-0.04%)
VALU: 469752 -> 467406 (-0.50%); split: -0.50%, +0.00%
SALU: 105433 -> 105323 (-0.10%); split: -0.11%, +0.00%

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Georg Lehmann
fac4edbcba nir/opt_varyings: back propagate signed zero information to outputs
Foz-DB Navi48:
Totals from 809 (0.67% of 120695) affected shaders:
MaxWaves: 21804 -> 21808 (+0.02%)
Instrs: 863131 -> 861310 (-0.21%); split: -0.22%, +0.01%
CodeSize: 4535500 -> 4523232 (-0.27%); split: -0.30%, +0.03%
VGPRs: 47304 -> 47280 (-0.05%)
SpillSGPRs: 170 -> 82 (-51.76%)
Latency: 6791484 -> 6786880 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 906281 -> 905301 (-0.11%); split: -0.11%, +0.00%
VClause: 16910 -> 16917 (+0.04%); split: -0.01%, +0.05%
SClause: 21856 -> 21827 (-0.13%); split: -0.14%, +0.01%
Copies: 61890 -> 61436 (-0.73%); split: -0.80%, +0.06%
Branches: 19725 -> 19640 (-0.43%)
PreSGPRs: 38011 -> 37851 (-0.42%)
PreVGPRs: 36482 -> 36454 (-0.08%)
VALU: 465316 -> 464323 (-0.21%); split: -0.22%, +0.00%
SALU: 143757 -> 143395 (-0.25%); split: -0.33%, +0.08%
VMEM: 36827 -> 36806 (-0.06%)
SMEM: 37769 -> 37768 (-0.00%)

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00