Commit graph

217958 commits

Author SHA1 Message Date
Hyunjun Ko
260908cecb anv: Add dummy workload for AV1 decode on affected platforms (Wa_1508208842)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Implement software workaround for AVP decoder corruption on Gen12
platforms. These platforms require a warmup workload before
the actual AV1 decode to prevent output corruption.

- Gen12: Tiger Lake, DG1, Rocket Lake, Alder Lake

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39604>
2026-01-30 04:24:05 +00:00
Matt Arsenault
c431eaad63 ac/llvm: Use new denormal_fpenv attribute for llvm >= 23
Reviwed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39566>
2026-01-30 04:00:05 +00:00
Matt Arsenault
ec9df376d8 ac/llvm: Remove -promote-alloca workaround
This bug was fixed many years ago.

Reviwed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39566>
2026-01-30 04:00:04 +00:00
Hyunjun Ko
8e9fec8e40 anv/video: Compute AV1 tile positions internally
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The pMiColStarts/pMiRowStarts arrays from applications may have
incorrect units. Instead of using them directly, compute the tile
start positions in superblock units internally based on the tile
dimensions.

Cc: mesa-stable
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
2026-01-30 03:28:01 +00:00
Hyunjun Ko
8004f46466 anv/video: fix a typo in Vulkan AV1 decoding.
Cc: mesa-stable
Fixes: e510efed05d("anv: support in-loop super resolution for AV1 decoding")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
2026-01-30 03:28:01 +00:00
Rob Clark
2659262335 freedreno/decode: Add lua handler to filter descriptors
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add a script hook which can be used to decide whether to show a
particular descriptor variant.  For example, the FMT determines
MULTI_PLANE vs SINGLE_PLANE, and the TYPE determines between
BUFFER vs other formats.

Some ambiguity remains.  We could do better in most cases by
extracting info from the enabled shader stages.  But this is
a good start.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:25 +00:00
Rob Clark
ebde70cdce freedreno/decode: Allow direct access to domain bitfield
For parsing packets or descriptors, it is useful to be able to use
pkt.FOO instead pkt[2].FOO.  This makes it easier when fields move
between dwords in the domain.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:25 +00:00
Rob Clark
6d7a056c8b freedreno/decode: Replace/remove __tonumber()
This was never actually implemented by lua.  Remove it.  In the case of
enums, implement the __eq() function instead so enum values can be
compared for equality.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:25 +00:00
Rob Clark
cdb8c6a14c freedreno/decode: Add script support for enum types
Allows using r.$enumtypename.$enumvalname to access enum values.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:24 +00:00
Rob Clark
8ab30de263 freedreno/registers: Descriptor variants
Document which fields apply to which descriptor variants.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:24 +00:00
Rob Clark
4555692e44 freedreno/decode: Decode all descriptor variants
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:23 +00:00
Rob Clark
d9261e6422 freedreno/decode: Extract out helper to set varset
For decoding descriptor variants, we'll need to dynamically set the
desctype varset.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:22 +00:00
Rob Clark
8335a4a0b4 freedreno/decode: Fix gen8 descriptor address
The BASE_LO/HI moved in gen8.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:21 +00:00
Rob Clark
55c8aa8cbd freedreno/registers: Rename A6XX_TEX_MEMOBJ
The gen8 descriptors already used the name "memobj" which better matches
docs.  This just cleans up a6xx/a7xx to match.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:21 +00:00
Rob Clark
74ecf3d0d0 freedreno/registers: Drop a6xx descriptor chip use
We aren't really using this, other than to document the field was added
in a7xx.  And it stands in the way of using a new enum for descriptor
types.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:20 +00:00
Rob Clark
c9d21ff6fd freedreno/decode: Add multi-plane descriptor coverage
Add tests which utilize multi-planar descriptors.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:20 +00:00
Rob Clark
7d373bcdd8 freedreno/decode: Enable --bindless for cffdump tests
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:19 +00:00
Rob Clark
b0b16af6d6 freedreno/decode: Skip bindless dumps on pre-bindless hw
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39573>
2026-01-29 23:52:19 +00:00
Osama Abdelkader
0423488955 vulkan/wsi: Fix realloc error handling in wsi_get_modifiers_for_format
Replace assert() with proper error checking for realloc() failure.
If realloc fails, free any existing modifiers, clean up resources,
and return NULL instead of potentially crashing or leaking memory.

Fixes a potential memory leak when memory allocation fails.

Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39215>
2026-01-29 23:15:37 +00:00
Maaz Mombasawala
50f4a79d98 ci: Update vmware farm admins.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Martin Krastev (@blu) left broadcom a while ago, update the admins of vmware
farm to Maaz Mombasawala (@mombasa) and Neha Bhende (@bhenden).

Signed-off-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39473>
2026-01-29 22:33:06 +00:00
Máté Pinczel
e6ea2bef6b nak: implement uror and urol using shf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39409>
2026-01-29 20:57:08 +00:00
Ian Forbes
9e7f757f0f svga: Implement GL_ARB_conditional_render_inverted
This was already supported if we have the DX10 SetPredication command.
We are already handling the conditional correctly in svga_render_condition.
The support is indicated by have_set_predication_cmd.

Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39619>
2026-01-29 20:41:44 +00:00
Caio Oliveira
db4bc5407f brw: Print "GRF registers" in INTEL_DEBUG=shaders output
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39601>
2026-01-29 20:16:48 +00:00
Caio Oliveira
0d19fc8256 brw: Fix "GRF registers" stats output
Pick the value from the brw_shader instead of from the prog_data, since
when there are multiple variants, the prog_data one will have the
maximum value.  Picking the wrong value also caused compute shaders
that had a single variant to report 0 GRFs since the prog_data was
being filled after the generate_code() call.

Issue spotted by Felix DeGrood.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39601>
2026-01-29 20:16:48 +00:00
Ian Forbes
7fd43d6e35 svga: Increase max_combined_shader_output_resources and SSBO limit to 16
The gl43 capability indicates we have a DX11.1+ device which supports
64 UAVs shared across all stages. This limit is roughly equivalent to
GL_MAX_COMBINED_SHADER_OUTPUT_RESOURCES which is controlled by
caps.max_combined_shader_output_resources which we currently set to
SVGA_MAX_SHADER_BUFFERS (8) which is probably too low since this limit
is also supposed to include render targets which we also set to 8.

The shader linker will validate that the pipeline does not exceed this
combined limit so we don't have to worry about the sum of the max for all
stages (16*5=80) now exceeding it.

Increasing the combined limt and the number of SSBOs from 8 to 16 allows
Blender to run as it requires 12 SSBOs. In theory we could increase the
combined limit to 56 but these limits are poorly documented and
implemented.

Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39043>
2026-01-29 19:59:17 +00:00
Valentine Burley
a309932429 zink/ci: Enable optimal_keys for zink-tu-a750
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39530>
2026-01-29 17:00:57 +00:00
Valentine Burley
d5c040ca9d zink/ci: Re-enable optimal_keys for zink-tu-a618
This was lost when enabling VVL.

Variables should not be set in rule templates.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39530>
2026-01-29 17:00:57 +00:00
Valentine Burley
275e5e064d zink/ci: Fix a few job timeouts
GitLab job timeouts should be set on individual jobs rather than in
generic rule templates.

Set zink-lavapipe to a 15 minute timeout.

LAVA jobs should have the blanket 1 hour timeout even if the jobs don't
take that long, due to how lava-job-submitter works.

Remove redundant timeouts from .radv-zink-test-valve, as they were always
being overridden.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39530>
2026-01-29 17:00:57 +00:00
Caterina Shablia
237e2d7b32 panvk: implement sparseResidencyImage3D
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:45 +00:00
Caterina Shablia
a93cf2b991 pan/lib: remove deadcode
These bits of code were used when sparse was implemented in terms
of u-interleaved, but are not necessary anymore.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:45 +00:00
Caterina Shablia
f43b8ee5ca panvk: implement sparse in terms of interleaved 64k
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:45 +00:00
Caterina Shablia
69d067fe1c pan/lib: introduce standard_sparse_mapping_granularity
And immediately implement it in terms of
DRM_FORMAT_MOD_ARM_INTERLEAVED_64K.

Also ban DRM_FORMAT_MOD_ARM_INTERLEAVED_64K for WSI in panfrost.
Normally, the modifier's test_props would take care of but as
panfrost doesn't use test_props, this has to be handled in
panfrost itself.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:44 +00:00
Caterina Shablia
d6412ebbdf pan/genxml: add interleaved 64k clump ordering and block format
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:43 +00:00
Caterina Shablia
c19efbf606 drm-uapi: update drm_fourcc.h
https://cgit.freedesktop.org/drm-misc/commit/?id=3aecd55af5b83d16d84e3c333d4163999ee8ff51

Adds DRM_FORMAT_MOD_ARM_INTERLEAVED_64K

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:43 +00:00
Caterina Shablia
09c2fadf90 panvk: merge vm_bind ops in some cases
Some apps exhibit bind patterns that can be easily implemented in
terms of fewer vm_bind ops than we currently do.

For now let's only optimize the case when a vm_bind op is
contiguous wrt the previous one on the right, in both VA and
BO (if applicable) ranges. With this optimization alone we already
get a decent reduction in some CTS sparse tests.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:43 +00:00
Caterina Shablia
5279eb7dfc panvk: let the mod handler handle DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED
There are additional conditions that must be met before
DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED can be used. These
conditions are verified by the handler of this modifier, but not
panvk_image_can_use_mod. Let's call the handler of this modifier
so it can finally decide whether this modifier can be used.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
2026-01-29 16:33:42 +00:00
Lionel Landwerlin
8661cb12e2 anv: implement VK_KHR_internally_synchronized_queues
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39534>
2026-01-29 16:03:26 +00:00
Lionel Landwerlin
db5319fbf0 anv/xe: move special WaitIdle optimization to submission path
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39534>
2026-01-29 16:03:26 +00:00
Rhys Perry
3fed41eade radv: improve skipping of creation of NIR for cached rt pipeline libraries
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38263>
2026-01-29 15:41:34 +00:00
Rhys Perry
89eefdcadb radv: fix when incomplete rt pipeline libraries are loaded from cache
It might be that the radv_pipeline_cache_lookup_nir_handle() in
radv_ray_tracing_pipeline_cache_search() fails but we will later need the
NIR. If rt_stages[i].shader was non-NULL, then we would not have created
the NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.2
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38263>
2026-01-29 15:41:34 +00:00
Olivia Lee
d6745b358d hk: fix hk_passthrough_gs_key size computation
The non-dynamic members of xfb_info are already included in
sizeof(hk_passthrough_gs_key), so adding nir_xfb_info_size counts them
twice. Because of this we were including uninitialized memory in the key
in hk_handle_passthrough_gs, which is undefined behavior.

Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39574>
2026-01-29 15:24:32 +00:00
Georg Lehmann
70f0e75262 nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
Foz-DB Navi48:
Totals from 177 (0.21% of 82405) affected shaders:
Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00%
CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00%
Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00%
VClause: 3613 -> 3616 (+0.08%)
Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36%
VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00%
SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531>
2026-01-29 14:44:37 +00:00
Tapani Pälli
85978ccd28 anv: route clear operations on compute to companion
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes bunch of cts tests hitting issues when attempting
anv_image_mcs_op with compute.

Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39581>
2026-01-29 14:25:54 +00:00
Zan Dobersek
b6a049ea4b tu: allocate transient attachments used for LRZ
When proceeding with rendering, any transient attachment that will be used
as LRZ buffer should also be allocated. With GMEM rendering, these
attachments otherwise remained unloaded and subsequent LRZ clears produced
GPU faults.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Fixes: #14604
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39535>
2026-01-29 13:59:28 +00:00
Mike Blumenkrantz
999aaac12e ntv: emit ViewIndex with flat for fragment stage
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39606>
2026-01-29 13:34:01 +00:00
Mike Blumenkrantz
3558d1e162 ntv: improve setting Aliased decoration on bo emits
driver_location is too flimsy and doesn't account for binding matches
in different descriptor sets

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39606>
2026-01-29 13:34:01 +00:00
Mike Blumenkrantz
c2e4ec75fc ntv: avoid setting Block decoration repeatedly on bo struct types
this could happen when reusing the bo struct type in vulkan mode

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39606>
2026-01-29 13:34:00 +00:00
Nick Hamilton
9f9788330e pvr: Fix the isp samples per tile calculation
The samples per tile calculation was incorrect for sample count 4 and 8.

Fix:
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.depth.samples_4.*
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.stencil.samples_4.*

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39580>
2026-01-29 12:53:32 +00:00
Georg Lehmann
c3e12429c5 nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
Fix the NaN correctness of the original pattern, and add more variants.

Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
f872c13707 nir/opt_algebraic: use contract instead of inexact for more patterns
These use more precise operations, so contract is enough.

Foz-DB Navi48:
Totals from 248 (0.30% of 82405) affected shaders:
Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01%
CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01%
Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01%
SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18%
Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07%
VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01%
SALU: 28706 -> 28707 (+0.00%)
VOPD: 1196 -> 1198 (+0.17%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00