Commit graph

214263 commits

Author SHA1 Message Date
Faith Ekstrand
01e56f408b nvk: Flush descriptor tables and heap maps on submit
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
4d04baba7d nvk: Use a coherent map for the event heap
Events are synchronization objects.  They really need to be coherent.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
870d3f1636 nvk/nvkmd: Invalidate maps before dumping pushbufs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
c04dacb42c nvk: Flush pushbufs in EndCommandBuffer()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
fac856112e nvk: Implement Flush/InvalidateMappedMemoryRanges()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
986c2cfed9 nvk/nvkmd: Add map sync to/from GPU helpers
If we have the ability to do cache ops from userspace (true on x86 and
aarch64), that's preferred.  Otherwise, we call into a back-end hook to
trap through to the kernel.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
fcb6c5c7a6 nvk/nvkmd: Add an NVKMD_MEM_COHERENT flag
All discrete GPU maps are coherent but that's not true on Tegra.  We
need a way to request coherent memory and also to ask for it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
dd53232667 nouveau/winsys: Add a NOUVEAU_WS_BO_COHERENT flag
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33959>
2025-10-15 22:05:53 -04:00
Faith Ekstrand
72c9256d8f turnip: Use the util cache helpers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:45 +00:00
Faith Ekstrand
1fbc73836e intel: Drop intel_mem.c/h
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:45 +00:00
Faith Ekstrand
f4a4c95d0c crocus: Use util_flush_inval_range()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:45 +00:00
Faith Ekstrand
77bea994b4 intel/sanitize-gpu: Use util_flush_inval_range()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:45 +00:00
Faith Ekstrand
7b77906a0c anv: Switch to util/cache_ops.h
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:44 +00:00
Faith Ekstrand
6d67828839 hasvk: Switch to util/cache_ops.h
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:44 +00:00
Faith Ekstrand
a47184e396 util/cache_ops/x86: Call util_get_cpu_caps() less
This also makes all the paths a bit more clear because we only ever
clflushopt on the clflusopt paths and only ever clflush on the clflush
paths.  It's really not much more code or logic duplication.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:44 +00:00
Faith Ekstrand
555881e574 util/cache_ops: Add some cache flush helpers
The x86 implementation was shamelessly stolen from intel_mem.c and the
aarch64 implementaiton was based on the code in Turnip.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
2025-10-16 01:19:44 +00:00
Ian Romanick
1dea86f773 brw: Don't do non-obvious things with BFN parameter ordering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Somehow dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.atan_frag
was able to generate a bitfield_select with a constant first
parameter. That makes the big comment here completely false.

Don't be clever. If the constant is in the wrong place,
commute_immediates during copy propagation will fix it.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
2025-10-16 00:37:30 +00:00
Ian Romanick
85db960e37 brw: Mark src3 of BFN as is_control_source
This prevents lower_regioning from doing bad things when the destination
and all the other sources are UW.

Other solutions considered:

- Have the type of src[3] match the destination type. This also required
  changes in combine_constants to allow the type be UD or UW.
- Make a new subclass brw_bfn_inst, and store the Boolean function
  selector outside the src[] array. This was a lot more code and a lot
  more churn (+47,-27 vs +4).

Fixes: b948e6d503 ("brw: Use BFN to implement nir_opt_bitfield_select")
Suggested-by: Curro
Suggested-by: Ken
Closes: #14095
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
2025-10-16 00:37:30 +00:00
Sergi Blanch Torne
d0af217911 ci: Add missing aiohttp Python dependecy
Found a missing dependency for `pipeline_message.py`.

Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37868>
2025-10-15 23:48:53 +00:00
Alyssa Rosenzweig
84d8e6824b treewide: don't check before free
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.

Via Coccinelle patches:

    @@
    expression ptr;
    @@

    -if (ptr) {
    -free(ptr);
    -}
    +free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr) {
    -FREE(ptr);
    -}
    +FREE(ptr);

    @@
    expression ptr;
    @@

    -if (ptr) {
    -ralloc_free(ptr);
    -}
    +ralloc_free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -free(ptr);
    -}
    -
    +free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -FREE(ptr);
    -}
    -
    +FREE(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -ralloc_free(ptr);
    -}
    -
    +ralloc_free(ptr);

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892>
2025-10-15 23:01:33 +00:00
Dave Airlie
543c9be87a nir/coopmat: fix non square load/store lowering for flexible dimensions
This shouldn't affect radv, but we should do the calculations correctly for
when non-square matters.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37879>
2025-10-16 07:19:28 +10:00
Tomeu Vizoso
836e1d65f6 teflon: Link to the ethos driver
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:15 +00:00
Tomeu Vizoso
bb72d78b2c pipe-loader: Load the ethos accel driver
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:15 +00:00
Tomeu Vizoso
2581c3ab60 ethos: Initial commit of a driver for the Arm Ethos-U65 NPU.
Supports all models in the test suite. No optimizations implemented yet.

Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:15 +00:00
Tomeu Vizoso
b3262b37ce teflon: Add support for the ResizeNearestNeighbor operation
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:15 +00:00
Tomeu Vizoso
0001dab219 teflon: Add support for the StridedSlice operation
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:14 +00:00
Tomeu Vizoso
83b9eb038f teflon: Add support for the MaxPool operation
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:13 +00:00
Tomeu Vizoso
48983c3198 teflon/tests: Replace YOLOX model with that from TI
The one we are testing currently with doesn't have a properly maintained
upstream repository nor demo.

Use the model from TI's zoo so we benefit from their maintenance:

https://github.com/TexasInstruments/edgeai-yolox

Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699>
2025-10-15 20:10:13 +00:00
Romaric Jodin
d0de915c0c glapi: static_data: do not use __file__ to get gl symbols file
Use an explicit path for libgl-symbols.txt from the build system
instead of reconstructing it from __file__.

The issue is that for Android build system, everything is sandboxed
and that file is not in the same root as the python script. Thus we
need a proper explicit path in meson to be able to translate it to a
legal Android construct that is capable of finding that file.

Update everything using libgl_public_functions to propagate that path.

Ref #14072

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37866>
2025-10-15 19:38:40 +00:00
Ruijing Dong
6e1988e3ed radeonsi/vcn: Correct a typo condition for jpeg decoding
Checking dec->jctx[i] instead of sctx->ctx

Cc: mesa-stable

Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37875>
2025-10-15 19:04:19 +00:00
Danylo Piliaiev
9f85c8897a tu: Synchronize access to copy_timestamp_cs_pool
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
tu_u_trace_submission_data_finish happens on the other thread than
tu_create_copy_timestamp_cs.

Fixes: 6e5944ec4b ("tu: Cache copy timestamp cs to avoid allocations on submit")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37848>
2025-10-15 18:43:32 +00:00
Timur Kristóf
4982f435f9 radv: Document SWITCH_ON_EOP and WD_SWITCH_ON_EOP
Just add some code comments for the next person trying to
understand these bits. No functional changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
2025-10-15 18:08:50 +00:00
Timur Kristóf
8ea08747b8 radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR
Mitigate a GPU hang in Dota 2 and Rise of the Tomb Raider
by reducing the primitive rate for triangle lists.
This workaround is not documented by AMD and may not be correct.

The problem isn't well understood and needs further investigation
to narrow down what the root cause is. Until then, it's better
to give users something that works, even if not optimal.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
2025-10-15 18:08:50 +00:00
Timur Kristóf
6f499141f5 radv: Disable compute queues when the regalloc bug is present
Compute queues may run compute dispatches in parallel with
the graphics queue, even from other processes/apps.
At the moment we can't make sure that all compute shaders
use a workgroup size of 256 to mitigate the regalloc hang,
so disable compute queues on affected chips.

Can be reverted if a better mitigation is found in the future.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
2025-10-15 18:08:49 +00:00
Timur Kristóf
765a748840 radeonsi: Don't use compute queue with regalloc hang bug
It already didn't use compute queues on GFX6, but some GFX7
chips are also affected by the same bug.

Compute queues may run compute dispatches in parallel with
the graphics queue, even from other processes/apps.
At the moment we don't have a way to restrict all workgroups
to 256 invocations, so instead let's make sure not to use the
compute queue.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
2025-10-15 18:08:49 +00:00
Autumn Ashton
15d375dc6e nvk: Implement VK_NVX_image_view_handle
This is used by DLSS to pass in image view
descriptors via parameter buffers for its
kernel launches.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37889>
2025-10-15 17:53:06 +00:00
Caio Oliveira
f861cd47d6 brw: Add variable for opcode in the brw_set_* high-level helpers
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37896>
2025-10-15 17:22:04 +00:00
Eric Engestrom
12a4d68580 docs: add sha sum for 25.2.5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37895>
2025-10-15 17:17:21 +00:00
Eric Engestrom
d13c1d9ec2 docs: add release notes for 25.2.5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37895>
2025-10-15 17:17:21 +00:00
Eric Engestrom
5ae1d09220 docs: update calendar for 25.2.5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37895>
2025-10-15 17:17:21 +00:00
Mel Henning
ff7f785f09 nvk: Fix maxVariableDescriptorCount with iub
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Otherwise, we would report very high values for inline uniform block
since NVK_MAX_DESCRIPTOR_SET_SIZE is a lot larger than
NVK_MAX_INLINE_UNIFORM_BLOCK_SIZE.

Fixes
dEQP-VK.api.maintenance3_check.support_count_inline_uniform_block_nonzero_binding_offset
on vulkan-cts-1.4.4.0

Fixes: 6a74b3e311 ("nvk: Support VkDescriptorSetVariableDescriptorCountLayoutSupport")
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37878>
2025-10-15 17:02:56 +00:00
Mike Blumenkrantz
db9dbcbec0 zink: defer swapchain updates for interval changes if acquired image is active
in the case where an app triggers a swap interval change mid-frame, this handling
previously triggered an immediate swapchain retire and then presented the new swapchain
which had yet to be rendered to

instead, defer swapchain updates to immediately after present when things are
safe to ensure that the right image is always presented

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14104

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37894>
2025-10-15 12:29:05 -04:00
Gert Wollny
673351bbf3 r600: Fix comparison of strides array when emitting vertex buffers
The comparison was only comparing a number of bytes where we actually
have to compare a number of dwords (Thanks QuadShader for digging into this)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14067
Fixes: 659b7eb279 ("r600: better tracking for vertex buffer emission")

v2: use element size instead of type size (Vitaly)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37856>
2025-10-15 15:25:04 +00:00
Lionel Landwerlin
49226692e5 brw: fix invalid sparse bitfield offset computation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
dest_size is the number of outputs to be provided into the IR, but the
location of the sparse bitfield in the dst temporary SEND destination
might be different (shorter due to masking of unused components
computed above).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14094
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37876>
2025-10-15 14:42:51 +00:00
Anna Maniscalco
a7b7ebf08b freedreno/afuc: Add x1e fw-id
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37886>
2025-10-15 14:18:36 +00:00
José Roberto de Souza
19de4b82f9 intel/brw: Store and set sfid in memory fences
sfid is another field that is not preserved after brw_transform_inst_to_send()
so we need to store it before transform and retore it to preserve the sfid value.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
a259f64595 intel/brw: Call lower_hdc_memory_fence_and_interlock() with brw_send_inst
With that we can avoid some as_send() calls.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
5b4deb7d2d intel/brw: Fix LSC fence scope and flush type
Opcodes SHADER_OPCODE_INTERLOCK and SHADER_OPCODE_MEMORY_FENCE are emitted as
brw_send_inst and at nir to brw conversion the desc field is set with scope and
flush type of the instruction.
But when brw_inst is converted to brw_send_inst all special fields of
brw_send_inst are set to 0, causing scope and flush type to always be 0.

So here calling lower_lsc_memory_fence_and_interlock() with brw_send_inst
parameter and storing the desc before brw_transform_inst_to_send().

I still have not figure out why we need do brw_transform_inst_to_send() even
if it is already a brw_send_inst but not doing so causes a segfault in
foreach_block_and_inst_safe(block, brw_inst, inst, s.cfg) of
brw_lower_logical_sends(), also other opcodes of that function does something
similar so I don't think that is wrong.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
Rhys Perry
2985fb0df3 radv: allow WGP mode with task/mesh
In practice, mesh shaders probably won't use WGP mode because they use
LDS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791>
2025-10-15 13:37:48 +01:00
Rhys Perry
dd2f34c777 radv: use CU mode when LDS is used
This improves performance of llama.cpp.

fossil-db (navi21):
Totals from 1598 (2.00% of 79825) affected shaders:
MaxWaves: 30182 -> 29278 (-3.00%); split: +0.04%, -3.03%
Instrs: 1013136 -> 1013065 (-0.01%); split: -0.07%, +0.07%
CodeSize: 5275876 -> 5274948 (-0.02%); split: -0.06%, +0.04%
VGPRs: 86176 -> 88016 (+2.14%); split: -0.22%, +2.36%
SpillVGPRs: 0 -> 11 (+inf%)
Scratch: 0 -> 4096 (+inf%)
Latency: 7954289 -> 7824742 (-1.63%); split: -1.64%, +0.01%
InvThroughput: 1511429 -> 1510912 (-0.03%); split: -0.89%, +0.86%
VClause: 26503 -> 26460 (-0.16%); split: -0.23%, +0.07%
SClause: 19032 -> 19039 (+0.04%); split: -0.01%, +0.05%
Copies: 74577 -> 74329 (-0.33%); split: -0.79%, +0.46%
Branches: 20278 -> 20279 (+0.00%)
VALU: 665079 -> 664831 (-0.04%); split: -0.09%, +0.05%
SALU: 124899 -> 124818 (-0.06%); split: -0.08%, +0.01%
VMEM: 46141 -> 46163 (+0.05%)

fossil-db (navi31):
Totals from 1609 (2.02% of 79825) affected shaders:
MaxWaves: 39724 -> 38880 (-2.12%)
Instrs: 1147767 -> 1147595 (-0.01%); split: -0.04%, +0.03%
CodeSize: 5777072 -> 5776376 (-0.01%); split: -0.03%, +0.02%
VGPRs: 91752 -> 93132 (+1.50%); split: -0.03%, +1.53%
Latency: 7526930 -> 7396201 (-1.74%); split: -1.74%, +0.00%
InvThroughput: 1083131 -> 1088328 (+0.48%); split: -0.45%, +0.93%
VClause: 25864 -> 25789 (-0.29%); split: -0.33%, +0.04%
SClause: 19136 -> 19135 (-0.01%); split: -0.02%, +0.01%
Copies: 80797 -> 80501 (-0.37%); split: -0.42%, +0.05%
VALU: 674455 -> 674160 (-0.04%); split: -0.05%, +0.01%
SALU: 123849 -> 123806 (-0.03%)

fossil-db (gfx1201):
Totals from 1614 (2.02% of 79839) affected shaders:
MaxWaves: 40140 -> 39296 (-2.10%)
Instrs: 1183227 -> 1183102 (-0.01%); split: -0.04%, +0.03%
CodeSize: 6091060 -> 6090636 (-0.01%); split: -0.03%, +0.03%
VGPRs: 90708 -> 92040 (+1.47%); split: -0.01%, +1.48%
Latency: 7588683 -> 7425866 (-2.15%); split: -2.15%, +0.00%
InvThroughput: 1070469 -> 1075700 (+0.49%); split: -0.50%, +0.99%
VClause: 25691 -> 25597 (-0.37%); split: -0.37%, +0.00%
SClause: 19095 -> 19086 (-0.05%); split: -0.05%, +0.01%
Copies: 80753 -> 80452 (-0.37%); split: -0.42%, +0.05%
VALU: 665218 -> 664922 (-0.04%); split: -0.05%, +0.01%
SALU: 144059 -> 144011 (-0.03%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791>
2025-10-15 13:37:48 +01:00