Commit graph

212972 commits

Author SHA1 Message Date
Mel Henning
e7a62d5eff util/macros: Add ATTRIBUTE_COLD
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:45 +00:00
Mary Guillemard
7790f98487 nouveau/headers: Add Blackwell support to nv_push_dump
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:45 +00:00
Mary Guillemard
9dccedc043 nouveau/headers: Include class headers instead of redefining class ids
Also clean up headers.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:44 +00:00
Mary Guillemard
b1f97c2778 nouveau/headers: Handle more gpfifo classes in vk_push_print
A good chunk of it excluding AMPERE B (header have a typo on some struct
making it clash with AMPERE A def)

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:44 +00:00
Mary Guillemard
fe44e8a7fa nouveau/headers: Handle all 3D classes in vk_push_print
We now handle from BLACKWELL_B down to FERMI_A (FERMI_B is excluded as we are missing the tex header)

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:43 +00:00
Mary Guillemard
4b985017dd nouveau/headers: Handle all DMA classes in vk_push_print
We now handle from BLACKWELL_DMA_COPY_B down to GT212_DMA_COPY.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:43 +00:00
Mary Guillemard
d7f226a3b2 nouveau/headers: Handle all compute classes in vk_push_print
We now handle from BLACKWELL_COMPUTE_B down to FERMI_COMPUTE_A.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:42 +00:00
Mary Guillemard
3a2b53f47f nouveau/headers: Autogenerate push method dumpers
Instead of typing this in nv_push.c, we now generate it based on the
class headers that are generated.

This makes it that we never have human errors in any of the checks and
allow to just support parsing everything we can.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
2025-09-26 19:40:42 +00:00
Jordan Justen
be61c12f3e anv: Use image view base-layer in can_fast_clear_color_att()
We currently only support fast clearing the first layer of an image.
Attachments use VkImageView which can specify a base-layer of the view
for an image attachment.

Fixes: 44351d67f8 ("anv: Change params of anv_can_fast_clear_color_view")
Ref: https://projects.blender.org/blender/blender/issues/141181
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37562>
2025-09-26 19:15:22 +00:00
Aleksi Sapon
8949473023 nir: Fix nir.h MSVC compilation for C++ source files
This kind of C initializer is not accepted by MSVC in C++ mode.

Fixed: 75292ae7 ("nir: Fix gnu-empty-initializer warning ")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37604>
2025-09-26 18:25:22 +00:00
Mel Henning
094804131e nak: Fix divergence test for redux availability
nak's divergence differs slightly from nir's divergence. Fix the test to
match what the backend will use, since we need to allocate a ureg.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13964
Fixes: 295373f29f ("nak: Implement nir_intrinsic_reduce with REDUX")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37585>
2025-09-26 18:07:04 +00:00
Konstantin Seurer
bb3e401cca Revert "lavapipe/ci: Disable stack-use-after-return detection for ASan"
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reverts commit 44d161a7a0.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37416>
2025-09-26 17:27:32 +00:00
Konstantin Seurer
9094b404d5 vulkan/cmd_queue: Handle struct arrays with pNext
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37416>
2025-09-26 17:27:32 +00:00
Konstantin Seurer
c76da351b0 vulkan/cmd_queue: Handle internal structs
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37416>
2025-09-26 17:27:32 +00:00
Konstantin Seurer
b02ef48e9d vulkan/cmd_queue: Remove unused variable
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37416>
2025-09-26 17:27:31 +00:00
Mike Blumenkrantz
b3b2daa28d lavapipe: VK_KHR_copy_memory_indirect
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37589>
2025-09-26 17:05:53 +00:00
Mike Blumenkrantz
010cd37e50 lavapipe: handle aspected depth/stencil memory->image HIC transfers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37589>
2025-09-26 17:05:52 +00:00
Mike Blumenkrantz
daa276b605 lavapipe: move copy_depth_box to lvp_image.c
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37589>
2025-09-26 17:05:52 +00:00
José Roberto de Souza
141a225ca1 intel/brw: Use ASR over SHR for SHADER_OPCODE_ISUB_SAT
src[1]/src0 is signed and Xe2+ SHR don't support operations over signed
data types so lets switch this over ASR that supports signed data
types.

Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37557>
2025-09-26 16:44:24 +00:00
José Roberto de Souza
c45f442d5c intel/decode: Add support to new version of Xe KMD devcoredump with canonical addresses
Customers suggested that Xe KMD should change all possible interfaces
visible to users to canonical address, with that we need some changes
to keep the decode of devcoredump working.

A old version of the tool will not be able to decode secondary batch
buffers when parsing a new version of the file but the new version of
this tool will be able to parse both versions of devcoredump file.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37570>
2025-09-26 16:15:53 +00:00
Danylo Piliaiev
24235bcac3 tu/perfetto: Use a separate track for VK_EXT_debug_utils labels
Labels set via VK_EXT_debug_utils are in a separate track due to the
following part of the spec:
 "An application may open a debug label region in one command buffer and
  close it in another, or otherwise split debug label regions across
  multiple command buffers or multiple queue submissions."

This means labels can start in one renderpass and end in another command
buffer, which breaks our assumption that stages can be modeled as a stack.
While applications aren't expected to use labels in such extreme ways,
even simpler cases can break our assumptions.

Having annotations in a separate track prevents the main track(s) from
entering an invalid state.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37028>
2025-09-26 15:45:21 +00:00
Georg Lehmann
46a4569c22 nir/opt_undef: prefer 0 over NaN for pack_half_2x16_rtz_split
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Using NaN doesn't usually allow any extra optimizations, and 0 is an inline
constant on AMD hw where this opcode is used with undef for fragment shader
exports.

Foz-DB GFX1201:
Totals from 889 (1.11% of 80287) affected shaders:
Instrs: 1676365 -> 1676348 (-0.00%)
CodeSize: 8827040 -> 8821760 (-0.06%)
Latency: 13346728 -> 13346699 (-0.00%)
InvThroughput: 1799283 -> 1799262 (-0.00%)
Copies: 108125 -> 108102 (-0.02%)
VALU: 974875 -> 974852 (-0.00%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>
2025-09-26 15:11:26 +00:00
Georg Lehmann
a7f8c6ed60 radv: call nir_opt_undef late too
Foz-DB GFX1201:
Totals from 2263 (2.82% of 80287) affected shaders:
MaxWaves: 57164 -> 57016 (-0.26%); split: +0.04%, -0.30%
Instrs: 2711595 -> 2678247 (-1.23%); split: -1.23%, +0.00%
CodeSize: 14066656 -> 13929720 (-0.97%); split: -1.01%, +0.03%
VGPRs: 139452 -> 140004 (+0.40%); split: -0.03%, +0.42%
Latency: 15902794 -> 15875935 (-0.17%); split: -0.17%, +0.00%
InvThroughput: 2179122 -> 2165716 (-0.62%); split: -0.62%, +0.00%
SClause: 61416 -> 61477 (+0.10%); split: -0.01%, +0.11%
Copies: 169781 -> 175175 (+3.18%); split: -0.05%, +3.22%
Branches: 53491 -> 53469 (-0.04%)
PreSGPRs: 114087 -> 114086 (-0.00%)
PreVGPRs: 115702 -> 115697 (-0.00%)
VALU: 1555907 -> 1535514 (-1.31%); split: -1.31%, +0.00%
SALU: 362560 -> 353803 (-2.42%)
SMEM: 106263 -> 106259 (-0.00%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>
2025-09-26 15:11:26 +00:00
Georg Lehmann
8343e45467 aco/lower_branches: update branch hints after changing jump targets
Fixes: 13ad3db43f ("aco/lower_branches: implement try_remove_simple_block() in lower_branches()")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>
2025-09-26 15:11:26 +00:00
Yiwei Zhang
2ea551e85a vulkan/util: drop workaround for ANB struct
VkPhysicalDevicePresentationPropertiesANDROID now properly extends
VkPhysicalDeviceProperties2.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37590>
2025-09-26 14:02:53 +00:00
Mike Blumenkrantz
0dc5caec36 vulkan: update spec to 1.4.328
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37590>
2025-09-26 14:02:52 +00:00
Hans-Kristian Arntzen
59278c2236 anti-lag: Do not enable layer by default.
There are too many reports of instability in the wild,
even in scenarios where anti-lag isn't really being actively used,
making it questionable to load the layer by default.

By using enable_environment, a certain environment variable
must be set for loader to consider loading the layer.
This could be removed once the layer stabilizes.

Cc: mesa-stable

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37509>
2025-09-26 13:29:38 +00:00
Hyunjun Ko
b7129a2085 anv/video: fix to set slice block size correctly for h265 decoding.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes dEQP-VK.video.encode.h265.resolution_change_dpb_layered_src_video_layout

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37412>
2025-09-26 12:27:59 +00:00
Hyunjun Ko
84802cf325 vulkan/video: fix misuse of CLAMP in h265 slice parsing.
Fixes: 7998106355 ("vulkan/video: Fix wrong parsing for H265 decoding")

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37412>
2025-09-26 12:27:59 +00:00
Hyunjun Ko
23c98417ae vulkan/video: fix h265 encoding with LT enabled.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37412>
2025-09-26 12:27:59 +00:00
Hyunjun Ko
896f95a37e vulkan/video: fix h265 decoding with LT enabled.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37412>
2025-09-26 12:27:58 +00:00
Simon McVittie
9d36bf891b vulkan: Compute path to write into JSON manifests once, use it everywhere
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reduces duplication: we only need to distinguish between Windows
and Unix in one place.

The previous code was inconsistent about using either the `platforms`
option, or the `host_machine`. Following the logic described in
commit 94379377 "lavapipe: build "Windows" check should use the host machine, not the `platforms` option.",
I've assumed that checking the host machine is the more-correct version
and used that.

Signed-off-by: Simon McVittie <smcv@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37576>
2025-09-26 10:47:31 +00:00
Simon McVittie
be8cac52d3 vulkan: Consistently form driver library names as prefix + name + suffix
This consistently uses `NAME.dll` on Windows, `libNAME.dylib` on Darwin
derivatives such as macOS, and `libNAME.so` on Linux, *BSD and so on.
It's also consistent about using the local variable name `icd_file_name`
for this name in every Vulkan driver, which was already the case in many
but not all drivers.

Some of these drivers probably don't make sense (or don't work) on
Windows and/or macOS, but if this is kept consistent for all drivers,
it should avoid the need for driver-specific commits like
commit 611e9f29e "lavapipe: fix icd generation for windows",
commit 951f3287 "lavapipe: set empty dll prefix",
commit 13e7a39f "lavapipe: fixes for macOS support",
commit 7008e655 "radv: Update JSON generator if Windows" and so on,
each time a driver is found to be relevant on more platforms than
previously believed.

Signed-off-by: Simon McVittie <smcv@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37576>
2025-09-26 10:47:31 +00:00
Christian Gmeiner
2a14b7224b etnaviv: Support ARB_stencil_texturing
The hardware support for stencil texturing is unclear from RE and
the feature databases. Enable this extension on halti5 GPUs as a
conservative starting point.

This also enables ARB_texture_stencil8.

Passes:
 - dEQP-GLES31.functional.stencil_texturing.format.stencil_index8*
 - dEQP-GLES31.functional.stencil_texturing.format.depth24_*
 - arb_texture_stencil8 piglit
 - arb_stencil_texture piglit

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37578>
2025-09-26 10:30:16 +00:00
Christian Gmeiner
06738c4ef6 etnaviv: Update headers from rnndb
Update to rnndb commit 8b4f7a88ce71

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37578>
2025-09-26 10:30:16 +00:00
Tapani Pälli
c8f47d7681 blorp: add missing pipecontrol after 3DSTATE_WM_HZ_OP for Xe2+
Backport-to: 25.2
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37547>
2025-09-26 10:07:18 +00:00
Georg Lehmann
cc08786689 aco: use maximum RT vgpr_limit that doesn't reduce wave count
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
144 instead of 132 with 5 waves, in practice.

Foz-DB Navi31:
Totals from 33 (0.04% of 80273) affected shaders:
Instrs: 3266241 -> 3261329 (-0.15%)
CodeSize: 16885356 -> 16860088 (-0.15%)
VGPRs: 4356 -> 4752 (+9.09%)
SpillVGPRs: 2504 -> 1535 (-38.70%)
Scratch: 264704 -> 216320 (-18.28%)
Latency: 18445909 -> 18395904 (-0.27%)
InvThroughput: 3689182 -> 3679182 (-0.27%)
VClause: 85171 -> 84595 (-0.68%)
SClause: 59365 -> 59320 (-0.08%); split: -0.08%, +0.01%
Copies: 260528 -> 259113 (-0.54%); split: -0.59%, +0.05%
Branches: 92537 -> 92519 (-0.02%)
VALU: 1937426 -> 1935925 (-0.08%); split: -0.08%, +0.01%
SALU: 393075 -> 393047 (-0.01%); split: -0.01%, +0.01%
VMEM: 147914 -> 146003 (-1.29%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37548>
2025-09-26 08:45:05 +00:00
Georg Lehmann
4b24bc7c70 util: add util_round_down_npot
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37548>
2025-09-26 08:45:05 +00:00
Mauro Rossi
7b50b8966b intel/mda: Fix gnu-empty-initializer warning
This also causes build errors on older Android prebuilt clang.

Fixes: bccc0fa9 ("intel/mda: Add code to produce mesa debug archives")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37586>
2025-09-26 08:30:16 +00:00
Erik Faye-Lund
a09b6551ff pvr: remove stale comment about pvr_pds_upload
The pvr_pds_upload struct has been moved to pvr_common.h, which doesn't
have the same circular dependency issue here. But this change is out of
scope for this MR, so let's just update the comment here instead.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:59 +00:00
Erik Faye-Lund
7ff8b043eb pvr: use pvr_memlayout instead of uint32_t
The circular include dependency has already been resolved, when this was
moved to pvr_common.h instead of pvr_private.h. So let's use the actual
type and delete the comment here.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:59 +00:00
Erik Faye-Lund
a26600c4f4 pvr: include pvr_common.h instead of pvr_private.h
This is the header that's *actually* needed here, pvr_private.h just
pulls it in for us. Since this is a header-file, let's use as narrow
includes as we can to avoid including everything everywhere and terrible
build times.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:59 +00:00
Erik Faye-Lund
ba5afddc90 pvr: remove bogus forward-declaration
This struct doesn't exist any more. But the forward declaration is also
unused, so it's not causing any harm. But let's remove it to clean
things up a bit.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:59 +00:00
Erik Faye-Lund
d963cca82f pvr: move event/sampler cast defs to correct header
This struct is defined in pvr_common.h, so we should have the cast
definitions in the same header.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:59 +00:00
Erik Faye-Lund
83a8df1b37 pvr: drop pointless PVR_FROM_HANDLE macro
All it does is call VK_FROM_HANDLE, let's just do that directly instead.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:58 +00:00
Erik Faye-Lund
7fce4e5bdc pvr: remove unused enum
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37554>
2025-09-26 08:15:58 +00:00
Christian Gmeiner
d5606141eb docs/features: Mark GL_EXT_transform_feedback as done for etnaviv/HWTFB
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37579>
2025-09-26 08:11:10 +00:00
Georg Lehmann
8e03505782 aco: don't insert s_sendmsg dealloc_vgprs with little vgprs allocated
Reduces message bus traffic when the benefit is small.

Foz-DB Navi31:
Totals from 3752 (4.67% of 80273) affected shaders:
Instrs: 1999755 -> 1992249 (-0.38%)
CodeSize: 10531824 -> 10501800 (-0.29%)
Latency: 14935247 -> 14935147 (-0.00%)
InvThroughput: 5976053 -> 5975262 (-0.01%)

Foz-DB Navi33:
Totals from 2614 (3.26% of 80273) affected shaders:
Instrs: 969475 -> 964247 (-0.54%)
CodeSize: 5171240 -> 5150328 (-0.40%)
Latency: 7891519 -> 7891434 (-0.00%)
InvThroughput: 4815008 -> 4814287 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
27cc6317f9 aco: dealloc vgprs if there is a pending non scratch store and no pending export
Because s_sendmsg dealloc_vgprs waits for every counter except vs_count,
and the message bus has limited throughput, we should only insert the dealloc
when we know that it's beneficial.

Foz-DB Navi31:
Totals from 5280 (6.58% of 80273) affected shaders:
Instrs: 4186851 -> 4197416 (+0.25%)
CodeSize: 21910004 -> 21952264 (+0.19%)
Latency: 31679067 -> 31679173 (+0.00%)
InvThroughput: 9182625 -> 9183417 (+0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
26e041e821 aco: remove existing dealloc_vgprs use
We didn't consider that s_sendmsg dealloc_vgpr waits for all counters
expect vscnt.

Foz-DB Navi31:
Totals from 74090 (92.52% of 80084) affected shaders:
Instrs: 36031071 -> 35853573 (-0.49%)
CodeSize: 189233756 -> 188523764 (-0.38%)
Latency: 222378318 -> 222374890 (-0.00%)
InvThroughput: 33366893 -> 33362457 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00