Commit graph

210530 commits

Author SHA1 Message Date
Samuel Pitoiset
4bd0bf7e19 radv: invalidating push constants for compute<->rt during dispatches
It's similar but a bit cleaner.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36792>
2025-08-18 07:25:32 +00:00
Samuel Pitoiset
104510aeb6 radv: slightly optimize indirect descriptor sets upload size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36792>
2025-08-18 07:25:32 +00:00
Samuel Pitoiset
fd5925868f radv: tidy up radv_flush_descriptors()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36792>
2025-08-18 07:25:31 +00:00
Yiwei Zhang
94fdc5bc47 venus: use VK_USE_PLATFORM_ANDROID_KHR when applicable
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
To stay consistent with common code gen:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36702

Besides using the spec platform guard, this change also:
- drops the guard for ANB sharedImage
- keep the gettid and disk cache guards as they are

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36802>
2025-08-18 02:36:12 +00:00
Yiwei Zhang
69de00efe2 meson/android: drop redundant libdisplay-info dep
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's only used by common wsi, but not Android.

Fixes: 2c870bbe20 ("build: Add dependency on libdisplay-info")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36795>
2025-08-17 14:40:36 +00:00
Martin Roukala (né Peres)
81a79234d8 radv/ci: disable hang detection in navi31-vkcts
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This has caused at least 2 unrelated MRs to fail a merge, so the
expectation that the GPU would not hang is clearly wrong and needs to
be updated.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36804>
2025-08-17 13:36:33 +03:00
Emma Anholt
a5d514c5f3 tu: Move the BO implicit sync flag handling to a BO allocation flag.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This lets us set NO_IMPLICIT per bo for non-implicit-sync BOs (which gets
checked when we're submitting with an implicit sync BO present).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19309>
2025-08-16 22:26:41 +00:00
Emma Anholt
4dcf32c56e wsi/drm: Don't request implicit sync if we're doing implicit sync ourselves.
This will avoid kernel overhead on tu (implicit syncs every BO) and radv
(implicit syncs the swapchain BOs) for doing implicit synchronization on
non-explicit-sync WSI backends (old X11 and Wayland, KHR_display without
!36591, and headless).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19309>
2025-08-16 22:26:41 +00:00
Emma Anholt
8f67d59725 wsi/drm: Do the dma_buf_semaphore setup at swapchain creation time.
Less work at present time, and will let us make decisions about implicit
sync up front.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19309>
2025-08-16 22:26:41 +00:00
Emma Anholt
a377d32fdc vulkan/wsi: Add a test for kernel 6.0 sync file import/export ioctls.
We'll use this in DRM WSI to decide if we need the implicit_sync flag on
swapchain image creation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19309>
2025-08-16 22:26:40 +00:00
Emma Anholt
61fb238a4d vulkan/wsi: Add comments about the WSI's syncing, and KHR_display stuff.
I have spent so long orienting myself in this code, more than once, that
it's time to leave some clues for next time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19309>
2025-08-16 22:26:40 +00:00
Emma Anholt
071a7e5f8f tu: Disable LRZ writes after most stencil-write operations.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
As explained in the comment, stencil can have a similar dependency on
later LRZ writes to how blending does.  Fixes
dEQP-VK.imageless_framebuffer.depth_stencil with TU_DEBUG=gmem,forcebin
(so you get LRZ filled during binning of the single draw call that
happened)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13533
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36660>
2025-08-16 21:56:30 +00:00
Yiwei Zhang
07cee75c39 venus: layer vkQueueSubmit2 over vkQueueSubmit w/o sync2
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This helps with common wsi code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36799>
2025-08-16 19:14:17 +00:00
Faith Ekstrand
1b2acf9006 vulkan: Drop implicit sync support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This gets rid of the internal wsi_memory_signal_submit_info structure
used to indicate implicit sync through vkQueueSubmit() as well as the
handling in vk_queue.c and vk_device::create_sync_for_memory.  Nothing
is using any of this anymore.

Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
16520cfdf1 vulkan/wsi: Stop setting wsi_memory_signal_submit_info
There are no longer any drivers implementing the back-end hooks for this
so there's no point in setting it from WSI.

Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
9cf6f14b88 vulkan/wsi: Drop signal_fence/semaphore_with_memory
Intel was the only drivers setting this and how they don't so we can get
rid of the flag and the associated code.

Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
334466d907 dozen: Drop dzn_create_sync_for_memory()
It creates a dymmy sync so there's no point.

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
7b945df668 hasvk: Dead code anv_bo_sync
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
9944be0be9 hasvk/wsi: Stop requesting signal_*_with_memory
Now that we require the dma-buf sync file import/export path, these
legacy paths should never be invoked so we can stop requesting them.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
9cf4872475 hasvk: Require Linux 6.0 for dma-buf sync file import/export
This also implies all the other syncobj features we care about so those
become dead code.  We'll delete them in following commits.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:47 -04:00
Faith Ekstrand
5f7c6b2810 hasvk: Require HAS_EXEC_TIMELINE_FENCES
i915 has had support for timeline syncobjs for a long time.  We might as
well require it at this point.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
3ec62d3a09 hasvk: Require HAS_EXEC_CAPTURE
This feature is almost as old as the Vulkan driver itself.  We've
required newer kernels for a long time.  There's no point in having this
feature bit kicking around.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
5802d2c090 hasvk: Require HAS_EXEC_ASYNC
This feature is as old as the Vulkan driver itself.  We've required
newer kernels for a long time.  There's no point in having this feature
bit kicking around.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
57aceb96aa anv: Dead code anv_bo_sync
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
7ebe93aa9f anv/wsi: Stop requesting signal_*_with_memory
Now that we require the dma-buf sync file import/export path, these
legacy paths should never be invoked so we can stop requesting them.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
affee04bd9 anv: Require Linux 6.0 for dma-buf sync file import/export
This also implies all the other syncobj features we care about so those
become dead code.  We'll delete them in following commits.

We don't need a check for Xe because the Xe driver was merged into Linux
6.8 while dma-buf sync file import/export landed in 6.0.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
d7416ebc19 intel/gem: Add an intel_gem_supports_dma_buf_sync_file() helper
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
8044f16bd6 anv/i915: Require HAS_EXEC_TIMELINE_FENCES
i915 has had support for timeline syncobjs for a long time.  We might as
well require it at this point.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
cb5a2eafd5 anv/i915: Require HAS_EXEC_CAPTURE
This feature is almost as old as the Vulkan driver itself.  We've
required newer kernels for a long time.  There's no point in having this
feature bit kicking around.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
f28eb1bae6 anv/i915: Require HAS_EXEC_ASYNC
This feature is as old as the Vulkan driver itself.  We've required
newer kernels for a long time.  There's no point in having this feature
bit kicking around.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Faith Ekstrand
94931fd4f4 anv: Set the Shader capability when compiling the FP64 shader
Otherwise the SPIR-V parser prints a warning the first time the driver
is loaded after a fresh compile.

Fixes: 91b62e9868 ("anv: Use spirv_capabilities for the float64 shader")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783>
2025-08-16 00:04:46 -04:00
Yiwei Zhang
403a62a9e5 venus: stop consuming wsi_memory_signal_submit_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Dropped in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36783

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36800>
2025-08-15 20:32:24 -07:00
Konstantin Seurer
cc0dc4b566 radv: Store parent node IDs inside nodes on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Saves some space.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36691>
2025-08-15 13:00:32 +00:00
Georg Lehmann
8c20947f69 amd/ci: update checksums for restricted traces
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970 seems to have
caused tiny differences for one pixel in each of the traces.
Kind of unexpected, but not exactly concerning either.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36784>
2025-08-15 11:34:56 +00:00
Konstantin Seurer
0d73aeea27 radv: Add RADV_DEBUG=validatevas for address validation in nir
The option creates a buffer where each bit stores whether the
corresponding 4096 byte memory section has been allocated. The helper
radv_build_is_valid_va allows for querying the validity of addresses
inside a nir shader which can be useful for debugging.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34392>
2025-08-15 10:32:35 +00:00
Konstantin Seurer
be4be884e1 radv: Rename radv_printf files to radv_debug_nir
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34392>
2025-08-15 10:32:34 +00:00
Yonggang Luo
fcab92d557 util: Now DETECT_ARCH_X86_64 can be safely used in rounding.h
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36744>
2025-08-15 09:27:19 +00:00
Yonggang Luo
219905aec7 util: Add DETECT_ARCH_ARM64EC for defined(_M_ARM64EC) equivalent
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36744>
2025-08-15 09:27:19 +00:00
Yonggang Luo
9beb0e90b4 util: Update DETECT_ARCH_X86_64 to exclude _M_ARM64EC
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36744>
2025-08-15 09:27:19 +00:00
Georg Lehmann
9ed94371f7 amd: stop using custom gl_access_qualifier for access type
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36764>
2025-08-15 08:26:10 +00:00
Georg Lehmann
f17cb6b714 amd: replace ACCESS_TYPE_SMEM with ACCESS_SMEM_AMD
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36764>
2025-08-15 08:26:10 +00:00
Corentin Noël
6da8752758 tgsi: Remove return type from tgsi_instruction_texture
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This return type is never used.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36765>
2025-08-15 08:03:04 +00:00
Samuel Pitoiset
eb1a093965 radv: stop using the pipeline layout for uploading push constants with DGC
Pass the push constant size as a parameter instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36777>
2025-08-15 07:45:00 +00:00
Samuel Pitoiset
b527a4f23e radv: split uploading push constants with DGC in two parts
The first part is for copying "normal" push constant values to the
upload space in the preprocess buffer. The second part is only for
updating the push constants set for DGC.

This will allow us to remove using the pipeline layout in the DGC
shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36777>
2025-08-15 07:45:00 +00:00
Samuel Pitoiset
3e0d4a1df1 radv: stop using the pipeline layout for inlined push constants with DGC
This only updates the inlined push constants set for DGC and doesn't
need the pipeline layout if the index is computed differently.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36777>
2025-08-15 07:45:00 +00:00
Samuel Pitoiset
95e387d283 radv: remove useless inline push constant emission with DGC IES
This is actually not needed because the base pipeline/shader is
required to be bind before preprocess()/execute() are called. Also,
the push constant layout must be similar between all pipelines/shaders
in the same IES.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36777>
2025-08-15 07:45:00 +00:00
Georg Lehmann
6ba462bf26 aco/disable_wqm: optimize local mask creation
Foz-DB Navi48:
Totals from 7861 (9.79% of 80287) affected shaders:
Instrs: 13276809 -> 13183483 (-0.70%)
CodeSize: 71221260 -> 70852500 (-0.52%); split: -0.52%, +0.00%
Latency: 124001421 -> 123976480 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 17820119 -> 17817551 (-0.01%); split: -0.01%, +0.00%
SALU: 1736356 -> 1666673 (-4.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:47 +00:00
Georg Lehmann
fc53cf146c aco: disable wqm for sampled buffer loads when not needed
Foz-DB GFX1201:
Totals from 318 (0.40% of 80287) affected shaders:
Instrs: 313039 -> 314064 (+0.33%); split: -0.00%, +0.33%
CodeSize: 1684104 -> 1688212 (+0.24%); split: -0.00%, +0.24%
VGPRs: 15120 -> 15144 (+0.16%)
Latency: 2515023 -> 2518610 (+0.14%); split: -0.06%, +0.20%
InvThroughput: 447468 -> 447615 (+0.03%); split: -0.02%, +0.05%
VClause: 4866 -> 4914 (+0.99%)
SClause: 6564 -> 6559 (-0.08%); split: -0.09%, +0.02%
Copies: 23577 -> 23673 (+0.41%); split: -0.04%, +0.45%
PreSGPRs: 16019 -> 16029 (+0.06%)
VALU: 172157 -> 172143 (-0.01%)
SALU: 52816 -> 53867 (+1.99%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:47 +00:00
Georg Lehmann
883b1ca364 aco: disable wqm for tex loads when not needed
By only executing VMEM loads for lanes where the result is used, we can save
bandwidth.

The NIR pass only handles tex for now, but those are most common anyway.
We can extend it handle image/ssbo/ubo/global loads in the future.

Foz-DB GFX1201:
Totals from 32633 (40.66% of 80251) affected shaders:
Instrs: 22635910 -> 23193509 (+2.46%); split: -0.00%, +2.46%
CodeSize: 122880044 -> 125093428 (+1.80%); split: -0.00%, +1.81%
VGPRs: 1481868 -> 1481712 (-0.01%)
SpillSGPRs: 3877 -> 4301 (+10.94%); split: -0.52%, +11.45%
Latency: 171480552 -> 171685219 (+0.12%); split: -0.18%, +0.30%
InvThroughput: 24364743 -> 24373441 (+0.04%); split: -0.08%, +0.12%
VClause: 388318 -> 388557 (+0.06%); split: -0.06%, +0.13%
SClause: 774781 -> 776492 (+0.22%); split: -0.29%, +0.51%
Copies: 1416586 -> 1541199 (+8.80%); split: -0.16%, +8.96%
Branches: 419591 -> 419673 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 1330303 -> 1416540 (+6.48%)
PreVGPRs: 964864 -> 964863 (-0.00%)
VALU: 12919601 -> 12920254 (+0.01%); split: -0.01%, +0.01%
SALU: 2685402 -> 3224147 (+20.06%); split: -0.00%, +20.07%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:46 +00:00
Georg Lehmann
7159fd21f8 aco: don't restrict vmem load scheduling by inserting p_end_wqm early
Foz-DB GFX1201:
Totals from 7 (0.01% of 80251) affected shaders:
Instrs: 703 -> 729 (+3.70%)
CodeSize: 4032 -> 4136 (+2.58%)
Latency: 5840 -> 4715 (-19.26%)
InvThroughput: 441 -> 405 (-8.16%)
Copies: 61 -> 67 (+9.84%)
PreSGPRs: 216 -> 218 (+0.93%)
SALU: 93 -> 113 (+21.51%)

When reordered after the next commit:
Foz-DB GFX1201:
Totals from 1609 (2.00% of 80251) affected shaders:
MaxWaves: 47984 -> 47986 (+0.00%)
Instrs: 1326847 -> 1332797 (+0.45%); split: -0.05%, +0.50%
CodeSize: 7248720 -> 7275364 (+0.37%); split: -0.04%, +0.41%
VGPRs: 74968 -> 75148 (+0.24%); split: -0.06%, +0.30%
SpillSGPRs: 182 -> 184 (+1.10%)
Latency: 10370602 -> 10172524 (-1.91%); split: -2.06%, +0.15%
InvThroughput: 1446508 -> 1445920 (-0.04%); split: -0.11%, +0.06%
VClause: 23567 -> 23559 (-0.03%); split: -0.35%, +0.32%
SClause: 43143 -> 43203 (+0.14%); split: -0.52%, +0.66%
Copies: 80948 -> 81622 (+0.83%); split: -0.32%, +1.16%
Branches: 21599 -> 21727 (+0.59%)
PreSGPRs: 69963 -> 70732 (+1.10%)
VALU: 778968 -> 779024 (+0.01%); split: -0.02%, +0.03%
SALU: 159797 -> 165329 (+3.46%); split: -0.01%, +3.47%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:46 +00:00