Commit graph

200045 commits

Author SHA1 Message Date
Georg Lehmann
a1fbf91ff2 radv/nir: fix radv_nir_remap_color_attachment progress
And switch to SPDX header.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Georg Lehmann
da197c3d55 ac/nir/lower_ps_late: remove gfx6 mrtz writemask workaround
This is now done in the backends.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Georg Lehmann
6a7ff2204b ac/llvm/gfx6: move mrtz writemask workaround to ac_build_export
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Georg Lehmann
072815e5cb aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Rhys Perry
b5cf3b1628 ac/nir: fix check for increasing size of non-descriptor loads
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In the previous version, "end" could have been zero, which would have
allowed an increase of "mul" bytes, when it should not not be increased at all.

For example:
- align_offset=4
- mul=4
- unaligned_new_size=96
- aligned_new_size=128
This would have loaded a dword which was not loaded previously.

fossil-db (gfx1201):
Totals from 115 (0.14% of 79839) affected shaders:
Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30%
CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37%
SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18%
Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11%
InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10%
VClause: 3689 -> 3691 (+0.05%)
SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44%
Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56%
Branches: 7402 -> 7401 (-0.01%)
PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39%
VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04%
SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99%
SMEM: 9556 -> 9697 (+1.48%)

fossil-db (navi31):
Totals from 238 (0.30% of 79825) affected shaders:
Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17%
CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22%
SpillSGPRs: 1064 -> 1059 (-0.47%)
Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13%
InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11%
VClause: 7101 -> 7098 (-0.04%)
SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62%
Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01%
PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26%
VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02%
SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70%
SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07%

fossil-db (navi21):
Totals from 239 (0.30% of 79825) affected shaders:
Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15%
CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19%
Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07%
InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06%
SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64%
Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96%
Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02%
PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08%
VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05%
SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45%
SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458
Backport-to: 25.3
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>
2025-12-12 13:58:42 +00:00
Rhys Perry
49d923078f ac/nir: fix calculation of aligned_new_size
This should consider nir_round_up_components().

fossil-db (gfx1201):
Totals from 90 (0.11% of 79839) affected shaders:
MaxWaves: 1829 -> 1901 (+3.94%)
Instrs: 410780 -> 411825 (+0.25%); split: -0.02%, +0.27%
CodeSize: 2227956 -> 2234464 (+0.29%); split: -0.02%, +0.31%
VGPRs: 6952 -> 6760 (-2.76%); split: -3.11%, +0.35%
Latency: 3071765 -> 3073960 (+0.07%); split: -0.00%, +0.07%
InvThroughput: 766201 -> 767322 (+0.15%); split: -0.00%, +0.15%
VClause: 7887 -> 7898 (+0.14%); split: -0.08%, +0.22%
Copies: 48189 -> 48324 (+0.28%); split: -0.05%, +0.33%
PreVGPRs: 6605 -> 6595 (-0.15%); split: -0.18%, +0.03%
VALU: 237272 -> 238147 (+0.37%); split: -0.01%, +0.37%
SALU: 48987 -> 49003 (+0.03%)
VMEM: 15542 -> 15560 (+0.12%)
VOPD: 188 -> 200 (+6.38%)

fossil-db (navi31):
Totals from 89 (0.11% of 79825) affected shaders:
MaxWaves: 1811 -> 1883 (+3.98%)
Instrs: 403695 -> 404691 (+0.25%); split: -0.01%, +0.26%
CodeSize: 2150612 -> 2154860 (+0.20%); split: -0.03%, +0.23%
VGPRs: 6892 -> 6676 (-3.13%)
Latency: 3306107 -> 3310010 (+0.12%); split: -0.01%, +0.13%
InvThroughput: 813092 -> 814382 (+0.16%); split: -0.00%, +0.16%
VClause: 7999 -> 8010 (+0.14%); split: -0.06%, +0.20%
Copies: 50089 -> 50210 (+0.24%); split: -0.05%, +0.29%
PreVGPRs: 6596 -> 6586 (-0.15%); split: -0.18%, +0.03%
VALU: 239617 -> 240392 (+0.32%); split: -0.01%, +0.33%
SALU: 45349 -> 45363 (+0.03%)
VMEM: 15762 -> 15780 (+0.11%)
VOPD: 258 -> 262 (+1.55%)

fossil-db (navi21):
Totals from 89 (0.11% of 79825) affected shaders:
Instrs: 345634 -> 346426 (+0.23%); split: -0.00%, +0.23%
CodeSize: 1895616 -> 1900156 (+0.24%); split: -0.00%, +0.24%
Latency: 3043334 -> 3046859 (+0.12%); split: -0.01%, +0.13%
InvThroughput: 928236 -> 929626 (+0.15%); split: -0.01%, +0.16%
VClause: 7894 -> 7905 (+0.14%); split: -0.06%, +0.20%
Copies: 48694 -> 48785 (+0.19%); split: -0.03%, +0.22%
PreVGPRs: 6580 -> 6570 (-0.15%); split: -0.18%, +0.03%
VALU: 228323 -> 229072 (+0.33%); split: -0.01%, +0.33%
SALU: 47202 -> 47216 (+0.03%)
VMEM: 16546 -> 16564 (+0.11%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458
Backport-to: 25.3
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>
2025-12-12 13:58:42 +00:00
Hyunjun Ko
c50474ac6f anv/video: clean up VP9 picture state setup
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38904>
2025-12-12 13:37:44 +00:00
Hyunjun Ko
2fe09217a1 anv/video: fix VP9 chroma subsampling format detection
Fixes: 314de7af ("anv: Initial support for VP9 decoding")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38904>
2025-12-12 13:37:44 +00:00
Boris Brezillon
c0d982751c panvk: Use WB mappings for the global RW and executable memory pools
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This implies relying on all users of these pools to do the flushing
explicitly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
2dd27c647b panvk: Use WB maps for command buffer memory
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
f860c7bdf1 panvk: Use write-back maps for descriptor sets
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
e84f804a6d panvk: Add a write_desc_data() helper
This centralizes things so that we only ever write to the descriptor
buffer in write_desc_data(). get_desc_slot_ptr() now returns a const
void * so we don't write to it.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
3b711d687b panvk: Map our standalone private BOs writeback when it makes sense
We can used CPU cached mappings for our private BOs being updated by
the CPU. We make the printf BO an exception to avoid having to
invalidate it every time we check the queue status.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
5095e125c5 panvk: Add various flush/invalidate helpers for internal BOs
Those will be used as we progressively transition some of our
internal buffers to writeback CPU mappings.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
f60d2aa545 panvk: Force a cacheline alignment when allocating objects from WB shared pools
When allocating individual objects from a shared pool, we don't want
objects to share cachelines, otherwise cache maintenance operations on
individual objects might corrupt other objects.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
1c7793ea0b panvk: Advertise a HOST_CACHED memory type if we have WC maps
If the GPU is IO coherent, we expose one memory type that's both
host-coherent and host-cached. Otherwise we expose one type that's
host-uncached and host-coherent, and one that's host-cached and
host-noncoherent.

By default, we advertise <cached,non-coherent> before
<non-cached,coherent> because that's the combination providing the
best perfs in situations where the user knows how to deal with the
non-coherent nature of the GPU.

Unfortunately, the CTS has a few bugs (missing or incorrect flush/inval
calls) forcing us to re-order things. We might drop the flag at some
point (some fixes have been submitted, others are on their way).

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
2afef24d3f panvk: Base memoryTypeBits on phys_dev->type_count
Stop hard-coding 1 and just advertise everything on the physical device.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
ba293b1e49 panvk: Store the memory heaps/types in the physical device
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
c7ca8950f2 panvk: Sync CPU maps around host image copies
This is a little annoying.  We probably don't want to call into the
kernel once for every Z slice or array layer we touch.  But at the same
time if we can flush from userspace we don't want to flush/invalidate
more than necessary.  So we have two sets of flushes, a more precise one
which we do based for userspace flushing and a coarse-grained one for
kernel flushing.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
a32eb87a5d panvk: Implement Flush/InvalidateMappedMemoryRanges()
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
1e6ea0697a panvk: Flush pending map syncs before submission
Flush deferred CPU sync ops so we can make CPU changes visible to the GPU.
This is currently a NOP because we haven't enabled cached mappings in
panvk yet, but we need to prepare for that before we progressively
switch each relevant buffer to use writeback CPU mappings.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
3ae96f3cfd panvk: Add a debug flag to force CPU map syncs through the kernel
Useful for debugging.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
4bee7f0003 panvk: Add a debug flag to force CPU-uncached mappings
Useful to debug stuff.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
a670956b7a panvk: Mask off BO_FLAG_WB_MMAP in adjust_bo_flags()
This makes it easier to say we want WB maps various places.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
aebd71cc8d panvk: Rely on supported_bo_flags to mask PAN_KMOD_BO_FLAG_GPU_UNCACHED
Now that we have it hooked up at the props level, we can filter
this flag out in panvk_device_adjust_bo_flags() and use this helper
when creating our uncached mempool.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
76bb8e1a39 panvk: Add a panvk_priv_mem_check_alloc() helper and use it
Stop checking allocation success with panvk_priv_mem_{dev,host}_addr().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
c9e94f92a0 panvk: Don't allocate memory for a buffer descriptor in CreateBufferView()
The buffer descriptor is copied to the descriptor set, and there's no
side-band data to allocate in GPU memory.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
b5e47ba894 pan/kmod: Add new helpers to sync BO CPU mappings
pan_kmod_flush_bo_map_syncs() queues CPU-sync operations, and
pan_kmod_flush_bo_map_syncs_locked() ensures all queued
operations are flushed/executed. Those will be used when we start
adding support for CPU-cached mappings.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
af14c37bf1 pan/kmod: Enforce PAN_KMOD_BO_FLAG_NO_MMAP
Fail early in pan_kmod_bo_mmap() if PAN_KMOD_BO_FLAG_NO_MMAP is set.
This saves us a user -> kernel round-trip, but most importantly, it
allows us to enforce NO_MMAP at the userspace level on BOs that the
kernel would otherwise accept to mmap() (mapping of imported BOs
requires extra DMA_BUF_IOCTL_SYNC calls we don't have).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
0f4f556229 pan/kmod: Expose the IO coherency property
Will be used to skip cache maintenance operations when the GPU is IO
coherent.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
cd8b8baf6e pan/kmod: Expose the BO flags supported by a pan_kmod_device
Will be needed to let the frontend know if it can use cached CPU-mappings,
and it allows us to extend the set of supported flags without introducing
a new field if we ever have to.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Faith Ekstrand
7e65322c93 pan/kmod: Add a panfrost_kmod_driver_version_at_least() helper
We have a few hand-rolled instances of this which work well enough but it
gets more complicated as soon as we care about checking a major version
more than 1.  Add a helper to make this more robust.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:41 +01:00
Boris Brezillon
ee172bb769 pan/kmod: Cache the device props at the pan_kmod_dev level
The frontend is going to query the device props anyway, so let's just
query it at device creation time and store it in pan_kmod_dev::props.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:15:30 +01:00
Faith Ekstrand
f43cff3728 util: Move STACK_ARRAY into util
It's useful for more than just Vulkan.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
2025-12-12 10:03:02 +01:00
Georg Lehmann
0fe8250bf4 radv: optimize known front_face_fsign too
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi21:
Totals from 1941 (1.99% of 97581) affected shaders:
MaxWaves: 44196 -> 44612 (+0.94%); split: +0.97%, -0.03%
Instrs: 1553182 -> 1548823 (-0.28%); split: -0.36%, +0.08%
CodeSize: 8261308 -> 8261496 (+0.00%); split: -0.17%, +0.18%
VGPRs: 98488 -> 97968 (-0.53%); split: -0.56%, +0.03%
SpillSGPRs: 1288 -> 1347 (+4.58%)
Latency: 19136399 -> 19094748 (-0.22%); split: -0.38%, +0.16%
InvThroughput: 5424693 -> 5409469 (-0.28%); split: -0.32%, +0.04%
VClause: 29941 -> 29943 (+0.01%); split: -0.26%, +0.27%
SClause: 39922 -> 39972 (+0.13%); split: -1.02%, +1.14%
Copies: 109736 -> 109684 (-0.05%); split: -1.45%, +1.40%
Branches: 24523 -> 24499 (-0.10%); split: -0.12%, +0.02%
PreSGPRs: 99206 -> 99191 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 79019 -> 78240 (-0.99%); split: -1.00%, +0.02%
VALU: 1145088 -> 1140731 (-0.38%); split: -0.44%, +0.06%
SALU: 164035 -> 164077 (+0.03%); split: -0.48%, +0.51%
SMEM: 80668 -> 80658 (-0.01%)

We used to call this pass before front_face_fsign is created
but that has changed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38906>
2025-12-12 08:24:38 +00:00
Marek Olšák
9bd2c6dcb2 ac/nir: allow smaller workgroups for GS
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's not good for performance, but it's possible to use for debugging.
Running single-wave GS workgroups could work around any LDS race conditions.

Setting the workgroup size to 64 reliably works around
GLCTS *primitive_counter*line failures, indicating streamout data
corruption with multi-wave GS workgroups.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38328>
2025-12-12 04:27:32 +00:00
stefan11111
ff8df8712e gallium/frontends/dri: Don't force dri cursor buffers to be 64x64
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: stefan11111 <stefan11111@shitposting.expert>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38841>
2025-12-12 02:39:41 +00:00
Qiang Yu
3f37740762 ac/llvm: workaround legacy fma intrinsic crash on gfx12
This is a llvm bug:
  https://github.com/llvm/llvm-project/issues/170437

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14359
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38884>
2025-12-12 02:08:41 +00:00
Rob Clark
cb9d9b8a6e asahi: Set prefer_real_buffer_in_constbuf0
This avoids u_upload_data_ref() when cb0 is bound.  The u_upload_*_ref()
paths are still problematic to mix with uploaders that the front-end
uses with explicitly managed releasebufs, but this at least side-steps
the issue, and is a legit fix on it's own.

Cc: mesa-stable
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
2025-12-12 00:55:55 +00:00
Rob Clark
51605bfac2 gallium: Make upload_cb0 return a releasebuf
pipe_upload_constant_buffer0() was immediately releasing the
u_upload_alloc() releasebuf.  But it is used in various call-
paths where the release needs to be deferred further.

Fixes crashes in firefox for any driver that uses the same
u_upload_mgr instance for pipe->const_uploader and
pipe->stream_uploader.

Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14309
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
2025-12-12 00:55:55 +00:00
Rob Clark
b2daaaec81 gallium/aux: Add debug option to force u_upload rollover
Triggering the rollover where the old upload buffer is released is a
good way to catch bugs with a releasebuf being dropped too soon (ie.
while the frontend still needs a reference).

This makes it easy to reproduce firefox crashes in any driver where
pipe->const_uploader == pipe->stream_uploader.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
2025-12-12 00:55:55 +00:00
Kenneth Graunke
bdacab49c7 brw: Use LSC extended descriptor offsets for Xe2 URB messages
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
URB messages on Xe2 are LSC messages with FLAT addressing.  We can
specify a S19 immediate offset in the extended message descriptor,
which should be more than adequate to hold any offsets we need.

We wrote the original URB code before implementing that, and never
doubled back to take advantage of it.  But doing so can drop ADDs
near every URB access.

fossil-db results on Battlemage:

   Totals:
   Instrs: 232239759 -> 231432254 (-0.35%)
   Cycle count: 34044435848.0 -> 34055507100.0 (+0.03%); split: -0.00%, +0.04%
   Spill count: 520370 -> 520362 (-0.00%); split: -0.00%, +0.00%
   Fill count: 470790 -> 470803 (+0.00%); split: -0.00%, +0.00%
   Max live registers: 72111853 -> 72111369 (-0.00%); split: -0.00%, +0.00%

   Totals from 227920 (28.89% of 788851) affected shaders:
   Instrs: 59841897 -> 59034392 (-1.35%)
   Cycle count: 683385208.0 -> 694456460.0 (+1.62%); split: -0.14%, +1.76%
   Spill count: 17278 -> 17270 (-0.05%); split: -0.10%, +0.06%
   Fill count: 17481 -> 17494 (+0.07%); split: -0.03%, +0.10%
   Max live registers: 23052652 -> 23052168 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
2025-12-12 00:12:03 +00:00
Kenneth Graunke
9482d392a1 brw: Fix outdated comments about urb->offset units
I recently converted urb->offset to be in bytes on Xe2, but neglected to
update these comments that still said OWord.

Fixes: 9ffae42975 ("brw: Store brw_urb_inst::offset in bytes on Xe2")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
2025-12-12 00:12:03 +00:00
Alejandro Piñeiro
bb06af3f9b panfrost/job: avoid shadowing variable name
Without this commit, panfrost_batch_update_access receives a parameter
called "batch", and then it uses the same name while iterating for all
batches on the current context. This can be confusing and error-prone,
so let's rename the latter.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38908>
2025-12-11 23:44:20 +00:00
Lionel Landwerlin
fecb9e0952 anv: switch shader heap placement to beginning of heap by default
It seems placing the shader at the end has a negative performance
impact.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8ba197c9ef ("anv: Switch shaders to dedicated VMA allocator")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38900>
2025-12-11 23:20:06 +00:00
Iván Briano
094f8f041f anv: enable fragmentShadingRateWithShaderSampleMask on Xe2+
Before DG2, the value the HW gives us seems to be backwards, but
since DG2 this is supposed to be supported just fine.
However, due to Wa_22012766191, enable it only for Xe2 and up.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
2025-12-11 22:50:10 +00:00
Iván Briano
df15770785 anv: coarse_pixel doesn't require any InputCoverageMaskState
The UNUSED is to avoid warnings on the gen9 variant.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
2025-12-11 22:50:10 +00:00
Iván Briano
978d4b2a99 anv: maxFragmentShadingRateCoverageSamples is 16 on all platforms
But before ACM, we need to mis-report it to keep the CTS sane, as the
implementation of coarse pixel seems to have all sorts of wrongs in
older HW.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
2025-12-11 22:50:10 +00:00
Iván Briano
a7280ab590 nir: add nir_lower_single_sampled::lower_sample_mask_in option
GLSL defines gl_SampleMaskIn as :
   "a fragment language that indicates the set of samples covered
    by the primitive generating the fragment during multisample
    rasterization"

when variable rate shading is enabled, a single invocation might cover
multiple samples. The lowering done in nir_lower_single_sampled() does
not account for that case, so add an option to selectively disable it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
2025-12-11 22:50:10 +00:00
Iván Briano
ef31f07077 nir: clear SAMPLE_MASK_IN if we lowered it
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
2025-12-11 22:50:10 +00:00