Commit graph

207861 commits

Author SHA1 Message Date
Karol Herbst
b3c245ecf2 clc: add support for cl_ext_image_unorm_int_2_101010
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35469>
2025-06-30 18:04:59 +00:00
Alyssa Rosenzweig
7fd7b18b38 nir: rename AGX geom/tess intrinsics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
to the new common code name.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
d13b321201 nir/lower_gs_intrinsics: drop stuff added for AGX
AGX now vendors a significantly different version of this pass, so the common
one doesn't need the stuff added for AGX.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
debd619555 hk: optimize point size writes with GS/TS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:07 +00:00
Alyssa Rosenzweig
279b7c028b hk: advertise more GS features
new impl handles this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:07 +00:00
Alyssa Rosenzweig
1f7fe678e7 asahi,hk: significantly rework GS
get rid of the rasterizer discard variants, by pushing XFB into the hardware VS
and letting everything cascade down from there. that then means hardware VS runs
for all streams, which means we get dynamic rasterization stream selection.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:07 +00:00
Alyssa Rosenzweig
03a5b7f25c asahi: flush around XFB
this is required by the spec. fixes
gles-3.0-transform-feedback-uniform-buffer-object.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Backport-to: 25.1
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:07 +00:00
Alyssa Rosenzweig
933b50b806 asahi: clang-format
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig
700ee568e0 libagx: add agx_vdm_barrier
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig
16b53d356a nir: add rasterization_stream sysval
for plumbing transformFeedbackRasterizationStreamSelect (in turn for exercising
more CTS and proving out my design).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig
805ef6cc17 nir: add intrinsics for geometry shader lowering
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:05 +00:00
Alyssa Rosenzweig
4f7cae5e61 nir/opt_algebraic: add trichotomy identity
In https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802 we will
significantly rework geometry shaders & transform feedback. In the new approach,
transform feedback is executed as part of the hardware vertex shader, meaning
the vertex shader needs to write out all the "copies" of the same value into
different parts of the XFB buffer. In the general case of a GS writing triangle
strips, we get 0-3 copies. This is good and lets us parallelize XFB better with
GS.

In the case of a VS alone with XFB, we insert a passthrough GS. In that case
special case, we can only get at most 1 copy, so if we can prove the length of
the output strip is 3 we can delete 2/3 of the shader.

Anyway, the only thing preventing NIR from doing that optimization is failing to
see through some conditionals, fixed by optimizing with the law of trichotomy.

We could add other variants of this pattern (signed vs unsigned, iand vs
ior/ixor) if we expect anything else to hit this other than my boutique use
case.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:04 +00:00
Yiwei Zhang
153857fb64 vulkan/wsi: amend barriers for blit dst buffer going to foreign queue
The prime blit dst buffer can be backed by external memory, no matter
host ptr shm or dma-buf export alloc. Whether the external path is taken
is only decided upon blit ctx creation time, so we have to track whether
external in the wsi_image. When the external path is taken, we have to
explicitly handle queue family ownership transfer from internal to
foreign. To be noted, no explicit foreign to internal ownership transfer
is needed since the blit dst content can be left undefined.

Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35034>
2025-06-30 16:03:26 +00:00
Yiwei Zhang
f6bf5d9a3e vulkan/wsi: amend barriers for blit dst buffer
For non-external blit dst buffer, it's bliting the wsi image to a buffer
with mapping populated by vkMapMemory, and it's shared via xcb_put_image
for x11 or memcpy into a shared wl_buffer backed by shm. So we need
additional host stage and host access bit to ensure proper cache flush.
There's no queue family ownership transfer needed.

Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35034>
2025-06-30 16:03:24 +00:00
Jesse Natalie
7f4ae75903 dzn: Roll up initialization failure in dzn_meta_init
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13416
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35748>
2025-06-30 15:13:54 +00:00
Karol Herbst
4b43780bbd rusticl/queue: fix wrong_self_convention and needless_borrow clippy warnings
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35834>
2025-06-30 14:17:33 +00:00
Mike Blumenkrantz
9482ad212d tc: don't reuse first rp info on batch if there is work pending
this otherwise causes a mismatch if drivers try to access previous info
during a set_framebuffer_state call (e.g., to flush clears)

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35777>
2025-06-30 13:51:42 +00:00
Sergi Blanch-Torne
f0c58518cc ci: reduce the root .gitlab-ci file
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Following the idea to distribute in a tree of files to include and split
between the files with or without hidden job definitions, some jobs in the
root file can be moved to files made specific to describe build or test jobs.

Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:30 +00:00
Valentine Burley
6508c6038f ci: Use placeholder-job for mr-label-maker-test
This is a quick job using the .fdo.ci-fairy image.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:30 +00:00
Sergi Blanch-Torne
396074c59d ci: split hidden job definition for bare-metal and ci-tron
The job definitions for lava-related jobs are encapsulated in a directory,
while the other two farm managers were in the generic test directory. Having a
directory for ci-tron places it side by side with other farm managers. For
bare-metal, it has another advantage, as this encapsulates elements related to
this farm manager in a single place.

To maintain simplicity and consistency in file naming, the gitlab-ci file in
the lava directory is renamed as it has a prefix that corresponds with the
directory hosting it. The other farm equivalent files don't include this
duplication as a prefix, and there isn't such a prefix in any other case of
the CI.

Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Acked-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:30 +00:00
Sergi Blanch-Torne
d5c63dd292 ci: split long containers build yaml
The yaml file for the definitions for container build on different systems can
have a split between systems before split between hidden and build jobs.

Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Acked-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:29 +00:00
Sergi Blanch-Torne
6ba1b61395 ci: separate hidden jobs to -inc yml files
Almost all the gitlab-ci.yml files in Mesa have their hidden jobs defined in
an include file. This may have started with !25238 with the idea to simplify
the re-use of hidden jobs by other projects. But we missed the .gitlab-ci
directory.

Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Acked-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:29 +00:00
Eric Engestrom
ff6925c34b ci/build: use !reference to build scripts instead of yaml anchors
Needed for the next commit

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
2025-06-30 12:32:29 +00:00
Robert Mader
a166d7609f gles: Add support for 10/12/16 bit SW decoder YCbCr formats
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Co-Authored-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>
2025-06-30 11:56:23 +00:00
Robert Mader
76095b2cb0 mesa/formats: Add support for 10 and 12 bit SW decoder YCbCr formats
Matching e.g. I420_10LE in Gstreamer / yuv420p10 in ffmpeg. The formats
are notably used for HDR10 videos by software decoders like dav1d, libav,
libaom and libvpx.

Use-cases include video players and editors that can allocate DMA buffers
- e.g. via udmabuf, dma-heaps, VA-API, V4L2, etc. - allowing them to avoid
unnecessary copies. Testing HDR10 playback on CI might also become easier.

Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>
2025-06-30 11:56:23 +00:00
Robert Mader
df09c3b0e7 drm-uapi: update drm_fourcc.h to latest version
Taken from commit e252e3f3488 of the drm-misc-next kernel tree,
including the new 10/12/16bit software decoder YCbCr formats.

These have also been pulled into the libdrm 2.4.125 release.

Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>
2025-06-30 11:56:23 +00:00
David Rosca
25a2c9116c radeonsi/vcn: Stop forcing OBU frame for first frame on VCN4
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This has effect only for first frame, for every other frame it will
be overwritten by application preference, which we should honor.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35712>
2025-06-30 11:17:55 +00:00
Rhys Perry
7b291a33d4 nir/search: fix dumping of conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Rhys Perry
08859cbe50 nir/lower_bit_size: fix bitz/bitnz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 6585209cdd ("nir/lower_bit_size: mask bitz/bitnz src1 like shifts")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Valentine Burley
9a25d6c00f ci/android: Add link to Android CTS results
The Android CTS results are saved as an HTML page in the CI artifacts.
Add a direct link to make them easier to access.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35792>
2025-06-30 10:15:15 +00:00
Valentine Burley
f56be74be6 ci/android: Add separate log section for Android CTS
The Android CTS logs were previously buried under the `cuttlefish: setup`
section. Add an uncollapsed section to improve visibility.

Also remove the `gathering the results` section, which was empty in the logs.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35792>
2025-06-30 10:15:15 +00:00
Valentine Burley
6203846120 ci/android: Move sourcing setup-test-env.sh before set -uex
This prevents printing the various aliases and exports, and cleans up the
job logs.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35792>
2025-06-30 10:15:15 +00:00
Christoph Pillmayer
a611ab7895 panvk: Propagate occ query state from secondary to primary
When we ExecuteCommands, we can have the following situation:

secondary {
   beginQuery
   endQuery
}

primary {
   beginRendering
   executeCommands(secondary)
   endRendering
}

See dEQP-VK.query_pool.concurrent_queries.secondary_command_buffer.

For the logic in EndRendering to correctly detect that an occlusion
query has ended inside this render pass, the oq state has to be
propagated back to the primary from the secondary.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35738>
2025-06-30 09:51:52 +00:00
Christoph Pillmayer
5cf456a028 panvk: Handle occlusion queries + multiview
If an occlusion query is started inside a renderpass with multiview, we
need to signal as many queries as there are bits set in the view mask.

It is enough to signal them, we don't have to change how we write the
occlusion query. All but the first of the n queries will return 0.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35738>
2025-06-30 09:51:52 +00:00
Alejandro Piñeiro
0f830c5572 v3d/compiler: properly handle the RA debug option
RA is intended to dump the vir when register allocation fails, so it
should be checked when we set c->compilation_result to
FAILED_REGISTER_ALLOCATION.

But it seems that this option was forgotten when on some of the
refactorings around compilation_result, as was let on an old condition
that reported register allocation failures.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35823>
2025-06-30 09:36:04 +00:00
Corentin Noël
c5b42a2ae4 virgl: Add more Gallium formats to the list
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Include the recently added formats.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35741>
2025-06-30 09:04:43 +00:00
Samuel Pitoiset
68708cd4da radv/ci: uprev kernel to 6.15.3
NAVI21/NAVI31 still uses 6.6 for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35765>
2025-06-30 08:10:47 +02:00
Lionel Landwerlin
89f3ee4cb2 brw: remove debug printf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: fcf4401824 ("brw: handle wa_18019110168 with independent shader compilation")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35815>
2025-06-29 12:39:03 +03:00
Yiwei Zhang
3f1d40d230 venus: wsi workaround for gamescope
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Gamescope relies on legacy scanout support when explicit modifier isn't
available and it chains the mesa wsi hint requesting such. Venus doesn't
support legacy scanout with optimal tiling on its own, so venus disables
legacy scanout in favor of prime buffer blit for optimal performance. As
a workaround here, venus can once again force linear tiling when legacy
scanout is requested outside of common wsi.

Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35811>
2025-06-28 17:52:40 -07:00
Yiwei Zhang
c5d1392a10 venus: share code for AHB image subres query
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35810>
2025-06-28 23:16:58 +00:00
Yiwei Zhang
d45b76c3cc venus: drop tiling_override tracking
It used to serve 2 purposes:
1. aliased swapchain image
2. faked legacy scanout support

...and now can be cleaned up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35810>
2025-06-28 23:16:58 +00:00
Yiwei Zhang
a96d7cdbc0 venus: drop drm_format_modifier tracking
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35810>
2025-06-28 23:16:58 +00:00
Yiwei Zhang
0db7818b87 venus: use wsi_common_create_swapchain_image
We no longer need to do our own since the chain has the info stored for
the common helper.

Test:
- dEQP-VK.wsi.*.swapchain.create.*
- dEQP-VK.wsi.*.maintenance1.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35810>
2025-06-28 23:16:57 +00:00
Calder Young
646977348b anv: Fix typo when checking format's extended usage flag
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: f4c1753c1a ("anv: report color/storage features on YCbCr images with EXTENDED_USAGE")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35703>
2025-06-28 20:39:18 +00:00
Yiwei Zhang
cb54338f65 venus: fix msaa state sample location info sanitization
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The additional reference has corrupted the below D/S state instead of
properly ending the msaa state pnext chain.

Fixes: ff64092ff3 ("venus: support VK_EXT_sample_locations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35808>
2025-06-28 20:25:23 +00:00
Mel Henning
8795006994 nir/opt_uniform_subgroup: Handle vote_feq
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Brings the vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_dvec4_vertex
from 234 to 169 instructions on NAK.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
70fccc59fc nir/opt_uniform_subgroup: Handle vote_ieq
No shader-db changes here, but it does improve some cts shaders, eg. the
vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_i64vec4_vertex
goes from 80 to 56 instructions with NAK

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
ff0a69efd7 nak: Call nir_opt_uniform_subgroup
Totals:
CodeSize: 4534711024 -> 4436632608 (-2.16%); split: -2.16%, +0.00%
Number of GPRs: 10963687 -> 10964035 (+0.00%); split: -0.00%, +0.00%
Static cycle count: 1097514100 -> 1057410000 (-3.65%)
Spills to reg: 242139 -> 242168 (+0.01%); split: -0.01%, +0.02%
Fills from reg: 222030 -> 222061 (+0.01%); split: -0.00%, +0.01%
Max warps/SM: 7245564 -> 7245484 (-0.00%); split: +0.00%, -0.00%

Totals from 52240 (26.58% of 196502) affected shaders:
CodeSize: 1982406064 -> 1884327648 (-4.95%); split: -4.95%, +0.00%
Number of GPRs: 3861735 -> 3862083 (+0.01%); split: -0.00%, +0.01%
Static cycle count: 484309374 -> 444205274 (-8.28%)
Spills to reg: 24327 -> 24356 (+0.12%); split: -0.07%, +0.18%
Fills from reg: 12675 -> 12706 (+0.24%); split: -0.02%, +0.26%
Max warps/SM: 1417372 -> 1417292 (-0.01%); split: +0.00%, -0.01%

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
694523e2b9 nak: Implement nir_intrinsic_vote_ieq with OpMatch
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
10acb44c64 nir: Split lower_vote_eq into int/float versions
Recent nvidia hardware has a native instruction for
nir_intrinsic_vote_ieq but not for nir_intrinsic_vote_feq. So, split
this boolean into two so we can contol the lowering separately for each
instruction.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00