Commit graph

201327 commits

Author SHA1 Message Date
Faith Ekstrand
c95b646e23 vulkan/queue: Use _mem_signal_temp instead of signal_mem_sync
The two checks should be equivalent.  This just lets us use data in
struct vk_queue_submit rather than a local boolean.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25576>
2024-10-03 22:11:39 +00:00
Faith Ekstrand
267b7f1deb vulkan/queue: Move has_binary_permanent_semaphore_wait into the sumbit struct
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25576>
2024-10-03 22:11:39 +00:00
Faith Ekstrand
9b21dc06c4 vulkan/queue: Don't use vk_semaphore in threaded payload stealing
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25576>
2024-10-03 22:11:39 +00:00
Danylo Piliaiev
e5d3eba096 u_trace: Fix trace_payload_as_extra_func desync between drivers
Buffer with indirect args wasn't passed to the function which
adds extra event args. Since function definition depends on the
common code, the definition is moved to a single place.

Fixes: 0a17035b5c
("u_trace: add support for indirect data")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31090>
2024-10-03 20:25:48 +00:00
Nanley Chery
26692deefc anv: Delete stale comment for BLORP clear color addr
It looks like this comment attempted to describe all the reasons we need
to pass the clear color address to BLORP. This comment actually isn't
exhaustive and some bits are out of date (e.g., BLORP no longer updates
the clear color address for us). Let's just delete it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Nanley Chery
10bcfb63d5 anv: Prevent clear color modifier corruption with views
If a dmabuf is shared with a clear color, the raw clear color channels
generally won't be interpreted correctly during format reinterpretation.
So, prevent Vulkan apps from trying to use such dmabufs as mutable
format render targets. Also, prevent such apps from using such dmabufs
as blorp_copy() destinations if doing so would require format
reinterpretation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Nanley Chery
edfb33efdd intel/blorp: Use original surface format for some copies
In iris, this should avoid some partial resolves when copying between
images. In anv, this will reduce restrictions on dmabufs which have
clear color support in the next patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Nanley Chery
73637dbce4 intel/blorp: Choose some copy formats independently
blorp_copy_get_formats() tries to make the source and destination view
formats match as much as possible. This avoids some casting in the copy
shader, but it makes determining the format that will be used for a
surface impossible without having the ISL surface for both that surface
and a source or destination.

We'd like to enable the Vulkan driver to know as early as possible what
format an image may be reinterpreted as for correctness. So, determine
the copy formats more independently and expose a helper which does so
for drivers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Nanley Chery
6721064939 anv: Use image formats when copying to/from buffers
blorp_copy() will sometimes use a complex shader if the source and
destination surface formats differ. For example, it will do this when
both formats support CCS_E, but have differing numbers of
bits-per-channel.

To reduce the chance of using this complex shader during transfers
between images and buffers, ensure the same format is used. We can't
completely prevent the complex shader because a copy may happen between
surface formats that have a different number of bits-per-pixel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Mike Blumenkrantz
f7b5faa1a2 zink: block srgb with winsys imports
these are already a set format

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Mike Blumenkrantz
f3c206d61e zink: fix external_only reporting for dmabuf formats
this is based on format features

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Mike Blumenkrantz
49950d3b2f zink: clamp out dmabuf exports from optimal tiling images
this is an impossible combo since optimal tiling cannot return stride

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Mike Blumenkrantz
3c44886d9e zink: also init format props when getting modifier props
forgot this in earlier refactor

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Mike Blumenkrantz
efeb65cfe8 zink: assert images aren't created with dmabuf export and optimal tiling
this is illegal

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Mike Blumenkrantz
2fdba5b914 zink: block dmabuf fallback into optimal tiling
when modifiers are specified the frontend needs stride, and optimal
tiling cannot provide a stride

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31467>
2024-10-03 19:06:02 +00:00
Faith Ekstrand
15fb18063b nvk: Fix a comment in SET_VIEWPORT_CLIP_CONTROL
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31514>
2024-10-03 18:34:55 +00:00
Rhys Perry
96e7cd89ea aco: fix is_vector_intact for GFX11 BVH
fossil-db (navi31):
Totals from 44 (0.06% of 79395) affected shaders:
Instrs: 1539111 -> 1539109 (-0.00%); split: -0.00%, +0.00%
CodeSize: 7880452 -> 7880380 (-0.00%); split: -0.00%, +0.00%
Latency: 7578794 -> 7578844 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 1450872 -> 1450876 (+0.00%); split: -0.00%, +0.00%
VClause: 40014 -> 40010 (-0.01%)
Copies: 116005 -> 116001 (-0.00%); split: -0.01%, +0.01%
VALU: 854630 -> 854626 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346>
2024-10-03 17:55:56 +00:00
Rhys Perry
24c60be1ad aco: create vector affinities for phi operands
fossil-db (navi21):
Totals from 2934 (3.70% of 79395) affected shaders:
Instrs: 8368484 -> 8365630 (-0.03%); split: -0.05%, +0.01%
CodeSize: 46032152 -> 45998480 (-0.07%); split: -0.09%, +0.01%
VGPRs: 200360 -> 200280 (-0.04%); split: -0.12%, +0.08%
Latency: 85556147 -> 85562615 (+0.01%); split: -0.09%, +0.10%
InvThroughput: 19066462 -> 19065173 (-0.01%); split: -0.09%, +0.09%
VClause: 209834 -> 209783 (-0.02%); split: -0.14%, +0.12%
SClause: 261811 -> 261826 (+0.01%); split: -0.00%, +0.01%
Copies: 727502 -> 724394 (-0.43%); split: -0.56%, +0.13%
Branches: 291083 -> 291120 (+0.01%); split: -0.01%, +0.03%
VALU: 5564021 -> 5560975 (-0.05%); split: -0.07%, +0.02%
SALU: 1100996 -> 1100942 (-0.00%); split: -0.02%, +0.02%

fossil-db (navi31):
Totals from 34207 (43.08% of 79395) affected shaders:
MaxWaves: 1036893 -> 1036781 (-0.01%); split: +0.01%, -0.02%
Instrs: 21977229 -> 21884600 (-0.42%); split: -0.47%, +0.05%
CodeSize: 112680884 -> 112298404 (-0.34%); split: -0.38%, +0.04%
VGPRs: 1590832 -> 1615912 (+1.58%); split: -0.25%, +1.83%
Latency: 142542601 -> 142670271 (+0.09%); split: -0.12%, +0.21%
InvThroughput: 19481055 -> 19434110 (-0.24%); split: -0.44%, +0.20%
VClause: 462865 -> 462558 (-0.07%); split: -0.20%, +0.13%
SClause: 619822 -> 619685 (-0.02%); split: -0.02%, +0.00%
Copies: 1704870 -> 1610889 (-5.51%); split: -5.89%, +0.38%
Branches: 518238 -> 518241 (+0.00%); split: -0.01%, +0.01%
VALU: 12230157 -> 12136112 (-0.77%); split: -0.82%, +0.05%
SALU: 2444075 -> 2444099 (+0.00%); split: -0.01%, +0.01%
VOPD: 3443 -> 3476 (+0.96%); split: +1.80%, -0.84%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11186
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346>
2024-10-03 17:55:56 +00:00
Rhys Perry
1e60509135 aco: stop using instructions in ra_ctx::vectors
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346>
2024-10-03 17:55:56 +00:00
Eric Engestrom
87c9690d8d docs: add sha sum for 24.2.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31511>
2024-10-03 17:47:52 +00:00
Eric Engestrom
7791afe7d7 docs: update calendar for 24.2.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31511>
2024-10-03 17:47:52 +00:00
Eric Engestrom
34d02b9191 docs: add release notes for 24.2.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31511>
2024-10-03 17:47:52 +00:00
David Rosca
be9b9c5fc4 pipe: Remove video get_*_fence
Replaced now with fence_wait.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
206bd951b4 frontends/va: Use fence_wait instead of get_*_fence
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
a98aa21873 r600/uvd: Implement fence_wait
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
a511dc1dda d3d12: Implement fence_wait
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
9080741d8f radeonsi/vpe: Implement fence_wait
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
02bcdc7648 radeonsi/vcn: Implement fence_wait
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
135f780e57 radeonsi/uvd: Implement fence_wait
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
David Rosca
3cec5a84ca pipe: Add video fence_wait
This will be used to replace get_*_fence functions.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31442>
2024-10-03 17:04:26 +00:00
Tapani Pälli
ac00d97e31 anv: use mi_builder in CmdBeginTransformFeedbackEXT
Patch converts MI_LOAD_REGISTER_MEM, MI_LOAD_REGISTER_IMM to use
mi_builder in CmdBeginTransformFeedbackEXT.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31502>
2024-10-03 16:20:40 +00:00
Friedrich Vock
64c406774f radv/rt: Skip all AABB code when no_skip_aabbs is not set
This avoids having to execute the load_global just to throw the results
away and ignore the node.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31443>
2024-10-03 15:22:08 +00:00
Mike Blumenkrantz
b43a639fe7 zink: block all 2d view creation with sparse
this is illegal (for now)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31489>
2024-10-03 14:45:33 +00:00
Samuel Pitoiset
5ab8caf5e2 zink/ci: update expected list of failures on NAVI31
This one seems fixed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31506>
2024-10-03 14:11:03 +00:00
Erico Nunes
6659b299e6 v3dv: match render and display device for wsi present
Since the last changes for EGL_EXT_device_drm_render_node, the
can_present_on_device callback now may receive the render device.
With that, the v3dv implementation may return that this device cannot
be used for presentation.
In particular, this callback is used for x11 wsi, and when through
XWayland it does now get the render device. On x11 wsi, this makes the
swapchain operate on blit mode. The blit mode introduces additional
unneeded overhead on wsi and runs through a different path which
currently causes rendering issues (in particular also with Zink).

Allowing both devices to match in the callback returns all wsi to
operate on the native mode and fixes the issues above.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31490>
2024-10-03 13:42:23 +00:00
Tatsuyuki Ishi
3b57a35ece radv: Enable descriptorBufferCaptureReplay.
The descriptors should be deterministic as long as the memory address it's
assigned to is equal. Enable it by just advertising the feature and putting
a dummy capture replay data requirement of 1 (0 is not permitted).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19952>
2024-10-03 13:06:07 +00:00
Boris Brezillon
fe6e96d685 panfrost: Move pan_blitter.{c,h} to the gallium driver
Move pan_blitter.{c,h} to the gallium driver and rename it
pan_fb_preload to reflect the fact it's not a generic blitter framework.

While at it, get rid of the remaining generic blitting bits and pick
better names for objects related to the preload stuff in
panfrost_{device,screen}.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:35 +00:00
Boris Brezillon
0bc3502ca3 panvk: Implement a custom FB preload logic
This has several advantages over using pan_blitter for that:

- we can catch allocation failures and flag the command buffer invalid
- we can re-use the vk_meta_device object list to keep track of our
  preload shaders
- we can re-use surface descriptors instead of re-emitting them every
  time a preload is done

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:35 +00:00
Boris Brezillon
607e517a11 panvk: Store attachment image views in the graphics state
Will be needed if we want to re-use pre-emitted texture payloads in the
FB preload path.

With this in place, we no longer need the src_iview in the resolve info
struct.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:35 +00:00
Boris Brezillon
a676d7ffb2 panvk: Emit textures needed for FB preload at image view creation time
Once we've specialized the framebuffer preload logic in panvk, this
will prevent re-emission of texture descriptors in the preload path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
4971538ffc panvk: Keep our copy_desc shader in vk_meta_device
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
d2515347f4 panvk: Keep our blend shaders in vk_meta_device
Now that vk_meta can keep track of VkShaderEXT objects, we can keep
our blend shaders in panvk_device::meta and get rid of our custom
hash-table.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
91c86c31cd panvk: Add an helper to create internal shaders
Blend and framebuffer preload shaders will be created as internal
shaders and added to the vk_meta object list.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
206bf1be09 panvk: Add a debug flag to force image copies through the gfx pipeline
Useful to debug copy-related issues.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
3d7bf07089 vk/meta: Make some helpers public
vk_image_view_type_to_sampler_dim() and vk_image_view_type_is_array()
can be useful to driver-specific meta shaders.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
cd38fd37f7 vk/meta: Allow tracking of driver-specific objects in the meta list
Add VK_META_OBJECT_KEY_DRIVER_OFFSET to define an offset for
driver-specific key types.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Boris Brezillon
7fe4f64c3b vk/meta: Support VkShaderExt objects to allow tracking internal shaders
PanVK has a few internal shaders that don't fit in the vk_meta
compute/graphics pipeline model. Teaching vk_meta about VkShaderEXT
allows us to keep track of those internal shaders without using yet
another hash table.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31441>
2024-10-03 09:53:34 +00:00
Iago Toral Quiroga
c58bfb355a broadcom/compiler: generate mali opcodes for clamping on Pi5
Models C0 and D0 support these opcodes too.

total instructions in shared programs: 10869461 -> 10856992 (-0.11%)
instructions in affected programs: 1467666 -> 1455197 (-0.85%)
helped: 6012
HURT: 1413
Instructions are helped.

total threads in shared programs: 431014 -> 431010 (<.01%)
threads in affected programs: 8 -> 4 (-50.00%)
helped: 0
HURT: 2

total uniforms in shared programs: 5432771 -> 5430909 (-0.03%)
uniforms in affected programs: 183047 -> 181185 (-1.02%)
helped: 976
HURT: 128
Uniforms are helped.

total max-temps in shared programs: 2235272 -> 2234069 (-0.05%)
max-temps in affected programs: 38163 -> 36960 (-3.15%)
helped: 1262
HURT: 168
Max-temps are helped.

total spills in shared programs: 4331 -> 4363 (0.74%)
spills in affected programs: 964 -> 996 (3.32%)
helped: 6
HURT: 47

total fills in shared programs: 6527 -> 6622 (1.46%)
fills in affected programs: 2047 -> 2142 (4.64%)
helped: 6
HURT: 47

total sfu-stalls in shared programs: 15807 -> 15935 (0.81%)
sfu-stalls in affected programs: 787 -> 915 (16.26%)
helped: 71
HURT: 172
Sfu-stalls are HURT.

total inst-and-stalls in shared programs: 10885268 -> 10872927 (-0.11%)
inst-and-stalls in affected programs: 1469423 -> 1457082 (-0.84%)
helped: 5998
HURT: 1417
Inst-and-stalls are helped.

total nops in shared programs: 184280 -> 185612 (0.72%)
nops in affected programs: 10000 -> 11332 (13.32%)
helped: 311
HURT: 1193
Nops are HURT.

The results show a reduction in register pressure, but an increase in
spills, which looks contradictory. This is because for some reason, this
optimization makes the NIR scheduler produce code for some shaders in Godot
that cause additional spilling, but the problem seems to be exclusive to
Godot shaders and not really related to the optimization itself but to
how the NIR scheduler works. Excluding Godot shaders we actually see a
decrease in spills and a slightly larger improvement in instruction
counts:

total instructions in shared programs: 10720106 -> 10707621 (-0.12%)
instructions in affected programs: 1375316 -> 1362831 (-0.91%)
helped: 5948
HURT: 1364
Instructions are helped.

total threads in shared programs: 428248 -> 428244 (<.01%)
threads in affected programs: 8 -> 4 (-50.00%)
helped: 0
HURT: 2

total spills in shared programs: 3729 -> 3712 (-0.46%)
spills in affected programs: 451 -> 434 (-3.77%)
helped: 6
HURT: 0

total fills in shared programs: 4738 -> 4714 (-0.51%)
fills in affected programs: 564 -> 540 (-4.26%)
helped: 6
HURT: 0

Comparing only shaders from Godot:

total instructions in shared programs: 149355 -> 149371 (0.01%)
instructions in affected programs: 92350 -> 92366 (0.02%)
helped: 64
HURT: 49
Inconclusive result (value mean confidence interval includes 0).

total max-temps in shared programs: 16477 -> 16472 (-0.03%)
max-temps in affected programs: 180 -> 175 (-2.78%)
helped: 5
HURT: 0
Max-temps are helped.

total spills in shared programs: 602 -> 651 (8.14%)
spills in affected programs: 513 -> 562 (9.55%)
helped: 0
HURT: 47

total fills in shared programs: 1789 -> 1908 (6.65%)
fills in affected programs: 1483 -> 1602 (8.02%)
helped: 0
HURT: 47

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
2024-10-03 09:02:08 +00:00
Iago Toral Quiroga
c57be33d96 broadcom/compiler: implement NIR mali opcodes for clamping
These translate directly to new unpack modifiers on V3D 7.x.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
2024-10-03 09:02:08 +00:00
Iago Toral Quiroga
a13bf51a9f broadcom: add helpers to identify availability of new unpack modifiers
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
2024-10-03 09:02:08 +00:00