venus: use seq_cst for ring cs and tail update ordering

To avoid incompatibility between the compiler implementations used by the driver and the renderer, seq_cst ordering is picked here, which has required a full mfence instruction. Then the renderer side acquire is ensured to be ordered after the cache flush of ring cs updates. Perf wise, there's no regression in headless vkmark runs. In theory, the overhead introduced here weighs trivially as compared to the ring cs encode/decode part. So we should go for better robustness. Test: venus on windows guest works with renderer on Linux Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14277 (cherry picked from commit 07d059f3e2) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38803>
2025-12-24 11:00:11 +01:00 · 2025-11-13 13:07:12 -08:00 · 2025-11-13 13:07:12 -08:00 · d29700fce6
commit d29700fce6
parent d22a20de74
2 changed files with 7 additions and 2 deletions
--- a/.pick_status.json
+++ b/.pick_status.json
@ -944,7 +944,7 @@
        "description": "venus: use seq_cst for ring cs and tail update ordering",
        "nominated": true,
        "nomination_type": 1,
-        "resolution": 0,
+        "resolution": 1,
        "main_sha": null,
        "because_sha": null,
        "notes": null
--- a/src/virtio/vulkan/vn_ring.c
+++ b/src/virtio/vulkan/vn_ring.c
@ -98,9 +98,14 @@ vn_ring_store_tail(struct vn_ring *ring)
 {
   /* the renderer is expected to load the tail with memory_order_acquire,
    * forming a release-acquire ordering
+    *
+    * To avoid incompatibility between the compiler implementations used by
+    * the driver and the renderer, seq_cst ordering is picked here, which has
+    * required a full mfence instruction. Then the renderer side acquire is
+    * ensured to be ordered after the cache flush of ring cs updates.
    */
   return atomic_store_explicit(ring->shared.tail, ring->cur,
-                                memory_order_release);
+                                memory_order_seq_cst);
 }

 uint32_t