iris: Use streaming loads to read from tiled surfaces

Always use the streaming load (since we know we have Broadwell+, all of our target CPU support sse41) for reading back form the tiled surface for mapping the resource. This means we hit the fast WC handling paths on Atoms (without LLC), and for big Core (with LLC) using the streaming load is no less efficient as we do not require the tiled buffer to be pulled into the CPU cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2026-05-05 22:38:05 +02:00 · 2019-02-22 20:53:41 +00:00 · 2019-02-22 20:53:41 +00:00 · 97ad0efba0
commit 97ad0efba0
parent 797fb6c6ac
2 changed files with 5 additions and 2 deletions
--- a/src/gallium/drivers/iris/iris_bufmgr.c
+++ b/src/gallium/drivers/iris/iris_bufmgr.c
@ -1186,8 +1186,11 @@ can_map_cpu(struct iris_bo *bo, unsigned flags)
    * most drawing while non-persistent mappings are active, we may still use
    * the GPU for blits or other operations, causing batches to happen at
    * inconvenient times.
+    *
+    * If RAW is set, we expect the caller to be able to handle a WC buffer
+    * more efficiently than the involuntary clflushes.
    */
-   if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC))
+   if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC | MAP_RAW))
      return false;

   return !(flags & MAP_WRITE);
--- a/src/gallium/drivers/iris/iris_resource.c
+++ b/src/gallium/drivers/iris/iris_resource.c
@ -1143,7 +1143,7 @@ iris_map_tiled_memcpy(struct iris_transfer *map)

         isl_memcpy_tiled_to_linear(x1, x2, y1, y2, ptr, src, xfer->stride,
                                    surf->row_pitch_B, has_swizzling,
-                                    surf->tiling, ISL_MEMCPY);
+                                    surf->tiling, ISL_MEMCPY_STREAMING_LOAD);
         box.z++;
      }
   }