mesa/src
Francisco Jerez 8857da4db5 intel/brw/swsb: Omit redundant read-after-read synchronization for back-to-back DPAS.
Multiple DPAS instructions executed on the same functional unit are
guaranteed to read their source operands in program order, so no
scoreboard synchronization is required between a DPAS read and another
DPAS read of the same register.

In order to achieve that track the pipeline (DPAS vs. other) of each
out-of-order dependency via a new field on the dependency struct along
with the token ID of the out-of-order dependency.  When a read
dependency for a DPAS instruction is encountered whose producer is
also a DPAS unit, strip the SRC synchronization flag so that no
redundant wait is emitted.  The DST synchronization flag is preserved
since write-after-read hazards still require ordering.

This reduces the number of scoreboard stalls emitted within chains of
DPAS instructions that have overlapping sources (common in matrix
multiplication kernels), improving occupancy of the systolic pipeline.
It avoids performance regressions in XeSS kernels in combination with
the following vectorization optimization, and could also be helpful in
theory with other workloads that utilize the systolic pipeline via
KHR_cooperative_matrix.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814>
2026-06-15 08:10:51 +00:00
..
amd radv: rename vrs_coarse_shading -> vrs_flat_shading 2026-06-13 19:29:59 +00:00
android_stub android_stub: purge unused log utils 2026-05-01 20:23:23 +00:00
asahi hk: use drirc_gen 2026-06-10 07:17:14 +00:00
broadcom v3dv: increase max push constants size 2026-06-12 12:20:37 +00:00
c11
compiler nir/divergence: Allow local_invocation_id.z to be treated as uniform. 2026-06-15 08:10:51 +00:00
drm-shim drm-shim: Include the hex of the driver ioctl for unimplemented ioctls. 2026-06-04 20:17:34 +00:00
egl Modify x11_xcb_display_supports_xshm to get xshm opcode 2026-06-12 15:16:05 +00:00
etnaviv etnaviv/isa: Fix Meson warning about etnaviv_isa_rs dummy library 2026-06-11 12:05:59 +00:00
freedreno Uprev ANGLE to 8e09325ebad45c7e11630a79754361e965e5fab0 2026-06-12 18:04:00 +00:00
gallium radv,radeonsi: disallow VRS flat shading if SubgroupInvocationID is used 2026-06-13 19:29:59 +00:00
gbm gbm: Replace VER_MIN with common MIN2 2026-04-30 13:00:03 +00:00
getopt
gfxstream gfxstream: kumquat: validate device dmabuf support before use 2026-06-08 21:38:19 +00:00
glx Modify x11_xcb_display_supports_xshm to get xshm opcode 2026-06-12 15:16:05 +00:00
gtest
imagination pco: Fix metadata invalidation 2026-06-12 13:28:07 +00:00
imgui imgui: update copy and port all tools using it 2026-04-30 10:59:45 +00:00
intel intel/brw/swsb: Omit redundant read-after-read synchronization for back-to-back DPAS. 2026-06-15 08:10:51 +00:00
kosmickrisp kk,wsi/metal: Support VK_(KHR/EXT)_swapchain_maintenance1 2026-06-10 17:41:01 +00:00
loader loader: check if the kernel driver is amdgpu 2026-05-27 10:19:50 +00:00
mesa mesa: allow GL_TEXTURE_COMPARE_{MODE,FUN} with EXT_shadow_samplers 2026-06-12 10:56:05 +00:00
microsoft dzn: use drirc_gen 2026-06-10 07:17:14 +00:00
nouveau nouveau/mme: Add a test for MME Shadow RAM behavior 2026-06-13 14:16:39 +02:00
panfrost kraid: Re-materialize constants 2026-06-12 17:10:28 -04:00
poly poly: Fix range used for index unroll bounds checks 2026-05-26 10:39:00 +00:00
tool pps: Re-emit time clock_sync more regularly 2026-05-06 21:37:15 +00:00
util util, llvmpipe: flush subnormals to zero on ARM/AArch64 2026-06-12 14:34:14 +00:00
virtio venus/ci: Revert ADL jobs to stable 6.17 kernel 2026-06-12 17:23:02 +00:00
vulkan Modify x11_xcb_display_supports_xshm to get xshm opcode 2026-06-12 15:16:05 +00:00
x11 Modify x11_xcb_display_supports_xshm to get xshm opcode 2026-06-12 15:16:05 +00:00
.clang-format intel: add Jay 2026-04-10 18:21:21 +00:00
meson.build meson: drop non-existent platforms=xcb check 2026-06-11 18:14:54 +00:00