mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-06-21 12:28:24 +02:00
Multiple DPAS instructions executed on the same functional unit are guaranteed to read their source operands in program order, so no scoreboard synchronization is required between a DPAS read and another DPAS read of the same register. In order to achieve that track the pipeline (DPAS vs. other) of each out-of-order dependency via a new field on the dependency struct along with the token ID of the out-of-order dependency. When a read dependency for a DPAS instruction is encountered whose producer is also a DPAS unit, strip the SRC synchronization flag so that no redundant wait is emitted. The DST synchronization flag is preserved since write-after-read hazards still require ordering. This reduces the number of scoreboard stalls emitted within chains of DPAS instructions that have overlapping sources (common in matrix multiplication kernels), improving occupancy of the systolic pipeline. It avoids performance regressions in XeSS kernels in combination with the following vectorization optimization, and could also be helpful in theory with other workloads that utilize the systolic pipeline via KHR_cooperative_matrix. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41814> |
||
|---|---|---|
| .. | ||
| amd | ||
| android_stub | ||
| asahi | ||
| broadcom | ||
| c11 | ||
| compiler | ||
| drm-shim | ||
| egl | ||
| etnaviv | ||
| freedreno | ||
| gallium | ||
| gbm | ||
| getopt | ||
| gfxstream | ||
| glx | ||
| gtest | ||
| imagination | ||
| imgui | ||
| intel | ||
| kosmickrisp | ||
| loader | ||
| mesa | ||
| microsoft | ||
| nouveau | ||
| panfrost | ||
| poly | ||
| tool | ||
| util | ||
| virtio | ||
| vulkan | ||
| x11 | ||
| .clang-format | ||
| meson.build | ||