mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-18 15:58:06 +02:00
Typically, we would schedule smooth varyings like this: nop ; nop ; ldvary.r4 nop ; fmul r0, r4, rf0 fadd rf13, r0, r5 ; nop ; ldvary.r1 nop ; fmul r2, r1, rf0 fadd rf12, r2, r5 ; nop ; ldvary.r3 nop ; fmul r4, r3, rf0 fadd rf11, r4, r5 ; nop ; ldvary.r0 where we pair up an ldvary with the fadd of the previous sequence instead of the previous fmul. This is because ldvary has an implicit write to r5 which is read by the fadd of the previous sequence, so our dependency tracking doesn't allow us to move the ldvary before the fadd, however, the r5 write of the ldvary instruction happens in the instruction after it is emitted so we can actually move it to the fmul and the r5 write would still happen in the same instruction as the fadd, which is fine. This patch allows us to pipeline these sequences optimally. For that, after merging an ldvary into a previous instruction in the middle of a pipelineable ldvary sequence, we check if we can manually move it to the last scheduled instruction instead (the one before the instruction we are currently scheduling). If we are successful at moving the ldvary to the previous instruction, then we flag the ldvary as scheduled immediately, which may promote its children (the follow-up fmul instruction for that ldvary) to DAG heads and continue the merge loop so that fmul can be picked and merged into the final fadd of the previous sequence (where we had originally merged the ldvary). This leads to a result that looks like this: nop ; nop ; ldvary.r4 nop ; fmul r0, r4, rf0 ; ldvary.r1 fadd rf13, r0, r5 ; fmul r2, r1, rf0 ; ldvary.r3 fadd rf12, r2, r5 ; fmul r4, r3, rf0 ; ldvary.r0 Shader-db results: total instructions in shared programs: 14071591 -> 13820690 (-1.78%) instructions in affected programs: 7809692 -> 7558791 (-3.21%) helped: 41209 HURT: 4528 Instructions are helped. total max-temps in shared programs: 2335784 -> 2326435 (-0.40%) max-temps in affected programs: 84302 -> 74953 (-11.09%) helped: 4561 HURT: 293 Max-temps are helped. total sfu-stalls in shared programs: 31537 -> 30683 (-2.71%) sfu-stalls in affected programs: 3551 -> 2697 (-24.05%) helped: 1713 HURT: 750 Sfu-stalls are helped. total inst-and-stalls in shared programs: 14103128 -> 13851373 (-1.79%) inst-and-stalls in affected programs: 7820726 -> 7568971 (-3.22%) helped: 41411 HURT: 4535 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9304> |
||
|---|---|---|
| .. | ||
| meson.build | ||
| nir_to_vir.c | ||
| qpu_schedule.c | ||
| qpu_validate.c | ||
| v3d33_tex.c | ||
| v3d33_vpm_setup.c | ||
| v3d40_tex.c | ||
| v3d_compiler.h | ||
| v3d_nir_lower_image_load_store.c | ||
| v3d_nir_lower_io.c | ||
| v3d_nir_lower_line_smooth.c | ||
| v3d_nir_lower_logic_ops.c | ||
| v3d_nir_lower_robust_buffer_access.c | ||
| v3d_nir_lower_scratch.c | ||
| v3d_nir_lower_txf_ms.c | ||
| vir.c | ||
| vir_dump.c | ||
| vir_live_variables.c | ||
| vir_opt_constant_alu.c | ||
| vir_opt_copy_propagate.c | ||
| vir_opt_dead_code.c | ||
| vir_opt_redundant_flags.c | ||
| vir_opt_small_immediates.c | ||
| vir_register_allocate.c | ||
| vir_to_qpu.c | ||