mesa/src/broadcom/compiler
Jose Maria Casanova Crespo 9bf0bdf776 v3d: Avoid scheduling an instruction that stalls waiting for SFU retval
If we detect that a scheduling candidate will stall because having a
register source that is the written by the SFU unit in the previous
instruction we reduce its priority so any non stalling operation would
be chosen.

The latency of SFU operations is defined as 2. So they would be scheduled
earlier if other candidates have the same priority.

Finally we won't merge instructions that stall to a previously chosen one.
As the result of the previous one would be waiting for an extra cycle.

Although shader-db result show that instruction are hurt with an increase
of 0.35% the sum of instructions + stalls is reduced a 0.52%. And
the total of sfu-stalls is reduced a 63.51%. It implies also a small
increase in the max-temps metric because of scheduling earlier SFU
operations.

total instructions in shared programs: 9102719 -> 9117851 (0.17%)
instructions in affected programs: 4324628 -> 4339760 (0.35%)
helped: 4162
HURT: 12128
helped stats (abs) min: 1 max: 10 x̄: 1.28 x̃: 1
helped stats (rel) min: 0.09% max: 4.76% x̄: 0.66% x̃: 0.51%
HURT stats (abs)   min: 1 max: 27 x̄: 1.69 x̃: 1
HURT stats (rel)   min: 0.05% max: 7.69% x̄: 0.87% x̃: 0.68%
95% mean confidence interval for instructions value: 0.90 0.96
95% mean confidence interval for instructions %-change: 0.47% 0.50%
Instructions are HURT.

total max-temps in shared programs: 1327728 -> 1327812 (<.01%)
max-temps in affected programs: 4730 -> 4814 (1.78%)
helped: 61
HURT: 134
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 2.70% max: 13.33% x̄: 4.89% x̃: 4.17%
HURT stats (abs)   min: 1 max: 3 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 1.54% max: 20.00% x̄: 6.10% x̃: 5.26%
95% mean confidence interval for max-temps value: 0.28 0.58
95% mean confidence interval for max-temps %-change: 1.80% 3.52%
Max-temps are HURT.

total sfu-stalls in shared programs: 99551 -> 36324 (-63.51%)
sfu-stalls in affected programs: 95029 -> 31802 (-66.53%)
helped: 25882
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 2
helped stats (rel) min: 5.26% max: 100.00% x̄: 79.86% x̃: 100.00%
95% mean confidence interval for sfu-stalls value: -2.47 -2.42
95% mean confidence interval for sfu-stalls %-change: -80.18% -79.54%
Sfu-stalls are helped.

total inst-and-stalls in shared programs: 9202270 -> 9154175 (-0.52%)
inst-and-stalls in affected programs: 5618516 -> 5570421 (-0.86%)
helped: 22728
HURT: 855
helped stats (abs) min: 1 max: 31 x̄: 2.16 x̃: 1
helped stats (rel) min: 0.07% max: 16.67% x̄: 1.14% x̃: 0.92%
HURT stats (abs)   min: 1 max: 5 x̄: 1.25 x̃: 1
HURT stats (rel)   min: 0.12% max: 5.26% x̄: 1.24% x̃: 0.86%
95% mean confidence interval for inst-and-stalls value: -2.07 -2.01
95% mean confidence interval for inst-and-stalls %-change: -1.07% -1.05%
Inst-and-stalls are helped.

v2: Rename v3d_qpu_generates_sfu_stalls to v3d_qpu_instr_is_sfu (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-22 03:00:50 +02:00
..
meson.build v3d: add lowering for OpenGL logic operations 2019-07-12 09:16:38 +02:00
nir_to_vir.c v3d: handle nir_intrinsic_store_tlb_sample_color_v3d 2019-07-18 08:59:35 +02:00
qpu_schedule.c v3d: Avoid scheduling an instruction that stalls waiting for SFU retval 2019-07-22 03:00:50 +02:00
qpu_validate.c broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements. 2018-04-26 11:30:22 -07:00
v3d33_tex.c v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value 2019-04-07 15:13:36 +02:00
v3d33_vpm_setup.c broadcom/vc5: Move V3D 3.3 VPM write setup to a separate file. 2018-01-12 21:56:24 -08:00
v3d40_tex.c v3d: use inc/dec tmu operation with image atomic sub/add of 1 2019-07-12 11:51:22 +02:00
v3d_compiler.h v3d: add shader-db stat to count SFU stalls 2019-07-22 03:00:50 +02:00
v3d_nir_lower_image_load_store.c v3d: Fix image_load_store clamping of signed integer stores. 2019-01-31 08:39:40 -08:00
v3d_nir_lower_io.c v3d: Move the stores for fixed function VS output reads into NIR. 2019-03-05 10:59:40 -08:00
v3d_nir_lower_logic_ops.c v3d: emit correct lowering for logic operations with MSAA render targets 2019-07-18 08:59:35 +02:00
v3d_nir_lower_scratch.c v3d: Use the new lower_to_scratch implementation for indirects on temps. 2019-04-12 16:16:58 -07:00
v3d_nir_lower_txf_ms.c v3d: Use nir_shader_lower_instructions() for txf_ms lowering. 2019-07-18 11:28:56 -07:00
vir.c v3d: add shader-db stat to count SFU stalls 2019-07-22 03:00:50 +02:00
vir_dump.c v3d: Add missing dumping for the spill offset/size uniforms. 2019-04-12 15:59:31 -07:00
vir_live_variables.c v3d: Stop treating exec masking specially. 2019-03-05 07:36:24 -08:00
vir_opt_copy_propagate.c v3d: Use ldunif instructions for uniforms. 2019-03-05 12:57:39 -08:00
vir_opt_dead_code.c v3d: Drop the V3D 3.x vpm read dead code elimination. 2019-03-05 12:57:39 -08:00
vir_opt_redundant_flags.c v3d: fix checking twice auf flag 2019-06-13 11:45:18 +02:00
vir_opt_small_immediates.c v3d: Use ldunif instructions for uniforms. 2019-03-05 12:57:39 -08:00
vir_register_allocate.c v3d: Fix detection of TMU write sequences in register spilling. 2019-04-26 12:42:30 -07:00
vir_to_qpu.c v3d: the ldtlbu signal reads an implicit uniform 2019-07-12 09:16:38 +02:00