nir: Add a scheduler pass to reduce maximum register pressure.

This is similar to a scheduler I've written for vc4 and i965, but this
time written at the NIR level so that hopefully it's reusable.  A notable
new feature it has is Goodman/Hsu's heuristic of "once we've started
processing the uses of a value, prioritize processing the rest of their
uses", which should help avoid the heuristic otherwise making such
systematically bad choices around getting texture results consumed.

Results for v3d:

total instructions in shared programs: 6497588 -> 6518242 (0.32%)
total threads in shared programs: 154000 -> 152828 (-0.76%)
total uniforms in shared programs: 2119629 -> 2068681 (-2.40%)
total spills in shared programs: 4984 -> 472 (-90.53%)
total fills in shared programs: 6418 -> 1546 (-75.91%)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2)

v2: Use the DAG datastructure, fold in the scheduling-for-parallelism
    patch, include SSA defs in live values so we can switch to bottom-up
    if we want.
v3: Squash in improvements from Alejandro Piñeiro for getting V3D to
    successfully register allocate on GLES3.1 dEQP.  Make sure that
    discards don't move after store_output.  Comment spelling fix.
This commit is contained in:
Eric Anholt 2019-02-19 09:30:52 -08:00
parent 5159db60fc
commit 8afab607ac
5 changed files with 1098 additions and 0 deletions

View file

@ -960,6 +960,11 @@ uint64_t *v3d_compile(const struct v3d_compiler *compiler,
NIR_PASS_V(c->s, nir_lower_bool_to_int32);
NIR_PASS_V(c->s, nir_convert_from_ssa, true);
/* Schedule for about half our register space, to enable more shaders
* to hit 4 threads.
*/
NIR_PASS_V(c->s, nir_schedule, 24);
v3d_nir_to_vir(c);
v3d_set_prog_data(c, prog_data);

View file

@ -326,6 +326,7 @@ NIR_FILES = \
nir/nir_range_analysis.h \
nir/nir_remove_dead_variables.c \
nir/nir_repair_ssa.c \
nir/nir_schedule.c \
nir/nir_search.c \
nir/nir_search.h \
nir/nir_search_helpers.h \

View file

@ -209,6 +209,7 @@ files_libnir = files(
'nir_range_analysis.h',
'nir_remove_dead_variables.c',
'nir_repair_ssa.c',
'nir_schedule.c',
'nir_search.c',
'nir_search.h',
'nir_search_helpers.h',

View file

@ -4201,6 +4201,10 @@ typedef bool (*nir_should_vectorize_mem_func)(unsigned align, unsigned bit_size,
bool nir_opt_load_store_vectorize(nir_shader *shader, nir_variable_mode modes,
nir_should_vectorize_mem_func callback);
void nir_schedule(nir_shader *shader, int threshold);
void nir_strip(nir_shader *shader);
void nir_sweep(nir_shader *shader);
void nir_remap_dual_slot_attributes(nir_shader *shader,

File diff suppressed because it is too large Load diff