From 74168a601e37afd95abf204a633d01a572af5271 Mon Sep 17 00:00:00 2001 From: Francisco Jerez Date: Wed, 16 Jul 2025 15:52:22 -0700 Subject: [PATCH] util/ra: Allow driver to override class P value. This is helpful for the driver to have the option to provide a custom threshold for the PQ test performed by the graph coloring algorithm. A threshold lower than the physical number of registers is helpful on platforms where the number of registers used can impose a limit on the thread parallelism of the program. In such platforms even though a passing PQ test guarantees that the node can be pushed onto the stack and neglected while coloring the remaining nodes, the ordering in which this happen can have a dramatic effect in the register pressure of the resulting shader and therefore also on the thread parallelism of the program. Setting a P value threshold lower than the real P value will cause nodes with Q value above the threshold to use the existing optimistic coloring heuristic that takes the effort of ordering nodes in the stack by Q value, in order to do a better job at minimizing the total register requirement of the program. Even though this causes us to hit the optimistic codepaths for trivially colorable nodes the interference graph is still guaranteed to be trivially colorable if it was trivially colorable without the override. The use of a threshold lower than the real P value will come at a compile-time performance cost, the specific trade-off between compile-time and run-time can be adjusted by the driver based on the number of registers available to each thread without causing a hit to thread parallelism. Reviewed-by: Lionel Landwerlin Part-of: --- src/util/register_allocate.c | 15 +++++++++++++++ src/util/register_allocate.h | 1 + 2 files changed, 16 insertions(+) diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c index e44c509643a..8c3296c342e 100644 --- a/src/util/register_allocate.c +++ b/src/util/register_allocate.c @@ -282,6 +282,21 @@ ra_class_add_reg(struct ra_class *class, unsigned int r) class->p++; } +/** + * Overrides the P value of a register class to be below the + * automatically calculated one in order to force optimistic + * allocation above this threshold even for nodes that are trivially + * colorable, which can be used to reduce the register requirement of + * programs on platforms where increasing register use imposes limits + * on thread parallelism. + */ +void +ra_class_override_p(struct ra_class *class, unsigned int p) +{ + assert(p <= class->p); + class->p = p; +} + /** * Returns true if the register belongs to the given class. */ diff --git a/src/util/register_allocate.h b/src/util/register_allocate.h index 14f2be4023a..d580fb249fe 100644 --- a/src/util/register_allocate.h +++ b/src/util/register_allocate.h @@ -67,6 +67,7 @@ ra_add_transitive_reg_pair_conflict(struct ra_regs *regs, void ra_make_reg_conflicts_transitive(struct ra_regs *regs, unsigned int reg); void ra_class_add_reg(struct ra_class *c, unsigned int reg); +void ra_class_override_p(struct ra_class *c, unsigned int p); struct ra_class *ra_get_class_from_index(struct ra_regs *regs, unsigned int c); void ra_set_finalize(struct ra_regs *regs, unsigned int **conflicts);