util/ra: Allow driver to override class P value.

This is helpful for the driver to have the option to provide a custom
threshold for the PQ test performed by the graph coloring algorithm.

A threshold lower than the physical number of registers is helpful on
platforms where the number of registers used can impose a limit on the
thread parallelism of the program.  In such platforms even though a
passing PQ test guarantees that the node can be pushed onto the stack
and neglected while coloring the remaining nodes, the ordering in
which this happen can have a dramatic effect in the register pressure
of the resulting shader and therefore also on the thread parallelism
of the program.

Setting a P value threshold lower than the real P value will cause
nodes with Q value above the threshold to use the existing optimistic
coloring heuristic that takes the effort of ordering nodes in the
stack by Q value, in order to do a better job at minimizing the total
register requirement of the program.  Even though this causes us to
hit the optimistic codepaths for trivially colorable nodes the
interference graph is still guaranteed to be trivially colorable if it
was trivially colorable without the override.

The use of a threshold lower than the real P value will come at a
compile-time performance cost, the specific trade-off between
compile-time and run-time can be adjusted by the driver based on the
number of registers available to each thread without causing a hit to
thread parallelism.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36618>
This commit is contained in:
Francisco Jerez 2025-07-16 15:52:22 -07:00 committed by Marge Bot
parent 35ac517780
commit 74168a601e
2 changed files with 16 additions and 0 deletions

View file

@ -282,6 +282,21 @@ ra_class_add_reg(struct ra_class *class, unsigned int r)
class->p++;
}
/**
* Overrides the P value of a register class to be below the
* automatically calculated one in order to force optimistic
* allocation above this threshold even for nodes that are trivially
* colorable, which can be used to reduce the register requirement of
* programs on platforms where increasing register use imposes limits
* on thread parallelism.
*/
void
ra_class_override_p(struct ra_class *class, unsigned int p)
{
assert(p <= class->p);
class->p = p;
}
/**
* Returns true if the register belongs to the given class.
*/

View file

@ -67,6 +67,7 @@ ra_add_transitive_reg_pair_conflict(struct ra_regs *regs,
void ra_make_reg_conflicts_transitive(struct ra_regs *regs, unsigned int reg);
void ra_class_add_reg(struct ra_class *c, unsigned int reg);
void ra_class_override_p(struct ra_class *c, unsigned int p);
struct ra_class *ra_get_class_from_index(struct ra_regs *regs, unsigned int c);
void ra_set_finalize(struct ra_regs *regs, unsigned int **conflicts);