So we don't have to do it in the traversal loop. Should 2 and
instructions and a 64-bit shift, so 4/8 cycles per instance node
visit.
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 208460 -> 208292 (-0.08%)
Instrs: 38276 -> 38248 (-0.07%)
Latency: 803181 -> 803142 (-0.00%)
InvThroughput: 165384 -> 165376 (-0.00%)
Copies: 4912 -> 4905 (-0.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>