This has several advantages over the u_
- not hand written
- no intermediate memcpy of raw pixels
- supports 4 ubytes in addition to floats
- no need to pass a pipe_transfer
It also has (hopefully temporary) limitations:
- no support for YUV
- no support for SRGB