These act as a vector OpCopy, except that copy-prop can't see through
them and the destination of OpPin gets pinned in the register file and
is unallowed to move. Of course, we have to be careful with these
because spilling can't spill them, either. If we have too many live
pinned values at the same time, spilling or RA may fail.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Unlike the pinned set in VecRegAllocator which exists for the duration
of an instruction, registers which are pinned in the main allocator are
pinned until the register is freed. The pinned set in VecRegAllocator
is initialized to a copy of the one in the main register allocator.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
The really tricky case here is phis, which may have a uniform def even
though some of the srcs are non-uniform. This happens because of the
restriction elsewhere that requires UGPRs and UPreds to only ever be
written in uniform control-flow.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Because we go in and out of SSA, all the phis get re-created and the new
phis will default to divergent. This little pass attempts to prove as
many of the phis convergent as possible.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
We know this is wrong. In many cases, they're faster than warp
instructions, sometimes with a latency as low as 2. However, there seem
to be a bunch of exceptions we don't understand and it's better to be
more concervative and have correct shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This time we take into account WaR and WaW dependencies and not just RaW
dependencies. The NVIDIA ISA is actually quite dynamic and the not
everything is nicely pipelined such that writes always happen at
consistent cycles. There are exact rules, of course, but we don't know
what those are so we need to make some worst-case assumptions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
UGPRs in warp instructions are treated more like cbufs than GPRs.
You're only allowed to have one and it has to share space with the
possible cbuf or immediate. This means we need to treat them as a "not
a register" case for warp instructions but as a register for uniform
instructions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This requires a pretty significant rework of encode_alu_base(). In
particular, we can't know the register file that's going to be used
until we get into encode_alu_base() so ALUSrc::from_src() can't handle
Zero itself. Instead, we defer to a new ALUSrc::with_op_uniformity()
helper which does a postprocess step.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This makes encode_alu() take Option<&Src> and call ALUSrc::from_src()
itself. This is necessary for handling uniform ALU correctly as we
can't actually separate Reg from UReg without knowing what kind of ALU
op we are. While we're here, take an Option<&Dst> instead of
Option<Dst>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Once we start using UGPRs, it's possible to have a vector with a mix of
GPRs and UGPRs. This isn't actually allowed by the hardware but it's
possible as an intermediate state thanks to copy-propagation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
We're allocating one register file at a time and our invariants are
per-file so we don't want to check the components assumption until we've
checked that it uses the active file.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>