We know this is wrong. In many cases, they're faster than warp
instructions, sometimes with a latency as low as 2. However, there seem
to be a bunch of exceptions we don't understand and it's better to be
more concervative and have correct shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This time we take into account WaR and WaW dependencies and not just RaW
dependencies. The NVIDIA ISA is actually quite dynamic and the not
everything is nicely pipelined such that writes always happen at
consistent cycles. There are exact rules, of course, but we don't know
what those are so we need to make some worst-case assumptions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
UGPRs in warp instructions are treated more like cbufs than GPRs.
You're only allowed to have one and it has to share space with the
possible cbuf or immediate. This means we need to treat them as a "not
a register" case for warp instructions but as a register for uniform
instructions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This requires a pretty significant rework of encode_alu_base(). In
particular, we can't know the register file that's going to be used
until we get into encode_alu_base() so ALUSrc::from_src() can't handle
Zero itself. Instead, we defer to a new ALUSrc::with_op_uniformity()
helper which does a postprocess step.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
This makes encode_alu() take Option<&Src> and call ALUSrc::from_src()
itself. This is necessary for handling uniform ALU correctly as we
can't actually separate Reg from UReg without knowing what kind of ALU
op we are. While we're here, take an Option<&Dst> instead of
Option<Dst>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Once we start using UGPRs, it's possible to have a vector with a mix of
GPRs and UGPRs. This isn't actually allowed by the hardware but it's
possible as an intermediate state thanks to copy-propagation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
We're allocating one register file at a time and our invariants are
per-file so we don't want to check the components assumption until we've
checked that it uses the active file.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Instead, do the same thing we do for float modifiers and use OpIAdd2 or
OpIAdd3. This makes for a little more work in copy-prop but the extra
opcode and lowering pass just isn't worth it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>
Remove a legacy workaround where presence of modifiers in framebuffer
state results in `needs_present` to be set without a good reason.
This prevents hitting an assertion for framebuffers that use DRM
modifiers, e.g. via GBM BO alloc -> EGLImage import -> GL FBO bind.
Co-authored-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29715>
Here we test the rematerialization of the deref produces valid nir
when both the deref and array index value are moved to the else branch of
the first terminator during the merge.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>
We were not restoring an outer loop as the current loop after we had
finished processing a nested loop.
Fixes: 9995f336e6 ("nir: add merge loop terminators optimisation")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>
When the namespace have a dash, this method cannot recogniza properly
the fields in a url. Better to use a regular expression quickly defining
the fields. The exception raised, when the pattern is not recognized
would help more the handler.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29683>