Fixes RA failure for a parallel-rdp shader on pitcairn.
fossil-db (Navi):
Totals from 2 (0.00% of 128733) affected shaders:
CodeSize: 203656 -> 205724 (+1.02%)
Instrs: 32267 -> 32529 (+0.81%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5544>
If on a nested loop
- the outer loop needs WQM but
- the inner loop doesn't need WQM and
- the break condition of the inner loop is computed in the outer loop
then it could happen that we transitioned to Exact before entering the inner loop
which could create an empty exec mask and lead to an infinite loop.
Fixes a GPU hang with RDR2
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5518>
These should all check if the operand has a regclass. The opsel check
should also be skipped post-RA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5504>
If the mode is too small, the operand will be an inline constant and the
literal dword won't be written.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
Some instructions such as v_mad_f16 can do partial writes on GFX9.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
This worked because the optimizer didn't consider that the 16-bit
instruction would interpret the inline constant differently. This will
change in the next commit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
p_create_vector can have sub-dword operands with a v1 definition.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
v_mad_f32 doesn't support denormals but v_fma_f32 does.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
This check was added because I thought v_mad_f32 didn't preserve the
signess of zero, but I can't reproduce that and this isn't mentioned
anywhere in LLVM.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
Use sub-dword definitions so that the RA can use SDWA
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
Shared subdword loads don't need byte alignment as they are split
into multiple loads if necessary.
Fixes: 5cde4989d3 ('aco: remove unnecessary split- and create_vector instructions for subdword loads')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441>
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
I think this case was just missing.
This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
As there are no SDWA instructions, we need to take care not to overwrite
the upper bits of other copy_operation's operands.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>