It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly
be handled as a pure VALU instruction.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
Similar to emit_gfx10_wave64_bpermute, but uses the new
v_permlane64_b32 instruction to swap data between wave halves.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>
Dual source blending must be in strict WQM mode.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19643>
The first operand of this new pseudo-instruction is a 64-bit SGPR for
the continue PC, followed by a variable list of fixed VGPRS for the
color exports which are the PS epilog inputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
Instead of SDWA v_mov_b32/v_xor_b32, we can use a combination of
v_add_u16/v_sub_u16 (add/sub swap, similar to xor swap) and v_perm_b32
with a literal.
I don't know yet if GFX11 adds any new instructions which makes this
easier, but this approach should have full functionality.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16595>
Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Makes p_extract more flexible and prepares it to be usable
for other use cases.
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11560>
Instead, we can rely on the fact that subdword definitions
must preserve the unused bits while dword definitions either
pad or sign-extend.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
This commit introduces a new struct SubdwordSel
in order to ease and clean up the usage of SDWA
selections. This includes removing the distinction
between register-allocated and fixed SDWA selections.
Instead, SDWA selections can now also access the high
bits of subdword variables. Alignment and sizes are
validated accordingly. Size, offset and sign_extend
can be evaluated via helper methods.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
Previously there hasn't been any validation for these instructions,
but after shooting myself in the leg with it a few times, I decided
to add the validation now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
Because isVOPn() is true for many VOP3, SDWA and DPP instructions, this
would often not complain.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
These will let us make the SDWA optimizer much simpler than if we were to
recognize combinations of shift/and/bfe.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
By exposing a subgroupSize of 64, reductions with
cluster_size 32 in wave32 might be considered divergent,
and thus, result in a VGPR.
Fixes: dEQP-VK.subgroups.clustered.graphics.subgroupclustered* with wave32
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10769>
This is used when printing the program and to avoid updating register
demand during post-RA liveness analysis.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>
p_extract_vector copy-propagation can create byte sels for v2b operands.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>
It isn't intended to be accurate after RA, so num_waves can become zero,
breaking the sgpr_limit calculation.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>