s_wqm_b64 clobbers SCC.
Found this while working on dual source blending.
Fixes: 6113ee650a ("aco/gfx11: fix FS input loads in quad-divergent control flow")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19747>
In a lot of situations the previous exec value was already copied from the
same registers that exec should be saved to. In that case we don't have to
insert an extra copy to save exec.
This breaks ssa but this pass is going out of ssa anyway.
Foz-DB Navi21:
Totals from 16129 (11.96% of 134913) affected shaders:
CodeSize: 128184044 -> 128054468 (-0.10%)
Instrs: 23902694 -> 23870325 (-0.14%)
Latency: 387124324 -> 387095955 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 79949118 -> 79945859 (-0.00%); split: -0.01%, +0.00%
Copies: 1620768 -> 1588752 (-1.98%); split: -1.99%, +0.01%
Foz-DB Vega10:
Totals from 15546 (11.51% of 135041) affected shaders:
CodeSize: 120322524 -> 120200568 (-0.10%)
Instrs: 23448344 -> 23417855 (-0.13%)
Latency: 414018749 -> 413639289 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 183819363 -> 183726539 (-0.05%); split: -0.05%, +0.00%
Copies: 2194937 -> 2164448 (-1.39%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18528>
6698753cdb switched our GS output stores to use MUBUF.
The stride doesn't matter for the ESGS descriptor (because idxen=false and
the index stride is 64), but this fixes it anyway.
This also changes ACO to use MUBUF store too, since MTBUF doesn't seem to
work correctly with an invalid data format in the descriptor.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6698753cdb ("ac/llvm: don't use tbuffer_store as a fallback for swizzled stores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18885>
Foz-DB Navi21:
Totals from 2403 (1.78% of 134913) affected shaders:
CodeSize: 25329156 -> 25311244 (-0.07%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19413>
FMask doesn't exist on GFX11. Have txf_ms take the fragment_fetch_amd
path.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19375>
The LLVM backend does this when lowering ordered_xfb_counter_add_amd. I
guess there is some missing dependency checking or something.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345>
TES rel patch id is <256, so we can use an existing unused LDS
byte instead of extra dword.
To ease the programing, change the index of repacked_arg_vars
for these variables.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18832>
According to LLVM, s_sendmsg_rtn(GET_REALTIME) should be used instead
of s_memrealtime.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267>
Apparently the TLS constructor doesn't work well if RADV
is instantiated multiple times and/or used by a program with
already existing threads.
Fixes: a128d444cb ('aco: use monotonic_buffer_resource for instructions')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19219>
Copy the information from the first predecessor then check whether
it matches other predecessors and modify the data accordingly.
Marked for backporting to stable to make it possible to also
backport fixes based on this.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
According to perf, this roughly halves the impact of the post-RA
optimizer in ACO's compile times.
Measurement was taken using a debug optimized build using
NIR_DEBUG=novalidate RADV_DEBUG=nocache and replaying the Fossil DB
from the Doom Eternal shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
This doesn't make sense, opsel preserves the not selected half of the register,
p_insert zeros it.
No Foz-DB changes.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 54292e99c7 ("aco: optimize 32-bit extracts and inserts using SDWA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19253>
It's a revert of fcd53bebe6 ("aco: Define NOMINMAX in Meson build file")
Because 852d91edcd ("windows: Always set NOMINMAX to remove min/max macros")
did the same thing
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19233>
I misread the ISA doc and got the order wrong.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: dae1629778 ("aco: disable sdwa on gfx11")
Fixes: e68e6c75ca ("aco: use v_perm_b32 to copy 0xff00/0x00ff/0xff/0x00")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19223>
Now that we added an index src to the NIR intrinsic, it can
happen that these generate MUBUF instructions which have both
an index and an offset.
Extend this ACO optimization to the case when idxen is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
Also modify all existing uses to pass a zero to this new src.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
Remove unused arguments, clean up allow_combining vs. swizzled etc.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>
Previously, we always treated these as coherent, but now let's make
this configurable. Also set all current users to ACCESS_COHERENT.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> (nir)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17551>