NIR doesn't have atomic loads/stores, so we have to workaround that with
this for dEQP-VK.memory_model.* to pass.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>
Previously, we've set element_size == 16 which causes loads from
packed vec3 arrays to cross the boundary and return wrong data.
This patch sets element_size = 4 and splits loads into single channel.
Fixes all of dEQP-VK.subgroups.ballot_broadcast.*
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5977>
This (combined with a pass to actually set the corresponding NIR flags)
should help fix a lot of the regressions from the SMEM addition combining
change.
fossil-db (Navi):
Totals from 12 (0.01% of 135946) affected shaders:
CodeSize: 12376 -> 12304 (-0.58%)
Instrs: 2436 -> 2422 (-0.57%)
VMEM: 1105 -> 1096 (-0.81%)
SClause: 133 -> 130 (-2.26%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
SMEM does the addition with 64-bits, not 32. So if the original code
relied on wrapping around (for example, for subtraction), it would break.
Apparently swizzled MUBUF accesses also have issues with combining
additions that could overflow. Normal MUBUF accesses seem fine.
fossil-db (Navi):
Totals from 27219 (20.02% of 135946) affected shaders:
CodeSize: 128303256 -> 131062756 (+2.15%); split: -0.00%, +2.15%
Instrs: 24818911 -> 25280558 (+1.86%); split: -0.01%, +1.87%
VMEM: 162311926 -> 177226874 (+9.19%); split: +9.36%, -0.17%
SMEM: 18182559 -> 20218734 (+11.20%); split: +11.53%, -0.34%
VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20%
SClause: 865384 -> 1104986 (+27.69%); split: -0.00%, +27.69%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
bounds_ctrl is set to true by default which works around some game bugs,
but that isn't enough on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>
Otherwise, we might have both VS and TCS code in the same block but float
controls are set per-block.
We also rely on VS code not dominating TCS code for the optimizer to work
correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>
This won't add interferences between spill ids of different types and will
exit early if there's already an interference.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>
I don't think this is much of an optimization in the typical case, but for
very complex shaders this should work much better.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>
fossil-db (Navi):
Totals from 167 (0.12% of 135946) affected shaders:
CodeSize: 484892 -> 482628 (-0.47%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>
This is needed since we will be lowering some 8/16-bit shuffles to
masked_swizzle_amd.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5695>
Even though the boolean can be divergent, the control flow can be (at
least partially) uniform. For example, we don't have to create any
s_andn2_b64/s_and_b64/s_or_b64 instructions with this code:
a = ...
loop {
b = bool_phi a, c
if (uniform)
break
c = ...
}
d = phi c
fossil-db (Navi):
Totals from 5506 (4.05% of 135946) affected shaders:
SGPRs: 605720 -> 604024 (-0.28%)
SpillSGPRs: 52025 -> 51733 (-0.56%)
CodeSize: 65221188 -> 64957808 (-0.40%); split: -0.41%, +0.00%
Instrs: 12637881 -> 12584610 (-0.42%); split: -0.42%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3388>
It looks like the attempt to fix this in 1e791e51a6 was incomplete.
This fixes crashes with Devil May Cry 5 with a debug build.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5828>
fossil-db (Pitcairn):
Totals from 2172 (1.58% of 137414) affected shaders:
CodeSize: 7109080 -> 7100100 (-0.13%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5623>