These should be handled like their non-_block counterparts - there is no i/o
index for them.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40096>
We currently have code to lower quad votes to a ballot. The same idea works for
subgroup votes. Generalize the quad vote code and use it to lower
vote_all/vote_eq for backends setting a new lower_vote option.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40074>
It's pretty much the same mechanism, except it's a different register
location.
With this change we gain indirect loading support.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
Noticed while playing with pixel coord things.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40056>
Pack/unpack should be a lot faster than duplicating the subgroup op.
No fossil-db changes, but multiple people complained about this to me.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40024>
One of the advantages to this new FB load shader, apart from it being
common, is that it's able to properly handle partial tile loads.
Instead of doing the force_preload/clear dance that PanVK is currently
doing, these shaders are clever enough to detect whether or not they're
inside the Vulkan render area and clear the inside while loading the
border pixels.
In order for this to work, there are two new intrinsics which provide
the framebuffer bounding box and the clear values. We need this in
order to handle partial loads correctly.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
Fixes: 6fc1030e4f ("nir: Add some new panfrost fragment shader intrinsics")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
Prefer to be explicit when handling it, like is done for regular
nir_instr_type_call.
Even though functions called by cmat_call have restrictions on them ("no
tangled instructions" for example), which could allow a couple of passes
to treat them differently, there's no tracking of what functions are
used only in such cases, so being conservative here should be safe.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39903>
This is mostly about conversions.
Conversions from float to int don't care about signed zero
and in the case of plain f2u/f2i, nan and inf are always
undefined too.
Conversions for int to float can't create nan, so they don't
need preserve_nan.
b2f only cares about preserve_sz, and nothing else.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
This complements our existing nir_get_io_index_src helper. Most, but annoyingly
not all, stores put their data source in source 0. Having a helper for this lets
us reduce special casing in a bunch of random places.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
Add load_texture_scale to the list of intrinsics whose divergence
depends on their sources. This is needed to support running divergence
analysis on VC4.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
This was used because the exact bit meant something different for
comparisons than it did for the replacement expression, but that isn't the
case anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
fossil-db (navi31):
Totals from 2 (0.00% of 84369) affected shaders:
Instrs: 7738 -> 7740 (+0.03%)
Latency: 333207 -> 333239 (+0.01%)
InvThroughput: 33320 -> 33324 (+0.01%)
VClause: 382 -> 384 (+0.52%)
VMEM: 656 -> 658 (+0.30%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
This would fix both stores 'b' and 'c' from being vectorized:
a = load(0)
loop {
b = load(0)
if (break)
store(0)
}
c = load(0)
fossil-db (navi31):
Totals from 8 (0.01% of 84369) affected shaders:
Instrs: 12035 -> 12066 (+0.26%)
CodeSize: 63016 -> 63208 (+0.30%)
Latency: 176091 -> 177013 (+0.52%)
InvThroughput: 43894 -> 43981 (+0.20%)
SClause: 194 -> 196 (+1.03%)
Copies: 803 -> 812 (+1.12%)
VALU: 7666 -> 7675 (+0.12%)
SALU: 1102 -> 1105 (+0.27%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
Match other SSBO intrinsics and other atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39895>
Surprisingly, this has an effect on GFX1201:
Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)
I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866>