Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
Move lowering to nir_lower_subgroups. At some point Intel
backend might want to skip that and lower at the backend IR
boundary, but for now lowering always applies.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>
This codepath had a bug (always setting `elems[0]`) since it was last
reworked, but there's no subgroup instruction that uses this helper and
support Composites, so it can be replace with an assert.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40356>
We can produce a transposed value sometimes, and we have to make sure
that val->transposed is also updated when that happens.
Noticed by inspection after the previous commit.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40017>
This comes up in lowered load_ubo sequences (observed in OpenCL test
test_api min_max_parameter_size). Hopefully the pack gets coalesced,
it's like nir_op_vec2 on most backends, so it should usually be ok to
sink even though the register pressure heuristic will reject it.
Allowing it to sink allows the UBO load to sink.
Intel's backend scheduler can optimize the relevant sequences locally
but there should still be a win here for global load sinking.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40267>
ministat (nir_analyze_fp_class):
Difference at 95.0% confidence
-201983 +/- 1064.87
-9.31575% +/- 0.0468505%
(Student's t, pooled s = 1257.67)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40346>
ministat (nir_analyze_fp_class):
Difference at 95.0% confidence
-4484.55 +/- 1288.68
-0.205419% +/- 0.0589514%
(Student's t, pooled s = 1521.99)
This should also use less memory.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40346>
Currently the fixed function vertex shader is built as io_lowered
shaders; however the gl_nir_add_point_size() function currently expects
the original shader to be not io_lowered, and this function is called to
lower the fixed function vertex shader.
Add support for adding point size store_output intrinsics for io_lowered
shaders.
This fixes fixed function rendering on Zink with a Vulkan driver w/o
VK_KHR_maintence5.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40373>
This means we don't run into undefined behavior when testing nan/inf inputs.
Also make sure that patterns using is_only_used_as_float are signed zero correct.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>
These can never come from the API but there's a few cases where panvk
wants them.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>
When masking out of bounds image loads, we previously returned a vector
of all zeros. However, for robustImageAccess2, depending on the format,
some components such as the alpha channel in an RGB format
should evaluate to 1.
This corrects the replacement value based on the format swizzle.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39430>
Code-motion should not move back upconversions without any other
instruction, that would only increase memory pressure without any
significant performance benefit (conversions are usually cheap).
This should also help lowering mediump varyings early by not reversing
their work.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40273>
On SM86+, we can use a 16-bit unsigned offset along side the register
for it.
This adds a new base indice that will be used for it, integration with
nir_opt_offsets and a lowering pass to get ride of the base on
unsupported generations.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
Adds a new intrinsic allowing to do raw write in the various ISBE spaces
where attributes are stored.
This also adapt isberd_nv to map to what we have since SM70+.
This will be used to support mesh shaders.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
It's the only driver that uses the pass so it may as well go there.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
Previously, we assumed that the selector for bcsel could be whatever,
regardless of the bit sizes of the data and we'd just fix it in the
back-end. This works okay for scalars but falls over the moment we
vectorize because all our vector handling assumes bit sizes match.
Since matching bit sizes is what the hardware wants anyway, it's better
to do the right thing in NIR and hope copy-propagation can fold in
conversions if needed.
Unfortunately, copy prop isn't that smart yet so this does hurt a bit:
Instrs: 1193679 -> 1198086 (+0.37%); split: -0.06%, +0.43%
CodeSize: 11915136 -> 11950592 (+0.30%); split: -0.05%, +0.34%
Full: 160985 -> 160941 (-0.03%); split: -0.04%, +0.01%
Estimated normalized CVT cycles: 4456.938557000181 -> 4480.876069000186 (+0.54%); split: -0.13%, +0.67%
Estimated normalized SFU cycles: 6350.9375 -> 6392.21875 (+0.65%)
Estimated normalized Load/Store cycles: 205773.0 -> 205795.0 (+0.01%)
Maximum number of threads: 12864 -> 12863 (-0.01%)
Number of spill instructions: 22487 -> 22489 (+0.01%)
Number of fill instructions: 52179 -> 52219 (+0.08%)
Hurt shaders:
google-meet-clvk/BgBlur
google-meet-clvk/Relight
parallel-rdp/small_subgroup
parallel-rdp/small_uber_subgroup
The proper solution here is to teach copy-prop about this stuff so that
it can propagate swizzles into ALU ops when they're supported:
https://gitlab.freedesktop.org/panfrost/mesa/-/issues/265
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14945
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
Use is_scalar to know if we can do transpose loading.
Also enable vectorization if 2 intrinsics share the same source (it
means the only difference is the base).
Fixes: e14d6b535c ("brw/nir: add new intrinsics to load data from the indirect address")
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>
Useful for smaller/larger loads. Also there is no reason to be bitsize
specific here if we use an signed constant.
Foz-DB Navi48:
Totals from 8 (0.01% of 114655) affected shaders:
Instrs: 7629 -> 7612 (-0.22%)
CodeSize: 40772 -> 40692 (-0.20%)
Latency: 54880 -> 54944 (+0.12%)
InvThroughput: 8879 -> 8880 (+0.01%); split: -0.08%, +0.09%
VALU: 4029 -> 4027 (-0.05%); split: -0.15%, +0.10%
SALU: 1260 -> 1249 (-0.87%)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40292>