Using ior here is equivalent to using uadd_sat, but works for every driver
and shouldn't hurt anywhere.
I forgot to fix this up when fixing up some vvl errors with zink.
Fixes crashes with the integer_ctz CL CTS tests in zink.
Fixes: 39ec184db6 ("zink: lower 64 bit find_lsb, ufind_msb and bit_count")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30535>
../src/compiler/nir/nir_lower_int64.c: In function ‘lower_int64_intrinsic’:
../src/compiler/nir/nir_lower_int64.c:1347:1: error: control reaches end of non-void function [-Werror=return-type]
1347 | }
Fixes: bf7a114246
nir/lower_int64: Add lowering for some 64-bit subgroup ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
Appendix A: Vulkan environemtn for SPIR-V says:
Operations described as “correctly rounded” will return the infinitely
precise result, x, rounded so as to be representable in
floating-point. The rounding mode is not specified, unless the entry
point is declared with the RoundingModeRTE or the RoundingModeRTZ
Execution Mode.
Conversion between types are classified as correctly rounded, so let's
do rounding correctly.
v2: check rounding mode for destination bit size (Georg)
Fixes upcoming Vulkan CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_vert
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_uint64_up_frag
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.rounding_rtz_conv_from_int64_up_frag
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25281>
If the high 32 bits were zero, this would be umin(find_lsb(lo), 31). This
evaluates to 31 if lo is also zero, instead of -1.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 9293d8e64b ("nir: Add find_lsb lowering to nir_lower_int64.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25409>
Instead, we replace every use of it with nir_def. Most of this commit
was generated by sed:
sed -i -e 's/dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
A few manual fixups were required in lima and the nir_legacy code.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
Instead, we replace it directly with nir_def. We could replace it with
nir_dest but the next commit gets rid of that so this avoids unnecessary
churn. Most of this commit was generated by sed:
sed -i -e 's/dest.dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
There were a few manual fixups required in the nir_legacy.c and
nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a
similar pattern.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
We could add a nir_def_bit_size() helper but we use ->bit_size about 3x
as often as nir_dest_bit_size() today so that's a major Coccinelle
refactor anyway and this doesn't make it much worse. Most of this
commit was generated byt the following semantic patch:
@@
expression D;
@@
<...
-nir_dest_bit_size(D)
+D.ssa.bit_size
...
Some manual fixup was needed, especially in cpp files where Coccinelle
tends to give up the moment it sees any interesting C++.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
There's plenty of places we can use these new and shiny helpers, so
let's clean up the code a bit.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23460>
We'd like to postpone most int64 lowering until pretty late in the
process, because e.g. turning iadd@64 into (unpack + add-low + add-high
+ compare + b2i32 + repack) sequences makes it difficult for many
optimization passes to detect basic arithmetic patterns. In particular,
nir_opt_load_store_vectorizer becomes unable to handle basic offset math
on 64-bit addresses.
We'd like to do double precision lowering earlier in the process,
however. One snag is that nir_lower_int64's lower_2f and lower_f2 can
produce operations that may need lowering by nir_lower_doubles(), so
it's crucial to run those sets of lowering together.
To handle this, we make a new entrypoint that does nir_lower_int64
but skips everything except float conversions. Note that the newly
produced instructions will still be lowered according to the full set
of int64 lowering options; this shouldn't be a huge deal.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23064>
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:
@@
expression A, B, C, D, E;
@@
-nir_ssa_dest_init(A, B, C, D, E);
+nir_ssa_dest_init(A, B, C, D);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
Starting with !19748 lowered 64 bit shifts were showing wrong results for
shifts with insignificant bits set.
nir shifts are defined to only look at the least significant bits. The
lowering has take this into account.
So there are two things going on:
1. the `ieq` and `uge` further down depend on `y` being masked.
2. the calculation of `reverse_count` actually depends on a masked `y` as
well, due to the `(iabs (iadd y -32))` giving a different result for
shifts > 31;
Fixes: 41f3e9e5f5 ("nir: Implement lowering of 64-bit shift operations")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19995>
Currently float16 to int64 conversions don't work correctly, because
the "div" variable has an infinite value, since 2^32 isn't
representable as a 16-bit float, which causes the result of of rem(x,
div) to be NaN for all inputs, leading to an incorrect result. Since
no values of magnitude greater than 2^32 are representable as a
float16 we don't actually need to do the fdiv/frem operations, the
conversion is equivalent to f2u32 with the result padded to 64 bits.
Rework:
* Jordan: Handle f16 in if/else rather than conditional
Fixes: 936c58c8fc ("nir: Extend nir_lower_int64() to support i2f/f2i lowering")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19391>
This involves computing the significand with a 64-bit precision type,
and implementing the normalization and packing manually instead of
relying on u2f32, since the significand can no longer be represented
as a 32-bit integer. This fixes 64-bit integer to 64-bit float
conversions on devices that support 64-bit float natively but lack
64-bit integer support, like Intel MTL hardware.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19128>
The existing code for this appears to work okay for conversions
involving 64-bit floats, relax the assert and enable the lowering
path. This fixes 64-bit float to 64-bit integer integer conversions
on devices that have native support for 64-bit floats but lack 64-bit
integer support, like Intel MTL hardware.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19128>
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
These are all pretty trivial because we can just split the op into one
subgroup op per half of the value. There's some question as to whether
these belong in lower_int64 or lower_subgroups but, on Intel, they key
decider of whether or not we need the lowering is based on whether or
not we have hardware int64 support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
These consist of the variations nir_op_{i|u|f}{min|max|med}3 which are either
lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>
I was doing math manually in a lowering pass for converting a division to
a ushr, and this will let the pass be expressed more naturally.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6378>
The round-to-nearest-even implementation found in lower_2f() is incorrect
for any value having a significand that is not directly representable
and whose non-representable part lies between 1 and half the minimum
representable value. In this case, the significand is rounded up instead
of being rounded down.
Fixes: 936c58c8fc ("nir: Extend nir_lower_int64() to support i2f/f2i lowering")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6290>
That's an attempt at replacing the complex __int64_to_float() and
__float_to_int64() implementations found in float64.glsl by a simpler
native NIR equivalent.
Thanks to that, we can have lower those conversion without having to
compile a GLSL shader, which would be quite annoying for OpenCL
kernels.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
This information is exposed through shader->options->lower_int64_options.
Removing the extra arg forces drivers to initialize this field correctly.
This also allows us to check the int64 lowering options from each int64
lowering helper and decide if we should lower the instructions we
introduce.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
Fixes an issue with Renderdoc's shader debugging with ACO.
If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6320e37d4b ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709>
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.i64.*
with ACO because this backend compiler expects most of the 64-bit
operations to be lowered.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
We have the code to do the lowering, we were just missing the
boilerplate bits to make should_lower_int64_alu_instr return true.
Fixes: 62d55f1281 "nir: Wire up int64 lowering functions"
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
This adds the option to lower 64-bit ufind_msb opcodes.
v2: use split_x/y removes component loops (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
One advantage of this is that we no longer need to run in a loop because
the new framework handles lowering instructions added by lowering.
Reviewed-by: Eric Anholt <eric@anholt.net>
We need this when doing full software 64-bit emulation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>