It is legal to have a cluster size larger than the subgroup/ballot size,
but our lowering would blow up in this case due to the nir_ishl_imm
overflowing in the lowering. Fortunately, this is easy to handle.
Fixes sub_group_clustered_reduce_logical_and()
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40224>
Tessellation evaluation shaders have a single convergent URB handle
(for the common patch data) used by all lanes. Every other stage's
IO handles have separate handles in each lane.
Thanks to Alyssa Rosenzweig for catching this bug.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280>
More fallout from f2a59fdea6.
is_not_zero now always returns whether the result is a floating point zero.
When combined with the fp denorm handling that will be added to
floating point range analysis, this is false for many sensible integer values.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.
Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>
The way it is, this optimization is too aggressive and may generate
code way worse than the original. Remove it from here, so drivers
consuming the generated SPIR-V will be able to make their own
more-informed decisions later.
Let's follow the same strategy of nir_load_liblc.c and just set the
limit to 0.
For indirect copies in Anv (not merged yet), block compressed formats
require some expensive divisions, so I put them all inside 'if'
statements that should never run on normal formats. This optimization
made us always run all the divisions all the time, tanking the
performance of the shader on small copies.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40020>
Use INT_MIN instead of INT_MAX for underflow.
Fixes: cc4b50b023 ("nir/opcodes: use u_overflow to fix incorrect checks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252>
With most Vulkan engines doing multithreaded compiles, NIR_DEBUG=print has
been a frustrating racy mess. Take a lock when we're doing per-pass
printing, so that the output is coherent. This unfortunately
single-threads the compiler process itself in that case, but when you're
NIR_DEBUG=printing, that's probably not a big deal.
An assert is introduced to make sure that nobody nests NIR_PASS() in a way
that would break printing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40126>
This address is delivered on Gfx12.5+ in compute/mesh/task shaders
from the command stream instruction.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
Will be useful to retain the base offset added in 0e9453291c ("brw:
improve push constant loading using base offsets") once we move push
constant data loading into NIR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
This seems to be faster.
ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
-592900 +/- 2302.24
-27.6432% +/- 0.0998961%
(Student's t, pooled s = 2719.05)
ministat (overall):
Difference at 95.0% confidence
-76.8333 +/- 27.2345
-0.632558% +/- 0.223407%
(Student's t, pooled s = 46.867)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Change the TYPE_FLOAT display format from %f to %g with sufficient
significant digits (%.5g for f16, %.9g for f32), so that float
immediates round-trip correctly through disassembly and assembly.
The %f format loses precision for small values: f16 0x0001 (denormal
~5.96e-8) displays as 0.000000, which parses back as 0x0000. The %g
format uses the minimum significant digits per IEEE 754 and strips
trailing zeros, using scientific notation when needed. Whole-number
values use %.1f to keep them unambiguously float (e.g. "1.0").
Update the etnaviv PEST grammar and the freedreno ir3 lexer/parser to
accept the new output formats (scientific notation, stripped zeros).
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40207>
They were already being handled explicitly in vtn_alu, so just handle
them directly for spec constants too -- that has to do special work
for conversions anyway. Remove the bit-size parameters from the function.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
There was an assumption that if the instruction had non-native float
as a source, the first source would have such type. This doesn't
hold for Select, and the code failed in two ways
- The boolean source of Select was being converted to the non-native
float type.
- The loop that resolves the bit-size for unsized operands would
trip at `assert(i == 0)` because Select has more than one source.
Re-organize the code to track the types of the sources independently,
and fix both issues above.
Fixes: 90e1b12890 ("spirv: Add bfloat16 support to SpecConstantOp")
Fixes: 51d3c4c889 ("spirv: support float8 spec constant op")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
Only used by Convert operations, so just pass 0 from callers that
are not Convert and clarify that in the code.
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.
This change should have no effect for non saturating conversions.
Fixes "conversions long_sat_*half" CL CTS tests
Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
allowed precision mismatches on uniforms, however if you lower precision on
16-bit consts, then this error triggers instead.
So here we relax the type matching and just make sure we match int vs
float.
Fixes: 0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5337
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40107>