These opcodes where fixed to return an integer instead of a boolean
value some time ago but the documentation for them was not updated
and still talked about a boolean result.
Fixes: b0d4ee520 ('nir/opcodes: Fix up uadd_carry and usub_borrow')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
While we're moving it, reformat a bit to make it match util_sign_extend
better.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17214>
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
nir_ine_imm(b, nir_iand_imm(b, x, mask), 0) and
nir_i2b(b, nir_iand_imm(b, x, mask)) are common
patterns which become quite messy when they are
part of a larger expression. Clang-format does
not improve things either and we can end up with
some rather interesting looking code.
(RADV ray tracing pipeline and query lowering)
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
Use util_sign_extend() to silence the following integer-overflow
error.
src/compiler/nir/nir_serialize.c:1333:40: runtime error: left shift of 1000165000 by 13 places cannot be represented in type 'int'
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17186>
Extend the packed_instr struct to support texops above
nir_texop_fragment_fetch_amd.
Fixes: 603e6ba972 ("nir: add two new texture ops for multisample fragment color/mask fetches")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17186>
packed_instr::tex::dest must be last to match the packed_instr::any::dest
position.
Fixes: 35655865cb ("nir/serialize: pack instructions better")
Cc: stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17212>
Also add radv and radeonsi implementation. Will be used in tess lowering.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
This really, really helps on platforms where fabs() isn't free. A great
many shaders use a * frsq(fabs(fdot(a, a))) to normalize a vector.
Since the result of the fdot must be non-negative, the fabs can be
eliminated by an existing algebraic rule.
shader-db results:
r300 (run on R420 - X800XL)
total instructions in shared programs: 1369807 -> 1368550 (-0.09%)
instructions in affected programs: 59986 -> 58729 (-2.10%)
helped: 609
HURT: 0
total vinst in shared programs: 512899 -> 512861 (<.01%)
vinst in affected programs: 1522 -> 1484 (-2.50%)
helped: 36
HURT: 0
total sinst in shared programs: 260690 -> 260570 (-0.05%)
sinst in affected programs: 1419 -> 1299 (-8.46%)
helped: 120
HURT: 0
total consts in shared programs: 957295 -> 957230 (<.01%)
consts in affected programs: 849 -> 784 (-7.66%)
helped: 65
HURT: 0
LOST: 0
GAINED: 3
The 3 gained shaders are all vertex shaders from XCom: Enemy Unknown.
I'm guessing that game is never going to run on my X800XL. :)
i915
total instructions in shared programs: 791121 -> 780843 (-1.30%)
instructions in affected programs: 220170 -> 209892 (-4.67%)
helped: 2085
HURT: 0
total temps in shared programs: 47765 -> 47766 (<.01%)
temps in affected programs: 9 -> 10 (11.11%)
helped: 0
HURT: 1
total const in shared programs: 93048 -> 92983 (-0.07%)
const in affected programs: 784 -> 719 (-8.29%)
helped: 65
HURT: 0
LOST: 0
GAINED: 36
Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16702250 -> 16697908 (-0.03%)
instructions in affected programs: 119277 -> 114935 (-3.64%)
helped: 1065
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 4.08 x̃: 4
helped stats (rel) min: 0.48% max: 10.17% x̄: 3.66% x̃: 3.94%
95% mean confidence interval for instructions value: -4.26 -3.89
95% mean confidence interval for instructions %-change: -3.76% -3.56%
Instructions are helped.
total cycles in shared programs: 880772068 -> 880734134 (<.01%)
cycles in affected programs: 2134456 -> 2096522 (-1.78%)
helped: 941
HURT: 324
helped stats (abs) min: 2 max: 2180 x̄: 123.06 x̃: 44
helped stats (rel) min: 0.04% max: 49.96% x̄: 7.08% x̃: 3.81%
HURT stats (abs) min: 2 max: 2098 x̄: 240.33 x̃: 35
HURT stats (rel) min: 0.04% max: 77.07% x̄: 12.34% x̃: 3.00%
95% mean confidence interval for cycles value: -47.93 -12.04
95% mean confidence interval for cycles %-change: -2.87% -1.34%
Cycles are helped.
No shader-db changes on any other Intel platform.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17181>
There are several places that should have supported the various sized
versions of bcsel and the various nir_op_[fi]csel_* opcodes. Rather
than enumerate the whole list, add a property.
v2: Make the comment for NIR_OP_IS_SELECTION more descriptive.
Suggested by Jason.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
For example, the proof for this pattern
(('bcsel', ('flt', 'a@32', 0), 'b@32', 'c@32'), ('fcsel_ge', a, c, b)),
would be
bcsel(a < 0, b, c)
bcsel(!(a < 0), c, b)
bcsel(a >= 0, c, b)
fcsel_ge(a, c, b)
However, !(a < 0) => (a >= 0) is well known to produce different
results if `a` is NaN.
Instead of that replacement, use this replacement:
bcsel(a < 0, b, c)
bcsel(-0 < -a, b, c)
bcsel(0 < -a, b, c)
fcsel_gt(-a, b, c)
This is NaN-safe and exact.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: 0f5b3c37c5 ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
point size min/max values are provided through the state vars, so ensure
these are always applied in order to respect ARB_point_parameters
cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
This extension adds built-in variables gl_LastFragDepthARM and gl_LastFragStencilARM
which can be implemented almost the same as gl_LastFragData from color fetch extension.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
nir_to_dxil() uses those types to pick the right operation overload,
and atomic loads/stores are no different from their non-atomic
counterpart apart from the atomicity property, so it makes sense to
pass a type to the deref_{load,store} intrinsic in that case too.
Suggested-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16926>
While MSVC doesn't do __STDC_VERSION__ correctly for C99, it does for
C11, which is what we now require. So we can remove this hack.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16908>
Needed to be able to call nir_opt_gcm on the v3dv driver. This change
is needed as on v3dv we honor vulkan resource index returning a vec2.
See commit 21b0a4c80c for more info.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16986>
Currently uint64_t_to_bitmask(..) is used in combination with
the pattern 'match'. This only works for values smaller then
64 bit. Add support for bigger isa sizes.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16996>
If the type is not an array, glsl_get_length() returns 0 and we don't
update the new_vars[]/flat_vars[] entries.
Fixes: bcd14756ee ("nir/lower_io_to_vector: add flat mode")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16960>
On iris and crocus, this flag is used to set "alt mode" math on the shader
as a whole. Some other drivers have a similar mode for DX9/ARB-program
behavior, so document what it does so we can start using it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16176>
With the following structures :
struct StructA
{
uint64_t value0;
uint8_t value1;
};
struct TopStruct
{
struct StructA a;
uint8_t value3;
};
Currently offsetof(struct TopStruct, value3) = 9. While the same code
on the CPU gives offsetof(struct TopStruct, value3) = 16.
This is impacting OpenCL kernels we're trying to use to build
acceleration structures.
v2: Add comment/link to some description of the alignment/size
computation
Cc: mesa-stable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16940>
Perform address calculation in 32 bits when
dealing with inbounds array derefs.
Closes: #6562
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16729>
Preserving information about inbounds access and
the required bit size for the bounds will help
with avoiding 64-bit operations when lowering io.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16729>
The function was previously a helper for when some drivers still
called the GLSL IR optimisations in a loop. No drivers do that
anymore.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16924>
Since we have now switched all drivers to using NIR and therefore
the NIR based uniform linker this param never needs to be set to
true so remove it.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16924>
Instead of just checking for the variables to match, check that the
entire deref up to the interface type matches.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Instead of having a bunch of mode checks as special cases, assert that
the modes equal and then switch on the mode. This should make the
special cases a bit easier to understand. Handling of `a_var == b_var`
looks redundant now but it won't be in the next patch.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
This will let us use it to compare only the first part of a pair of
deref paths and continue the comparison later.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Instead of incrementing pointers, use an integer index. This makes it
clear that we always increment them together. It'll also make the next
change a bit easier. We use a pointer to an integer because the next
patch is going to let us abort the walk and we want to be able to
continue where we left off.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>