This implements IEEE-754-2019 signed zero semantics for fmin/fmax, as now
required by NIR, for hardware that has busted signed zero behaviour for
fmin/fmax. Ian expressed interest in this for Intel.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
Ensure the following identities hold to match IEEE-754-2019 and upcoming NIR:
min(-0, +0) = -0
min(+0, -0) = -0
max(-0, +0) = +0
max(+0, -0) = +0
NVK uses this lowering. In a simple compute shader using fmin64 on an SSBO with
signed zero preserve required, testing the effect of this patch, the instruction
count goes from 47->52. Obviously I'm not thrilled by that but I also couldn't
find any obvious way of mitigating the issue. (Maybe NVIDIA has special hardware
support here. By instruction count, lowering all the way to int64 is a loss,
though I don't know how to count cycles on NVIDIA.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
Ensure the following identities hold to match IEEE-754-2019 and upcoming NIR:
min(-0, +0) = -0
min(+0, -0) = -0
max(-0, +0) = +0
max(+0, -0) = +0
To implement, we specialize a version of flt64_nonnan. The regular flt64 has
extra logic to handle signed zero, so this version is actually simpler. So in
addition to the bug fix, this is an optimization. Compute shaders from
KHR-GL46.gpu_shader_fp64.builtin.max_dvec4 before and after:
before: 136 inst, 122 alu, 122 fscib, 4 ic, 1006 bytes, 39 regs, 28 uniforms
after: 104 inst, 90 alu, 90 fscib, 4 ic, 766 bytes, 39 regs, 28 uniforms
I will happy take a 24% reduction in instruction count as the cost of standards
conformance ^_^
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
SPIR-V strengthened the semantics around signed zero, requiring fmin(-0, +0) =
-0. Since nir_op_fmin is commutative, we must also require fmin(+0, -0) = -0 to
match, although it's unclear if SPIR-V requires that. We must strengthen NIR's
definitions accordingly.
This strengthening is additionally motivated by the existing nir_opt_algebraic
rule like:
(('fmin', a, ('fneg', a)), ('fneg', ('fabs', a))),
With the strengthened new definition, this transform is clearly exact. With the
weaker definition, the transform could change the sign of zero based on
implementation-defined behaviours which ... while, not exactly unsound, is
undesireable semantically.
...
This is probably technically a bug fix, but I'm not convinced it's worth it's
weight in backporting.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
This is more idiomatic and already #include'd.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
These are helpful now that float_controls2 exists, these are common
patterns worth factoring out into helpers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
like fui/uif but for fp64. will be used for NIR constant folding.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
We need to save the last COMPUTE_TMPRING_SIZE value in si_context because
it's no longer computed when compute scratch isn't used.
Fixes: 3b0bfd254f - radeonsi/gfx11: make flat_scratch changes for compute
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30071>
It removes testing cases that we don't need, and adds testing cases that we
need, such as alignment and the default codepath. The result is much less code.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30173>
as proposed by Amol Surati, this should ensure that the params are always updated
Fixes: 5eb0136a3c ("mesa/st: when creating draw shader variants, use the base nir and skip driver opts")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30126>
These events were silently truncated in get_counters_for_event.
The integer types in this pass are a bit all over the place, maybe we should
consider using typedefs for clarity or a different solution with type safety.
Fixes: 9e9cabd2fa ("aco/waitcnt: support GFX12 in waitcnt pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30163>
KGSL writes the profiling values asynchronously while we read them
immediately after the IOCTL returns which can result in the struct
not being filled in by the time we read it, this results in AGI not
correctly processing any timestamps from larger submits which take
longer to queue. To fix this, we now busy-wait on until the value
has been written out by KGSL.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30147>
This fixes a leak that happens when copying a image using blit, and the
image is a compressed one.
In this case a new image view is created that can be re-interpreted to
perform the copy, but was not properly free.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30161>