Currently float16 to int64 conversions don't work correctly, because
the "div" variable has an infinite value, since 2^32 isn't
representable as a 16-bit float, which causes the result of of rem(x,
div) to be NaN for all inputs, leading to an incorrect result. Since
no values of magnitude greater than 2^32 are representable as a
float16 we don't actually need to do the fdiv/frem operations, the
conversion is equivalent to f2u32 with the result padded to 64 bits.
Rework:
* Jordan: Handle f16 in if/else rather than conditional
Fixes: 936c58c8fc ("nir: Extend nir_lower_int64() to support i2f/f2i lowering")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19391>
RGBX only loads and resolves 3 components, etc.
v2: buf fixes to make AMD_TEST=computeblit pass
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19477>
si_determine_wave_size always returned 32 because shader->info was
uninitialized. Do it after it's initialized.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19477>
This is happened when multi-threading access to util_get_process_name
memory leak point:
Direct leak of 4097 byte(s) in 1 object(s) allocated from:
#0 0x7f42888c0e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7f4288859d18 in __interceptor_realpath ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3608
#2 0x55a9c272e03d in __getProgramName ../src/util/u_process.c:75
#3 0x55a9c272e03d in util_get_process_name ../src/util/u_process.c:197
#4 0x55a9c2746da7 in util_queue_init ../src/util/u_queue.c:416
#5 0x55a9c272c233 in queue_init ../src/util/perf/u_trace.c:403
#6 0x55a9c272c233 in u_trace_context_init ../src/util/perf/u_trace.c:453
#7 0x55a9c262eb54 in test_thread ../src/util/tests/perf/u_trace_test.cpp:14
#8 0x55a9c275228b in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18764>
Even though initialized = true can make sure have no recursion, but that's may leading to
debug_get_option_should_print return false at the second thread, but the first thread
return true. These two threads should return the same value, even though this function is for
debug only, but it's better to getting it to be correct.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18764>
`pred` is a pointer, for sufficiently large numbers these
being cast to int were both > 0 regardless of the order
of `data1` and `data2`.
Fixes: 523a28d3fe ("nir: add an instruction set API")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19539>
GDS is used for NGG queries/streamout (GFX10+ only) and the BOs were
only added to the graphics queue because compute doesn't need them.
Though, the kernel emits a GDS switch when a queue submission doesn't
use GDS. That means that submitting jobs on the compute queue without
GDS can reset the state of the graphics queue and lead to GPU hangs.
The only viable solution for now is to make the GDS BOs resident to
avoid resetting the state between queues. This shouldn't introduce
more syncs between queues because GDS BOs are similar for both.
This fixes a GPU hang with Warhammer Chaosbane during loading time and
possibly some spurious random GPU hangs. Note that this GPU hang was
workarounded on the Steam side with RADV_DEBUG=nongg.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19466>
Now the glapi/glapi_dispatch.c are cleaned up because of this
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19509>
plane order is expected when trying to render yuv surfaces, update it for yuv444p
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19445>
decode of yuv444 yuv400 and yuv422 is supported on JPEG ip version 2.5.0 and 2.6.0.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19445>