Fix OpenCL-CTS error in `math_brute_force/test_bruteforce -w ldexp`
Valhall LDEXP.v2f16 takes a 16-bit exponent, while NIR ldexp uses a
32-bit exponent. Truncating large exponents can flip overflow into
underflow or leave huge 16-bit exponents to hardware behavior that does
not match OpenCL's expected signed infinity/zero results.
Clamp the exponent to a range sufficient to overflow or underflow all
fp16 values before lowering to ldexp16_pan.
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41234>
This property is unrelated to the CTS conformance process from Khronos,
it just means that the driver passes that CTS version, even if not
"officially" conformant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41258>
Even if they are in the same block, we might still need to move the
source instructions if they are otherwise after our insert location.
This can happen in the case where we insert strict_wqm_coord before
terminate_if.
Fixes: ac33f82d54 ("ac/nir/lower_tex_coords: move input loads instead of cloning them")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41336>
Note: the atomic host-mem-barrier tests assume that the atomic
buffer could be shared which is not how the r600 operates.
This change was tested on palm and cayman, with the exception
of the "atomic counter" tests, it fixes all the other cases:
spec/arb_shader_image_load_store/host-mem-barrier/.*: fail pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41199>
LRZ and FDM have a few major performance pitfalls, if they are not
clearly surfaced when doing perfetto trace - they are easy to miss.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
We rely on tu_lrz_flush_valid_at_suspending_rp_boundary() to make sure
that subsequent resuming renderpasses get the correct LRZ state. However
this doesn't work on early a6xx GPUs without tracking support. Disable
LRZ in this case, similar to secondaries.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40935>
This matches the behavior of radv for these two.
Fixes:
dEQP-VK.binding_model.descriptor_buffer.traditional_buffer.capture_replay.sparse_buffer_descriptor_data_consistency
dEQP-VK.binding_model.descriptor_buffer.traditional_buffer.capture_replay.sparse_buffer_descriptor_data_consistency_and_usage
Fixes: 8feed47fce ("tu: Initial support for sparse binding")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38148>
So that dma-buf-imported EGLImages on big-endian hosts resolve to a
sized GL internal format in st_bind_egl_image() instead of falling
back to unsized GL_RGBA/GL_RGB.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
So that dri2_get_mapping_by_fourcc() resolves the byte-reversed fourccs
(DRM_FORMAT_BGRA/BGRX/RGBA/RGBX8888) used for the native 8888 visual
on big-endian hosts.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
This workflow has been discussed a lot with the team for the past
few years. Let's just clarify it for real in the documentation.
Co-written-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41239>
OpenGL 3.1 is a transitional version in the progression of dropping
legacy features. It does not feature a "Compatibility Profile", instead
only GL_ARB_compatiblity extension is defined for it.
Programs that queries GL_CONTEXT_PROFILE_MASK at runtime and call the
compatibility codepath when this query doesn't exist or the query
returns GL_CONTEXT_COMPATIBILITY_PROFILE_BIT will work on OpenGL
implementation with a version < 3.1 or a version > 3.1, but not on
implementations targetting OpenGL 3.1 and lacking GL_ARB_compatiblity.
As most programmers now have hardwares and drivers targetting version >
3.1 installed, such error is hard to catch.
So try the best to enable GL_ARB_compatiblity on drivers exposing
exactly OpenGL 3.1 to satisfy such programs. It's still possible to use
MESA_GL_VERSION_OVERRIDE=3.1FC to acquire a context w/o
GL_ARB_compatiblity on such drivers.
Fixes the overview functionality of kwin_wayland on panfrost with
Mali-G57 (which exposes OpenGL 3.1 on current Mesa), although the
problematic profile detection code is in Qt instead of KWin.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41298>
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.
v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
Instead of lowering frag_coord 4 times during compilation,
just use this.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
to strengthen and simplify pixel_coord lowering
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
GPU capture bugs if heap sizes are not aligned to at least 16K. Ensuring that
they are is not expected to impact memory usage since it seems the actual
internal memory allocation is already aligned to 16K, the issue is only with
how the heap reports its size versus the allocation size that capture uses.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41218>
samplers can be destroyed whenever, which makes it problematic to store
the pointers into descriptor layouts for embedded samplers. instead,
directly store the descriptor info into the layout, since this is all
constant data which is unaffected by object lifetimes
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
in a sequence like:
* CmdPushConstants
* CmdBindPipeline (doesn't use push constants)
* CmdDispatch
* CmdBindPipeline (uses push constants)
* CmdDispatch
the previous code would never update pushconsts and the second dispatch
would have no valid data
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
We should have never been doing this as bind time. Instead, layout
transitions out of UNDEFINED are in the spec specifically so the
driver has a point where it can do initialization, so do our init there
instead.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>