Previously we determine wave_size of merged shader stages separately,
and ignore the condition which may cause them to be different.
Now we determine the wave_size of the TCS/GS part first, then use the
wave_size for VS/TES part. So that we can condider the previous shader
stage's information when determine the wave_size of TCS/GS, and two
stages in the merged shader can affect each other's wave_size.
This requires si_shader_selector to have two kinds of main part for
wave32 and wave64 when part mode, to be combined with other shader
part with various wave size.
This also enables merged shader stages with different
si_shader_info->has_divergent_loop to use wave32. We'll add another
condition for KHR_shader_subgroup latter.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.
OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
The long names were originally meant to map to the HW encoding but
nowadays the actual encoding values depend on gfx version, whether
instruction is 3src, etc.
Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>
Create specific helper for register file encoding and handle it there.
Use ad-hoc structs to let the macro take optional named arguments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>
In these cases there's a clear bound we can use. In C++ this is a
compiler extension and not compatible with zero initializing a
regular struct -- which will happen in a later change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>
We were actually using PIPE_FORMAT_B5G6R5_UNORM for HAL_PIXEL_FORMAT_RGB_565
since Android support was added to Mesa. This restores the original behavior.
Fixes: 273e54391a ("egl/android: Remove hard-coded color-channel data")
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30827>
We can't exceed c_int::MAX, because the CTS casts to ints in a few places.
We also need to take into account max pixel size when restricting to
max_mem_alloc as this cap is pixel based, not byte based.
Cc: mesa-stable
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30739>
Prints the shader to a string and assigns source locations based on
that.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18903>
Adds a new instruction type that stores metadata that might be useful
for debugging purposes. Passes must ignore these instructions when
making decisions.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18903>
vkcube, vkgears and vkmark are crashing with the following error/segfault:
$ NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1 vkcube
WARNING: NVK is not a conformant Vulkan implementation, testing use only.
Selected GPU 0: GeForce GT 640 (NVK GK107), type: DiscreteGpu
ERROR: couldn't get DataFile for op ldc_nv
Segmentation fault (core dumped)
Handling nir_intrinsic_ldc_nv as per nir_intrinsic_load_ubo in Converter::getFile()
allows to run vkcube, vkgears and vkmark on Nvidia GT640
Fixes: dc99d9b2 ("nvk,nak: Switch to nir_intrinsic_ldc_nv")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30832>
LLVM optimizes umul_hi with a constant to v_mul_hi_i32_i24_e32 which isn't
always what we need here. This causes miscalculations. To prevent LLVM to
apply this optimization, we insert a optimization barrier.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11761
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30810>
Previously, we trusted the caller to set the size to zero for null
descriptors. However, device-generated commands allows specifying a
zero device address with a non-zero size. We may as well handle this
case directly in the MME and keep our command emit shader simple.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30826>