If each SIMD has to get an different number of waves, report the maximum.
One example of a situation is when a single-wave workgroup uses more than
max_lds_per_simd. This change causes radv_get_max_waves() to report a
single wave per SIMD instead of none.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
This fixes a issue in radv_get_max_waves() where it aligned the LDS
allocation to 512 bytes instead of 1024 on GFX10.3.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
Always return the wave32 waves instead of wave64 waves because the wave32
wave count is more precise in the case of wave32.
This also fixes usage of lds_per_wave in wave32.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
num_simd_per_compute_unit was the number of SIMDs per compute unit, but
lds_size_per_workgroup was the bytes of LDS per WGP.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
This new version contains several fixes.
v2:
- Bump lava kernel+rootfs tag (Eric)
v3:
- Drop {x86,arm}_test-base tag bumps (Michel)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8990>
Fixes crash in ppgtt_lookup when the same bo is used twice
with different offsets.
It's possible to hit this with i965 and always_flush_batch=true.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9008>
This function is used as a callback and the other instance
of this callback doesn't add its own new line.
Messages printed by this function already end with a new line.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8988>
Aubinator creates lots of 4k mappings, so for large traces it's
possible to hit system limit on the number of mappings created
by a single process.
Ideally, aubinator should merge those mappings, but that's tricky
and I'm not sure it's worth spending time on.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8988>
Most of it is API-independent, so let's move it out of the gallium
driver so it can be shared with the Vulkan driver.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
We are about to add a pan_blend.h in src/panfrost/lib. Rename the
existing pan_blend.h so we can include both.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
This implies late preparation of the fragment shader RSD, but given the
simplicity of pan_shader_prepare_rsd() (it's basically a 1:1 translation
between shader info and the RSD fields), it's unlikely to make a
difference. If we really want to optimize the time spent preparing the
RSD, we should consider caching a packed version at the batch level and
re-using it when nothing changed.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
While at it, rework the code to avoid copies between intermediate
structures: the pan_shader_info is passed to the compiler context so
the compiler can fill shader information directly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
So we can re-use the panfrost_sysvals definition outside of the
compiler without dragging the sysval_to_id hash table.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
Move panfrost_compile_shader() and panfrost_get_shader_options() to
pan_shader.c and drop the {bifrost,midgard}_compile.h include so backend
compiler internals are not directly exposed to the gallium driver.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
We should use Bifrost NIR options when compiling for Bifrost.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8963>
Previously we were calculating the inverse swizzle instead and doing something
horrible to get 0/1 right, and then "fixing" our table.
Let's do it right an align with the mesa-wide table.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8797>
Since some extensions never got their dedicated feature structs, not all
extensions promoted to core Vulkan have the relevant feature bits in
`VkPhysicalDeviceVulkanXYFeatures`. Those extensions are supported by
the device when the device version is high enough.
For those extensions, set the screen flags directly if the device
version is sufficient, otherwise check for the extension as usual.
Fixes: efe6f00e ("zink/codegen: do not enable extensions that are now core")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9030>
This is a desktop OpenGL 2.1 extension that seems to be required by
glamor to support glyph rendering acceleration with R8 textures.
Implementation borrowed from vc4.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8969>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member m_pos_input is not
initialized in this constructor nor in any functions that it
calls.
Fixes: 374bc76706 ("r600/sfn: Add the position input as varying")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9004>
because amdgpu_bo_wait doesn't wait for active CS jobs using the buffers.
This fixes incorrect buffer reuse of busy buffers.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8895>
There's reuse of values depending on the stage, so a function that
just takes the value might produce invalid results. All the codebase
was already changed to use the gl_varying_slot_name_for_stage()
instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8998>