As the lowering mentioned there got renamed twice:
commit b085016f94
Author: Rob Clark <robclark@freedesktop.org>
Date: Fri Mar 25 13:52:26 2016 -0400
nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries
Since it will gain support to lower inputs, give it a more generic name.
commit 1754507d49
Author: Marek Ol¨ák <maraeo@gmail.com>
Date: Wed Jun 25 19:05:19 2025 -0400
nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
Reviewed-by: Marek Ol¨ák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35855>
This changes legacy GS outputs to use the same logic as NGG GS.
It enables the same optimizations that NGG has such as forwarding
constant GS output components to the GS copy shader at compile time.
ac_nir_gs_output_info is removed.
GS output info is no longer passed to ac_nir_lower_legacy_gs and
ac_nir_create_gs_copy_shader separately.
ac_nir_lower_legacy_gs now gathers ac_nir_prerast_out, generates GSVS ring
stores, and also generates the GS copy shader with GSVS ring loads.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This way we won't have to pass output info between the two functions.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This is a cleanup.
Old gs LDS layout: [es outputs][gs outputs][scratch]
Old nogs LDS layout: [xfb/cull][scratch]
New gs LDS layout: [es outputs][scratch|gs outputs]
New nogs LDS layout: [scratch|xfb/cull]
The LDS scratch is moved to the beginning of the preceding buffer in LDS,
while the addresses in that LDS buffer are offset by the scratch size.
It effectively merges the LDS scratch with the preceding buffer in LDS.
Thanks to that, we no longer need the ngg_scratch ABI and the offset
in a user SGPR.
The lowering passes now return the LDS scratch size, which is used
by the drivers to determine the final LDS size.
The ngg_lds_layout SGPR is now unused without GS in RADV.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
We incorrectly used it to determine whether the shader should cull, which
luckily had no effect because it wasn't used everywhere.
cull_clipdist_mask should be used instead, which also reflects whether
clip planes are enabled in GL.
clip_cull_dist_mask is renamed to export_clipdist_mask to make it clear.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This removes LDS space and loads/stores for constant GS & XFB output
components. Constant output components skip LDS stores, and LDS loads
are replaced with the gathered constants.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This simplifies the code and scalarizes the loads/stores.
Scalar loads/stores will allow forwarding constant output components
from stores to loads easily.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This simplifies the code and scalarizes the loads/stores.
Scalar loads/stores will allow forwarding constant output components
from stores to loads easily.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
This isn't a huge cleanup on its own, but it lets us start assuming
coroutine support, and I would like to unify graphics shader dispatch
to work the same way as compute shader dispatch.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35374>
The program info contains all sorts of flags concerning the compilation state of
the program, and those need to be reset when the program is going to be recompiled
when its source changes.
For example, before this change, after info.flrp_lowered got set following the
first call to glProgramStringARB, flrp lowering would not happen for any subsequent
updates to the program string, leading to compilation failures.
Signed-off-by: Charlotte Pabst <cpabst@codeweavers.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35523>
This looks like a hw bug on GFX10.3-GFX11.5 because RB+ seems to only
work as expected when all channels (RGBA) are written. With that format,
RGB channels must be all set or unset but setting the A channel is
legal so far.
This will reduce rendering performance with that format but it's the
less intrusive solution for now. This might be revisited in the near
future, also with more VKCTS coverage.
This has been tested and verified on GFX10.3 (NAVI21) and GFX11
(NAVI31) and GFX12 (NAVI48), unfortunately I don't have GFX11.5 but
let's assume it's broken there too.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13371
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35631>
A cooperative matrix can only be constructed from a single
scalar value. Print that value, wrapped by a function call that
looks like a type-constructor.
This adds a test case that will otherwise assert out in spirv2nir.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35757>
This changed in 76c6157902f1 ("nil/copy: Base swizzling on the per-plane
pipe_format") but I missed the case for linear render targets.
Fixes: 76c6157902f1 ("nil/copy: Base swizzling on the per-plane pipe_format")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35887>
This was only used to answer the DRI components attrib request, which
was made unused as of the previous commit to move knowledge of the enum
to wayland-drm server-side code.
In theory, this makes the format mapping a little less flexible, however
in practice wayland-drm was neither extensible nor extended, and has
been deprecated in favour of explicit user awareness of format
properties.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35856>
wayland-drm is the bit of EGL which implements
EGL_WL_bind_wayland_display, used as a legacy path by pre-dmabuf Wayland
compositors to handle buffer exchange with clients. This allows the
client (platform_wayland) and server (wayland-drm) sides to negotiate
buffer exchange behind the scenes.
In this model, the server is unaware of what the format means and how it
should be bound to an image, so we have to query how the image should be
set up, bound to textures, and sampled. To do this, they have to query
the EGL implementation to find out.
We had a bunch of generic DRI infrastructure to support this, but now
that the DRI interface is no longer a hard one, and wl_drm is deprecated
in favour of dmabuf anyway, we can get rid of a lot of this
pseudo-generic support code.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35856>
Versions prior to the r1p0 revision of Mali-T760 has a bug that prevents
16x MSAA from working properly. Let's detect this GPU, and disable the
functionality for it.
It's a bit awkward that we can't easily do this as part of the pan_model
infrastructure, but we'd otherwise have to introduce a new quirk that
depends on the GPU revision, which isn't something the current pan_model
infrastructure is prepared for. So let's just do it in get_max_msaa()
instead.
And while we're at it, let's also make the max_4x_msaa quirk do the
MIN2() properly. This can currently only happen on V4 GPUs when the
pan_allow_128bit_rts_v4 driconf is enabled, but it should be fixed in
either case.
Fixes: 423f3fd485 ("panfrost: enable 8x and 16x msaa modes when supported")
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35871>
Otherwise the comparison will always be false for protected content.
Also remove extra setting of the protected bit that was happening later.
Fixes: 8d9cc6aa23 ("anv: properly flag image/imageviews for ISL protection")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35870>
For each dimension, we `threads *= lws`.. which is still zero if threads
is initialized to zero.
Fixes: eca4f0f632 ("rusticl/kernel: check that local size on dispatch doesn't exceed limits")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35864>