Even though radeonsi may not use compute queues, other processes
might run compute jobs in the background, so radeonsi must make
sure not to use larger than 256 sized workgroups on GPUs that
are affected by the regalloc hang.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
Even though radeonsi may not use compute queues, other processes
might run compute jobs in the background, so radeonsi must make
sure not to use larger than 256 sized workgroups on GPUs that
are affected by the regalloc hang.
Unfortunately that means that for now RadeonSI won't be able to
support ARB_compute_variable_group_size on these GPUs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
We don't need to take ETNA_DIRTY_SHADER into consideration for pure
updates of the constant states. When the shader is dirty constants
and code will be uploaded together and the update path will be skipped.
The uniform cache in the context has been removed in ee1ed59458
("etnaviv: prep for UBOs"), so the comment referencing this cache
is confusing and can go as well.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39422>
Constant buffers may be changed without the shader changing.
Check the correct dirty bits when marking constant buffers
as read during the draw to ensure proper synchronization.
Fixes: a40a6e551e ("etnaviv: draw: only mark resources as read/written when the state changed")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39422>
These are useful for displaying very high color precision images with
more than 10 bpc color depth, and also more precision than what fp16
can do on a standard dynamic range (SDR) display, where fp16 for values
in the unorm 0.0 - 1.0 range is about equivalent to at most ~11 bpc
linear color depth. This is especially useful for and aimed at scientific
applications, e.g., neuroscience and other bio-medical research cases.
At least current generation AMD gpu's released during the last 10 years
and supported by amdgpu-kms + atomic modesetting do allow for scanout of
such 16 bpc framebuffers and of up to 12 bpc output to suitable HDMI or
DisplayPort high precision displays.
We gate the format behind a new driconf option 'allow_rgb16_configs',
which defaults to true, but allows to disable the formats if any issues
should arise.
Most regular applications won't need the high display precision of
these new 16 bpc 64 bpp formats which have higher memory and bandwidth
requirements, and therefore a potential undesired performance impact
for regular apps. Followup per-platform enablement commits will use
the EGL_EXT_config_select_group extension to put these 16 bpc unorm
formats into a lower priority config select group 1, so they don't get
preferably chosen by default by eglChooseConfig(), but must be explicitely
requested by client applications which really need the high color
precision of these 64 bpp formats and are happy to pay the potential
performance impact. Thanks to Adam Jackson for pointing me to the
EGL_EXT_config_select_group extension.
If the format would be put into the default config select group 0, a
simple EGL eglChooseConfig() call would end up choosing these formats,
which is not what such regular apps would want.
Tested to not cause any change on native X11/EGL and X11/GLX, which only
supports at most 30 bpc / 32 bpp formats.
Followup commits will enable these formats for the EGL/Wayland backend,
and on the EGL/DRM backend.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588>
PAL always set WD_SWITCH_ON_EOP for pre gfx10 when primitve
restart is enabled to prevent gpu hang.
It only happens when specific index stream with primitive
restart. Since we don't know what's the exact problem,
just follow PAL to disable 4x primitive rate when primitive
restart is enabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14629
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39292>
Only add the appropriate picture_{h264,h264_enc,vc1,...}.c file when the
corresponding codec is enabled via the -Dvideo-codecs flag.
Add stub functions to va_private.h, so that the code in decode.c and
encode.c remains untouched.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39354>
Add mpeg12dec as a selectable video-codec and add a corresponding check
to vl_codec_supported.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39354>
Add jpeg as a selectable video-codec and add a corresponding check to
vl_codec_supported.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39354>
This replaces all full lisence headers with SPDX identifiers and
generally makes things more consistent. I've also dropped the few
remaining author tags. If someone wants to know who wrote a bit of
code, `git blame` is going to be way more accurate than author tags
anyway.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
Without this, non-dynamically-supported state changes that require a pipeline
change (like blend states without full_ds3) that happen in between drawcalls
get ignored unless another one of the conditions also happened to be true.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39381>
Vulkan spec requires binding flags to be matched with the binding with
the same index, however currently bindings are sorted with flags not
properly sorted, which leads to bindings and flags mismatch.
Resolve this by adding optional flags info to the parameters of
vk_create_sorted_bindings(), and refactoring panvk/pvr (which really
pair bindings and flags instead of only iterating flags) to use sorted
flags.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>
aco implements the same logic, and in the future it will make changes to
config->float_mode to avoid unnecessary s_setreg.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
si_sqtt_start / si_sqtt_stop use emit_barrier which clears barriers_flags.
Since these functions are used to build an auxiliary cs which will only
be emitted later (on sqtt enablement/disablement) it shouldn't clear
the global barrier_flags value.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
The pattern:
ctx->barrier_flags |= ...;
si_mark_atom_dirty(sctx, &sctx->atoms.s.barrier);
is used a lot, let's add an inline helper. This prevents
forgetting the call to si_mark_atom_dirty.
si_upload_bindless_descriptors is special because we're
already in the emit phase so we shouldn't dirty barrier
again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it. This simplifies the back-end
and is less code, if you ignore the new copyright header.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
a longstanding issue in zink has been the scenario where a dmabuf is
created for e.g., RGBA8888, then the app tries to do SRGB, but the driver
doesn't support mutable formats with the dmabuf modifier. in this scenario, the app
would either crash or break unpredictably
by reusing the existing transient mechanism (previously only for msrtss emulation),
these dmabufs can instead have a shadow image which handles mutable formats and
then syncs back to the main image when necessary
this should greatly improve the situation on e.g., Intel
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39336>