The shader assembly was only available when not hitting the cache.
Additionally the serialized shader code was also the relocated variant
which meant that it could differ from one run to the next. Instead
serialize the unrelocated code produced by the compiler.
With this change we now decode the copy of the ISA we have on the
host.
NIR dumps are only available for shaders not loaded from the cache
(much like the other drivers).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8f4c2bd566 ("anv: add runtime shader statistic support")
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
This is potentially nicer for some drivers. AMD drivers will use it.
mesa_frag_result_get_color_index will be used often.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>
We used to maximize threads_per_task, but that is ideal when the system
has a single gpu client. When there are multiple gpu clients, we want
smaller threads_per_task such that cores can be more fairly shared among
the clients.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
task_axis selects the dim of the global workgroup, not the dim of the
local workgroup.
v2: fix assert for dEQP-VK.compute.pipeline.basic.empty_workgroup*
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org> (v1)
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
Set compute_ep_limit to max_tasks_per_core on v12+. It is generally a
good idea to queue as many tasks as possible to better utilize the
cores.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
Since v12, RUN_COMPUTE.ep_limit specifies the size of the compute task
queue. RUN_COMPUTE stalls when there are more tasks in the queue than
the specified ep_limit.
Sensible values are 0 (treated as 4), 4, or 16 (max_tasks_per_core).
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
Coverity notices that we read past the end of the array we're pointing
to, which is intentional, we want to copy additional members from the
source struct into the target pointer. As such, cast to a `void *`,
since this will make Coverity happy.
CID: 1649589
Fixes: 314de7af06 ("anv: Initial support for VP9 decoding")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38438>
Instead of manually applying the patch, backport the version that landed
in main, which requires a cmake argument to enable.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
Unlike what the comment said here, V9 does in fact support a single
float-format, so let's allow that.
But also, V10 and later supports FP16 formats, but this incorrect check
made that not work. Enable the FP16 formats also while we're at it. We
don't need any additional checks here, because the 16-bit unorm formats
were also added in V10, so util_format_any_to_unorm() does the right
thing here.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38848>
There's no reason why the S8_UINT check should be written in a different
way than the other checks here; let's make this consistent.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38848>
nir_lower_clip_cull_distance_array_vars was sneakily updating
shader_info::clip/cull_distance_array_size.
This moves the gathering into a new function
nir_gather_clip_cull_distance_sizes_from_vars.
v2: remove assertions that prevented nir_lower_clip_cull_distance_array_vars
from being used with non-compact arrays
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38465>
Metal will prematurely discard fragments with side effects even if those
side effects happen before the discard. Work around this by making said
discards "optional".
Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38741>
Consolidate importing paths by using the new importing
function so that compressed buffer can be imported
correctly.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36825>