This also makes all the paths a bit more clear because we only ever
clflushopt on the clflusopt paths and only ever clflush on the clflush
paths. It's really not much more code or logic duplication.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
The x86 implementation was shamelessly stolen from intel_mem.c and the
aarch64 implementaiton was based on the code in Turnip.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37803>
Somehow dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.atan_frag
was able to generate a bitfield_select with a constant first
parameter. That makes the big comment here completely false.
Don't be clever. If the constant is in the wrong place,
commute_immediates during copy propagation will fix it.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
This prevents lower_regioning from doing bad things when the destination
and all the other sources are UW.
Other solutions considered:
- Have the type of src[3] match the destination type. This also required
changes in combine_constants to allow the type be UD or UW.
- Make a new subclass brw_bfn_inst, and store the Boolean function
selector outside the src[] array. This was a lot more code and a lot
more churn (+47,-27 vs +4).
Fixes: b948e6d503 ("brw: Use BFN to implement nir_opt_bitfield_select")
Suggested-by: Curro
Suggested-by: Ken
Closes: #14095
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.
Via Coccinelle patches:
@@
expression ptr;
@@
-if (ptr) {
-free(ptr);
-}
+free(ptr);
@@
expression ptr;
@@
-if (ptr) {
-FREE(ptr);
-}
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr) {
-ralloc_free(ptr);
-}
+ralloc_free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-free(ptr);
-}
-
+free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-FREE(ptr);
-}
-
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-ralloc_free(ptr);
-}
-
+ralloc_free(ptr);
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892>
Use an explicit path for libgl-symbols.txt from the build system
instead of reconstructing it from __file__.
The issue is that for Android build system, everything is sandboxed
and that file is not in the same root as the python script. Thus we
need a proper explicit path in meson to be able to translate it to a
legal Android construct that is capable of finding that file.
Update everything using libgl_public_functions to propagate that path.
Ref #14072
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37866>
tu_u_trace_submission_data_finish happens on the other thread than
tu_create_copy_timestamp_cs.
Fixes: 6e5944ec4b ("tu: Cache copy timestamp cs to avoid allocations on submit")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37848>
Mitigate a GPU hang in Dota 2 and Rise of the Tomb Raider
by reducing the primitive rate for triangle lists.
This workaround is not documented by AMD and may not be correct.
The problem isn't well understood and needs further investigation
to narrow down what the root cause is. Until then, it's better
to give users something that works, even if not optimal.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
Compute queues may run compute dispatches in parallel with
the graphics queue, even from other processes/apps.
At the moment we can't make sure that all compute shaders
use a workgroup size of 256 to mitigate the regalloc hang,
so disable compute queues on affected chips.
Can be reverted if a better mitigation is found in the future.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
It already didn't use compute queues on GFX6, but some GFX7
chips are also affected by the same bug.
Compute queues may run compute dispatches in parallel with
the graphics queue, even from other processes/apps.
At the moment we don't have a way to restrict all workgroups
to 256 invocations, so instead let's make sure not to use the
compute queue.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37885>
This is used by DLSS to pass in image view
descriptors via parameter buffers for its
kernel launches.
Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37889>