pop_enable_group calls _mesa_set_enable for every state it changes,
so we don't need do anything else.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8850>
At -O1 with GCC 10.2.1, _nir_visit_dest_indirect (declared ALWAYS_INLINE)
will fail to inline if it's caller (nir_foreach_dest) is not inlined,
because _nir_visit_dest_indirect is passed as a function pointer. This
results in a compilation error.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Fixes: 336bcbacd0 ("nir: inline nir_foreach_{src,dest}")
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4353
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9301>
Test a full transformation path (load_uniform -> load_ubo -> load_uniform)
and validate the load_uniform offset.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9305>
The restoring of the acutal uniform offset was wrong.
Fixes: 1837135f7c ("etnaviv: nir: add ubo lowering pass")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9305>
This is a little bit contorted because the Z storage for the tile is
either float or int depending on the Z format, so we have to be careful
about types when comparing.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9287>
... instead of truncating to GLfloat. This seems somewhat silly since
the "clamp" part means only values [0.0, 1.0] are defined, but if the
depth buffer is Z32_UNORM then storing as GLfloat means you lose 8 bits
of depth bounds precision. This happens not to matter, yet, since swrast
classic doesn't support Z32_UNORM for depth, and the software gallium
drivers don't support EXT_depth_bounds_test. But the latter part is
about to change.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9287>
In release builds, there should be no change, but in debug builds the
assert will help us catch undefined behavior resulting from using
util_cpu_caps before it is initialized.
With fix for u_half_test for MSVC from Jesse Natalie squashed in.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>
I noticed that we were hitting this before st_create_context() called
util_cpu_detect() and so num_cpu_mask_bits was zero. But there is no
harm in calling util_cpu_detect(), so lets just call it here to be safe.
Fixes: d877451b48 ("util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>
So that RGP reports the memory type and the memory throughput.
Based on AMDVLK.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9303>
Since value-numbering no longer works across loops, we no longer need to
use v_readfirstlane_b32.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>
face
The opcode evaluates tha unnormalized coordinates, the length of the
major axis, and the cube face.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>
E.g. r600 a cube texture lookup uses a specific cube instruction
to evaluate the sample coordinates and the face ID, so that the cube
texture lookup can be lowered to a array texture lookup, thereby sharing
the code with the 2D array texture lopkup.
However, for TXD the given gradients still need to be three-component
vectors, so add a flag that the NIR validation knows that we deal with
cube texture that was lowered to an array and can validate accordingly.
v2: Handle new flag in serialization (Marek)
v3: Rebase so that the change does not require the patch to deduct the
number of offset and grad components from sampler type
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>
this ends up being a tradeoff where we waste a little startup time and
an extra ~4k memory for the overall screen object in exchange for never having
to fetch format properties again, which is a surprisingly expensive call
to be making as much as we have to make it
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9293>
For a while we were doing 3-space indent with 8-space tabs, largely
due to the emacs settings of a couple of contributors. We stopped
using tabs a long time ago, and they're just a nuisance at this point.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9207>
These had the function name baked into the perf_debug message, which
after a bunch of refactoring, was out of sync with the actual code.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9207>
Allow ctx to be NULL in perf_debug_ctx() and make perf_debug() a
shortcut for perf_debug_ctx(NULL, ...) to simplify things slightly
in the next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9264>
This was meant to be an || rather than &&, although it didn't matter for
shaderdb because both conditions would be true. But it did matter if
you were trying to force synchronous compile to avoid having nir/ir3
prints interleaved from multiple threads.
While at it, add a more specific debug flag to force initial variant
compile to be synchronous, because at some point the 'shaderdb' flag
itself will not force this.
Fixes: 75b0c4b5e1 ("freedreno/ir3: Async shader compile")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9264>