With autotune allocating counters low-to-high, the conflict with
PERFORMANCE_QUERY_KHR will happen if any CP-based counters are
used. This is a temporary workaround which just drops the first
two CP counters from being usable for performance queries.
Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40949>
This is more consistent with the newly established pattern of the
UMD allocating all locally used performance counters low-to-high
instead of the prior high-to-low order.
Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40949>
The UMD will be switching to allocating counters from low-to-high,
so to avoid the chances of conflict with this new policy the PPS
driver now allocates the other way around. Additionally, this will
future proof it for the MSM-DRM uAPI for performance counters which
will similarly allocate from high-to-low.
Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40949>
When preemption optimization is supported then the necessary CP
counters being missing causes a device initialization error which
is unnecessary as support can simply be disabled instead to allow
for a more graceful fail. This also fixes A8XX which doesn't have
performance counters hooked up yet.
Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40949>
Future kernel API for perfcounter management will likely be required for
a8xx and onwards. For a7xx and earlier, cmdstream-based selector and
counter register management is still supported.
Cc: mesa-stable
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40949>
With the pass order shuffling, code like `(x & 0xf) + (x & 0xfffffff0)` gets
optimized to bitfield_select(0xF, x, x). But it would be much better to optimize
simply to x. nir_opt_algebraic would do that for us but we run this pass too
late for algebraic to save us from ourselves, so be smarter.
Observed on dEQP-GLES31.functional.compute.basic.image_atomic_op_local_size_8
with Jay, this saves an instruction there.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40956>
If ac_drm_device_initialize returns -EACCES for the fd passed in.
A render node file description can't have DRM master status, which means
AMDGPU_CTX_PRIORITY_HIGH can't work without CAP_SYS_NICE (which
generally only the root user has).
Fixes: 8f30e90fc1 ("winsys/amdgpu: Prefer render node FD for ac_drm_device_initialize")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40974>
When we unsychronized map a BO, we tell Valgrind that the content is not
initialized yet.
But we forgot to mark it as defined when the map finishes, which leads to
several conditional jump or move depends on uninitialised value(s)
warnings when using Valgrind.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40964>
Applications are required to set NonUniform if the resource is arrayed,
but with VK_DESCRIPTOR_MAPPING_SOURCE_HEAP_WITH_SHADER_RECORD_INDEX_EXT,
the resource is non-arrayed in the shader. So, it's technically not
required to set it. Although, the offset can vary per-lane and
NonUniform is implicit.
Backport-to: 26.1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40953>
"size" is the allocated size of the array, not the number of immediates
actually used. We could wind up returning a too-large constlen, larger
than 512, and since the binning variant uses the non-binning variant's
constlen as it's max_const we could make binning variants use c512.x and
crash when encoding.
Fixes: 86f3c0c4c2 ("ir3: simplify constlen calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40961>
We need to know the immediate count even after lowering, to compute the
overall const size. Previously we were using the capacity field, but
that's unreliable and won't be available once we switch to a real
dynamic array container instead of (poorly) reinventing one.
Fixes: 86f3c0c4c2 ("ir3: simplify constlen calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40961>
For simple copy, we can copy data with uvec4(16bytes) at once.
When we have serialize/deserialize copy mode, we want to copy out the
instance leave address which are 8byte wide, so we need to jump with
8byte stride instead of 16bytes.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40966>
GLES2/3 doesn't expose alpha test, user clip planes, or two-sided color.
These lowers therefore shouldn't disable the one-variant fast path for GLES2/3 contexts.
Keep the lowering itself unchanged and only relax the shader_has_one_variant checks.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40905>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state() for both PBO upload and download
paths. Only stream outputs, render condition, and pause queries
(no atoms) still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state(). Only stream outputs, render
condition, and pause queries (no atoms) still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state(). The conditional DSA/blend
invalidation for stencil writes is preserved. Only stream
outputs (no atom) still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state(). Only stream outputs and pause
queries (no atoms) still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state(). Only stream outputs (no atom)
still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Replace CSO save/restore of atom-backed states with
st_context_invalidate_state(). Only stream outputs (no atom)
still use CSO save/restore.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>
Add ST_INVALIDATE_* flags for all states that meta-operations
(bitmap, clear, drawpixels, etc.) modify behind the state tracker's
back. This enables meta-ops to use st_context_invalidate_state()
instead of CSO save/restore for state re-derivation.
The atoms already know how to re-derive Gallium state from GL context,
so marking states dirty via st_context_invalidate_state() is
sufficient - st_validate_state() will call the atoms before the next
draw.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40592>