pre_chain.rp_trace usage relied on a bunch of bad assumptions
and together with u_trace_move didn't cause issues until
u_trace is started to be refactored. Fixing those bad assumptions
and correctly initializing and freeing pre_chain.rp_trace
also requires fixing u_trace_move at the same time.
u_trace_move fixes:
- If dst had trace chunks in it - we may have leaked them.
- The correct list move pattern is "list_replace -> list_inithead"
not "list_replace -> list_delinit"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41390>
Format draw and draw_indexed Perfetto events with their vertex count.
For draw_indirect and draw_indexed_indirect, include the draw count
when indirect tracing is enabled (MESA_GPU_TRACES=indirects), otherwise
fall back to the static name.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41374>
Format compute events as compute(x,y,z) using the end-payload group
dimensions. Trailing dimensions that equal 1 are omitted to keep labels
concise — e.g. compute(128,1,1) becomes compute(128).
For compute_indirect, the dispatch dimensions are not known at command
record time since they live in GPU memory as a VkDispatchIndirectCommand.
The u_trace framework reads them back at trace flush time via the
is_indirect mechanism: the GPU address is recorded alongside the
tracepoint, and u_trace copies the pointed-to struct into indirect_data
once the GPU has finished. The same trailing-1 trimming is applied when
indirect tracing is enabled (MESA_GPU_TRACES=indirects); otherwise the
event falls back to the static "compute_indirect" name.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41374>
Add a separate end_event_dyn() that takes a std::string by value for
dynamic event names. The [=] lambda capture deep-copies the string into
the closure, avoiding a dangling pointer when the Trace() continuation
runs after the caller's stack frame is gone.
The existing end_event() with const char* remains for string literals
and long-lived pointers (e.g. payload->str), where no copy is needed.
CREATE_DUAL_EVENT_CALLBACK_DYN formats the event name via snprintf and
passes the result as a std::string to end_event_dyn(). Follow-up patches
will use this macro to label events with runtime dimensions.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41374>
When uncached memory type is used under emulation then most games have a
significant performance penalty due to accessing the buffer atomically.
Instead when this option is set, it will override uncached buffer
allocations to instead be cached+coherent if the host supports it. This
allows the atomic accesses to still be done but not have abysmal
performance.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41323>
Instead of leaving timestamp_copy_data half-initialized in
copy_timestamp_cs_pool - always have it fully initialized and valid
state there.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41438>
Add a new helper function pvr_pbe_format_num_sample_components that maps a
pvr_transfer_pbe_pixel_src format to the number of components it actually
uses. Use pvr_pbe_format_num_sample_components in pvr_uscgen_tq_frag_load
to set params.sample_components before calling pco_emit_nir_smp, so the
instruction is emitted with the correct component count. This allows the
generation of a more optimal SMP instruction, avoiding the emission of
unused result components.
Signed-off-by: Caius Moldovan <caius.moldovan@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41393>
We've been reporting in features.txt that we support this extension
unconditionally, but we didn't. Now that we have the bits wired up due
to Vulkan, we can actually enable it on Bifrost and later.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
Enabling clamping in the opcode here doesn't do quite what we need. This
makes the HW clamp to the max LOD specified in the sampler, but we need
to clamp to the maximum available LOD instead, which is the minimum
of the max-lod of the sampler and the max level in the texture itself.
We also need to take the mipmap mode into account when computing the
level of detail. This is not something the TEX_GRADIENT instruction
does, so we need to do this manually.
Now that we no longer modify the flags in the loop, we can get rid of
the loop alltogether, and only issue a single TEX_GRADIENT instruction.
While we're at it, clean up some naming to better match the phrasing
from the spec.
This only applies to Valhall for now.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14867
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
This bit is reserved and should be zero on V9, so we should report an
illegal instruction if we ever encounter it while packing.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
If we use the texture coordinates mode for TEX_GRADIENT we need valid
texture coordinates on disabled lanes to compute correct lods across all
pixels on a triangle, otherwise pixels along triangle edges will read
garbage when computing coordinate deltas and produce bogus results.
We previously tried to solve this by setting the force_delta_enable bit,
but that doesn't always work... and worse, this bit isn't supported on
V9, which means we sometimes end up generating illegal instructions.
Fixes Piglit:
shaders/zero-tex-coord texturequerylod
Fixes: 4e58029dc0 ("pan/va: fix base-level for nir_texop_lod")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
Define VPE 2.0.0 version identifiers in amd_family.h.
In ac_gpu_info.c, assign vpe_version only when the detected version is supported.
This ensures userspace only sees a valid VPE version.
Signed-off-by: Peyton Lee <peytolee@amd.com>
GitLab jobs use a default 1-hour timeout, which can allow jobs to run
longer than intended. We can override the LAVA timeout for Marge to have
a still conservative, but safer timeout.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41414>
The ambiguity of the Vulkan spec was clarified, and we don't need to
support sparse depth/stencil with exactly the same number of samples
as non-sparse.
If you want to pass CTS, you'll need VK-GL-CTS commit 03976477f521
("Don't require more than VK_SAMPLE_COUNT_1_BIT for non-color sparse
resident images").
This is essentially a revert of d5da6980d3 ("anv/sparse: don't
support depth/stencil with sparse") and 7b337e214d ("anv: remove
dead code").
Thanks to Iván Briano for working with Khronos to get clarification on
the spec and for implementing the VK-GL-CTS fix.
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37423>
While Jay overwrites sparse_tex->op with the newer opcodes that only
return red and the sparse stuff, BRW keeps using the original opcode
of the cloned instruction, so it can't change def->num_components.
This was not previously detectable since we did not have sparse
enabled for depth/stencil on Anv for a while. A patch to re-enable
that was proposed a while ago (MR !37423), never merged, but then a
recent attempt to try to merge it (by me) detected this regression.
Let's fix the regression first, then we can finally re-enable sparse
depth/stencil support in Anv, hopefully.
Fixes: 7468261d3d ("intel/nir: Make intel_nir_lower_sparse work for either brw or jay")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37423>
If we're a bit clever with the bits, we can make one fixup helper that
works for all rounding modes. See the giant comment for details.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41295>
This makes the tests names a bit more consistent and takes advantage of
the namespacing that gtest already gives us. (There's no reason to put
the whole prefix in the test name again.)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41295>
Smashing bits is super sketchy. However, all the bits do is force the
test down the _slow path so let's explicitly test that instead.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41295>
This prints the swizzle pattern for all non-XOR tiling modes.
It can be used to determine which GPUs have the same tiling.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41405>