pan_nir_lower_fs_outputs already handles channel masks and vec3 and
smaller outputs. We don't need the extra precursor pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
pan_nir_lower_fs_outputs() already does this so there's no need to have
it in a separate pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
These are already implemented by common code, so there's nothing to be
done here, really.
A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
Enable the extension and feature bits on CSF (v10+). Map the conditional
rendering pipeline stage to all three subqueues for barrier handling.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Meta operations (blit, copy, clear images, fill/update buffer) internally
emit draws or dispatches that must not be affected by conditional
rendering. Save and disable the state in meta_gfx_start/meta_compute_start,
then restore it in the corresponding end functions.
CmdClearAttachments is the exception: it IS affected by conditional
rendering per the Vulkan spec, so its state is restored right after
meta_gfx_start.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
When a secondary command buffer is recorded with
conditionalRenderingEnable, its draws load the conditional rendering
flag from panvk_cs_subqueue_context::cond_render_flag via the
cs_subqueue_ctx_reg register.
In CmdExecuteCommands, evaluate the primary's predicate and write the
normalized result (0=skip, non-zero=render) to the subqueue context.
Since the context is per-queue and cs_subqueue_ctx_reg is never
clobbered across cs_call, this works correctly with
SIMULTANEOUS_USE_BIT and across multiple queues. For nested
inheritance, propagate the caller's own flag value.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Use panvk_cond_render() to wrap RUN_IDVS and RUN_COMPUTE GPU commands.
When active, the predicate is loaded from the buffer and a cs_if
branches over the GPU commands when the condition says to skip.
This per-command approach correctly leaves render pass operations
(BeginRendering, EndRendering, load ops) unaffected as required by the
spec. Only draws, dispatches, and CmdClearAttachments are conditional.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Add panvk_cond_render_state, CmdBeginConditionalRendering2EXT,
CmdEndConditionalRenderingEXT, and the panvk_cond_render() macro
that wraps GPU commands in a cs_if block.
For direct conditional rendering, the macro loads the predicate from
the buffer. For inherited secondaries, it loads a pre-evaluated flag
from panvk_cs_subqueue_context::cond_render_flag. When conditional
rendering is inactive, no GPU instructions are emitted.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
The code was built with the assumption that VS shaders could magically
read FS varying descriptors because we never emitted any VS varying
descriptor. The truth is that VS shaders never read any varying
descriptor at all, lea_buf is never used for varying stores on v9.
This simplifies varying handling by a lot since we can assume different
descriptor types for VS and FS without hacks.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
Previously the driver decided when the backend should use
LD_VAR_BUF[_IMM] instructions based on the total number of varyings
read, falling back to LD_VAR[_IMM] + descriptors when the varying index
could overflow the immediate index in the instructions. That means that
even adding a single varying read could overflow the index and make
everything fall back to LD_VAR.
With this patch the backend decides when to use LD_VAR_BUF for each
varying load, reporting that decision to the driver. This helps with
index overflows because only the instruction that actually overflow the
immediate use the LD_VAR fallback, leaving all other instructions on the
fast path.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
The function was a leftover from the varying ABI rework.
Fixes: 7fc6af99ea ("pan: Remove dead code for sso_abi builder and fixed_varyings")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
Purely a visual change, but aligns with DDK disassembly.
For example:
- FMA.f32 r1, ^r1, u1, ^r4
+ FMA.f32 r1, r1^, u1, r4^
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
A lot of masks and shifts were hard-coded in the disassembler. This
commit tries to move them to shared logic.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Rather than opcode/opcode2 hardcoded, treat the opcode as a list of
one or more subcodes.
This implies modifying the disassembler to hold an arbitrary depth dict
of dicts and recursively build the switch statements used to look up
each level.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
This splits the printing logic from the iteration logic, making it
easier to reason about either.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Opcode2 was a bit all over the place, so utilize the new opcode modifier
to gather opcode2 information in a single place.
This cleans up the implicit va_mods "left", "descriptor_type" and
"memory_width".
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Rather than having the opcode as an attribute and the offset/mask being
implicit, make all of this information explicit in the xml.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
These instructions were not generated as they do not exist.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
This is already the default value, so there's no point in overriding it
to itself.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
Dividing this by itself is nonsensical, and just always gives us one.
That's obviously not what we want here.
But in this case we also know that the extent is divisible by the tile
extent, so there's no need for DIV_ROUND_UP, we can just divide.
Fixes: e6f8cab698 ("pan/layout: Split the logic per modifier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
We can't use the stencil-aspect of a color-attachment. That's going to
fail, so let's use the color-aspect instead. We already have it around
anyway.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
This is controlled by the writeback-mode when using AFRC, not by an YUV
Enable field. This Filed doesn't exist in these, and should according to
the spec be zero.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
In preparation for IO lowering in NIR. The varying size does not change
between variants and we'll need the real store width in NIR if we want
to lower it correctly.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
Valhall removed Bifrost's memory segments and added in its place memory
access. Those were bolted on reserved bits as "pseudo-segments" and the
emitter would catch these and emit the right memory access. This commit
cleans it up a bit by making memory_access available directly and
exposing it to NIR (this will be useful later).
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
This removes the previous hack that searched the psiz write by looking
for 16-bit stores with the correct pseudo segment. We also add a new
intrinsic that mimicks global stores but tags psiz writes, this will be
used later in the series.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
First, rename them to make them a bit more clear. They act on global
memory so they should be _global and they map to ld/st_cvt so so _cvt is
nice and obvious. Second, they don't need IO semantics as they're not
IO. But they do need ACCESS so that we can better control things like
CAN_REORDER. Third, add a src_type to store_global_cvt even though it
won't be used just yet because we'll want it for lowering VS stores.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
Unlike load[_interpolated]_input, which has to deal with all sorts of
ABI nonsense between driver and compiler, these new intrinsics are
dumber than bricks. They're literally just the HW ops as NIR
intrinsics. These will allow us do the lowering in NIR and put the
driver in total control over what goes down what path. Among other
things, a driver could choose to lower some things to ld_var and others
to ld_var_buf.
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
The runtime builds a final pipeline state with pointers to structures
coming from the associated pipelines libraries.
So far it has considered that the viewMask was part of a structure
together with the rest of the renderpass information. This information
can be specified in pre-raster, fragment & color-output state groups
and it was assumed would be consistent for all 3. And the runtime
currently takes the pointer to the structure from the last pipeline
library (color output).
Some coming spec/cts will clarify that the viewMask only needs to be
specified for pre-raster & fragment groups, making the value in the
color-output group untrustworthy.
This change creates a new state structure to hold the viewMask on its
own so it is only gather on pre-raster & fragment groups.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Aitor Camacho <aitor@lunarg.com> (kosmickrisp)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (turnip)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv)
Reviewed-by: Frank Binns <frank.binns@imgtec.com> (powervr)
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (panvk)
Royaled-yes-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39940>
Allow applications to control the virtual address where device memory
is mapped by passing MAP_FIXED to mmap via pan_kmod_bo_mmap().
Support VK_MEMORY_UNMAP_RESERVE_BIT_EXT by replacing the mapping with
a PROT_NONE anonymous mapping to keep the address range reserved.
Only memoryMapPlaced and memoryUnmapReserve are advertised.
memoryMapRangePlaced is not supported because the DRM GEM mmap offset
mechanism requires mapping from offset 0.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40315>
DRM GEM mmap offsets go through drm_vma_offset_exact_lookup_locked()
which requires an exact match on the GEM offset. Passing a non-zero
bo_offset causes EINVAL because the kernel can't find the BO at the
shifted offset. Every caller already passes bo_offset=0 and maps the
full BO size, so drop those parameters and use bo->size directly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40315>
The correct dependence is cs_flush_caches.cs_defer.signal to
signal cs_sync32_set.cs_defer.wait in occulusion query path.
Fixes: 443ddac ("panvk/csf: merge v10 and v11 paths in
issue_fragment_jobs")
Fixed: many random fail cases in VK-GL-CTS 1.4.4.2, eg.
dEQP-VK.query_pool.occlusion_query.get_results_conservative
_size_64_wait_query_without_availability_draw_points_clear_color
Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40368>
Now that the SSO ABI is never used we can remove it with all of the
plumbing code required to find the fixed varyings bits.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>