The implementation for dumping shader line numbers was broken for anv as of:
1de9f367e8 anv: remove unused gfx/compute pipeline code
Now the implementation is moved to the shader heap upload and mimics the
current implementation in iris.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40248>
Add checks for integer coordinates and array indexing HW features
Features require HW support and the PCO_DEBUG env var to contain the
adv_smp entry
Integer coordinates are supported for images and textures without an LOD
setting
Array indexing is not supported and will trigger an abort
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40139>
Replace the #instruction-alu-no-dst-maybe-src0-src1 intermediate, which
used override expressions and variable SRC USE fields, with two concrete
intermediates following the branch/branch_unary/branch_binary pattern:
- #instruction-alu-no-dst-no-src: no sources, COND bits forced to zero
- #instruction-alu-no-dst-cond-src0-src1: SRC0+SRC1 with fixed USE=1
patterns and a COND field
The texkill instruction is split into texkill (unconditional, no sources)
and texkill_cond (conditional, with sources). This eliminates the "maybe"
anti-pattern and enables full assembler round-trip for conditional texkill.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40205>
Move the COND field (bits 6-10) out of the root #instruction bitset so
only instructions that actually support conditions decode/encode it.
Non-conditional instructions now enforce those bits as zero via a pattern.
This follows the freedreno ir3 precedent where conditional and
non-conditional instruction variants use separate intermediate bitsets.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40205>
The CSF backend's cs_call() is a hardware call/return instruction that
nests naturally. The existing CmdExecuteCommands implementation already
performs caller-callee state merging without checking command buffer
level, so no functional changes are needed.
The hardware call stack has 8 levels. The kernel consumes one for the
ringbuffer CALL, and two are reserved for future internal driver use,
leaving maxCommandBufferNestingLevel at 5.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40120>
This seems to be faster.
ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
-592900 +/- 2302.24
-27.6432% +/- 0.0998961%
(Student's t, pooled s = 2719.05)
ministat (overall):
Difference at 95.0% confidence
-76.8333 +/- 27.2345
-0.632558% +/- 0.223407%
(Student's t, pooled s = 46.867)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
The samples=2 variant also flakes, matching the RPi4 pattern which
covers all sample counts. Broaden the entry to match all variants.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
The previous commit added a v3d_get_rt_format() check to reject
fast TLB blits when the job's RT format differs from the blit
destination. Since each RT format maps to a unique (internal_type,
bpp) pair via get_internal_type_bpp_for_output_format(), the
rt_format equality check is strictly stronger than the previous
internal_type/bpp comparison.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
There's little point in having two unreachable blocks here. Yeah, sure,
in theory we are a little bit safer against forgetting to add a case for
newly introduced enum values here. But the UNREACHABLE macro should
already tell us when we trigger such cases anyway, and the cost here is
really readability.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
The compiler seems to fail to see that all cases are handled here,
producing a warning thinking "val" can be undefined. So let's make
that very obvious, by replacing the _COUNT-case with a default
block.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
panvk_image.c isn't a per-arch file, so the PAN_ARCH macro doesn't exist
here. We need to do a run-time check here instead.
Fixes: 01ba87a7fc ("panvk: Relax ms2ss afbc disablement")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40220>
In case the FS only writes one output.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>