Add checks for integer coordinates and array indexing HW features
Features require HW support and the PCO_DEBUG env var to contain the
adv_smp entry
Integer coordinates are supported for images and textures without an LOD
setting
Array indexing is not supported and will trigger an abort
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40139>
Replace the #instruction-alu-no-dst-maybe-src0-src1 intermediate, which
used override expressions and variable SRC USE fields, with two concrete
intermediates following the branch/branch_unary/branch_binary pattern:
- #instruction-alu-no-dst-no-src: no sources, COND bits forced to zero
- #instruction-alu-no-dst-cond-src0-src1: SRC0+SRC1 with fixed USE=1
patterns and a COND field
The texkill instruction is split into texkill (unconditional, no sources)
and texkill_cond (conditional, with sources). This eliminates the "maybe"
anti-pattern and enables full assembler round-trip for conditional texkill.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40205>
Move the COND field (bits 6-10) out of the root #instruction bitset so
only instructions that actually support conditions decode/encode it.
Non-conditional instructions now enforce those bits as zero via a pattern.
This follows the freedreno ir3 precedent where conditional and
non-conditional instruction variants use separate intermediate bitsets.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40205>
The CSF backend's cs_call() is a hardware call/return instruction that
nests naturally. The existing CmdExecuteCommands implementation already
performs caller-callee state merging without checking command buffer
level, so no functional changes are needed.
The hardware call stack has 8 levels. The kernel consumes one for the
ringbuffer CALL, and two are reserved for future internal driver use,
leaving maxCommandBufferNestingLevel at 5.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40120>
This seems to be faster.
ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
-592900 +/- 2302.24
-27.6432% +/- 0.0998961%
(Student's t, pooled s = 2719.05)
ministat (overall):
Difference at 95.0% confidence
-76.8333 +/- 27.2345
-0.632558% +/- 0.223407%
(Student's t, pooled s = 46.867)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
The samples=2 variant also flakes, matching the RPi4 pattern which
covers all sample counts. Broaden the entry to match all variants.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
The previous commit added a v3d_get_rt_format() check to reject
fast TLB blits when the job's RT format differs from the blit
destination. Since each RT format maps to a unique (internal_type,
bpp) pair via get_internal_type_bpp_for_output_format(), the
rt_format equality check is strictly stronger than the previous
internal_type/bpp comparison.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
There's little point in having two unreachable blocks here. Yeah, sure,
in theory we are a little bit safer against forgetting to add a case for
newly introduced enum values here. But the UNREACHABLE macro should
already tell us when we trigger such cases anyway, and the cost here is
really readability.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
The compiler seems to fail to see that all cases are handled here,
producing a warning thinking "val" can be undefined. So let's make
that very obvious, by replacing the _COUNT-case with a default
block.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
panvk_image.c isn't a per-arch file, so the PAN_ARCH macro doesn't exist
here. We need to do a run-time check here instead.
Fixes: 01ba87a7fc ("panvk: Relax ms2ss afbc disablement")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40220>
In case the FS only writes one output.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>
When this is the case. we shouldn't hang or crash.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>
This should enable fast-clear for color images with the GENERAL layout
on GFX10-10.3. This seems important because DXVK tends to use that
layout more often now.
There are still issues with MSAA images, so it's only enabled for
single-sampled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40145>
This is actually not needed because nobody is using storage with
depth-only formats and compression doesn't work at all anyways.
PAL and native don't allow this either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40214>
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.
Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.
v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
Add a method for determining which MESA_MAP_ACCESS_* flag would be
appropriate for a given OwnedDescriptor, based on both access flags and
write seals (since access mode can be RDWR despite the seals!)
This is useful for virtgpu implementations when mapping incoming buffers
from host software into the guest's address space. Previously Rutabaga
relied on basic heuristics like "SHM is always R/O", but with upcoming
extra protocols to be forwarded over virtgpu channels (like PipeWire)
those assumptions no longer hold true.
Signed-off-by: Val Packett <val@packett.cool>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40132>
Change the TYPE_FLOAT display format from %f to %g with sufficient
significant digits (%.5g for f16, %.9g for f32), so that float
immediates round-trip correctly through disassembly and assembly.
The %f format loses precision for small values: f16 0x0001 (denormal
~5.96e-8) displays as 0.000000, which parses back as 0x0000. The %g
format uses the minimum significant digits per IEEE 754 and strips
trailing zeros, using scientific notation when needed. Whole-number
values use %.1f to keep them unambiguously float (e.g. "1.0").
Update the etnaviv PEST grammar and the freedreno ir3 lexer/parser to
accept the new output formats (scientific notation, stripped zeros).
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40207>
They were already being handled explicitly in vtn_alu, so just handle
them directly for spec constants too -- that has to do special work
for conversions anyway. Remove the bit-size parameters from the function.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
There was an assumption that if the instruction had non-native float
as a source, the first source would have such type. This doesn't
hold for Select, and the code failed in two ways
- The boolean source of Select was being converted to the non-native
float type.
- The loop that resolves the bit-size for unsized operands would
trip at `assert(i == 0)` because Select has more than one source.
Re-organize the code to track the types of the sources independently,
and fix both issues above.
Fixes: 90e1b12890 ("spirv: Add bfloat16 support to SpecConstantOp")
Fixes: 51d3c4c889 ("spirv: support float8 spec constant op")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
Only used by Convert operations, so just pass 0 from callers that
are not Convert and clarify that in the code.
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
Set max outstanding ray queries to 1024. This value can be tuned later
specific to apps.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40182>