Seems I accidentally added two copies of the same test-name in
6661c59981 ("pan/ci: add some more flakes"), with the former one
missing the last character. This didn't cause any harm, because this
doesn't match any tests. But let's clean it up.
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41020>
We've been reporting in features.txt that we support this extension
unconditionally, but we didn't. Now that we have the bits wired up due
to Vulkan, we can actually enable it on Bifrost and later.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
Enabling clamping in the opcode here doesn't do quite what we need. This
makes the HW clamp to the max LOD specified in the sampler, but we need
to clamp to the maximum available LOD instead, which is the minimum
of the max-lod of the sampler and the max level in the texture itself.
We also need to take the mipmap mode into account when computing the
level of detail. This is not something the TEX_GRADIENT instruction
does, so we need to do this manually.
Now that we no longer modify the flags in the loop, we can get rid of
the loop alltogether, and only issue a single TEX_GRADIENT instruction.
While we're at it, clean up some naming to better match the phrasing
from the spec.
This only applies to Valhall for now.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14867
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
This bit is reserved and should be zero on V9, so we should report an
illegal instruction if we ever encounter it while packing.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
If we use the texture coordinates mode for TEX_GRADIENT we need valid
texture coordinates on disabled lanes to compute correct lods across all
pixels on a triangle, otherwise pixels along triangle edges will read
garbage when computing coordinate deltas and produce bogus results.
We previously tried to solve this by setting the force_delta_enable bit,
but that doesn't always work... and worse, this bit isn't supported on
V9, which means we sometimes end up generating illegal instructions.
Fixes Piglit:
shaders/zero-tex-coord texturequerylod
Fixes: 4e58029dc0 ("pan/va: fix base-level for nir_texop_lod")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34339>
The kernel looks at drm_panthor_timestamp_info::flags, so it can't be
uninitialized.
Fixes: 302127fe ("pan/kmod: Add timestamp uapi support")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41418>
There's no point in having these as separate passes that live in the
compiler. We already have lower_res_indices(), which is panfrost's
equivalent to panvk's descriptor lowering. We can just do it there.
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41320>
It's only a couple lines of code since we're already doing this for
UBOs. It doesn't need to be a separate pass.
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41320>
Previously, we only supported one of the index or the offset source and
relied on lower_index_to_offset to ensure we only had one or the other.
However, now that we're doing things in NIR, it's trivial to support the
full index+offset form.
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41320>
Per Vulkan spec, the pipeline sample mask applies to all rasterization
sample counts, including single-sample. Drop the msaa-conditional clamp
that forced the sample mask to UINT16_MAX when rasterizationSamples == 1
and just use vk_dynamic_graphics_state's value directly. The default
when no static pSampleMask is provided is already all-ones, so existing
behaviour is preserved for pipelines that don't set the mask.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40882>
The panfrost compiler is now able to handle these on v9+ and we don't
need to lower them ourselves anymore. We only need the lowering on
Bifrost because we don't have the magic LD_PKA there.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
This new pass, pan_nir_lower_image(), will eventually subsume all image
lowering. For now, though, it only lowers image_size and only on
Valhall.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41352>
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`
There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used
Clarify that the parent here is where the src's def is used, not where
it's defined.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
The rk3399-gru-kevin devices are not reliable enough to host pre-merge
jobs, so move them to nightly.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41379>
`b3210` was not removed in v11, and is causing a failure in
OpenCL-CTS when running `test_basic
vector_swizzle`. `invalid_instruction` assertion was triggered with
the message:
```
Invalid 8-bit widen:
r3 = LSHIFT_OR.v4i8.flow2 u1.b3210, u256, u256.b0
```
Restore `b3210` in the ISA XML file, and handle the case for it in
`va_pack_widen`.
Fixes: c36326d3 ("pan/bi: Remove b3210 from valid swizzle")
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41098>
These are funky enough that they make more sense as intrinsics than
texture opcodes.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
Fix OpenCL-CTS error in `math_brute_force/test_bruteforce -w ldexp`
Valhall LDEXP.v2f16 takes a 16-bit exponent, while NIR ldexp uses a
32-bit exponent. Truncating large exponents can flip overflow into
underflow or leave huge 16-bit exponents to hardware behavior that does
not match OpenCL's expected signed infinity/zero results.
Clamp the exponent to a range sufficient to overflow or underflow all
fp16 values before lowering to ldexp16_pan.
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41234>
In order to assign bind_ops[i].syncs a slice of the sync_ops array,
op_sync_cnt must record the exact number sync operations for that vm_bind
operation, so that &sync_ops[syncop_ptr - op_sync_cnt] will give us the
right start of its slice.
Fixes: 97f6a62f7e ("pan/kmod: Add a backend for panthor")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41274>
It has been available in the Panthor KMD since 1.5
Fixes: 590ad83b98 ("panfrost: Use pan_image_test_modifier_with_format() to do our modifier check")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41274>
Now that lower_noperspective_fs and varying collection are closer
together we can merge nopersp collection in lower_noperspective_fs
without fear of desyncrhonization, making everything also a bit cleaner.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
Now that we removed a lot of upcoming bugs using time-travel, we can
reorders the passes in postprocess to be more in-line with modern
compilers. We also lift a lot of passes from compile_shader_nir into
postprocess.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
The shader will be optimized a few passes later in preprocess, this way
we can have the same pipeline as in Gallium
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
This is needed if we want postprocess to decide IDVS and layout later in
the series
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
Before this commit, all scartch memory was allocated in 16-byte chunks
and indirect references where always lowered into if-else trees. This
patch tries to clean this up a little bit, by using a more compact layout
that is still TLS friendly, allowing indirect accesses and only lowering
them for optimizations and using the newer nir_lower_explicit_io.
The patches should improve performance on some shaders, but lifts a lot
of dust off the compiler uncovering some new bugs. They have been kept
at bay by disabling local memory vectorization.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
Only caused problems when the VS/FS has more TLS than our internal shaders
that doesn't usually happen but will cause bugs when we start to
compress local memory.
Fixes: 005703e5b5 ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
Reorders the preprocess passes to be more in-line with modern compilers.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
This way the pass does not depend on lower_ssbo anymore
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
Using OpenCL size/alignment requirements we might get some types
with a size bigger than their alignment. This breaks the current TLS
load/stores that expect 16-byte alignment for 16-byte load/stores. This
problem probably hasn't surfaced yet because we reassigned OpenCL scratch
in 16-byte slots, but will break if we compact the layout.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
A common memory swap operation might be compiled as:
%v1 = LOAD %a1 # L1
%v2 = LOAD %a2 # L2
STORE %v2, %a1 # S1
STORE %v1, %a2 # S2
The current pressure scheduler just records the last load/store
operation for dependencies, thus the dependency chain becomes L2 -> S1
-> S2. The compiler might thus reorder them as L2, S1, L1, S2, i.e
# L1:
%v2 = LOAD %a2 # L2 |
STORE %v2, %a1 # S1 |
%v1 = LOAD %a1 # L1<-
STORE %v1, %a2 # S2
This is incorrect as S1 depends on L1 too. The fix makes all loads also
depend on each other, restricting load reordering. The proper fix that
NAK has is to track all loads and make each store depend on every load,
building a more correct DAG. This doesn't matter as much in panfrost
since all loads are serialized by the scoreboard. We might still want
to implement it for register pressure in the future.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>