These need a simple two-instruction lowering regardless of the size of float
involved. Fixes integer_ops.integer_divideAssign
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
We don't have an 8-bit CSEL, so this is the best we can do. It's easier to write
the lowering as an algebraic rule since we don't need to do anything clever.
Fixes integer_ops.integer_clamp.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
When we lower swizzles, we move source modifiers (except for the swizzle) after
the swizzle operation. In particular, we change the order of composition for
negates and abs. However, copying the source will copy the modifiers unless we
specifically strip the extra modifiers. That's harmless in practice on Bifrost,
which doesn't check for extraneous modifiers, but is incorrect IR and trips an
assertion in the Valhall packing code.
Fixes test_relations.relational_bitselect.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
We're about to extend this pass to support 8-bit swizzles. That will be a
nontrivial change, so let's get some testing for what's already in the pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
Fixes math_bruteforce.atan2 and contractions tests.
For OpenCL, we want to flush fp32 and preserve fp16, applying to both inputs and
outputs so F16_TO_F32 acts as preserve, which implements CL spec text:
> Denormalized numbers for the half data type which may be generated when
converting a float to a half using vstore_half and converting a half to a float
using vload_half cannot be flushed to zero
Note that our libclc builds flush denorms and rusticl does not advertise denorms
so we're expected to flush to zero. rusticl correctly sets the desired float
controls, we just have to match to the hardware requirements.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
In NIR, txf does not take a sampler. However, in the hardware it does take a
sampler. If there is no sampler bound and we use txf, the hardware will read
back all-0's due to bounds checking. As a workaround, bind a trivial sampler and
use that.
As-is this workaround is Valhall specific, making use of an extra resource
table. I'm punting on generalizing back to Bifrost until I can discuss the issue
in more depth with Jason and Karol and figure out the right fix.
Fixes api.image_properties_query.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>
Indirect dispatch does not actually require any dynamic memory allocation, even
with shared memory. We just need to set wls_instances to some (mostly arbitrary)
value, statically allocate memory based on that, and let the hardware throttle
workgroups to fit if needed.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18661>
If memory allocation fails, we look for a suitable sized BO in the BO cache and
wait until we can use its memory. That usually works, but there's a case when it
can fail despite sufficient memory in the system: BOs in the BO cache
contributing to memory pressure but none of them being of sufficient size. This
case is not just theoretical: it's seen in the OpenCL
test_non_uniform_work_group, which puts the system under considerable memory
pressure with an unusual allocation pattern.
To handle this case, try evicting *everything* from the BO cache and stalling
in order to allocate, if the above attempts failed. Fixes the following error:
DRM_IOCTL_PANFROST_CREATE_BO failed: No space left on device
on the aforementioned OpenCL test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18579>
Fix defects reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable used going out of scope leaks the storage it points to.
leaked_storage: Variable multiple_uses going out of scope leaks the storage it points to.
Fixes: 8fb415fee2 ("pan/bi: Reduce some moves when going out-of-SSA")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18653>
This test times out occasionally. Let's disable it for now.
Reported-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18578>
We need to look at the job header pointers themselves, not the memory objects
that contain them, because there can be (and usually is) multiple jobs per BO.
Fixes: 3da8c9193c ("panfrost: Handle Job VA cycles when decoding a dump file")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18539>
This is a common pattern. Make it more ergonomic. With the help of Coccinelle:
@@
expression E;
expression R;
expression instruction;
@@
-instruction->src[E] = bi_replace_index(instruction->src[E], R);
+bi_replace_src(instruction, E, R);
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Prior to register allocation, destinations must be in SSA form, except for the
"fake SSA" briefly used in the current RA (for which bi_is_ssa returns true
anyway). These asserts duplicate the asserts in the validator. If there's any
coverage lost here, that's just a sign the validator needs to run more often.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Frequently, we want to iterate over the SSA variables read by an instruction,
while skipping over constants and uniforms. Add and use a macro for this
pattern. A few redundant asserts are removed.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Now that variables have flat naming (no more SSA/reg distinction), we can just
use .value directly. This cleans up the RA a bit.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Now that non-SSA nodes are an internal implementation detail of RA, the
entrypoints should be made private to make sure no other passes use the
RA-specific liveness analysis.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
We've already broken SSA irreparably at this point. We can just change what
"normal" means and avoid the disjointed indexing.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
nir_invalidate_divergence was introduced in aa765c54bd ("pan/bi: Add divergent
intrinsic lowering pass"), which needed divergence information for a late NIR
pass but not otherwise in the backend. Unfortunately, nir_convert_from_ssa is
less aggressive about coalescing when divergence information makes it look like
the coalescing would make the code worse -- even though that's not actually an
issue on Mali.
Now that we don't use nir_convert_from_ssa, we don't need the hack.
Likewise for the vec.reg hack, which was introduced because the split/collect
cache relies on SSA form.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Don't call nir_convert_from_ssa, preserve the SSA from NIR. This gets us real
SSA for all frontend passes. The RA becomes a black box that takes SSA in and
outputs register allocated code out. The "broken" SSA form never escapes RA,
which means we can clean up special cases in the rest of the compiler. It also
gets us ready for exploiting SSA form in the RA itself.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>