It seems the hardware behavior for this is as per-spec and we are
supposed to identify as active entire quads. Particularly, there
are some derivative tests with dynamic control flow that use
subgroup ballot and require this.
However, we still need to exclude terminted lanes (OpTerminate). For
that, we keep track of the sample mask at the start of a fagment
shader start and compare it with the current sample mask.
Fixes: ('broadcom/compiler: support subgroup reduction operations from fragment shaders')
Fixes: dEQP-VK.glsl.derivate.dynamic_loop.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27409>
We are slightly over-subscribed right now, with 21 merge jobs (10 vk
+ 10 gl + 1 traces) for 20 RPi4, so let's split the tests between
slightly fewer RPi4 and make each split job run for slightly longer,
because it also means that all the jobs start immediately, reducing the
overall delay in merging any MR that triggers this job.
The typical run time for this job was around 8 min with a 10-split; with
an 8-split it is now 9 min, which is still within the 10 min target and
well below the 15min limit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27391>
We already had the logic implemented, but it was never really tested
(there was a comment about that)
So the advantage of this is that we now test that code (in fact, there
were a small typo on that code).
There aren't too much CTS tests for this feature, but we gets tests
like this working:
dEQP-VK.clipping.clip_volume.depth_clip.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10527
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27386>
In fragment shaders these instructions consider a lane active when
any lane in the same quad is active, which is not what we want, so
we need to include the current sample mask in the condition mask
used with these instructions to limit lane selection to those that
are really active.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
The BARRIER_ID instruction is only available in compute
and tessellation, implement an equivalent barrier that we
can use from other stages.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
If the lane from which the hardware writes the unifa address
is disabled, then we may end up with a bogus address and invalid
memory accesses from follow-up ldunifa.
Instead of always disabling unifa loads in non-uniform control
flow we can try to see if the address is prouced from a nir
register (which is the only case where we do conditional writes
under non-uniform control flow in ntq_store_def), and only
disable it in that case.
When enabling subgroups for graphics pipelines, this fixes a
GMP violation in the simulator with the following test
(which has non-uniform control flow writing unifa with lane 0
disabled, which is the lane from which the unifa takes the
address):
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcastfirst_int
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
We don't rely in any lowerings for these (other than
scalarization). The only noteworthy aspect is that these
instructions, like ballot, use the condition mask to
filter out valid invocations that are inactive because of
control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
This maps to our native shuffle instruction. For shuffle relative
and shuffle xor, we rely on the nir lowering to lower this to
ALU and regular shuffle.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
This adds support in our compiler for the subgroup ballot
feature. To this end we start using the NIR lowering for
subgroups which can lowers some of these intrinsics into
things more amenable to our hardware and takes care of
scalarization.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
These use the sample mask to decide about active lanes, so we need
to make sure we don't move them above a previous setmsf instruction.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
c->execute is 0 (not the block index) for lanes currently active
under non-uniform control flow.
Also this simplifies a bit the instructions we emit for flag
generation, both for uniform and non-uniform control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
If the ELSE block is cheap then we don't emit the branch instruction
but we still want to generate the flags, since these are setting
the flags for the THEN block too.
Fixes: e401add741 ("broadcom/compiler: skip jumps in non-uniform if/then when block cost is small")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
To handle coverity warning:
4. thread2_modifies_field: Thread2 sets cache_size to a new value. Note that this write can be reordered at runtime to occur before instructions that do not access this field within this locked region. After Thread2 leaves the critical section, control is switched back to Thread1.
CID 1559509 (#1 of 1): Check of thread-shared field evades lock acquisition (LOCK_EVASION)6. thread1_overwrites_value_in_field: Thread1 sets cache_size to a new value. Now the two threads have an inconsistent view of cache_size and updates to fields correlated with cache_size may be lost.
521 cache->cache_size += bo->size;
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26951>
Full coverity warning:
CID 1558604: Uninitialized pointer read (UNINIT)12. uninit_use_in_call: Using uninitialized value *results when calling nir_vec.
236 return nir_vec(b, results, DIV_ROUND_UP(num_components, 2));
To fix it we initialize the variables, provide a unreachable on the
switch that sets the results values. As we are here we also move a
comment to make things more clear.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26951>
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.
Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.
Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.
v2: always lower 64 bit
Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
Previous commit upreved deqp only for the Android
Fixes: 1ff4687e86 ("ci: uprev deqp-runner from 0.16.1 to 0.18.0")
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
[Eric]
- rename the deqp-runner version to DEQP_RUNNER_VERSION instead of DEQP_VERSION
- update image tags
- fix expectations lists
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27062>
As driver supports OpenGL 3.1, run proper tests, besides the OpenGL ES
tests.
Note that including GL3.1 is not required to include previous versions.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27069>
Move ralloc allocations and frees for BOs into the critical sections
protected with mutexes.
This fixes several double-free and use-after-free crashes that happens
sometimes when using the simulator to run Vulkan CTS tests, specially
when these tests involve multithreading, like
`dEQP-VK.api.object_management.multithreaded_per_thread_resources.device_memory_small`.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27070>
Demoting means that we don't execute any writes to memory but
otherwise the invocation continues to execute. Particularly,
subgroup operations and derivatives must work.
Our implementation of discard does exactly this by using
setmsf to prevent writes for the affected invocations, the
only difference for us is that with discard/terminate we
want to be more careful with emitting quad loads for tmu
operations, since the invocations are not supposed to be
running any more and load offsets may not be valid, but with
demote the invocations are not terminated and thus we should
emit memory reads for them to ensure quad operations and
derivatives from invocations that have not been demoted still
work.
Since we use the sample mask to implement demotes we can't tell
whether a particular helper invocation was originally such
(gl_HelperInvocation in GLSL) or was later demoted
(OpIsHelperInvocationEXT added with SPV_EXT_demote_to_helper_invocation),
so we use nir_lower_is_helper_invocation to take care of this.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26949>
v3d_optimize_nir is calling nir_opt_undef twice. As it is inside the
usual "do {..} while (progress);" loop, is not needed to call it
twice.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26928>
Add full Vulkan CTS testing for the new V3D 7.1 driver, used in the
Raspberry Pi 5.
So far we add it to run nightly; in future will be added to pre-merge
CI.
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26705>