When letting the overlapping waves enter their ordered sections, there must
be no memory accesses to resources which need primitive-ordered access that
are still pending, or there would be a race between the current wave and
the overlapping waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
If the wave has set the Primitive Ordered Pixel Shading packer ID hardware
register, it must send MSG_ORDERED_PS_DONE once before the program ends.
It's also safe to send the message if the packer ID register hasn't been
set yet, therefore the message may be sent conservatively. For simplicity,
to ensure that it's sent on all execution paths after setting the packer ID
register, always sending it from a top-level block. This is required for
GFX9-10.3 POPS.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
Implementing the acquire/release semantics of fragment shader interlock
ordered section in Vulkan, and preventing reordering of memory accesses
requiring primitive ordering out of the ordered section.
Also, the ordered section should be as short as possible, so not reordering
the instructions awaiting overlapped waves upwards, and the exit from the
ordered section downwards.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
Waits are needed for early exits from inside a Primitive Ordered Pixel
Shading ordered section, but that code doesn't insert them reliably anyway
because it doesn't obtain the counters for the exact locations of the
jumps, which may be anywhere inside the predecessor blocks.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
A wait for export_ready (if the corresponding bit is not set in the
instruction) is done to enter the Primitive Ordered Pixel Shading ordered
section on GFX11.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
pops_exiting_wave_id is a volatile ALU source operand containing the ID of
the latest wave that hasn't exited yet, for comparing with the newest
overlapped wave ID in overlapping waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
These run nir_validate in debug builds, which will avoid bugs slipping in. It's
not enough that llvmpipe doesn't mind illegal NIR, these passes are well within
their rights to fail spectacularly if the NIR wouldn't validate. So validate so
we catch issues early.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
GLSL booleans (and hence bool derefs) may be translated either as 1-bit or
32-bit NIR registers, depending whether the backend uses nir_lower_bool_to_int32
or not. Add a knob for this and choose the right type for different backends.
Fixes nir_validate failure on
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_bvec3 run under
lavapipe. That test indexes into a bvec3 array, and gallivm first lowers bools
and then lowers derefs to registers, resulting in random 1-bit booleans mixed in
with 32-bit bools.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
If we only lower parameters, that's still progress. Technically.
Fixes: 6a29cb2654 ("nir/lower_bool_to_int32: add support for lowering functions.")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>
Wire up the CL_DEVICE_PROFILING_TIMER_RESOLUTION from the PIPE_CAP.
While here, also set CL_PLATFORM_HOST_TIMER_RESOLUTION to 1;
that's bogus since we're using the same value as for device, but
at this point we don't have a device to ask.
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>
Use the get_timestamp as both the device_timestamp in
get_device_and_host_timer and host_timestamp in that
and get_host_timer.
Having eliminited most other clock sources, discussions
on previous versions have concluded it's best to use the
same timer as the 'host_timestamp' since the main requirements
are that it must be one that's a time seen by the device and
that it's very closely coupled.
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>
uint isn't a standard type, just something we accidentally get from some
other headers.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>
Here, we want explicitly sized types, not just types that happen to be
of the right size.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>
In these cases we actually want uint32_t, because we're doing 32-bit
things to them.
The hwinfo-bit is only being used by i915, and should probably be
moved to i915 instead. But it shoukd *also* be converted, so let's do
that now.
While we're at it, fixup the bit-setting as well.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>
For executing IB on the compute queue (ie. IB2 isn't supported), we
will need to break chaining, this is a first step towards this.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>
This was only necessary for preambles/postambles, let's clarify this
by determining the IB info from the first IB in the array instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>
The current IB size is copied when radv_amdgpu_cs_add_old_ib_buffer()
is called, which might not be the real IB size because we might still
pad the CS with NOP packets after.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>
The explicit stride doesn't have to be defined to aoa and therefore can be
zero in some cases, like in arrays of arrays of uniform blocks.
Resolves crash with spec@arb_gl_spirv@execution@ubo@aoa-2.shader_test piglit test for virgl.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23648>
Applications can do their own caching, but are in any case required to
properly "compiler" the binaries via clBuildProgram or clCompileProgram +
clLinkPrograms.
In any case, there is no point building something if we already have the
result.
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23847>
This function is added for create strong relationship between
nir_function_impl and nir_function.
So that nir_function->impl->function == nir_function is always true when
(nir_function->impl != NULL && nir_function->impl != NIR_SERIALIZE_FUNC_HAS_IMPL)
And indeed this invariant is already done in functions validate_function and validate_function_impl
of nir_validate
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>
Without generating a proper timestamp for the disk cache, we pull old
binaries out of the disk cache, potentially being buggy or simply
outdated.
Once meson 1.2 lands we can easily pull in LLVM functions.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>