This implements support for Decode processing allowing to perform
processing operation on the decoded picture in one single call without
having to use separate processing context.
This also implements the same functionality for encoding, which is
useful to perform conversion from RGB to YUV in a single call, and it
allows us to properly support the conversion inside encoder (eg. EFC on
AMD).
For Encode processing the additional output buffer is required same as
with Decode processing, but driver may not use it to perform the
conversion (in case where the conversion can be done by the encoder hw).
This means the contents of the additional buffer is undefined, and
application should not rely on the buffer actually containing output
picture of the conversion.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36755>
D3D10 requires SO buffer stride to be at least 2048 bytes.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36842>
We now have lower_texture_early and lower_texture.
lower_texture_early handle nir_lower_tex and (in the future) could handle
anything that is backend specific that need to happen before nir_lower_io.
lower_texture handles actual lowering of backend specific things that
must happen after nir_lower_tex and nir_lower_io.
This allows us to finally not run nir_lower_tex two times in panvk.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
Moving it out of there will allow us to shuffle and move API specific parts
out of there.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
As we are going to move texture and IO lowering, this split preprocess
functions in two, one handling preprocess the other postprocess.
The split is done right before lower_io and has no functional change for
now.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
The full nightly jobs have been failing for a while without much interest
in them.
Reduce Piglit coverage by switching to the `quick_gl` profile, which
is what the pre-merge jobs run.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36608>
../src/gallium/frontends/va/config.c(574): error C2059: syntax error: '}'
MSVC 2019 doesn't support for it yet
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843>
this is only to catch the case of a bound descriptor being written to
by some operation other than its draw/dispatch descriptor bind,
so any non-write binds are ignored
previously those non-write binds were required because of how sync
analysis could drop non-write access, so that is fixed as well
also use the vbo bind count instead of the mask because why not
also also ignore non-write GENERAL image deferred sync because that shouldn't
need anything deferred
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>
this avoids a scenario where a non-subdata UNSYNCHRONIZED unmap triggers through
tc at the same time the frontend calls an UNSYNCHRONIZED subdata call
in the main thread, which desynchronizes the cmdbuf and hits an assert
Fixes: 8ee0d6dd71 ("zink: add a third cmdbuf for unsynchronized (not reordered) ops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>
D3D11 requires that subnormals are not flushed to zero
when tessellating primitives. Since we are flushing
subnormals during shader execution, we must temporarily
turn flushing off when calling the tessellator.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36811>
Some ops on 64 bit data don't require the data to reside in neighboring
channels and can be executed as seperate 32 bit ops. In these cases we don't
need to pin the registers to a specific channel, but for scheduling it is better
that we make sure that both destination values reside in different channels, so
that they can be scheduled into one ALU group and reduce the probability of
read-port conflicts when used as source values.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>
More ops then op2_dot_ieee + op2_mul_ieee can be submitted
as multi-slot ops. Make it ease to handle additional opcodes
when splitting the alu op that has only one dst but requires
multiple slots. With that we can emit more multi-slot ops that
use consecutive slots and use a different opcode in the last slot.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>
If we have more than one register that is associated with the same
ssa index, but can be allocated without a specific channel pinning,
then don't add it to the ssa.index/register.index map to not
re-use the same register index.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>
In addition, on Cayman some trans opts can use three or four channels,
and it may be an advantage to use the four channel version if the
result needs to be written to the w channel to reduce the all-over
ALU instruction group count.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>
The scheduler sets the flag when scheduling the ALU
instructions into ALU groups, so there is no need to
set these flags early and it was already done inconsistently
anyway. The only expection is the ALU predicate instructions,
because it is not yet handled direcly by the scheduler.
Clanup the use of alu_write too.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36790>
Everything is currently using CLOCK_BOOTTIME, which is perfetto's
default, and matches the previous behavior. On some hardware, different
clocks may be better synchronized with the gpu clock.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34390>
To ensure all submitted commands are visible on screen flush them.
Fixes tri.cpp test with softpipe.
Signed-off-by: Max Ramanouski <max8rr8@gmail.com>
Signed-off-by: Max R <max8rr8@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36769>
This change fixes the gds implementation of
atomic_counter_comp_swap which requires three arguments.
This update is based on 4e3b43f180 "r600/atomic: fix
ATOMCAS instruction." which was the tgsi implementation.
Note: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36554
is required for this change to work properly on cayman.
This change was tested on palm, cypress and barts. Here is the test fixed:
khr-gl4[5-6]/shader_atomic_counter_ops_tests/shaderatomiccounteropsexchangetestcase: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36254>