Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Commit ea5b7de138 broke some piglit tests on radeonsi (Bonaire hardware).
This commit fixes half of the regression by enabling msaa if the dest surface has
more than 1 sample (instead of hardcoding it to false).
Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This allows using the LCDIF display controllers (with the mxsfb drm
modesetting driver) along with the Etnaviv render-only drivers. LCDIF is
found on i.MX SoCs.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This should save a lot of per-compile time by using the RA the way it's
actually supposed to be used.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
While testing kmscube with mesa master, it turns out that kmscube is not
working anymore. After bisecting, commit
5a7688fdec is the culprit. A short trial
and error session allowed to find the removed bit of code making kmscube
working again.
This patch adds it back.
Fixes: 5a7688fde ("panfrost: Use 64-bit descriptors globally")
v2: Add comment pointing out this is magic. [Alyssa, trivial]
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Rather than anything "early Midgard", limit us specifically to T6XX, as
certain workarounds only apply to genuine T6XX, not T7XX.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Needed for the following st/mesa fix.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We still have some big ticket items left on GLES 3.0, but it's often
helpful to be able to access higher dEQP levels for debugging features
that just don't quite match a particular API.
Plus, this opens up a whole slew of new features to poke at if boredom
overtakes, ahem.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The branch instruction has 6 bits per register operand which allows it
to specify a component in the register.
Fix codegen so that it outputs the right component, otherwise it always
outputs the x component.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
The macros already prepend "ppir: ", remove them from the actual strings
so it doesn't appear duplicated.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
The spilling code spills entire vec4 registers regardless of the
components used by the spilled uses.
The inserted stores code force the 4 components, but these loads were
using a variable number of components, causing bugs on loading the
spilled registers.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).
This has been build-tested on amd64 with:
Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st: mesa xa xvmc xvmc vdpau va
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
st_extensions.c sets const->MaxImageSamples (GL_MAX_IMAGE_SAMPLES) by
looping over [16, 15, .. 1x] MSAA modes, and RGBA/BGRA/ARGB/ABGR 8888
color formats, calling pipe->is_format_supported() for each, with
the usage set to PIPE_BIND_SHADER_IMAGE. If any are supported, it
selects that number of samples.
We were checking if sample_count <= 1, which meant that we were getting
a value of 1x MSAA, rather than the expected 0x (feature doesn't exist).
But, only on Icelake because Gen11 adds support for typed read messages
for R8G8B8A8_UNORM. The lack of typed read messages for these formats
was tricking the check on Gen9 to say no correctly. This caused some
Icelake conformance failures, because we don't implement this feature.
Just check for sample_count == 0 instead.
Indirect linear writes were not being marked as initialized, causing the
back blit to be dropped, breaking the listed tests.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now that we run RA in a loop, before each iteration after a failed
allocation we choose a spill node and spill it to Thread Local Storage
using st_int4/ld_int4 instructions (for spills and fills respectively).
This allows us to compile complex shaders that normally would not fit
within the 16 work register limits, although it comes at a fairly steep
performance penalty.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng
of logic ops on precompiled shaders, which we don't want to do. Also, this
had the side effect of making shader-db crash, as during this lowering we
would try to read the color format swizzle information from the fragment shader
key that we don't populate in precompiled shaders because right now we only
need it when logic operations are enabled.
Reviewed-by: Eric Anholt <eric@anholt.net>
In virgl_buffer_transfer_extend, when no flush is needed, it tries
to extend a previously queued transfer instead if it can find one.
Comparing to virgl_resource_transfer_prepare, it fails to check if
the resource is busy.
The existence of a previously queued transfer normally implies that
the resource is not busy, maybe except for when the transfer is
PIPE_TRANSFER_UNSYNCHRONIZED. Rather than burdening us with a
lengthy comment, and potential concerns over breaking it as the
transfer code evolves, this commit makes the valid_buffer_range
check the only condition to take the fast path.
In real world, we hit the fast path almost only because of the
valid_buffer_range check. In micro benchmarks, the condition should
always be true, otherwise the benchmarks are not very representative
of meaningful workloads. I think this fix is justified.
The recent change to PIPE_TRANSFER_MAP_DIRECTLY usage disables the
fast path. This commit re-enables it as well.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Do not take a transfer and do the memcpy. Add a _buffer suffix to
the function name to make it clear that it is only for buffers.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Without setting hw_res, virgl_transfer_queue_extend never finds a
match and always returns NULL.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Legacy GS has to use Wave64, so TES before GS has to use Wave64 too.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>