If you don't pass this, the compiler refuses to compile the assembly for
pre-v7 CPUs. This also keeps us from building identical, non-NEON code on
aarch64 and x86.
Fixes: a373f77662 ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.")
v2: Fix Android build by just appending NEON_C_SOURCES when
ARCH_ARM_HAVE_NEON.
Tested-by: Rob Herring <robh@kernel.org>
(cherry picked from commit bd5efbd70b)
[Emil Velikov: address libvc4_la_LIBADD conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/vc4/Makefile.am
I've been trying to get away without these conditionals in vc4's NEON
code, but it meant compiling extra unused code on x86, and build failing
on ARMv6.
v2: Use the _arm/_arm64 flags to simplify detection (suggested by Rob),
but hide the _arm version under ARCH_ARM_HAVE_NEON to keep from trying
to build this stuff for armv5te.
Tested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ba8533b6ea)
We need to link librt for u_thread.h's clock_gettime() call.
Fixes: b822d9dd67 ("gallium/util: move u_queue.{c,h} to src/util")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b94ddc181b)
BLEND_STATE packing was modified to be variable-length in:
9670124e31 genxml: Make BLEND_STATE command support variable length array.
The initial gen10.xml still had the old, fixed-length style
definition for BLEND_STATE. So gen10_upload_blend_state would
overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
of all-zero entries when packing BLEND_STATE. This caused
BLEND_STATE upload to not work at all.
Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit d6539608a4)
LLC platforms are magic in that reads from the CPU are always cache
coherent, or rather GPU writes that bypass LLC do still invalidate the
appropriate cache line.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 49eda75df6)
Fixes KHR-GL45.shader_ballot_tests.ShaderBallotAvailability, which
causes some silly swizzles to appear, triggering this optimization to
get hit.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9c8f017f77)
I'm working on this, but I'm not sure I'll make 17.2 at this stage,
maybe 17.2.1.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 611076a41a)
src0.x is always read for the LOD, irrespective of which outputs are
read.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 934511d1f3)
This affects which inputs are marked as used. In a situation where only
the texture instruction uses an input, it might have been ignored as
unused due to input masks.
Affects subtests of KHR-GL45.texture_cube_map_array.sampling
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 054c54d1be)
We continue in the code to do some more things with the rhs, including
setting a constant initializer. If the type is wrong, this causes some
confusion down the line, leading to assertions. This makes sure that the
rhs processing continues to flow as-if the type was correct to start
with (even though the state has been marked as an error state).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101766
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 978c4c597a)
The legacy test won't work on gfx9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 694d59fbaf)
port the opaque metadata changes from radeonsi for gfx9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e43cc3e3af)
This is also a GFX9 register.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 674ecbfef2)
We set this later in the non-gfx9 path, just remove these
bits from here.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fc600eb98d)
The predication packet changed format on GFX9, update the driver.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5247b311e9)
It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.
Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 1ab7fed707)
intel_miptree_texture_aux_usage() takes an isl_format, but we are
passing a mesa_format. clang warns:
brw_blorp.c:305:52: warning: implicit conversion from enumeration
type 'mesa_format' to different enumeration type
'enum isl_format' [-Wenum-conversion]
intel_miptree_texture_aux_usage(brw, src_mt, src_format);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~
Fixes: fc1639e46d ("i965/blorp: Use texture/render_aux_usage for blits")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f7dfc44c61)
Since we don't iterate to a fixed point, we can end up in situations
where we have a SAT instruction + a long immediate. This is not legal.
However since it's immediately computable, just run unary straight away
to handle the situation.
Fixes: 24a799ad35 ("nv50/ir: fix ConstantFolding with saturation")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 165e18dd21)
This seems like a workaround, but we don't see the bug on CIK/VI.
On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.
Could also be a kernel difference between SI and CIK.
v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.
Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 82ba384c10)
This ports the workaround from radeonsi, that was missing in radv.
This fixes Talos rendering when MSAA is enabled on my Tahiti card.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8bf3930751)
This just copies the code from the -pro shaders,
and fixes the tests on CIK.
With this CIK passes the same set of conformance
tests as VI.
Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3f389f75b6)
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.
Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit acba3a3151)
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.
Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 15e5a7a683)
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.
Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8286c3a49f)
This fixes corrupted shadows in Unigine Valley.
The corruption disappeared when I stopped setting IMG_DATA_FORMAT_24_8
for depth.
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 27fef5d52d)
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.
v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using
separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using
combined depth-stencil.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5563872dbf)
If we have an invalid display fed into the functions, the display lookup
will return NULL. Thus as we attempt to get the platform type, we'll
deref. it leading to a crash.
Keep in mind that this will not happen if Mesa is built without X11 or
when the legacy eglCreate*Surface codepaths are used.
A similar check was added with earlier commit 5e97b8f5ce ("egl: Fix
crashes in eglCreate*Surface), although it was only applicable when the
surfaceless platform is built.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 26fbb9eacd)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/egl/main/eglapi.c
UE4Editor has this issue.
This commit prevents hangs (release build) or assertion failures (debug
build). It doesn't fix the editor, but catastrophic scenarios are
prevented.
Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4630ede102)
For mul(a, +-1) codegen can generate OP_MOV with a saturation flag
set which is ignored at emission. The same can happen with add(a, 0),
and others.
Adding an assert for detecting more of such issues.
Fixes wrongly rendered water in Hitman Absolution running under wine.
Also a few shaders in Mad Max and Alien Isolation produce such MOVs.
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: generalize the fix for other cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 24a799ad35)
Need to take the sample count into account in the depth decompress and
resummarize pipelines and render pass.
Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e9a13bf22)
Originally, I had moved it to the caller to make some things easier when
adding the CCS modifier. However, this broke DRI2 because
intel_process_dri2_buffer calls intel_miptree_create_for_bo but never
calls intel_miptree_alloc_aux. Also, in hindsight, it should be pretty
easy to make the CCS modifier stuff work even if create_for_bo allocates
the CCS when DISABLE_AUX is not set.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8e5808fc0c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/mesa/drivers/dri/i965/intel_mipmap_tree.c
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices. This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary. For tiled surfaces this is not likely to make a
difference. For linear surfaces, on the other hand, this means we may
require additional memory. In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d27c6095e)
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other. However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds. However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads. There are
two caveats to this:
1) There is some potential for issues with caches here if extra data
ends up in a cache we don't expect due to OOB reads. However,
because we always trash the entire cache whenever we need to move
anything between cache domains, this shouldn't be an issue.
2) There is a potential issue if a surface gets placed at the very top
of the GTT by the kernel. In this case, the hardware could
potentially end up trying to read past the top of the GTT. If it
nicely wraps around at the 48-bit (or 32-bit) boundary, then this
shouldn't be an issue thanks to the scratch page. If it doesn't,
then we need to come up with something to handle it.
Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan. However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c15b92ce11)
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.
v2: get it right, reverse the polarity.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 36a1b61321)
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c961b679fe)
X/GLX can't handle them. This removes almost 500 GLX visuals that were
incorrectly exposed.
This is a less invasive version of Marek's .getCapability series.
Note: the patch is not applicable for master, but only for the 17.2
branch.
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f33d8af7aa "st/dri: add 32-bit RGBX/RGBA formats"
[Emil Velikov: commit message polish]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Copy/paste error was duplicating a gen_knobs.cpp rule.
Fixes: 5079c277b5 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit e4a6ae06cf)
Reported by valgrind at:
glsl_to_tgsi_visitor::visit(ir_expression*) (st_glsl_to_tgsi.cpp:1560)
When compiling the Deus Ex shaders.
Fixes: 28a5e7104 ("st/glsl_to_tgsi: handle precise modifier")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 06237fc9e1)
GLES/gl.h has historically provided some typedefs that are not
used in the API itself. Restore these typedefs that were lost to
avoid breaking applications.
These seem to be the only typedefs removed in the update.
Fixes: 7fd0817 "Update Khronos-supplied headers"
[Eric: added a big warning to revert this patch when pulling the updated header]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 3db05ed1d1)
This reverts commit 3008161d28,
which caused a regression for VMWare.
The initial code had some recursion in it, that I removed by accident
trying to add back the recursion broke lots of things, take the high
road and revert for now.
Fixes: 3008161d (st_glsl_to_tgsi: rewrite rename registers to use array fully.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b8bea9a050)