We were hitting the
unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.
v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b117f59710)
This query is supposed to return the max texture buffer size/width in
texels, not size in bytes. Divide by 16 (the largest format size) to
return texels.
Fixes Piglit arb_texture_buffer_object-max-size test.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by :Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 3b28eaabf6)
Current selection of pixel format does not enforce the request of
stencil or depth buffer if the color depth is not the same as
requested.
For instance, GLUT requests a 32-bit color buffer with an 8-bit
stencil buffer, but because color buffers are only 24-bit, no
priority is given to creating a stencil buffer.
This patch gives more priority to the creation of requested buffers
and less priority to the difference in bit depth.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101703
Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 80c6598cdb)
This patch fixes the total surface size in surface cache
to include array size as well.
Tested with MTT glretrace.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit adead35320)
piglit test ext_texture_array-gen-mipmap is fixed with this patch.
Tested with mtt piglit, glretrace, viewperf and conform. No regression.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 31fe1d10b2)
xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.
Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api
v2: do reset only if shaders provide an explicit stride
v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 860919a3b2)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
The check for the pointer being non-NULL was being done too late.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit a6893a50c8)
The line stipple fallback code for virtual HW version 8 didn't work.
With HW version 8, we were getting zero when querying the max line
widths (AA and non-AA). This means we were setting the draw module's
wide line threshold to zero. This caused the wide line stage to always
get enabled. That caused the line stipple module to fall because the
wide line stage was clobbering the rasterization state with a state
object setting the line stipple pattern to 0xffff.
Now the wide_lines variable in draw's validate_pipeline() will not
be incorrectly set.
Also improve debug output.
BTW, also this fixes several other piglit tests: polygon-mode,
primitive- restart-draw-mode, and line-flat-clip-color since they
all use the draw module fallback.
See VMware bug 1895811.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit c2b92dada0)
Squashed with:
svga: adjust line subpixel position for HWv8
This fixes two regressions on HWv8:
Piglit gl-1.0-ortho-pos
Piglit/glean fbo
This was caused by commit c2b92dada0 "svga: clamp device line width
to at least 1 to fix HWv8 line stippling"
This also fixes two conform tests: Vertex Order and Polygon Face
No Piglit/conform changes with HWv9 or later.
VMware bug 1905053
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 5b8d33acef)
The NIR parameters are ordered "compare, data", matching GLSL, but both
the image and buffer LLVM intrinsics take them the other way around.
This is already handled correctly for SSBO atomics.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
(cherry picked from commit c2a5cb6427)
Despite being a member of the etna_screen struct, 'refcnt' is used by
the winsys-specific logic to track the reference count of the object
managed in a hash table. When the count reaches zero, the pipe screen
is removed from the table and destroyed.
Fix the logic by initializing the refcnt to 1 when screen created.
This initialization is done in etna_screen_create(), to follow the
same logic as in freedreno and virgl.
Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5d8514de14)
Cacheline alignment of SWR_STATS to prevent sharing of cachelines
between threads (performance).
Gets rid of gcc-7.1 warning about using c++17's over-aligned new
feature.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit bab03c06fc)
Define these in terms of setzero for ancient gcc versions which don't
have the undefined intrinsics.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit f0a22956be)
Squashed with:
swr: modifications to allow gcc-4.8 compilation
Code unconditionally used avx2 intrinsics in the avx compilation.
The simd intrinsics library we used has diverged significantly between
branch and master; the non-“undefined intrinsics” portion is specific
to the branch.
This complements:
f0a22956be “swr/rast: _mm*_undefined_* implementations for gcc<4.9”
And makes this branch equivalent with the additional master patch:
d50ef7332c “swr/rast: don't use _mm256_fmsub_ps in AVX code”
Don't assume the header is present on some platforms - use the more
robust CheckHeader() instead.
glibc 2.26 removed xlocale.h.
https://sourceware.org/glibc/wiki/Release/2.26#Removal_of_.27xlocale.h.27
Fix this build error with glibc 2.26.
Compiling src/util/strtod.c ...
src/util/strtod.c:32:10: fatal error: xlocale.h: No such file or directory
#include <xlocale.h>
^~~~~~~~~~~
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101657
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c5d0dc7fa5)
In blit_framebuffer we're already doing a NULL
pointer check for readFb and drawFb so it makes
sense to do it before we actually use the pointers.
CID: 1412569
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit b3b6121115)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
sampler view allocated in vaAssociateSubpicture is not cleared
in vaiDeassociateSubpicture.
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit b1a359b7d8)
The current implementation assumed that these were replaced in GLSL >= 4.10
by gl_Max{Vertex,Fragment}UniformVectors, however this is not true: both
built-ins should be produced from GLSL 4.10 onwards.
This was raised by new CTS tests that are in development.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b70d6a2de1)
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).
Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.
Fixes: d5b355ce5f ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 50a8a7377a)
_mesa_glsl_has_builtin_function is used to determine whether any variant
of a builtin are available, for the purpose of enforcing the GLSL ES
3.00+ rule that overloads or overrides of builtins are disallowed.
However the builtin_builder contains information on all builtins,
irrespective of parse state, or versions, or extension enablement. As a
result we would say that a builtin existed even if it was not actually
available.
To resolve this, first check if at least one signature is available for
a builtin before returning true.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 880f21f55d)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Previously, we were using the type of the variable which is incorrect.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit a10d887ad1)
We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.
This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.
Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7168425dd7)
Before, we were just implementing it with a move, which is incorrect
when the source and destination have different bitsizes. To implement
it properly, we need to use the 64-bit pack/unpack opcodes. Since
glslang uses OpBitcast to implement packInt2x32 and unpackInt2x32, this
should fix them on anv (and radv once we enable the int64 capability).
v2: make supporting non-32/64 bit easier (Jason)
v3: add another assert (Jason)
Fixes: b3135c3c ("anv: Advertise shaderInt64 on Broadwell and above")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 196e6b60b1)
The smapi->get_egl_image() call in st_egl_image_get_surface() stores a
reference to the EGLImage's texture in stimg.texture. That reference is
released via pipe_resource_reference(&stimg.texture, NULL) before stimg
goes out of scope at the end of the function, but not in the error path
if !is_format_supported().
Fixes: 83e9de25f3 ("st/mesa: EGLImageTarget* error handling")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7d7bcd65d6)
extra: References 6a7c5257ca but because later f8d69beed4 introduced
a regression and the latter didn't land.
Signed-off-by: Andres Gomez <agomez@igalia.com>
This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold. This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".
This resolves bug 101467 filed against swr for the piglit
streaming-texture-leak test. For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.
Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations. But, this addresses those as well.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32c1a54bd0)
The labels array may change its virtual address on a reallocation, so
it is invalid to cache pointers into the array. Rather than using the
pointer directly, remember the array index.
Fixes miscompilation of shaders in glmark2 ideas, leading to GPU hangs.
Fixes: c9e8b49b (etnaviv: gallium driver for Vivante GPUs)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit ec43605189)
The buffer intrinsics should be used instead of the image ones.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 909184ac9c)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
When using cmpswap on an image, it was being trunctated to
lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely.
v2: Add stable CC
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6fc41bb4d5)
We set this unconditionally on every other platform. Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 4878ab9bd4)
The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c. Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4. Not exactly "true", but at least
more accurate.
The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used. The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.
At any rate, we should use the more accurate mode.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit b625bcc601)
The CL CTS queries the max allocation size, and then attempts to
allocate buffers of that size. If not enough contiguous RAM/VRAM is
available, this causes errors in the radeon kernel module due to
inability to allocate the required memory.
It's a bit of a hack, but experimentally on my system, I can use ~3/4
of the card's VRAM for a single global/constant buffer allocation given
current GUI/compositor use.
For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of:
Global memory size 2143076352 (1.996GiB)
Max memory allocation 1500153446 (1.397GiB)
Max constant buffer size 1500153446 (1.397GiB)
To:
Global memory size 2143076352 (1.996GiB)
Max memory allocation 751619276 (716MiB)
Max constant buffer size 751619276 (716MiB)
Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size,
OpenCL CTS test/conformance/api/min_max_constant_buffer_size
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e4d06e4c53)
We shouldn't use the wide line stage if the line width is 1.
This check isn't strictly needed because all drivers are (now)
specifying a line wide threshold of at least 1.0 pixels, but
let's play it safe.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit c8f344ed2d)
KHR/khrplatform.h is required by the EGL, GLES and VG headers, but is
only installed if Mesa3d is compiled with EGL support.
This patch installs this header file unconditionally.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77240
Signed-off-by: Eric Le Bihan <eric.le.bihan.dev@free.fr>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2154defcd6)
sysmacros.h was getting implicitly included in types.h until recently in
AOSP master. Define MAJOR_IN_SYSMACROS to explicitly include sysmacros.h.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
(cherry picked from commit e8f82bfd52)
The header embeds the struct so it needs the header inclusion instead of
the dummy forward declaration.
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Tom Stellard <tstellar@redhat.com>
Fixes: 32206c5e56 ("radeonsi: Add radeon_shader_binary member to struct
si_shader")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 1f958c1337)
Previously the logic would decide that the record is kept, which
translates into keep = false in the caller, which meant that these
passes did not run.
While it's right that keep = false which means that a new record does
not need to be added, we do still have to perform the usual list
maintenance. It's easiest to do this pre-merge rather than post.
The lowering that clip/cull distance passes produce triggers this bug in
TCS (since reading outputs is done differently in other stages), but it
should be possible to achieve it with the right sequence of regular
reads/writes.
Fixes: KHR-GL45.cull_distance.functional
Fixes: generated_tests/spec/arb_tessellation_shader/execution/tes-input/tes-input-gl_ClipDistance.shader_test
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4a79f2be33)
All the BuildUtil helpers just insert the operation into the current BB.
So we have to take care that any fetchSrc() operations happen before the
operation whose setIndirect() it goes into.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c02ee4a8b)
We were exposing 4096, but we can do up to 8192 in Gen4-6 and up to
16384 in gen7+. OpenGL 4.1+ requires at least 16384.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b72b7c541d)
The very last entry in the sid_strings_offsets table ended up missing,
leading to out-of-bounds reads and potential crashes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 67e49a7f65)