Both are explicit 17.2 nominations, since they depend on work which
landed past the 17.1 branchpoint.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
This fixes a bug uncovered by:
2412c4c81e
util: Make CLAMP turn NaN into MIN.
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 433f6f7ac9)
The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a0e6b9a2db)
[Emil Velikov: add a HAVE_LLVM check, as applicable in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeon/r600_pipe_common.c
The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).
While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.
Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 65fbaab0b7)
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 31f1863ace)
As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.
If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).
Fixes: adafe4b733 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9f439ae120)
I don't know how I managed to leave this here for so long. Found when
working on a 1:1 overlapping blit extension for X11.
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 93fec49a75)
Fill the entire array instead of just a quarter. This avoids
crashes with large shaders.
(currently this never causes a problem because shaders larger than 2048/4
instructions are not supported by this driver on any hardware, but it will
cause problems in the future)
Fixes: ec43605189 ("etnaviv: fix shader miscompilation with more than 16 labels")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 15a1ceb127)
Seems like the backends have been using pthreads since day one, yet
we've been missing the link.
With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.
v2: Split from a larger patch.
Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: c6e67f5a93 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 33d397ada5)
[Emil Velikov: add PTHREAD_LIBS to COMMON_LIBADD]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/swr/Makefile.am
Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.
Fixes unresolved symbol in linux scons builds.
v2: add comment in code about the 1:1 assumption.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit d1e7153228)
cpuid.7 requires cx=0 to select the extended feature leaf.
avx512 detection was using the non-indexed cpuid resulting
in random non-detection of avx512.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 131b9f644c)
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.
Found by assertion failures while enabling VK_ANDROID_native_buffer.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5d69052113)
The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.
v2: Rebase
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
(cherry picked from commit 7791949dad)
[Emil Velikov: drop st/dri hunk, squash correct swrast piece]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
If the source is an indirect register, there is ralloc'd data. Copying
with a direct assignment will copy the pointer, but the data will still
belong to the old instruction's memory context. Since we're lowering
and throwing away instructions, that could free the data by mistake.
Instead, use nir_src_copy, which properly handles this.
This is admittedly not a common case, so I think the bug is real,
but unlikely to be hit.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0320bb2c6c)
[Emil Velikov: drop nir_lower_atomics_to_ssbo.c - not in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/compiler/nir/nir_lower_atomics_to_ssbo.c
Previously we only incremented the guide for a single
dimension/wildcard.
V2: rework logic to avoid code duplication
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3f0fb23b03)
This can happen if, for instance, you have an array of structs and there
are both direct and wildcard references to the same struct and some
members only have direct or only have indirect.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecf91898e0)
From the Vulkan 1.0.53 spec VU for vkCreateImageView:
"image must have been created with a usage value containing at least
one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"
We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c5700ed72e)
We were early returning and never created the NULL surface state.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bd41564746)
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order. Therefore, rearrange the
elements when copying.
v2: Move scan tables to vl_zscan.c.
Fix type in size assertion.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 63dcfed81f)
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.
ANV did so since day one (back in 2015)
Cc: mesa-stable@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4168c162c5)
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.
ANV did so since day one (back in 2015)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 43c188f970)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/intel/vulkan/anv_device.c
khrplatform.h is only used by EGL and GLES; let's only install it when
one of those is enabled.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jussi Kukkonen <jussi.kukkonen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8821ef4be1)
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.
But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.
Thus, use 1ull, which is an unsigned long long value.
This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved
v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 28d0c38d85)
Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.
Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c22e3c5373)
[Emil Velikov: resolve trivial conflicts - drop initial[] hunk]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c
If the application hasn't done any drawing since the last call, we
would reuse the same back buffer which was used for the previous swap,
which may not have completed yet. This could result in various issues
such as tearing or application hangs.
In the normal case, the behaviour is unchanged.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97957
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101683
Cc: mesa-stable@lists.freedesktop.org
[Michel Dänzer: Make Thomas' fix from bugzilla actually work as
intended, write commit log]
(cherry picked from commit 81fb154777)
We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3645268748)
Commit bfe1e7737a changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.
This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).
Fixes issues seen with ETQW apitrace.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit f7e78abdf4)
We were hitting the
unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.
v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b117f59710)
This query is supposed to return the max texture buffer size/width in
texels, not size in bytes. Divide by 16 (the largest format size) to
return texels.
Fixes Piglit arb_texture_buffer_object-max-size test.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by :Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 3b28eaabf6)
Current selection of pixel format does not enforce the request of
stencil or depth buffer if the color depth is not the same as
requested.
For instance, GLUT requests a 32-bit color buffer with an 8-bit
stencil buffer, but because color buffers are only 24-bit, no
priority is given to creating a stencil buffer.
This patch gives more priority to the creation of requested buffers
and less priority to the difference in bit depth.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101703
Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 80c6598cdb)
This patch fixes the total surface size in surface cache
to include array size as well.
Tested with MTT glretrace.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit adead35320)
piglit test ext_texture_array-gen-mipmap is fixed with this patch.
Tested with mtt piglit, glretrace, viewperf and conform. No regression.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 31fe1d10b2)
xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.
Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api
v2: do reset only if shaders provide an explicit stride
v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 860919a3b2)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>