Use alignment to calculate the stride associated with the pointer
types. That stride is used when the pointers are casted to arrays.
Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b;
} will have element an element size of 12 bytes, but the stride needs
to be 16 bytes to respect the 8 byte alignment.
Fixes: 050eb6389a "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 026cfa1099)
We need this when doing full software 64-bit emulation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0ba508d7a3)
Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers
functions, where we call into some AHardwareBuffer functions.
The legacy Android ext did not have us call into any Android function
at all and hence it was not noticed.
Fixes: 755c633b8d "anv: Fix vulkan build in meson."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit d4f0f1a6e2)
anv_render_pass_compile() turns an unused attachment into a NULL
depth_stencil_attachment pointer so check that pointer before
accessing it.
Found with updates to existing CTS tests.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 208be8eafa ("anv: Make subpass::depth_stencil_attachment a pointer")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit c9c8c2f7d7)
Fix this build error on Ubuntu 18.04.
/usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5'
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663
Suggested-by: Eric Engestrom <eric@@engestrom.ch>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 730ceeddb5)
Cherry-picks a0d4d7febf upstream
The TTN path needs access to the screen to make the right decisions about
lowering, but we didn't have pctx->screen set up at fdN_prog_init time.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
When using softpin, the first allocation was not calculating the
padding and offset correctly for the case the first allocation needed
to grow. We were missing initialize the state.end right after
expanding the pool for the first time.
This is not a problem for non-softpin since there we don't use
leftover padding so the ends would re-arrange incrementally.
This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in
SKL -- the test uses a shader larger than the initial size for the
instruction pool.
Fixes: dfc9ab2ccd "anv/allocator: Add padding information."
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 09c4037dda)
Without this the restored program will fail the pipeline validation
checks when we attempt to use an SSO program.
Fixes: c20fd744fe ("mesa: Add Mesa ARB_get_program_binary helper functions")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111010
(cherry picked from commit 3043908ccb)
This is the MOCS setting used for the A64 stateless messages which we
sometimes use for SSBO operations.
Fixes: 48ed2a7bb0 "anv: Implement VK_EXT_buffer_device_address"
Fixes: 79fb0d27f3 "anv: Implement SSBOs bindings with GPU addr..."
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 6a2ff217b8)
None of the current code knows what to do with swizzles. Take the safe
option for now and bail if we see one. This does have a small shader-db
impact but it is at least safe.
Shader-db results on Kaby Lake:
total loops in shared programs: 4364 -> 4388 (0.55%)
loops in affected programs: 5 -> 29 (480.00%)
helped: 5
HURT: 29
Shader-db results on Haswell:
total loops in shared programs: 4373 -> 4370 (-0.07%)
loops in affected programs: 5 -> 2 (-60.00%)
helped: 5
HURT: 2
Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 9a3cb6f5fe)
The current code assumes everything is 32-bit which is very likely true
but not guaranteed by any means. Instead, use nir_eval_const_opcode to
do the calculations in a bit-size-agnostic way. We also use the new
constant constructors to build the correct size constants.
Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 268ad47c11)
One issue was that the original version didn't check that swizzles
matched when comparing ALU instructions so it could end up matching
very different instructions. Using the nir_instrs_equal function from
nir_instr_set.c which we use for CSE should be much more reliable.
Another was that the loop assumes it will only run two iterations which
may not be true. If there's something which guarantees that this case
only happens for phis after ifs, it wasn't documented.
Fixes: 9e6b39e1d5 "nir: detect more induction variables"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 9f7ffe41dd)
This is simple now, but we're going to be adding a few more conditions
to this later.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit a1c737927c)
It is legal to call vkFreeCommandBuffers() on NULL command buffers.
This fix requires eb41ce1b01 ("util/hash_table: Properly handle
the NULL key in hash_table_u64").
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4438188f49 ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a72351cc76)
Set the absolute minimum possible GLSL version. API_OPENGL_CORE can
mean an OpenGL 3.0 forward-compatible context, so that implies a minimum
possible version of 1.30. Otherwise, the minimum possible version 1.20.
Since Mesa unconditionally advertises GL_ARB_shading_language_100 and
GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't
advertise any extensions to enable any shader stages (e.g.,
GL_ARB_vertex_shader).
Converts about 2,500 piglit tests from crash to skip on NV18.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0349bc3ce2)
Each tests has a comment with the expected before and after NIR. The
tests don't actually check this. The tests only check whether or not
the optimization pass reported progress. I couldn't think of a robust,
future-proof way to check the before and after code.
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b08d704051)
Previously, an instruction like
mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F]
would get rewritten as
mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F]
The latter does not produce the correct result. The VF immediate in the
second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F]. This
commit produces the former.
Fixes: 1ee1d8ab46 ("i965/vec4: Reswizzle sources when necessary.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 47c2aa5b48)
It was reported as unsupported previously. It should be importable
and is compatible with itself.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 69cc6272fb ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5824130389)
alignment=0 does weird things with align64.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e46b41b3ae)
Begin/Reset of command buffer both reset the content of the command
buffer. Don't forget to wipe them on Begin.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4438188f49 ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit")
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8f0f727fe4)
Fixes: d7e6541cc7 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8a053254b8)
Do not use the view format when filling the surface state.
Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*
Fixes: fb1350c76f ("intel: Add and use helpers for level0 extent")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e06bc0b166)
From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):
For objects in the Uniform, StorageBuffer, or PushConstant storage
classes, the element’s address or location is calculated using a
stride, which will be the Base-type’s Array Stride when the Base
type is decorated with ArrayStride. For all other objects, the
implementation will calculate the element’s address or location.
For non-CL shaders the driver should layout the Workgroup storage
class, so override any explicitly set ArrayStride in the shader. This
currently fixes only the lower_workgroup_access_to_offsets case, which
is used by anv.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 050eb6389a)
Previously, on systems where multiple versions of Python 3 (e.g. 3.6 and 3.7)
are installed, wrong version of Python 3 could have been used.
The proper fix requires availability of path() method in Meson's python
module, which has been added in Meson 0.50:
https://github.com/mesonbuild/meson/pull/4616
Distro Bug: https://bugs.gentoo.org/671308
Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
v2: - Add missing `endif` keyword (Dylan)
(cherry picked from commit b120a02b21)
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 10e8d46601)
The disk cache code tries to allocate a 256 Kbyte buffer on the stack.
Since musl only gives 80 Kbyte of stack space per thread, this causes a
trap.
See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size
(In musl-1.1.21 the default stack size has increased to 128K)
[mattst88]: Original author unknown, but I think this is small enough
that it is not copyrightable.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit fd7b7f14d8)
This is a regression from the old autotools build system.
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 7389bf9761)
Enable the compute shader copositer only when TEX_LZ is supported by the driver.
v2: Also check whether DIV is supported.
https://bugs.freedesktop.org/show_bug.cgi?id=110783
Fixes: 9364d66cb7
gallium/auxiliary/vl: Add video compositor compute shader render
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 75d8b4e795)
Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able
to check this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 843723e2f7)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/gallium/docs/source/screen.rst
src/gallium/include/pipe/p_defines.h
The simulator complains about using byte operands, we also have
documentation telling us.
Note that add operations on bytes seems to work fine on HW (like ADD).
Using dwords operands with CMP & SEL fixes the following tests :
dEQP-VK.spirv_assembly.type.vec*.i8.*
v2: Drop the GLK changes (Matt)
Add validator tests (Matt)
v3: Drop GLK ref (Matt)
Don't mix float/integer in MAD (Matt)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
BSpec: 3017
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5847de6e9a)
This reverts commit 5157a42765.
There is a meson bug that causes llvm to always be statically linked,
which is obviously not what we want. I haven't had time to look into it
yet, but for now let's just revert it.
(cherry picked from commit 97c2c4546c)
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit 9c421d6b47.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d96cba7754)
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit 2be60e0c73.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 387e43b52f)
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit 85ecd14ef6.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7746d4edef)
Left shift was applied twice.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110702
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: <irherder@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81c784a4a)
When a context is destroyed the destroy_tex_sampler_cb makes sure that all the
sampler views created by that context are destroyed.
This is done by walking the ctx->Shared->TexObjects hash table.
In a multiple context environment the texture can be deleted by a different context,
so it will be removed from the TexObjects table and will prevent the above mechanism
to work.
This can result in an assertion in st_save_zombie_sampler_view because the
sampler_view owns a reference to a destroyed context.
This issue occurs in blender 2.80.
This commit fixes this by explicitly releasing sampler_view created by the destroyed
context for all texture attachments.
Fixes: 593e36f956 (st/mesa: implement "zombie" sampler views (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c37f03d464)
dbb4457d98 started using drmDevicesEqual(), which was
introduced in libdrm 2.4.81
We could either copy the function locally, or bump the required version.
Since the function is non-trivial and 2.4.81 is old enough already,
I suggesting the latter.
Fixes: dbb4457d98 ("egl: add EGL_EXT_device_drm support")
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5819bc0e5c)
These two extensions are supported on GFX8 but the throughput
of 16-bit floats/integers is same as 32-bit. Also, shaderInt16
is only enabled on GFX9+ for the same reason, be more consistent.
This fixes a crash with Wolfenstein II because it expects
shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is
exposed. Note that AMDVLK only enables these extensions on GFX9+.
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ef1787dbc9)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Conflicts:
src/amd/vulkan/radv_extensions.py
A while back, we added a new field, but failed to update the copier.
I believe iris is the only current user of the new field, and it hasn't
used the copier, so noone noticed.
Fixes: 8b626a22b2 st/mesa: Record shader access qualifiers for images
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 255c71ec07)