Need to flush before updating the buffer to ensure that the copy is
ordered after previous accesses (assuming the app has performed the
appropriate barriers).
This fixes potential issues due to draws prior to an update reading
the new buffer content, despite having the necessary barriers between
them.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e0cc32b85b)
The flushes could be due to TRANSFER barriers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cce43f6d8c)
Ported from radeonsi, pointed out by Tom.
"This prevents LLVM from using sext instructions for local memory
offsets and allows the backend to fold immediate offsets into the
instruction. This also prevents some incorrect code generation for
ptrtoint and inttoptr instructions."
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b8ee70384a)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/amd/common/ac_nir_to_llvm.c
This should prevent cases when a buffer was incorrectly mapped without
synchronization just because this wasn't done.
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 71a2e4e945)
Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
now that LLVM bug 26775 and its corollary, 25503, are fixed.
Amendment: remove extraneous spaces in macro def & invocations.
We would prefer a runtime check, e.g. via an LLVMQueryString
(analogous to glGetString, eglQueryString) or LLVMGetVersion API,
but no such API exists at this time.
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
[Emil Velikov: remove LLVM_VERSION macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3f1b6ef2aa)
When binding as textures, the alignment can be 16. However when binding
as an image, the address has to be aligned to 256. (Also when binding as
an RT, but that can't happen with GL or current gallium APIs.)
Reported-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 32dd8d59b6)
If i-th thread could not be created it means we have i threads,
not i+1, because we start from 0.
Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7f268cf12b)
Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls
exit()") added a atexit handler which calls
util_queue_killall_and_wait() for each queue to stop the threads.
However the app is also free to use atexit handlers to clean up things,
leading to util_queue_destroy() call which will also call
util_queue_killall_and_wait() for the same queue again, causing threads
being joined twice, and that is undefined. This happens with libglut,
for example. A simple fix is to just set num_threads to 0 as there are
no more valid threads after util_queue_killall_and_wait() returns.
Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9936121935)
They cause regressions on little endian.
Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 66d1cb587a)
This commit improves the message by telling them that they could probably
enable DRI3. More importantly, it includes a little heuristic to check
to see if we're running on AMD or NVIDIA's proprietary X11 drivers and,
if we are, doesn't emit the warning. This way, users with both a discrete
card and Intel graphics don't get the warning when they're just running
on the discrete card.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715
Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Rene Lindsay <rjklindsay@hotmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 3d8feb38e8)
The number of dynamic descriptors is limited by both the number of
descriptors and the total number of dynamic things. Because there isn't
a single "maximum dynamic things" limit, we need to divide by two so
that they can create the maximum of both UBOs and SSBOs.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5e44ef4a76)
No idea what this does, but disabling it fixes a bunch
of failing CTS tests in the lod area, so let's go with that.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d81bd2f754)
The dynamic_offset_offset in the descriptor set binding layout is
relative to the dynamic_offset_start for the set in the pipeline
layout.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 162beb2abb)
This fixes the wrong dynamic buffer descriptors being updated when
firstSet > 0.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0941d1a574)
It has issues and the fix I'm working on is too complicated for stable,
so disable for now.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ab2dd361f)
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.
This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 290d7e892d)
This isn't exposed in -pro, the hw docs say it is deprecated,
so let's not bother with it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cc59e24a6b)
See detailed explanation of why this is needed in commit eb60a89bc3.
This spot was missed/overlooked. Basically as a result of the fact
that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8
to the requested amount, we have to be mindful of that when doing bare
nouveau_pushbuf_space calls.
Reportedly this fixes some crashes when replaying a hitman trace taken
on radeonsi.
Fixes: eb60a89bc3 ("nouveau: take extra push space into account for pushbuf_space calls")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8e6d67685e)
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.
Fixes SIGBUS on Raspberry Pi at high optimization levels.
This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.
v2: Commit message reword/typo fix, and add a bigger explanation in the
code (by anholt)
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cd2b55e536)
Squashed with commit:
ralloc: don't leave out the alignment factor
Experimentation shows that without alignment factor gcc and clang choose
a factor of 16 even on IA-32, which doesn't match what malloc() uses (8).
The problem is it makes gcc assume the pointer is 16 byte aligned, so
with -O3 it starts using aligned SSE instructions that later fault,
so always specify a suitable alignment factor.
Cc: Jonas Pfeil <pfeiljonas@gmx.de>
Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Tested by: Mike Lothian <mike@fireburn.co.uk>
Tested by: Jonas Pfeil <pfeiljonas@gmx.de>
(cherry picked from commit ff494fe999)
Even though compute shaders cannot access the framebuffer, there is a
synchronization issue when a compute dispatch accesses a texture that
was previously bound and drawn to as a framebuffer.
Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of
the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior
results if the texture is still attached to the currently bound
framebuffer. However, the feedback loop is broken when the application
changes the framebuffer binding before a compute dispatch, and the
state tracker needs to let the driver known about this.
Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 40c77bbf83)
exec_node::get_prev() does not guard against going past the beginning
of the list, so we need to add explicit checks here.
Found by ASAN in piglit arb_shader_storage_buffer_object-rendering.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 911391bd70)
This was hiding bugs as it retyped the source to destination's type.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0dddad5b1b)
When generating the MOV INDIRECT instruction, the source type is ignored
and it is set to destination's type. However, this is going to change in a
later patch, so we need to explicitly set the proper source type.
brw_vec8_grf() creates an float type's fs_reg by default, when the
ICP handle is actually unsigned. This patch fixes these cases before
applying the aforementioned patch.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit d8122128bc)
The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.
v2:
- Improve commit log (Curro)
- Fix read_size (Curro)
- Fix DF uniform array detection in assign_constant_locations() when
it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
v3:
- Move changes in assign_constant_locations() to other patch.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 56266df7ed)
Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit a497ab6838)
This bug can make that we don't detect the end of a contiguous area
correctly and push larger areas than the real ones.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 7427425247)
I messed this up when I wrote it, this fixes:
dEQP-VK.memory.pipeline_barrier.*uniform_texel_buffer.*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e66be3d3bb)
This is a DRI3 version of a change made for DRI2
(4d6d4f939e, "egl/dri2: implement query surface hook"),
that fixed failures in dEQP-EGL.functional.resize.surface_size.grow
and dEQP-EGL.functional.resize.surface_size.shrink.
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 884f65e185)
For blitting we need to use the depth or stencil format, never
the combined.
This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800b82ea13)
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d833ae97)
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ea34a98c0)
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb878db7eb)
v2: restore the state
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cc2f92b09f)
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.
However, the same 'var' is referenced later in ast_declarator_list::hir().
This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.
This error was detected by Address Sanitizer.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a73a618933)
This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.
v2: use list_head, switch the call order in destroy
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4aea8fe7e0)
Found by inspection. However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 075ed20614)
Some things may not be floats and intel CPUs are known for mangling bits
when a float type is used for copying integers.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 8c8095c260)
This fixes a some rendering corruption in The Talos Principle
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 40087bcb51)