Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit a497ab6838)
This bug can make that we don't detect the end of a contiguous area
correctly and push larger areas than the real ones.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 7427425247)
I messed this up when I wrote it, this fixes:
dEQP-VK.memory.pipeline_barrier.*uniform_texel_buffer.*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e66be3d3bb)
This is a DRI3 version of a change made for DRI2
(4d6d4f939e, "egl/dri2: implement query surface hook"),
that fixed failures in dEQP-EGL.functional.resize.surface_size.grow
and dEQP-EGL.functional.resize.surface_size.shrink.
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 884f65e185)
For blitting we need to use the depth or stencil format, never
the combined.
This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800b82ea13)
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d833ae97)
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ea34a98c0)
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb878db7eb)
v2: restore the state
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cc2f92b09f)
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.
However, the same 'var' is referenced later in ast_declarator_list::hir().
This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.
This error was detected by Address Sanitizer.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a73a618933)
This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.
v2: use list_head, switch the call order in destroy
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4aea8fe7e0)
Found by inspection. However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 075ed20614)
Some things may not be floats and intel CPUs are known for mangling bits
when a float type is used for copying integers.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 8c8095c260)
This fixes a some rendering corruption in The Talos Principle
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 40087bcb51)
This is similar to clflush_range except that it puts the mfence on the
other side to ensure caches are flushed prior to reading.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8582ab2d6e)
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.
Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 066a117be7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c
src/gallium/drivers/radeonsi/si_state_shaders.c
Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.
Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.
Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.
v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
have count 0.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6a1d9684f4)
Allocating huge buffers in VRAM is not a problem, but when those buffers
start being migrated, the kernel runs into errors because it cannot split
those buffer up for moving through GTT.
This should fix intermittent failures of
GL45-CTS.texture_buffer.texture_buffer_max_size
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 550125e1e7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
If llvm::sys::getHostCPUName() returns "generic", override
it with "pwr8" (on PPC64LE).
This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
is missing, resulting in (big-endian) "generic" being returned on
little-endian Power8NVL systems. The result is that code that
attempts to load the least significant 32 bits of a 64-bit quantity in
memory loads the wrong half.
This omission should be fixed in the next version of LLVM (4.0),
but this work-around should be left in place in case some
future version of POWER<n> also ends up unrepresented in LLVM's table.
This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion
tests on POWER8NVL processors.
(V4: add similar comment in the code.)
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b934aae364)
It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit c8ef512398)
(cherry picked from commit bc8d047068)
start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.
v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
Tested with the new piglit.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a264fee624)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Conflicts:
src/gallium/drivers/radeonsi/si_state_draw.c
Found while running shader-db under valgrind.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0ac118398)
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
(cherry picked from commit f448701622)
It's broken in a number of ways. In particular, a bunch of the
conditions are backwards so it doesn't actually detect what it's
supposed to detect. Since it's been broken, it hasn't actually been
helping anything so just deleting it isn't a regression.
This (and removing another optimization) were done on master in commit
b073811617.
Cc: "Kenneth Grunke" <kenneth@whitecape.org>
Cc: "Mark Janes" <mark.a.janes@intel.com>
[Emil Velikov: patch is a backport of the below "cherry pick"]
Fixes: a4393bd97f ("i965/fs: Fix the inline nir_op_pack_double optimization")
(cherry picked from commit b073811617)
vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a
pResults fields which must holds the results of all the images
requested to be presented. Currently we're not filling this field.
Also as a side effect we probably want to go through all the images
rather than stopping on the first error.
This commit also makes the QueuePresentKHR() implementation return the
first error encountered.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0fcb92c17d)
Even though we supported both coherent and non-coherent memory types, we
effectively forced apps to use the coherent types by accident. Found by
inspection, only compile tested.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6319bfc2a6)
This just fixes this without repeating the code.
Reported-by: Li Qiang
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 69fc7a2c82)
It's trivial to swizzle clear colors on the CPU, easily deals with the
hardware restrictions for render target swizzles, and makes swizzled
clears work on all hardware as opposed to just HSW+.
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e233db6e93)
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.
Having a TCS but no TES may confuse drivers - i965 crashes, for example.
This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.
v2: Add an ES spec citation (suggested by Alejandro)
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 05a56893aa)
Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7e75f0913a)
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."
In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.
With this patch Mesa reports a compilation error as glslang does with
the following shader:
buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}
Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5bc222ebaf)
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c95f821cb4)
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.
Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1acdd62847)
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.
Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 59ca352fc5)
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."
Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute
This allows some fragment shaders in Dirt Rally to compile
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21efe2528c)
(Patch co-authored by Jason and Ken.)
We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.
This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.
At best, getting this wrong would reduce the guardband's effectiveness
in some cases. At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.
v2: drop clamping of positive maximums.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f3c068c5c8)
The next patch will make the guardband calculation dependent on the
transformation matrix. Instead of computing it in both atoms, just
combine them into a single atom.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 89ad7f1be6)
As was done for dcc and cmask.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 90ac2285f0)
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
(cherry picked from commit 47ca0f537d)
CMASK alignment can be greater than image data alignment, so pass
it to the app so that it knows what alignment to backing memory
should have.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eb01b20cc4)
This fixes the vulkan samples deferredmultisampling test.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a864ef7f48)