Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewied-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 18.2: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 743e11c10b)
Gen7.5 has a BLEND_STATE of size 0 which includes a variable length
group. We did not deal with that very well, leading to an endless
loop.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 440a988bd1)
Instead of printing addresses like everyone else, we were accidentally
printing the offset from state base address. Also, state_map is a void
pointer so we were incrementing in bytes instead of dwords and every
state other than the first was wrong.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit cbd4bc1346)
First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site. Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit. This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need. This fixes decoding of 3DSTATE_SBE_SWIZ.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 2abd7ae189)
The batch decoder looks for a field with a particular name to decide
whether an MI_BB_START leads into a second batch buffer level. Because
the names are different between Gen7.5/8 and the newer generation we
fail that test and keep on reading (invalid) instructions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f430a37fa7)
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:
"Only the input variables that are actually read need to be written
by the previous stage; it is allowed to have superfluous
declarations of input variables."
Fixes:
* interstage-multiple-shader-objects.shader_test
v2:
Update comment in ir.h since the usage of "used" field
has been extended.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 4a8444d5bc)
Just like the rest of the tree - these should be run either as part of
the build system check target, or at the very least with an explicitly
versioned python executable.
Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python script")
Fixes: 97c28cb082 ("glsl/tests: Convert optimization-test.sh to pure python")
Fixes: 3b52d29227 ("glsl/tests: reimplement warnings-test in python")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 48820ed8da)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/compiler/glsl/tests/optimization_test.py
This is a patch from me and a patch from Mathieu Bridon squashed
together.
Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
(cherry picked from commit e15686567c)
[Andres Gomez: applied cleanly but backported as suggested by Emil]
Signed-off-by: Andres Gomez <agomez@igalia.com>
The requirement was bumped a while back, but we forgot to update the
docs.
Fixes: ed871af91c ("configure.ac: raise Mako required version to
0.8.0")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit e39b916d0c)
Haven't tested this, but we do include loader.h
in platform_android.c
Fixes: c5ec155685 ("meson: wire up egl/android")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit c731508b98)
The extension was exposed but not the functions.
This fixes:
dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels
dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv
dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 37eee90df7)
Thanks to reproducible builds, binary file timestamps may be identical
for both 32bit and 64bit packages when built from the same source.
This means radv will use the same cache for both 32 and 64 bit
processes, which leads to crashes.
Conveniently there is a spare byte in cache_uuid, let's place the
pointer size there.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107601
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105904
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 356f6673d6)
Fix rendering issues on BDW and SKL.
Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")
Fixes the following regressions seen
exclusively on SKL:
* KHR-GL46.texture_barrier_ARB.disjoint-texels
* KHR-GL46.texture_barrier_ARB.overlapping-texels
* KHR-GL46.texture_barrier.disjoint-texels
* KHR-GL46.texture_barrier.overlapping-texels
and both on BDW and SKL:
* GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners
* GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners
v2: Note the fixed tests (Andres).
Don't cause failures with multisampled buffers (Andres).
Don't hamper SKL GT4 (Ken).
v3: Fix the Fixes tag (Dylan).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6d80b0b4ba)
Check the destination's row pitch against the BLT engine's row pitch
limitation as well.
Fixes: 0288fe8d04
("i965/miptree: Use the correct BLT pitch")
v2: Fix the Fixes tag (Dylan).
Check the destination row pitch (Chris).
Reported-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b041fc0649)
This struct contains all the data of interest. can_blit_slice() will use
it in the next patch to calculate the correct pitch.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 030b6efcfd)
This fixes a crash in build_tex_intrinsic() when trying to
launch the Basemark GPU benchmark on GFX8. It looks like
there is still something wrong because some frames are black.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 4c43ec461d)
Otherwise, nir_lower_clip_cull_distance_arrays might report
wrong number of output clips/culls because it relies on
shader output variables and some of them might be dead.
This fixes a rendering issue with Dolphin and Super Mario
Sunshine.
Fixes: b0c643d8f5 ("spirv: Use NIR per-member splitting")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107610
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 24ee53231d)
it causes corruption on several different GPU generations.
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a4a104fc81)
With the current code, we didn't do the space checks prior
to atomic counter setup emission, but we also didn't add
atomic counters to the space check so we could get a flush
later as well.
These flushes would be bad, and lead to problems with
parallel tests. We have to ensure the atomic counter copy in,
draw emits and counter copy out are kept in the same command
submission unit.
This reworks the code to drop some useless masks, make the
counting separate to the emits, and make the space checker
handle atomic counter space.
[airlied: want this in 18.2]
Fixes: 06993e4ee (r600: add support for hw atomic counters. (v3))
(cherry picked from commit 32529e6084)
We use floating-points for viewport bounds so VIEWPORT_SUBPIXEL_BITS
should reflect this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105975
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 25ec806eb2)
This fixes a regression with some Unity demos. Not sure
what the root cause of the problem is, especially because
the driver doesn't perform any fast color clears. So, it
shouldn't be needed to decompress DCC. RadeonSI says that
the decompression is relatively cheap if the surface has
been decompressed already.
One possible improvement is to two use predicates, one for
DCC and one for FCE that could be cleared when DCC, FMASK
or CMASK are performed by the driver. That might skip some
unnecessary decompression passes (not DCC though).
Fixes: ff7daadca1 ("radv: enable/disable predication for the DCC decompression pass")
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0aacb5eab6)
At the moment, depending on pipe transfer flags, the dumb
buffer map address can end up at either kms_sw_dt->ro_mapped
or kms_sw_dt->mapped.
When it's time to unmap the dumb buffer, both locations get unmapped,
even though one is probably initialized to 0.
That leads to the code segment getting unmapped at runtime and
crashes when trying to call into unrelated code.
This commit addresses the problem by using MAP_FAILED instead of
NULL for ro_mapped and mapped when the dumb buffer is unmapped,
and only unmapping mapped addresses at unmap time.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107098
Signed-off-by: Ray Strode <rstrode@redhat.com>
Fixes: d891f28df9 ("gallium/winsys/kms: Fix possible leak in map/unmap.")
Cc: Lepton Wu <lepton@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9baff597ce)
Because lower_ycbcr gets called before apply_pipeline_layout, the
indices are all logical and the binding layout HW size is actually too
big for the bounds check. We should just use the regular logical array
size instead.
Fixes: f3e91e78a3 "anv: add nir lowering pass for ycbcr textures"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 320dacb0a0)
When the number of unique BO is 0, we optimize the list creation
by copying all buffers of the current CS directly into it. But
this is only valid if the CS doesn't have virtual buffers,
otherwise they are not added and hw might report VM faults.
This fixes VM faults with:
dEQP-VK.sparse_resources.image_sparse_binding.2d.rgba8ui.1024_128_1
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d27e1584ce)
We have to do a fast-clear eliminate when clearing DCC
metadata with 0x20202020. I don't know if that fixes anything
but that seems correct to me.
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f9e8456c39)
This was missing when VK_EXT_conditional_rendering has been
implemented. The predication type should be -1 to avoid
restoring previous state when performing a decompression pass
with DCC enabled.
Note that we don't have to handle secondary command buffers
because we don't support this feature currently.
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f3a78a9da0)
Seems like DXVK depends on that and it might get reverted
upstream. Since apps are not supposed to use 0 in v2 anyway,
we should be safe implementing the old behavior there.
Fixes: 66e12451ac "radv: Update to new VK_EXT_vertex_attribute_divisor to version 2."
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 011a811652)
Accessing scalar constant as an array in function call or
initializer list triggered assert in get_array_element.
Examples:
func(0[0]);
vec2 t = { 0[0], 0 };
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107550
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 479a849ad6)
Commit 4434591bf5 caused substantially more URB messages in
geometry and tessellation shaders. Before we can really enable this
sort of optimization, We either need some way of combining them back
together into vectors or we need to do cross-stage vector element
elimination without splitting everything into scalars.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107510
Fixes: 4434591bf5 "intel/nir: Call nir_lower_io_to_scalar_early"
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 10f44da775)
Kernel (for ppgtt) requires memory address to be
aligned to page size (4096).
-v2: added marking that also fixes initial commit 01058a5522.
-v3: numbers replaced by PAGE_SIZE; buffer-object size is aligned
instead of alignment of offsets (Chris Wilson).
-v4: changes related to PAGE_SIZE moved to separate commit
-v5: restored alignment to page-size for 0-size.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106997
Fixes: a363bb2cd0 (i965: Allocate VMA in userspace for full-PPGTT systems.)
Fixes: 01058a5522 (i965: Add virtual memory allocator infrastructure to brw_bufmgr.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 24839663a4)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f0a8accb0d)
This option allows us to remove additional s_waitcnt instructions
because s_barrier internally does s_waitcnt 0.
Though, apparently there is a problem with LDS accesses that
causes rendering issues with FFXV and DXVK. Disable this
optimization for now (RadeonSI still uses it).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 71d5b2fbf8)
This patch fixes a regression in mesa 18.2 and mesa-dev branches
for HAVE_DRM_GRALLOC code path which is causing black screen on Android
and prevents boot due to SIGSEGV MAPERR crash related to unproper handling
of drm_gralloc drm FD in new droid_open_device() path.
Problem is due to c7bb82136b ("egl/android: Add DRM node probing and filtering")
To avoid the crash the former existing working droid_open_device() is restored,
renamed droid_open_device_drm_gralloc() and kept within HAVE_DRM_GRALLOC braces.
Tested with mesa-dev and mesa 18.2 branch and oreo-x86 bootanimation
and Androdi GUI booting is fixed with i965, nouveau, radeon.
The changes are compatible with gbm_gralloc, I've tested build with hwc too.
(v2) remove indentation from HAVE_DRM_GRALLOC pre-processor directive
NOTE: Definition of enum{} for GRALLOC_MODULE_PERFORM_GET_DRM_FD
is not necessary and it's actually causing a redefinition building error,
because in HAVE_DRM_GRALLOC path gralloc_drm.h is already exported
by libgralloc_drm which is currently still a dependency.
Fixes: c7bb82136b ("egl/android: Add DRM node probing and filtering")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
(cherry picked from commit 73b342c7a5)
Follow radeonsi.
Fixes: 3665f66ef2 "radv: Add support for ETC2 textures."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4bb6c49375)
Behavior wrt firstInstance got changed, and a divisor of 0 has been
disallowed.
The new version of the ext got published in specification 1.1.81.
Sending to stable since the only known user is DXVK, which needs
this for correctness.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 66e12451ac)
One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS
destinations were broken was that an earlier layer was swapping it
out for B8G8R8A8_UNORM. That made Z24X8 -> Z24X8 blits work.
However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken.
The old code only considered one format at a time, without thinking
that format conversion may need to occur.
This patch moves the translation out to a place where it can consider
both formats. If both are Z24X8, we continue using B8G8R8A8_UNORM to
avoid having to do shader math workarounds. If we have a Z24X8
destination, but a non-matching source, we use our shader hacks to
actually render to it properly.
Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit de57926dc9)
The hardware doesn't support rendering to R24_UNORM_X8_TYPELESS, so
Jason decided to fake it with a bit of shader math and R32_UNORM RTs.
The only problem is that R32_UNORM isn't renderable either...so we've
just traded one bad format for another.
This patch makes us use R32_UINT instead.
Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8a29086285)
The Vulkan 1.1.82 spec flipped the order to better match D3D.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit a9f7bcfdf9)
The check for ETC2 compatibility was not updated when the fallback
format was changed.
Fixes: 71867a0a61
st/mesa: Fall back to R8G8B8A8_SRGB for ETC2
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e94095ec30)
This is basically copied from the DRI2 destroy path. Without this,
Raspberry Pi would quickly run out of CMA during the EGL tests in the CTS
due to all the pixmaps laying around.
Fixes: f35198bade ("egl/x11: Implement dri3 support with loader's dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit b618d7ea59)
When the SIMD16 Gen4-5 fragment shader payload contains source depth
(g2-3), destination stencil (g4), and destination depth (g5-6), the
single register of stencil makes the destination depth unaligned.
We were generating this instruction in the RT write payload setup:
mov(16) m14<1>F g5<8,8,1>F { align1 compr };
which is illegal, instructions with a source region spanning more than
one register need to be aligned to even registers. This is because the
hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
compressed instruction into two mov(8)'s.
I believe this would cause the hardware to load g5 twice, replicating
subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2
artifact blocks in both TIS-100 and Reicast.
Normally, we rely on the register allocator to even-align our virtual
GRFs. But we don't control the payload, so we need to lower SIMD widths
to make it work. To fix this, we teach lower_simd_width about the
restriction, and then call it again after lower_load_payload (which is
what generates the offending MOV).
Fixes: 8aee87fe4c (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Diego Viola <diego.viola@gmail.com>
(cherry picked from commit 08a5c395ab)
This extension is not defined for indirect contexts. Marking it as
"client only", as the old code did here, would make the extension
available in indirect contexts, even though the server would certainly
not have it in its extension list.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 63a6b719d9)