Fixes some dEQP-VK.renderpass2.* flakes. Valgrind:
Test case 'dEQP-VK.renderpass2.dedicated_allocation.attachment.8.724'..
==754520== Conditional jump or move depends on uninitialised value(s)
==754520== at 0x575B21C: radv_layout_is_htile_compressed (radv_image.c:1690)
==754520== by 0x572F470: radv_handle_depth_image_transition (radv_cmd_buffer.c:5855)
==754520== by 0x572F2F2: radv_handle_image_transition (radv_cmd_buffer.c:6123)
==754520== by 0x572EEC6: radv_handle_subpass_image_transition (radv_cmd_buffer.c:3385)
==754520== by 0x572A104: radv_cmd_buffer_begin_subpass (radv_cmd_buffer.c:4843)
==754520== by 0x572A007: radv_CmdBeginRenderPass (radv_cmd_buffer.c:4913)
==754520== by 0x572A197: radv_CmdBeginRenderPass2 (radv_cmd_buffer.c:4921)
Why false?
A renderloop happens when the same attachment is both used as input
attachment and output (color, ds) attachment in a subpass. Of course
this doesn't happen outside of a renderpass and hence we can initialize
it to false at the start of the renderpass.
Fixes: 66131ceb8b "radv: Pass through render loop detection to internal layout decisions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3074
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6068>
(cherry picked from commit 18fe130ec9)
This avoids some performance regressions on Gen12 platforms caused by
SIMD32 fragment shaders reported in titles like Dota2, TF2, Xonotic,
and GFXBench5 Car Chase and Aztec Ruins.
The most obvious pattern in the regressing shaders I identified among
these workloads is that they all had non-uniform discard statements,
which are handled rather optimistically by the current IR analysis
pass: No penalty is currently applied to the SIMD32 variant of the
shader in the form of differing branching weights like we do for other
control flow instructions in order to account for the greater
likelihood of divergence of a SIMD32 shader.
Simply changing that by giving the same treatment to discard
statements as we give to other branching instructions seemed to hurt
more than it helped on platforms earlier than Gen12, since it reversed
most of the improvement obtained from SIMD32 fragment shaders in
Manhattan for no measurable benefit in other workloads (Manhattan has
a handful of shaders with statically non-uniform discard statements
which actually perform better in SIMD32 mode due to their approximate
dynamic uniformity). For that reason this change is applied to Gen12+
platforms only.
I've been running a number of tests trying to understand the
difference in behavior between Gen12 and earlier platforms, and most
of the evidence I've gathered seems to point at EU fusion being the
culprit: Unlike previous generations, on Gen12 EUs are arranged in
pairs which execute instructions in lockstep, giving an effective warp
size of 64 threads in SIMD32 mode, which seems to increase the
likelihood for control flow divergence in some of the affected shaders
significantly.
Fixes: 188a3659ae "intel/ir: Import shader performance analysis pass."
Reported-by: Caleb Callaway <caleb.callaway@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5910>
(cherry picked from commit 4d73988f6f)
Copied from PAL. Higher values break tessellation, which I was only able
to reproduce with register shadowing enabled.
Fixes: 0bf3e6fae7 "radeonsi/gfx10: double the number of tessellation offchip buffers per SE"
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5798>
(cherry picked from commit 1c6eca23fd)
Previously, we've set element_size == 16 which causes loads from
packed vec3 arrays to cross the boundary and return wrong data.
This patch sets element_size = 4 and splits loads into single channel.
Fixes all of dEQP-VK.subgroups.ballot_broadcast.*
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5977>
(cherry picked from commit 7015d2c249)
Consider the following case:
if ssa_1 {
block block_2:
/* succs: block_4 */
} else {
block block_3:
...
break
/* succs: block_5 */
}
block block_4:
vec1 32 ssa_100 = phi block_2: ssa_2
After block_3 extraction and reinsertion, phi->pred becomes invalid
and isn't updated by reinsertion since it is unreachable from block_3.
Call nir_opt_remove_phis_block before moving block to eliminate single
source phis after the if.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3282
Fixes: e3e929f8c3
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5945>
(cherry picked from commit 6f94b3da11)
When creating a sampler-type, we need to pass the correct vaclue for
the "is_shadow"-parameter to glsl_sampler_type(), otherwise the compiler
backend will have no clue about this being a shadow-sampler.
Fixes: 1c0f92d8a8 ("nir: Create sampler variables in prog_to_nir.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5986>
(cherry picked from commit c33e8d7d52)
This change mostly touches error handling code paths, where a
bug was found when the DRI driver failed to bind a new DRI
context. Specifically, the reason for it to fail was the window
system unable (for whatever reason) to provide the DRI drawable
with a buffer. In this instance, Mesa un-does the EGL bindings,
but doesn't restore the old DRI context, hence remaining in a
funny state. It's worth mentioning that despite trying, there
is no guarantee that the old DRI context can be restored,
depending on the runtime.
Before this change, if bindContext() failed then
dri2_make_current() would rebind the old EGL context and
surfaces and return EGL_BAD_MATCH. However, it wouldn't rebind
the DRI context and surfaces, thus leaving it in an
inconsistent and unrecoverable state.
After this change, dri2_make_current() tries to bind the old
DRI context and surfaces when bindContext() failed. If unable
to do so, it leaves EGL and the DRI driver in a consistent
state, it reports an error and returns EGL_BAD_MATCH.
Fixes: 4e8f95f64d ("egl_dri2: Always unbind old contexts")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5707>
(cherry picked from commit 2907faee7a)
dri2_make_current() has become hard to follow, address this by
splitting the semantic of needing a call to bindContext() and
its failure.
Cc: mesa-stable
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5707>
(cherry picked from commit 8b0b6f907d)
dri2_make_current() has become long and convoluted. Address
this by folding together multiple if blocks checking for the
same variable.
Cc: mesa-stable
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5707>
(cherry picked from commit 6b12999ef7)
Fixes an issue with Renderdoc's shader debugging with ACO.
If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6320e37d4b ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709>
(cherry picked from commit 0868638aed)
Otherwise we might get VM_L2_PROTECTION_FAULT_STATUS errors.
Fixes: 8275dc1ed5 ("ac/surface: fix epitch when modifying surf_pitch")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>
(cherry picked from commit 87ecfdfbf0)
It's invalid and the temporary syncobj was never actually destroyed.
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5921>
(cherry picked from commit 8aa9d0acb8)
This reverts commit 4fee7b30c0, which was
intended to be a temporary workaround for a leak introduced in
a65e29ccb2 ("gallium: simplify throttle implementation"). However, that
leak was then fixed in 023282a4f6 and we
forgot to revert this hack.
Closes: #2108
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5858>
(cherry picked from commit 40b99bb79e)
No clue how this worked before.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 82f18b713a ("panfrost: Keep track of active BOs")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
(cherry picked from commit 37d89e0f93)
When overwriting the writer, we need to release the old reference.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 2dad9fde50 ("panfrost: Start tracking inter-batch dependencies")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
(cherry picked from commit 20dd37024b)
Whenever a struct type is decorated Block or BufferBlock we turn that
into a GLSL_TYPE_INTERFACE. Since these decorations can end up random
places, we should allow them for constants.
Closes: #3252
Fixes: 9d0ae777dd "spirv: Use interface type for block and buffer..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5855>
(cherry picked from commit 351b5137d7)
EXT_shader_image_load_store says:
The format of the image unit must be in the "1x32" equivalence class
otherwise the atomic operation is invalid.
ARB_shader_image_load_store says:
We will only support 32-bit atomic operations on images
Fixes: fc0a2e5d01 ("glsl: add EXT_shader_image_load_store new image functions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5688>
(cherry picked from commit 1e3aeda528)
Two problems:
* Multiply has higher priority than shift
* rsc->layout.format isn't initialized for a2xx
Fixes: 5a8718f01b ("freedreno: Make the slice pitch be bytes, not pixels.")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5796>
(cherry picked from commit 344e764b01)
Coincidentally got a bugreport of a game that is broken without the import
fix below, but it turns out I made a copy-paste error as well ..
In good news it is clearly tested now.
Fixes: ad15149958 "radv: Set handle types in Android semaphore/fence import."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5783>
(cherry picked from commit ffb8020f6e)
ntq_setup_fs_inputs and ntq_setup_gs_inputs sort the inputs according to
the driver location. This input array is then used to calculate the VPM
offset for the outputs in the previous stage. However, it wasn’t taking
into account variables that are packed into a single varying slot. In
that case they would have the same driver_location and are
distinguished by location_frac.
This patch makes it additionally sort by location_frac when the driver
locations are equal. This can happen when the compiler packs varyings
that are sized less than vec4. Without this fix, when the VPM is used to
transmit data free-form between the stages (such as VS->GS) then it
would end up writing to inconsistent locations.
Fixes dEQP tests such as:
dEQP-GLES31.functional.primitive_bounding_box.lines.global_state.
vertex_geometry_fragment.default_framebuffer_bbox_equal
Fixes: 5d578c27ce ("v3d: add initial compiler plumbing for geometry shaders")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5787>
(cherry picked from commit deefebc55b)
Fixes various artifacts with Detroit: Become Human. This assumes other
Vulkan games using the same engine could have the same issues.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5710>
(cherry picked from commit e4654a35b0)
If we clear depth only texture via glClearTex(Sub)Image it may cause:
../src/intel/blorp/blorp_genX_exec.h:1554: blorp_emit_surface_states: Assertion `params->depth.enabled || params->stencil.enabled' failed.
due to clear_depth_stencil calling blorp_clear_depth_stencil when
depth is already fast-cleared and there is no stencil.
Fixes piglit test: arb_clear_texture-depth
Fixes: 51638cf18a
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5770>
(cherry picked from commit 77844690be)