During its Ambient Occlusion calculations the game ends up calculating
sin/cos of some pretty big values, for which HW produces completely bogus
results (e.g. cos(3929491.25) ~= -0.011, while correct would be ~0.923).
Limit the arguments to the reasonable (-2*Pi; 2*Pi) range with the
limit_trig_input_range WA.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8292
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21396>
(cherry picked from commit 42dba8ebc5)
When establishing a texture transfer map, if there is any pending changes on the
texture, instead of trying direct map with DONTBLOCK first, just
use the upload buffer path.
Fixes piglit tests gen-teximages, arb_copy_images-formats
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 1b9b060f0e)
Add typeless format to the compatible format lists for shareable surfaces.
Fixes webgl benchmark crash in eglCreateImage running from firefox on Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 3a359385cb)
When an EGLImage is created from a 2D texture and used for texture sharing,
the texture surface might not have been created with the SHARED bind flag.
To allow these surfaces for sharing, this patch sets the USAGE SHARED bit
for surfaces that can be potentially used for sharing even when the SHARED
bind flag is not originally set. Instead of unconditionally enabling the
SHARED bind flag for all surfaces and unnecessarily bypass the surface cache
optimization, this patch only enables the USAGE SHARED bit for surfaces
that also have the RENDER TARGET bind flag.
When the surface handle is inquired and if the surface is currently
marked as cachable, we will need to unset the cachable bit so
the surface handle will not be recycled again.
This patch fixes an assertion in svga_resource_get_handle() when the
EGL_MESA_image_dma_buf_export extension is used in webgl benchamrk running
from firefox in Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 75b7296fc3)
AMDGPU pstate is per context but if there is multiple AMDGPU contexts
in flight, the kernel can return -EBUSY.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
(cherry picked from commit 6d73841d34)
This change is wrong for two reasons:
- it hangs most of the time maybe, because changing PSTATE when the
application is running is broken somehow
- it increases the time between triggering and generating the capture
considerably, because there is a delay for changing PSTATE
This restores previous logic where PSTATE is set to profile_peak at
logical device creation. Though, it also re-introduces an issue with
multiple logical devices (kernel returns -EBUSY) but this will be
fixed in the next commit.
This fixes GPU hangs when trying to record RGP captures on my NAVI21.
Note that profile_peak is only required for some RDNA2 chips (including
VanGogh).
Cc: mesa-stable
This reverts commit 923a864d94.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
(cherry picked from commit 663877e894)
resource_from_handle implementations create an additional reference to
the scanout resource, which caused lima to leak those resources after
commit ad4d7ca833.
Do as the other drivers do and import the bo directly while creating
the scanount resource.
Cc: 22.3 mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8198
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21330>
(cherry picked from commit c426e5677f)
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty. In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break. This was true until c7fc44f9eb ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.
This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 4e09d37f3b)
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source. In particular, consider the
following parallel copy: A -> B, C -> A, A -> C. In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C. This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.
When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable. We could,
but it would require a read_first_invocation which would get messy. In
In c7fc44f9eb ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.
The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination. (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.) While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling. For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.
This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination. This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.
No shader-db changes on SKL or DG2
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 5afba073c6)
upcoming xserver releases will emit PresentConfigureNotify with this
flag set when a window is destroyed, ensuring drivers
don't poll infinitely and deadlock
fixes#6685
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>
(cherry picked from commit 819cbf329a)
base_mip_width is used in si_compute_copy_image when the
SI_IMAGE_ACCESS_BLOCK_FORMAT_AS_UINT flag is used.
width = tex->surface.u.gfx9.base_mip_width;
This will be incorrect if we don't adjust it. For instance,
with a 260x256 image, surf_pitch and base_mip_width are
320 before surf_pitch is updated to be 192.
Both need to match, or computing the width from base_mip_width
leads to incorrect result.
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21253>
(cherry picked from commit affa8a9fb2)
This is to avoid out of bound register accesses (potentially leading
to hangs) when the dispatch size is smaller than when is reported in
the NIR subgroup_size.
v2: Implement bounding with a mask (since workgroup sizes are powers of 2) (Faith)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 530de844ef ("intel,anv,iris,crocus: Drop subgroup size from the shader key")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21282>
(cherry picked from commit 9ac192d79d)
It seems broken but can't really figure out why and DCC levels aren't
interleaved on GFX11. Skipping DCC initialization for levels seems to
also fix it but seems safer to disable completely, as a hotfix.
Fixes DCC issues with Hi-Fi Rush, Sonic Frontiers, Hogwarts Legacy
and probably more.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8230
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21186>
(cherry picked from commit 5d41d8258a)
Cannot be used for SSBO, so ignore SCACHE invalidation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 8df17163c7 ("radv: implement vkCmdWaitEvents2KHR")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21271>
(cherry picked from commit 7efabfbbe4)
For sync2 bits, overflow can happen.
Use BITFIELD64_BIT to align with ANV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 8df17163c7 ("radv: implement vkCmdWaitEvents2KHR")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21271>
(cherry picked from commit 97aa8d9547)
Pointed out by GCC:
In function ‘load_text_file’,
inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/compiler/glsl/standalone.cpp:358:17: error: ‘free’ called on pointer ‘block_195’ with nonzero offset 48 [-Werror=free-nonheap-object]
358 | free(text);
| ^
In function ‘ralloc_size’,
inlined from ‘load_text_file’ at ../src/compiler/glsl/standalone.cpp:352:31,
inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/util/ralloc.c:117:18: note: returned from ‘malloc’
117 | void *block = malloc(align64(size + sizeof(ralloc_header),
| ^
Fixes: a9696e79fb ("main: Close memory leak of shader string from load_text_file.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit 49a6bdde8e)
Drop the unused ctx parameter, to match the main Mesa code.
Fixes ODR violation flagged by -Wodr with LTO enabled:
../src/mesa/main/shaderobj.h:74:1: error: ‘_mesa_reference_shader_program_data’ violates the C++ One Definition Rule [-Werror=odr]
74 | _mesa_reference_shader_program_data(struct gl_shader_program_data **ptr,
| ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: type mismatch in parameter 1
76 | _mesa_reference_shader_program_data(struct gl_context *ctx,
| ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: ‘_mesa_reference_shader_program_data’ was previously declared here
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
Fixes: 717a720e9c ("mesa: drop unused context parameter to shader program data reference.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit bf67f32d4b)
A stackless (or at least using allocated memory for stack) version
might be nice but for now this works around some games compiling
large shaders and hitting stack overflows.
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21231>
(cherry picked from commit 0a17c3afc5)