Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.
Fixes tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers
Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
(cherry picked from commit dab34bd5c8)
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.
This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.
References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>
(cherry picked from commit d10fd5b7c9)
Commit 38800b38 changed nir_opcodes.py, but that doesn't seem to have
triggered nir_opt_algebraic.py. The change in 75ef5991 depends on
opt_algebraic lowering 16-bit versions of slt, but if opt_algebraic is
not rebuilt, this may not happen. This resulted in some people seeing
assertion failures in, for example,
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_3.step,
due to the backend seeing nir_op_slt that it didn't know how to handle.
v2: Add nir_opcodes.py to nir_algebraic_py so that all the per-driver
algebraic passes pick up the dependency too. Rename it to
nir_algebraic_depends. Suggested by Emma.
Closes: #6047
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15050>
(cherry picked from commit a01b262990)
Conflicts:
src/gallium/drivers/r300/meson.build
- Delete code from r300, which doesn't exist in the 22.0 branch
The important thing isn't the number of words pushed, it's that there are no
UBOs required for us to upload. Check that instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
(cherry picked from commit 3c1021cd1e)
This is a very obscure encoding restriction in the Bifrost ISA. Unknown if any
real apps or tests hit this, but we still need to get it right sadly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>
(cherry picked from commit 8e0eb592d5)
glXMakeCurrent* may miss release pbuffer if pbuffer is created
with refcount=0. This won't happen when pbuffer had different
GLX id and X pixmap id.
cc: mesa-stable
Fixes: bc8a51a79a ("glx: no need to create extra pixmap for pbuffer")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14926>
(cherry picked from commit bf09c08e31)
This causes the flushed_depth_texture is allocated without
multi sample. So the blit will cause VM fault.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14990>
(cherry picked from commit 80974a5f1e)
Bit is flipped compared to all the other packets.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 705395344d ("intel/fs: Add support for compiling bindless shaders with resume shaders")
Fixes: c3ac9afca3 ("anv: Create and return ray-tracing pipeline SBT handles")
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15078>
(cherry picked from commit 2763a8af5a)
memcpy is divided into chunks that are vec4 sized max. The problem
here happens with a structure of 24 bytes :
struct {
float3 a;
float3 b;
}
If you memcpy that struct, the lowering will emit 2 load/store, one of
sized 8, next one sized 16. But both end up located at offset 0, so we
effectively drop 2 floats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a3177cca99 ("nir: Add a lowering pass to lower memcpy")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15049>
(cherry picked from commit 768930a73a)
If a secondary command buffer is used and the client provides a
framebuffer and that framebuffer has a stencil-only attchment, we would
try to get the aux usage for the depth component of that attachment and
crash. Check the aspects of the image before looking at aux usage.
This fixes at least the following SkQP tests on my Tigerlake:
- vk_circular-clips
- vk_filterfastbounds
- vk_innershapes_bw
- vk_lineclosepath
- vk_multipicturedraw_rrectclip_simple
- vk_pathinvfill
- vk_quadclosepath
- vk_rrect_clip_bw
- vk_windowrectangles
Fixes: 0d8b9c529c ("anv: Allow PMA optimization to be enabled in secondary command buffers")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15048>
(cherry picked from commit df0e2a1565)
We're moving towards a path where all contexts share the same virtual
memory - because this will make implementing vm_bind much easier - ,
and to achieve that we need to rework the binder memzone. As it is,
different contexts will choose overlapping addresses. So in this patch
we adjust the Binder to be 1GB - per Ken's suggestion - and use a real
vma_heap for it. As a bonus the code gets simpler since it just reuses
the same pattern we already have for the other memzones.
Credits to Kenneth Granunke for helping me with this change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 70dcffde4e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15036>
With the recent addition of the shortcuts aiming to avoid atomic
operations, the reference count on resources can become unbalanced
in the Tegra driver since they are wrapped and then proxied to the
Nouveau driver.
Fix this by keeping a private reference count.
Fixes: 7688b8ae98 ("st/mesa: eliminate all atomic ops when setting vertex buffers")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 108e6eaa83)
With the recent addition of the shortcuts aiming to avoid atomic
operations, the reference count on sampler views can become unbalanced
in the Tegra driver since they are wrapped and then proxied to the
Nouveau driver.
Fix this by keeping a private reference count.
Fixes: ef5d427413 ("st/mesa: add a mechanism to bypass atomics when binding sampler views")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit e8ce0a3357)
The "IB2" indirect buffer command is not supported on compute queues
according to PAL, and it indeed causes GPU hangs when task shaders are
used together with vkCmdExecuteCommands.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15006>
(cherry picked from commit da719792ad)
Once we simplified a phi node, we never updated the definition it points
to, which meant that it could become out of date if that definition were
also simplified, and we didn't check that when rewriting sources. That
could happen when there are multiple nested loops with phi nodes at the
header.
Fix it by updating the phi's pointer. Since we always update sources
after visiting the definition it points to, when we go to rewrite a
source, if that source points to a simplified phi, the phi's pointer
can't be pointing to a simplified phi because we already visited the phi
earlier in the pass and updated it, or else it's been simplified in the
meantime and this isn't the last pass. This way we don't need to
keep recursing when rewriting sources.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15035>
(cherry picked from commit 3ef858a6f6)
This fixes artifacts seen in games when using ASTC transcoding,
we need to use DXT5 for proper alpha channel support.
Number of components is a block specific property, there is no easy
way to see if we will require >1bit alpha support or not, so simply
use DXT5 to have support in place.
Fixes: 91cbe8d855 ("gallium: Add a transcode_astc driconf option")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15029>
(cherry picked from commit d3b4202b63)
STL/LDL have 13 bits to store imm offset.
Fixes crash in CS compilation in Monster Hunter World.
Fixes: b024102d7c
("freedreno/ir3: Use nir_opt_offset for removing constant adds for shared vars.")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14968>
(cherry picked from commit 0b2da9d795)
We really need offsets to be in dwords, not in vec4s.
The bug manifests as random failure of func.mesh.clipdistance.5 crucible
test, where stores to gl_MeshVerticesNV[x].gl_ClipDistance[4+n] actually write to
gl_MeshVerticesNV[x].gl_ClipDistance[1+n].
Fixes: 1f438eb033 ("intel/compiler: Implement Mesh Output")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14997>
(cherry picked from commit b6557b80a5)
This code does not work as expected when built with clang and
-fstrict-aliasing.
Redefine it in unsigned char operations so that it does not
violate strict aliasing rules.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
(cherry picked from commit 0f9756f480)
Some functions in ppir iterate the ppir_op_info slots arrays looking
for the PPIR_INSTR_SLOT_END token. The dummy/undef internal ops may
appear in the scheduling code and their slots arrays did not contain
that token, which could result in invalid array reads.
Reported by gcc -fsanitize=address.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
(cherry picked from commit 7297f931f0)
Reported by gcc -fsanitize=address, sometimes gpir regalloc attempts to
handle an uninitialized node->value_reg (containing the value -1), which
results in an invalid array access.
Avoid it for now to prevent crashes, but more investigation may be
required later on.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14894>
(cherry picked from commit 5b15849366)
Since the winsys uses refcount, options like RADV_DEBUG_ZERO_VRAM might
have not been initialized if the first instance wasn't created with
application info.
This fixes missing zerovram for vkd3d-proton.
Cc: 21.3 22.0 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14978>
(cherry picked from commit aa3405e812)
This wasn't much of a problem before because vk_command_buffer_finish()
doesn't do much on an empty command buffer. However, it's about to be
responsible for managing the pool's list of command buffers so it will
be critical to get this right.
Fixes: c9189f4813 ("anv: Use a common vk_command_buffer structure")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
(cherry picked from commit 7b0e306854)
Properly handling NaN adversely affects several hundred shaders in
shader-db (lots of Skia and a few others from various synthetic
benchmarks) and fossil-db (mostly Talos and some Doom 2016). Only apply
the NaN handling work-around when the shader demands it.
v2: Add comment explaining the 1.0*y_over_x. Suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 2098ae16c8 ("nir/builder: Move nir_atan and nir_atan2 from SPIR-V translator")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
(cherry picked from commit 1cb3d1a6ae)
frexp_sig of ±0, ±Inf, or NaN should just return the input unmodified.
frexp_exp of ±Inf or NaN is undefined, and frexp_exp of ±0 should return
the input unmodified. This seems to already work.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 23d30f4099 ("spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
(cherry picked from commit 7d0d9b9fbc)