I thought this was a bug in CTS but the Vulkan spec says:
"VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT specifies write access
to a color, resolve, or depth/stencil resolve attachment during
a render pass or via certain subpass load and store operations."
So, VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT is used to synchronize
depth/stencil resolve attachments. Yes, it's counterintuitive.
This can't actually be fixed properly for now because RADV performs
the end subpass barrier *before* resolve attachments instead of after.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8138>
(cherry picked from commit 7880faccc5)
In case one operand was renamed and another operand came
from an incomplete phi, it could happen, that the original
name was not restored.
This has no impact on the code, but ensures correct SSA
is maintained during RA.
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8109>
(cherry picked from commit b50d3e5760)
EGL_EXT_protected_surface introduces EGL_PROTECTED_CONTENT_EXT,
while EGL_EXT_protected_content is about protected context.
When I implemented EGL_EXT_protected_surface I mixed up the 2
names, so this commit fixes it.
Fixes: bd182777c8 ("egl: implement EGL_EXT_protected_surface support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8122>
(cherry picked from commit 663e06faa6)
The problem was that the shader constants were based on the framebuffer
sample count and ignored the multisample enable state and the line/polygon
smoothing state, which uses MSAA rasterization that only sets SampleMaskIn
to get the coverage for alpha-blended smoothing (the PS epilog computes
the alpha channel from SampleMaskIn and blending generates the AA results).
- This is a complete rework that adds a new state for NGG cull constants.
- It fixes the same thing for the prim discard compute shader.
- It documents how VS_STATE.SMALL_PRIM_PRECISION is encoded.
It fixes blue corruption in Unigine Heaven with MSAA and Medium details
or better.
Fixes: 7648060dc0 - radeonsi: enable NGG culling by default on gfx10.3 dGPUs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8134>
The small DCE of the spiller only removes the original instructions
of rematerialized variables in case they are unused. If a variable
has been renamed, it cannot match any original instruction anymore.
Thus, the lookup is then unnecessary and can be omitted.
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8055>
(cherry picked from commit ef4101d6d7)
(some) drivers need to have the swizzle set prior to create_sampler_view
being called in order to actually apply it
Fixes: d11fefa961 ("st/mesa: optimize 4-component ubyte glDrawPixels")
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8107>
(cherry picked from commit a709d99bfd)
index_size is specified in bytes, not bits.
Fixes: f4583b4086 ("zink: move 8bit index handling out of u_primconvert path")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8081>
(cherry picked from commit ba74e1be22)
Conflicts:
src/gallium/drivers/zink/zink_draw.c
The DRM_RDWR flag is needed for mmap with PROT_WRITE to work.
Cc: mesa-stable
Signed-off-by: Robin Ole Heinemann <robin.ole.heinemann@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8075>
(cherry picked from commit df76963a5c)
If the image tiling is set to VK_IMAGE_TILING_LINEAR,
buffer_set_metadata will read an uninitialized radeon_bo_metadata.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: d5fd8cd46e ("radv: Allow non-dedicated linear images and buffer.")
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7898>
(cherry picked from commit ad19b0714a)
the code here tries to be too smart and only use a geometry shader if there's
actually multiple layers being uploaded, but the fragment shader also unconditionally
reads gl_Layer as long as the pipe cap for gs is set, which means that
in the case when the gs is dynamically disabled due to uploading a
single-layer surface, the fs has no input to read for gl_Layer and everything breaks
always using a gs isn't ideal, but it's considerably more work to manage multiple
fs variants based on layer usage
Fixes: c99f2fe70e ("st/mesa: implement PBO upload for multiple layers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8067>
(cherry picked from commit 614c77772a)
I got confused and:
* used the vkformat instead of the pipe format for getting format description
* incorrectly calculated bpp
but this time it's definitely 100% fixed I promise
Fixes: 456b57802e ("zink: fix direct image mapping offset")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8074>
(cherry picked from commit dfd0f042e0)
With the block's end_ip accidentally being the ip of the next instruction,
contrary to the comment, you would end up doing end-of-block freeing early
and have the value missing when it came time to emit the next instruction.
Just expand the ips to have separate ones for start and end of block --
while it means that nir_instr->index is no longer incremented by 1 per
instruction, it makes sense for use in liveness because a backend is
likely to need to do other things at block boundaries (like emit the if
statement's code), and having an ip to identify that stuff is useful.
Fixes: a206b58157 ("nir: Add a block start/end ip to live instr index metadata.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7658>
(cherry picked from commit d3d28f6c2d)
the x and y offsets here were improperly calculated without taking into account:
* layer/level offset
* x/y coord bpp
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8058>
(cherry picked from commit 456b57802e)
The scheduler doesn't take SGPR use into account, which can be
a limiting factor on older GPUs. This patch fixes a CTS test crash
on GFX6.
CC: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8040>
(cherry picked from commit 731f8fc9dd)
because 848e7b94 commit cause.it modify u_debug_stack_android.cpp
location from src/gallium/auxiliary/util to src/util but Android.mk
not modify
Fixes: 848e7b94 ("Move stack debug functions to src/util")
Signed-off-by: cheyang <cheyang@bytedance.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7851>
(cherry picked from commit 83d1e2efd0)
Ported from RadeonSI.
The restriction was applied too late.
Cc: 20.2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c5e8f6700b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8014>
Ported from RadeonSI.
To get optimal LDS usage since the previous change.
Cc: 20.2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f777d00a75)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8014>
Using more blend targets than specified by maxFragmentDualSrcAttachments
is invalid per the Vulkan spec.
I'm usually not a fan to workaround game bugs inside the driver but
it's really easy for us to ignore MRT1+ in the driver and that
prevents wrong behaviour.
Cc: 20.2, 20.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit bc7f442d8e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8014>
VK spec got clarification about the pSizes parameter.
Fixes set of new tests:
dEQP-VK.pipeline.extended_dynamic_state*with_offset*
v2: move offset subtract to be part of size calculation (Jason)
CC: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3871
Fixes: b9a05447a1 ("anv: dynamic vertex input binding stride and size support")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7439>
(cherry picked from commit 5998a6543a)
Alpha channel is always linear (oops).
Fixes: ddac5933f8 ("turnip: call packing functions directly for pack_gmem_clear_value")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7899>
(cherry picked from commit d7ea266e6f)
Conflicts:
.gitlab-ci/deqp-freedreno-a630-fails.txt
We need to pick 1u vs 1.0f based on the type of the texture, just like for
normal samples. Move the decision up to the create_sampler_view, and use
that value from both sampler paths.
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8012>
(cherry picked from commit 4ba884b814)
Commit 64d6f56ad2 ("panfrost: Allocate syncobjs in panfrost_flush")
aimed at optimizing the fencing logic but it looks it also broke the
fence-based synchronization in subtle ways.
Indeed, now that the fence only waits on a single syncobj, we're not
guaranteed that all jobs queued in panfrost_flush_all_batches() will
be done when the fence is signaled, because jobs at the top level
(those stored in the batches hashmap) have not inter-dependencies.
Commit 9e397956b0 ("panfrost: signal syncobj if nothing is going to
be flushed") made this even more apparent by signaling the fence right
away if nothing was left to be drawn in the current context, thus
ignoring any of the batches left to flushed in the ->batches map.
If we want to keep relying the existing kernel APIs there's clearly no
ideal solution here. We can either go back to the original fencing
mechanism where each fence contained an array of syncobjs to be tested
or serialize jobs that have no explicit dependencies so we know the last
submitted job will also be the last one to return. The orginal approach
has proven to add quite a significant overhead (caused by the amount of
ioctls and the time spent in kernel space to gather dma fences attached
to those syncobjs and test them). So let's go for the simple solution
where we have a single syncobj bound to the context which we update to
point to the last job out_sync every time we submit a top-level job.
This approach implies reworking the way we create fences since we
need to capture the syncobj state at the time the fence is created.
Unfortunately, there's not SYNCOBJ_CLONE ioctl, which forces us to
export/create/import a fence so we have a new object that's not
subject to changes done to the context syncobj.
If we want to further optimize the logic, we should probably explore
some of those options:
1/ Adding array based SYNCOBJ ioctls (SYNCOBJ_{CREATE,DESTROY,CLONE}_ARRAY)
so we can mitigate the cost of ioctls when we need to manipulate
arrays of syncobjs
2/ Support synchronization jobs. That is, jobs that have a NULL job chain
but an array of sync_in and a sync_out to allow creating
synchronization points
3/ Add syncobj aggregators so we only have to wait on one syncobj from
userspace. The syncobj aggregator would wait for all sub syncobjs to
be signaled before signaling the top-level one.
Fixes: 64d6f56ad2 ("panfrost: Allocate syncobjs in panfrost_flush")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7831>
(cherry picked from commit 29f938a0ec)
We shouldn't reset the ->writer field when a reader comes in because we
want subsequent readers to have a dependency on the writer too. Let's
add a new field encoding the last access type and use it to replace the
writer != NULL test.
Reported-by: Roman Elshin
Fixes: c6ebff3ecd ("panfrost: Remove panfrost_bo_access type")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7831>
(cherry picked from commit 387221e4f2)
among all Android gen rules '::' was used only here to declare dependencies;
mesa development and stable branch are worth receiving the fix
Fixes the following building errors with Android 7:
obj/STATIC_LIBRARIES/libmesa_nir_intermediates/spirv/gl_spirv.P:184: *** target file
gen/STATIC_LIBRARIES/libmesa_nir_intermediates/spirv/vtn_generator_ids.h' has both : and :: entries. Stop.
Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Fixes: 1070bba19e ("android: fix SPIR-V -> NIR build")
Reported-by: youling257 <youling257@gmail.com>
(cherry picked from commit 185df8ef07)
Apparently LRZ will be read/written regardless of depth being enabled or
not, so we have to make sure these registers are zero.
Fixes: 1d83f5ae84 ("turnip: disable LRZ on vkCmdClearattachments() 3D fallback path")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7899>
(cherry picked from commit fa16e66a3f)
There is an early return if cmd->state.predication_active is true, so do
the LRZ invalidate before that.
Fixes: 2f79e00664 ("turnip: disable LRZ on vkCmdClearAttachments()")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7899>
(cherry picked from commit f24358e002)
The packet size is constant and assumes all states, except for the 2 input
attachment states. (this means we get an invalid packet if DIRTY_LRZ isn't
set when DIRTY_DRAW_STATE is set).
Fixes: 3c07a14998 ("turnip: enable LRZ")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7899>
(cherry picked from commit af6e74bca8)
In the Chrome WebGL Aquarium stress test, 20 instances of Chrome will run
Aquarium simultaneously over 20+ hours. That causes Chrome crash.
During the stress, glBeginQueryIndexed is called frequently.
1.Each query will only use 32 bytes from query_buffer_uploader. After the offset
exceed 4096, it will alloc new buffer for query_buffer_uploader->buffer
and release the old buffer.
2.But iris_begin_query will call u_upload_alloc when the offset changed, and it
will increase the query_buffer_uploader->buffer->reference.count every time
when it called u_upload_alloc.
3.So when u_upload_release_buffer try to release the resource of
query_buffer_uploader->buffer, its reference.count is
already equal to 129. pipe_reference_described will only decrease its reference
count to 128.So it never called old_dst->screen->resource_destroy.
4.The old resouce bo will never be freeed. And chrome will called mmap every time
when it alloc new resource bo.
5. Chrome process map too many vmas in its process. Its map count exceed the
sysctl_max_map_count which is 65530 defined in kernel.
6. When iris_begin_query want to alloc new resource bo, it will meet NULL pointer
because mmap return failed. Finally chrome crashed when it access this NULL resource
bo.
The fix is decrease the reference count in iris_destroy_query.
Patch is verified by chrome webgl Aquarium test case for more than 72 hours.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yang Shi <yang.a.shi@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7890>
(cherry picked from commit 3aaac40b12)
avoids errors seen when building on OpenBSD/amd64
../src/amd/compiler/aco_instruction_selection.cpp:1677:62: error: ambiguous conversion for functional-style cast from 'unsigned long' to 'aco::Operand'
bld.vop3(aco_opcode::v_mul_f64, Definition(dst), Operand(0x3FF0000000000000lu), tmp);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
glibc uses unsigned long for uint64_t on LP64 archs and unsigned long long for
uint64_t on ILP32 archs. On OpenBSD unsigned long long is used for uint64_t
on all archs.
The Operand constructors are uint8_t uint16_t uint32_t uint64_t
use UINT64_C so lu or llu suffix will be used as needed.
Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7944>
(cherry picked from commit ebfb9e1817)