We are using different llvm type depending on the cpu supporting f16c
instructions or not.
The reasoning behind this was that we really couldn't do anything with f16
values and had to cast them to some int type anyway, plus IIRC originally
this actually predates llvm even supporting a half type in the first place
(or if it did, at the very least it was not able to do anything useful with
it).
There are now bugs with lavapipe when the cpu doesn't support f16c, since while
we don't expose f16 capabilities in this case, we can still hit f16 conversion
functions for the likes of unpack2x16float and quantizeToF16, and we're just
straight calling fpext/fptrunc functions, not touching our own code for half
conversion (I believe our own code might still be faster as llvm de-vectorizes
it if it's not supported by the cpu, but don't quote me on that - could depend
on llvm version, and also for trunc the rounding is actually different since
our own functions implement rounding according to d3d10 requirements (mostly
used for f16 render targets)).
This only seems to be a problem for vulkan, not GL, since glsl has its own
lowering pass if the half float packing instructions aren't supported by the
driver.
Ideally we'd fix this by just always using llvm half type for f16, however
still not all llvm backends can handle it.
So instead do some hacky bitcasts around the fpext/fptrunc calls with f16,
which works on x86 even when not supporting f16c. Other llvm backends not
really supporting halfs will still crash there as before (albeit it should
be a "cleaner" crash as the IR is now correct...), but at least keeps them
running for more ordinary things such as f16 texture sampling / render
targets (which they wouldn't if we'd use llvm half type everywhere).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13807
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13865
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37344>
radv_cmd_buffer_upload_alloc_aligned is used with alignment=0, which
guarantees that the alignment is at least 4.
Fixes: 9e16ed7a13 - ac/nir: switch nir_load_smem_amd uses to ac_nir_load_smem wrapper
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37345>
previously this only checked to see if dst was bound, but that is not
the only condition in which clears may be flushed, and triggering a clear
flush while blitting will not set image layouts, which means that a renderpass
could be illegally triggered on an UNDEFINED image (even though it wouldn't be used)
instead, do a much more thorough check to determine whether clears can actually be
stored with the expectation that they will otherwise be flushed
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37467>
this can also be caused by winsys and api being out of sync, e.g.,
if the window resize is lagging behind the framebuffer resize
in this case, just use level 0 and assume things will be okay
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37467>
Ensures max_subgroup_size is set to the subgroupSize physical device
property on drivers that don't support VK_EXT_shader_object,
VK_EXT_subgroup_size_control, or Vulkan 1.3.
Fixes: d807f5a351 ("vulkan: set nir subgroup size shader info")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37477>
Bind queue is a queue of sparse resource memory bind operations.
It binds memory to sparse resources. It doesn't map to any
particular kernel object. The queue is equipped with an internal
syncobj to implement PANVK_DEBUG=sync.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
Sparse resources (buffers and images) own their address ranges, so
when creating a sparse resource we allocate an address range large
enough to fit the resource. The address range has to start and end
at page boundaries, as that's the memory mapping granularity.
At destruction time, we unmap all memory mapped within the range,
as Vulkan doesn't require the user to unmap memory themselves and
we don't want to retain references to BOs for longer than
necessary. Finally, we free the address range that was previously
owned by the resource.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
This flag is intended to be used by sparse resource creation. When
set, sparse resources will always start off mapped to blackhole,
regardless of whether they were created with SPARSE_RESIDENCY flag
set or not.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
We want to be able to survive accesses to, from Vulkan's
perspective, unmapped regions of sparse resources. For that we
allocate a single page -sized bo, which we'll use to implement
sparse unmapping by mapping the address range to this bo.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
Since the kmsro option was removed, it is now just built together with a
list of gallium OpenGL drivers that require it.
On a Vulkan-only build with zink for OpenGL, kmsro is still required for
some wsi paths for those platforms, but it is no longer possible to
explicitly enable it without a gallium OpenGL driver to pull it.
This enables kmsro when zink is enabled to allow the Vulkan-only use
case in those platforms.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37459>
The spec requires these to be decorated as FLAT,
but some apps forgot to set that,
eg. old DXVK before d12a8e09a855
Let's unconditionally decorate these FS inputs as FLAT
in spirv_to_nir, we can do that for free and prevent those
apps from crashing RADV.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33205>
This commit keeps vkcts as a nightly job, but this puts us in shooting
distance to what we've been working for for the past 2.5 years!
We will flip the switch to making this job part of the merge pipeline
after a week of stress testing to make sure reliability issues,
especially around USB, don't come back to haunt my days and nights.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37367>
Usually wave64 performs better for fragment shaders,
because LDS sharing for interpolation is better.
But the rt traversal loop divergence is likely high enough to make
wave32 better on GFX10.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37360>
If the application really thinks it needs pswave32, let it use it.
Fragment shaders also have no concept of full subgroups, so the existing
code that chooses the subgroup size will work already.
For pre raster stages, we cannot allow this because of potential mismatches
in merged stages.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37360>
Allow users to set an environment variable to influence JM context slot
priorities.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>
A Panfrost JM context can be created by leveraging the new Panfrost 1.5
KM IOCTLs and translating Mesa pipe resource priority levels into
Panfrost-specific ones.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>
Priority values will be used later one when creating job contexts.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>
When writing to a depth texture the driver is first doing a decompress
blit to a stageing resource. On one hand this blit can be skipped, if
PIPE_MAP_DISCARD_WHOLE_RESOURCE is set, OTOH we need to clear the
PIPE_MAP_UNSYNCHRONIZED flag if a partial write is done, because we have to
wait until the blit is finished.
v2: Update the patch with a more targeted approach.
Fixes: 25b97a3a96 ("mesa/st: mark internal texture map calls as UNSYNCHRONIZED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13916
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37457>
With that we can easily add a restriction to the not + flt -> fge
optimization to handle NaNs like it was done before.
Fixes: 51d8ca2dff ("r600/sfn: optimize comparison results")
v2: use SPDX license identifier (austriancoder)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37450>