For example, the FS may write gl_SampleMask while color writes are
masked out and there is no depth attachment.
Note that the proprietary driver still considers more state when
disabling the FS, such as the depth test being disabled, and thus
disables the FS in cases where we do not. However, I think that is
too much of a stretch unless we find some real workload needing it.
This change also allows disabling an FS that has discard.
This requires being careful around occlusion queries, since when one
is enabled, we cannot disable an FS that can discard.
Found via gpu-ratemeter bench: vk.pix.noaa.output.color+z+samplemask.colormask=0
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41857>
The destriding lowering hard-coded a special case for weight_width == 5
with a fallback "+1" branch that was only correct for 3x3 kernels.
Replace it with formulas derived from TFLite's SAME-padding rule for
stride 2:
The half-resolution expansion applied to the reshuffle output and to
the strided_to_normal() input is:
weight_width / 2
which gives 1 for 3x3, 2 for 5x5, and 3 for 7x7 kernels.
The reshuffle window start offset is:
(weight_width + input_width % 2 - 2) / 2
This folds the previous odd-input fixup into the same expression
preserves the existing 3x3 and 5x5 behavior while extending the
lowering to wider odd kernels such as 7x7.
Fixes Models.Op/inception_000, which uses Inception V1's Conv2d_1a_7x7,
in the Teflon test suite.
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41774>
This creates the BO with AMDGPU_GEM_CREATE_NO_CPU_ACCESS for buffers
that we don't map.
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41850>
Add a buffer create function that takes PIPE_RESOURCE_FLAG_* flags.
Disable suballocation for all buffers on UVD/VCE without VM support.
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41850>
VkBindMemoryStatus is using a pointer to VkResult but the value cannot
be correctly encoded and decoded with the current code generator. Until
the issues are fixed, the extension should not be used as it'll cause
cts failures and invalid behavior.
Test: dEQP-VK.memory.binding.maintenance6.*
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41893>
Ordering of the extensions was affecting the codegen and some structures
were missing due to errors during codegen. One example is the custom
border color structure for the samplers, due to the reference from new
vkRegisterCustomBorderColorEXT function that's introduced with a
different extension VK_EXT_descriptor_heap. This CL adds a sorting
mechanism to generate code for supported extensions first to ensure
deepcopy and transform functions are created correctly.
Test: dEQP-VK.pipeline.*
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41893>
Fix
"dEQP-VK.api.copy_and_blit.*.image_to_image.all_formats.color.2d_to_1d.*.e5b9g9r9_ufloat_pack32.*"
on HK.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 5f5f4474f6 ("nir: Add a format unpack helper and tests")
Reviewed-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41929>
Unlike BitSet, which is backed by a Vec<u32>, this is backed by a
fixed-length array is therefore Copy. It's also mostly const so it can
be constructed and used from const contexts. Because of the const
rules, it's a bit more rigid and can only really accept keys which are
unsigned integer types.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915>
checking completion alone disregards submit_count, which is used to
determine the validity of any existing usage pointer. this could lead to
large numbers of bos with stale usage and infinite memory ballooning
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41936>
Given the relative cost of the extra syscall and kmalloc for the name
versus actually allocating pages, we can just always do this and give a
better debugging experience by default. We expect infrequent memory
allocation on Vulkan, anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41878>
To use the common function, this gives up the warning about the
memory being too small to meet the Vulkan spec for low end
devices.
Note: the common helper expose 25% for devices with <=1GiB but
to adhere to the Vulkan spec, the value is clamped to 1GiB.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
The budget calculation has changes slightly as the budget scaling
is applied prior to adding the used up heap memory.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
This also changes `video_memory` to use the heuristic instead of
the 10%, consistent with `max_mem_alloc_size`.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
The budget calculation has changes slightly as the budget scaling
is applied prior to adding the used up heap memory.
This also introduces a new tier since the common helper exposes
25% of memory as heap on devices with <=1GiB memory. Previously
50% was being used.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
Also remove the clamping to va_size in the budget calculation
since the heap_size is already clamped to va_size and the budget
is clamped to heap_size.
This also introduces a new tier since the common helper exposes
25% of memory as heap on devices with <=1GiB memory. Previously
50% was being used.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
This also introduces a new tier since the common helper exposes
25% of memory as heap on devices with <=1GiB memory. Previously
50% was being used.
This also fixes `device->heap_used` not using atomic read.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
The budget calculation has changes slightly as the budget scaling
is applied prior to adding the used up heap memory.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
Some drivers scale the available memory proportionally to the
advertised heap memory. The `heap_memory_percent` driconf option
allows tweaking the percentage of system memory exposed as heap
memory, so drivers supporting this also need to scale their
budgets accordingly. So add `vk_gpu_heap_budget_from_system()`.
Some drivers just clamp the available memory to the heap size. This
is accounted for by having the `scale_with_heap` parameter.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
Also adds helper function to be used by drivers.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41242>
When moving the code of creating image views to the init_sampler_view()
function, the check for Z/S aspect bits was forgotten to be added
(because it was a big if gating a lot of code originally).
When the driver doesn't have needs_zs_shader_swizzle set, this is not
problematic, because the condition for creating Z/S view is to have only
Z aspect; however the needs_zs_shader_swizzle case now fails because Z/S
views are now created for color images.
Fix the issue by re-adding the Z/S aspect check before checking
needs_zs_shader_swizzle flag.
Fixes: cafa22142b ("zink: create views for samplers lazily")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41923>