This verifies that the requested allocation doesn't exceed the maximum
in cases where the size passed to `clSVMAlloc()` isn't a multiple of the
provided alignment. It also clamps the maximum allocation to `i32::MAX`,
which prevents overflowing `pipe_box`'s `width` field.
Both of these changes prevent possible undefined behavior on 32-bit
systems due to violation of `Layout` prerequisites.
v2: use safe layout creation for maintainability, add a few comments
v3: use Layout utils for aligned size calc, split out max alloc changes
v4: use `checked_compare()` for alloc/size comparison
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34166>
This uses the helpers from the previous patch to calculate how many
attachments and MRT buffers we have space for.
In the case where we can support more MSAA samples for smaller formats,
we also add support for that.
The flaking test seems to be due to a CTS issue, see this ticket for
details:
https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5651
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
In order to enable higher MSAA modes, we're going to have to perform
some calculations on how to budget the (sometimes) limited tile-buffer
space.
Due to limited tilebuffer space, we need to prioritize a bit here.
First, we reserve space for 4x MSAA for all formats. Then we try to fit
8 color attachments into the tile-buffer. And then finally, we calculate
how many extra multi-sample buffers we can fit into the rest.
The reason we reserve 4x MSAA first, is that this is required by all
Vulkan versions. It also prevents us from regressing existing features.
Then we try to pick 8 color attachments next, because that's required by
Vulkan 1.4 as well as Vulkan Roadmap 2024 and D3D12. Vulkan Roadmap 2022
requires 7 as well.
This adds helpers that implements this, which can be used by both the
Gallium and the Vulkan driver. It's really benefitial if both of these
drivers prioritize the same way here.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
On v5, as well as v7 onwards, we can disable pipelining in order to fit
more data into the tile-memory. This is important in order to support
multiple, large color buffers with high MSAA sample counts.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
We also have a budget for the tile size for depth-buffers. It's
currently hard to trigger issues with this than for color-buffers,
but this becomes important when we support larger MSAA counts.
We also need to take a bit of care for stencil-only attachments, because
they also count against a limit here. We really only care about the
sample counts here, because the stencil buffer budget is always a
quarter of the depth-buffer budget, and always uses a single byte per
sample.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
There's two limitations we have to cater to:
1. The HW needs at least one render-target. We can disable write-back for
it, but it needs to allocate tile-buffer space for it.
2. The HW can't have "holes" in the render-targets.
In both of those cases, we already set up dummy RGBA8 UNORM as the format,
and disable write-back. But we forgot to take this into account when
calculating the tile buffer allocation.
This makes what we program the HW to do consistent, meaning we don't end
up smashing the tile-buffer space. We might be able to do something
better by adjusting how we program these buffers, but let's leave that
for later.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
These seems to crash on CI, not timeout. And the stencil.samples_1
variant is already present in the fails file, so let's remove the
duplicate.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
These GPUs had their tilebuffer sizes listed at twice their actual
values. While that still works, it ends up disabling pipelining in some
cases. This gives a significant performance hit, compared to using the
correct values.
But, it turns out to be hard or impossible to trigger at the moment, due
to the limited number of MSAA samples we support. Once that changes,
this is a lot easier to trigger, so let's fix it up.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
This is an n-queen pattern, where no two values should be on the same
row or column. But this and the second to last element has the same y
component, and neither has the negative one.
Let's fix this up by setting the first value to the negative value. This
matches the D3D 16x sample pattern.
Fixes: a61fb62966 ("panfrost: Upload sample positions on device init")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
The tests skipped in x11-skips.txt and gbm-skips.txt were identical,
so consolidate them into the common all-skips.txt file.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
This extension requires vulkan 1.1. Fixes
dEQP-VK.api.info.extension_core_versions.extension_core_versions on
bifrost.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 22fa3e88dd ("panvk: advertise VK_KHR_float_controls2")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34463>
8-bit loads are already supported by bi_emit_load_ubo and
bi_emit_load_push_constant, so the only necessary changes were fixing
swizzle lowering issues uncovered by these CTS tests.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
For most swizzled instructions that are different between valhall and
bifrost, valhall allows more values than bifrost does, so we can avoid
some unnecessary lowering.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
Fixes two known issues:
- We did not lower invalid swizzles for IADD.v4s8, triggered in the CTS by
enabling uniformAndStorageBuffer8BitAccess and storageBuffer8BitAccess in
panvk.
- We did not lower invalid swizzles for IMUL.v4i8, triggered by
dEQP-VK.spirv_assembly.instruction.compute.mul_extended.(un)signed_8bit
on bifrost.
The old logic was missing several other instructions, so there may be
additional bugs that we don't know about.
There are no cases where the new behavior will keep swizzles that would
have been lowered previously, so this change should not introduce any
new bugs with valhall.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
Previously we were using 'swizzle', with special handling in va_pack.
This does not work if we want to use va_src_info to determine allowed
swizzles in bi_lower_swizzle. The allowed set of swizzle values for
'lane' is correct for this instruction.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
This makes codegen using bifrost/ISA.xml swizzle values simpler because
we don't need to special-case the values that we don't emit.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
Primary reason to do this is to make codegen using the swizzle names in
bifrost/ISA.xml simpler. A secondary benefit is that dependent code can
now use the swizzle name that matches the context, making things a
little more readable.
We may want to consider giving widens separate values later, so that
va_lower_constants and bi_opt_constant_fold can fold them correctly, but
I don't know of current bugs caused by this.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
This is all supported by the common nir code, no changes needed on our
end.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
Valhall supports all combinations of ftz/preserve denorm behavior
between FP16 and FP32 except FP16=ftz, FP32=preserve. Because of this,
we can't advertise independent denorm behavior.
Even with INDEPENDENCE_NONE, it is still possible for shaders to set
denorm behavior for one size and leave the other size unspecified.
Previously we were defaulting to preserve for any unspecified size, but
with FP16=ftz, we need to default unspecified FP32 to preserve.
When advertising INDEPENDENCE_NONE, the CTS checks that the
shaderDenormFlushToZeroFloat* and shaderDenormPreserveFloat* features
are equal for all sizes, so we need to advertise the same supported
denorm behavior for FP64 even though we don't support FP64 at all.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
On bifrost independent float controls are implementable, just
potentially expensive because it requires scheduling FP16 and FP32
instructions in separate clauses.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
This allows more efficient scheduling by putting a 16-bit int
instruction in the same clause as a 32-bit float instruction even when
the 16-bit and 32-bit float controls are different.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
The current behavior is identical, but we can express that some
instructions may be packed in either FTZ and no-FTZ clauses in the
future.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
Using 'FADD.f32 x, +0' for f32->f16 conversions strips signed zero,
which we can't do if we advertise shaderSignedZeroInfNanPreserveFloat16.
Adding -0 instead preserves the original sign.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: b63ef74e73 ("pan/bi: Stop using V2F32_TO_V2F16 on Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
Many float instructions do not have a rounding mode modifier, but all of
the operations that are listed as requiring correct rounding in the
vulkan spec are supported in hardware.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
These are needed to implement VK_KHR_shader_float_controls rounding
mode.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>
With vkCmdPipelineBarrier, it's possible to specify a barrier with
pipeline stages but without any memory barriers. These might not be
practical, but are legal Vulkan code.
Barriers like this are currently ignored in mesa, as we only convert
barriers with passed memory barriers into vkCmdPipelineBarrier2.
This commit adds handling of execution only barriers by converting them
into a memory barrier without access masks.
Fixes: 97f0a4494b ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34187>
Fix the nginx cache snippets - I'd missed the file nesting somehow.
Tested on a debian:bookworm image with nginx-full installed, checked
that we could pull an arbitrary external site, as well as S3, as well
as GitLab artifacts.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34341>
Get flush_id once per command buffer in the submit and use it for all
subqueues instead of getting a new flush_id for every subqueue.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34448>
Fixes some upcoming CTS tests for texture clears.
* some drivers will attempt to issue clears with zero range
and hit asserts/crashes (spec clarification for negative
values)
* fix error thrown with negative values to match spec
* fix cases for clearing generic compressed formats
* fix negative case of using color format while having
depth/stencil internalformat and vice versa
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>