It used the TGSI MAD opcode to implement the old ffma, but this one let's
the driver choose to implement it as unfused or fused, so ffma_weak is a
better git over ffma or fmad.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
The buffer copy path needs to account for depth/stencil aspect copies
to compatible color formats. It also must account for differing image
subresource ranges; for example the depth/stencil <-> color tests copy
from level 0 to level 3.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
Re-uses unroll logic to create padded index buffer. We only need to
handle it when the bound range is a subset of the buffer.
This could be optimized, particularly for indirect case, by using
a separate shader program which creates an indirect command buffer
to issue the draw command. However, Metal 4 provides a command for
indexed draws that takes a GPU address and length, and is documented
to provide the robustness guarantees we need, so this solution is
only really needed short-term until Metal 4 encoders.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
`vk_image_init` sanitizes create info parameters for us, which we should
use instead.
For example, `maintenance5` blit-with-remaining-layers tests pass the layer
count in `depth` and later pull it out inside the test to set the `arrayLayers`.
They do not reset `depth` to 1, but `vk_image_create` sanitizes it for us.
While we're here, also pull out the extra stencil plane created from original
NVK code, as we don't use it.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
Setting layer_offset would not cause any issues, but it's ignored by the
GPU. Therefore, removing it for correctness.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41645>
The android extension enables the driver to blit from single-sampled
color attachments.
Adding this image usage expressess that functionality and causes anv to
generate the ISL_FORMAT_RAW-formatted clear color during fast-clears.
This fixes an assert failure when anv tries to override the clear color
format used for a blorp_blit() call to ISL_FORMAT_RAW.
There are other ways to handle this, but this solution is consistent
with our handling of multisample images (which may be resolved as well).
Fixes: 465c186fc5 ("anv: Prepare for format width changes in blorp_copy()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15463
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41650>
This is now dead code. This code is not used by RADV.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
We always use sample_id to load the sample position from memory.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
no change in behavior other than getting pixel_coord from the beginning
unconditionally
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
FYI, ac_nir_lower_ps_early is only used by radeonsi.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
The scalarBlockLayout feature was already exposed via the Vulkan 1.2
feature struct, but Vulkan 1.1 clients (e.g. Dawn) need the EXT to
discover it.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41673>
B10G11R11_UFLOAT_PACK32 maps to V3D_INTERNAL_TYPE_16F on the TLB,
which canonicalizes NaN bit patterns when arbitrary 32 bits are
reinterpreted as that format. The same canonicalization happens in
the blit shader when sampling a B10G11R11 source. Both break the
bit-exactness that vkCmdCopyImage, vkCmdCopyImageToBuffer and
vkCmdCopyBufferToImage require, since the spec defines them as raw
byte copies for any pair of texel-size compatible formats.
Fix it by aliasing the format to R32_UINT whenever B10G11R11 is
involved.
This fixes dEQP-VK.api.copy_and_blit.*b10g11r11*,
dEQP-VK.image.subresource_layout.*b10g11r11* and
dEQP-VK.api.image_clearing.*b10g11r11* failures on V3D 7.1.7 (rpi5)
and V3D 4.2 (rpi4).
Assisted-by: Claude Opus 4.7
Cc: mesa-stable
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41599>
The HW can do up to UINT32_MAX but we're using that value to signal
indirect dispatch arguments.
A game like Resident Evil Requiem will use more than 64k on X
dimension.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41592>