ac already handles advertizing fmad and ffma support, so simply decide
when nir should fuse, because radeonsi contrary to radv doesn't let aco do
all the fusing itself.
Also unset splitting for force_use_fma32 handling.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
This advertizes both ffma and fmad on a couple of chipsets to support
VK_KHR_shader_fma and to improve OpenCL fma performance, where fma is not
optional and the emulation is more expensive than slow fma.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
It used the TGSI MAD opcode to implement the old ffma, but this one let's
the driver choose to implement it as unfused or fused, so ffma_weak is a
better git over ffma or fmad.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
The buffer copy path needs to account for depth/stencil aspect copies
to compatible color formats. It also must account for differing image
subresource ranges; for example the depth/stencil <-> color tests copy
from level 0 to level 3.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
Re-uses unroll logic to create padded index buffer. We only need to
handle it when the bound range is a subset of the buffer.
This could be optimized, particularly for indirect case, by using
a separate shader program which creates an indirect command buffer
to issue the draw command. However, Metal 4 provides a command for
indexed draws that takes a GPU address and length, and is documented
to provide the robustness guarantees we need, so this solution is
only really needed short-term until Metal 4 encoders.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
`vk_image_init` sanitizes create info parameters for us, which we should
use instead.
For example, `maintenance5` blit-with-remaining-layers tests pass the layer
count in `depth` and later pull it out inside the test to set the `arrayLayers`.
They do not reset `depth` to 1, but `vk_image_create` sanitizes it for us.
While we're here, also pull out the extra stencil plane created from original
NVK code, as we don't use it.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41663>
Setting layer_offset would not cause any issues, but it's ignored by the
GPU. Therefore, removing it for correctness.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41645>
The android extension enables the driver to blit from single-sampled
color attachments.
Adding this image usage expressess that functionality and causes anv to
generate the ISL_FORMAT_RAW-formatted clear color during fast-clears.
This fixes an assert failure when anv tries to override the clear color
format used for a blorp_blit() call to ISL_FORMAT_RAW.
There are other ways to handle this, but this solution is consistent
with our handling of multisample images (which may be resolved as well).
Fixes: 465c186fc5 ("anv: Prepare for format width changes in blorp_copy()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15463
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41650>
This is now dead code. This code is not used by RADV.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>