nvk: enable VK_KHR_shader_fma
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

This allows rusticl to make use of the native fma instructions giving us
better OpenCL performance.

e.g. ProjectPhysX_OpenCL-Benchmark on my GA102:

FP32 0.610 -> 11.474 TFLOPs/s

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41720>
This commit is contained in:
Karol Herbst 2026-05-21 11:02:41 +02:00 committed by Marge Bot
parent 22f61a4eb5
commit a9da8ec49b
3 changed files with 8 additions and 1 deletions

View file

@ -581,7 +581,7 @@ Khronos extensions that are not part of any Vulkan version:
VK_KHR_shader_bfloat16 DONE (anv/gfx12.5+, radv/gfx12+, vn)
VK_KHR_shader_clock DONE (anv, hasvk, lvp, nvk, panvk, radv, tu, vn)
VK_KHR_shader_constant_data DONE (anv, radv)
VK_KHR_shader_fma DONE (kk, radv, vn)
VK_KHR_shader_fma DONE (kk, nvk, radv, vn)
VK_KHR_shader_maximal_reconvergence DONE (anv, hk, kk, lvp, nvk, panvk/v10+, radv, vn)
VK_KHR_shader_quad_control DONE (anv, hk, lvp, nvk, panvk/v10+, radv, vn)
VK_KHR_shader_relaxed_extended_instruction DONE (anv, hasvk, hk, kk, lvp, nvk, panvk, pvr, radv, tu, v3dv, vn)

View file

@ -16,3 +16,4 @@ OpenCL 3.1 support for rusticl on asahi, iris, radeonsi, llvmpipe and zink
VK_KHR_workgroup_memory_explicit_layout on pvr
VK_KHR_maintenance5 on pvr
VK_KHR_shader_fma on RADV
VK_KHR_shader_fma on nvk

View file

@ -184,6 +184,7 @@ nvk_get_device_extensions(const struct nvk_instance *instance,
.KHR_shader_float_controls = true,
.KHR_shader_float_controls2 = true,
.KHR_shader_float16_int8 = true,
.KHR_shader_fma = true,
.KHR_shader_integer_dot_product = true,
.KHR_shader_maximal_reconvergence = true,
.KHR_shader_non_semantic_info = true,
@ -772,6 +773,11 @@ nvk_get_device_features(const struct nv_device_info *info,
.presentAtRelativeTime = true,
.presentAtAbsoluteTime = true,
#endif
/* VK_KHR_shader_fma */
.shaderFmaFloat16 = info->sm >= 70,
.shaderFmaFloat32 = true,
.shaderFmaFloat64 = true,
};
}