Given that the intrinsic will be CSEed at the NIR level, we don't need to
preemptively set it up at the top of the shader. No change in HSW shader-db.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
Unless I've seriously missed something, we have the Z in the payload
(which we can always request if we need access to it and it's not already
passed to us due other WM IZ settings).
total instructions in shared programs: 4408303 -> 4408186 (<.01%)
instructions in affected programs: 1164 -> 1047 (-10.05%)
total cycles in shared programs: 142485036 -> 142484566 (<.01%)
cycles in affected programs: 26820 -> 26350 (-1.75%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
NIR catches that if you're just doing something like adding two smooth
inputs, we can do the multiply once on the result instead of on each
input. BRW shader-db results:
total instructions in shared programs: 4409146 -> 4408303 (-0.02%)
instructions in affected programs: 800761 -> 799918 (-0.11%)
total cycles in shared programs: 143203198 -> 142485036 (-0.50%)
cycles in affected programs: 79081682 -> 78363520 (-0.91%)
total sends in shared programs: 363044 -> 363042 (<.01%)
sends in affected programs: 33 -> 31 (-6.06%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This moves some conversions to NIR that may get eliminated, and also
distinguishes gl_FragCoord.z/w loads at the shader info level so we don't
need to flag uses_src_depth/uses_src_w when only gl_FragCoord.xy get used
(as is typical). This reduces thread payload setup on many shaders.
Also, interestingly, blorp shaders stop reserving space for z/w despite
not putting them in the payload (since PS_EXTRA isn't filled out for z/w).
HSW shader-db is noise:
total instructions in shared programs: 9942649 -> 9942997 (<.01%)
instructions in affected programs: 143167 -> 143515 (0.24%)
total cycles in shared programs: 314768862 -> 314299112 (-0.15%)
cycles in affected programs: 62951452 -> 62481702 (-0.75%)
LOST: 44
GAINED: 26
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This will be used for representing gl_FragCoord in NIR and reducing
payload registers pushed.
HSW results:
total instructions in shared programs: 9940636 -> 9948574 (0.08%)
instructions in affected programs: 852560 -> 860498 (0.93%)
total cycles in shared programs: 314804525 -> 314900080 (0.03%)
cycles in affected programs: 39786599 -> 39882154 (0.24%)
LOST: 5
GAINED: 11
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This was leftover dead code from 4bb6e6817e ("intel: Use a system value
for gl_FragCoord") -- the sysval doesn't do any interpolation and doesn't
have sources that could use a barycentric.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This generates one extra instruction to set the rounding mode to RTE due
to f2f16_rtne in the lowering. This changes the result for
fquantize2f16(65505.0) from 65536 to 65504, which fixes SPIR-V
conformance for this value:
If Value is positive with a magnitude too large to represent as a
16-bit floating-point value, the result is positive infinity. If Value
is negative with a magnitude too large to represent as a 16-bit
floating-point value, the result is negative infinity.
SPIR-V doesn't specify whether this overflow check is before or after
rounding, but IEEE specifies rounding first, which is what produces our
65504.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25552>
There's no need for a per driver HMI implementation since the
vk_icdGetInstanceProcAddr implementation can well populate the required
entrypoints for Android icd.
Changes have to be done in this single commit for simplicity. Otherwise,
I would have to create a separate android shared library in the runtime
like how vk_instance is handled today, so that the target is able to
check per driver enablement def. However, after all drivers have
migrated over within this MR, we still have to clean those up. So I
decided to just do those in a single commit instead.
v2: avoid preloading u_gralloc in vulkan hal open
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>
Per <hardware/hwvulkan.h>, the hw_device_t::close() function is called
upon driver unloading. The behavior has been like this since Android 10.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>
u_gralloc will be initialized upon the initial vk_android_get_ugralloc.
v2: drop explicit gralloc init
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>
Per <hardware/hwvulkan.h>, the hw_device_t::close() function is called
upon driver unloading. The behavior has been like this since Android 10.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>
We are reading accel header parameter those are updated by CS, so we
need to apply flushes to make L3 coherent with CS.
This fixes ray query tests on MTL:
- dEQP-VK.ray_query.*.serialization.*
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35590>
The intel-adl-cl job was previously running on two DUTs, but the
runtime reported by deqp-runner was only about 3 minutes.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35574>
Introduce a new, pre-merge vkd3d-proton job on Alder Lake, and move the
VK_DRIVER variables to the .anv-test template.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35574>
This matches up with the native gl drivers as well as the media stack.
- VK_SAMPLER_YCBCR_RANGE_ITU_NARROW <=> EGL_YUV_NARROW_RANGE_EXT
Cc: mesa-stable
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35597>
This matches up with the native gl drivers as well as the media stack.
- VK_SAMPLER_YCBCR_RANGE_ITU_NARROW <=> EGL_YUV_NARROW_RANGE_EXT
Cc: mesa-stable
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35597>
We only want the atomic bit to be conditional to non sparse.
Also take the opportunity to fix buffer features and report the same
supported atomic formats as images.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ed77f67e44 ("anv: add emulated 64bit integer storage support")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35358>
PresentWait2 should be possible on any physical device, as it adds a
surface capability query that depends on common wsi code.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35414>
Helped from: Stéphane Cerveau <scerveau@igalia.com>
- Fix crash when segmentation is unavailable
- Set 8x8 to minCodedExtent
- Fix typo for GOLDEN and ALTREF scale factor
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35485>
Don't report compressed memory type in the case of Xe2 modifiers
as the Vulkan spec requires identical memory types behind the
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT.
Instead, we require dedicated allocation to get the right
compressed memory in allocation stage. The BMG modifier also
requires scanout flag to set. Refer to comments.
Thanks for the help from:
Nanley Chery <nanley.g.chery@intel.com>
Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Kenneth Graunke <kenneth@whitecape.org>
and other people not listed.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34567>
When aux has to be disabled (ISL_SURF_USAGE_DISABLE_AUX_BIT)
for some reasons like VK_SHARING_MODE_CONCURRENT, we simply
cannot implicitly choose any modifier with compression.
Otherwise, we run into a situation that an image is created
with a modifier but without the aux support that modifier
requires. It will fail a CTS test once Xe2 modifiers are
enabled:
dEQP-VK.wsi.wayland.swapchain.private_data.image_sharing_mode
MESA: warning: ../src/intel/vulkan/anv_image.c:1198: image with
modifier unexpectedly has wrong aux usage (VK_ERROR_UNKNOWN)
GFX12.x (MTL) does not show this failure because only one queue
family is present. But they will face the same issue when aux is
disabled for any other reasons:
NotSupported (Only 1 queue families available for
VK_SHARING_MODE_CONCURRENT at vktWsiSwapchainTests.cpp:715)
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34567>
As a part of the effort to unify the displayable attribute
on dmabuf sharing across drivers, we set scanout flag on
imported bos on Xe2+.
Refer to the comment in the change.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34567>
Narrow down the definition of two compression flags suggested by
Nanley Chery <nanley.g.chery@intel.com> so that we can address
the unified compression support of Xe2 modifiers and don't have
to set media compression flag thats result more update in the
stack.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34567>