We're going to start changing the surface format during blorp_copy().
Changing the surface format could lead to incorrect image alignment
parameters, so return a fixed halign and valign for images with a single
subresource. That's all that will be needed for the upcoming
blorp_copy() changes.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
Aux-tt alignment only applies to the beginning of the resource. Drop it
if we're pointing to an image that is not in the first tile of the
image. Likewise for the alignment we add for sequential multi-engine
access.
We allow sparse on 1D images. When getting an image from such a surface,
the alignment likely won't be aligned to 64KB. So, in this case, remove
the flag to avoid the alignment expectation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
Increase the scope of Yf/Ys miptail workarounds to drop the dependency
on format type (compressed or uncompressed) and make this information
more publically accessible. If I recall correctly, the affected tests
only performed blorp_copy() uploads and downloads and never accessed
images with compressed formats. So, we likely should be increasing the
scope.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
Fixes the following test case on ICL:
$ INTEL_DEBUG=noccs ./deqp-vk -n
dEQP-VK.api.image_clearing.core.clear_color_image.3d.optimal.
single_layer.r32g32b32a32_uint
Fixes: 78e24605db ("intel/isl: Reduce scope of Yf-disabling workaround")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
We're going to be changing the surface format of images but need to
maintain a consistent render compression format to properly
encode/decode. Generalize and use the field that was previously specific
to ISL_AUX_USAGE_MC.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
WA states that we need to allocate maximum number of stackIDs per DSS
from RT_DISPATCH_GLOBALS to 2048.
We can still throttle/control the CFE_STATE::StackID to be in range
specified by the field.
This does impact performance having CFE_STATE::stackIDs capped to 2K
by default. More the outstanding ray queries, larger the working set and
have more impact on cache hit rate.
This affect performance on Xe2+ onwards:
* Boundary Benchmark: 36.2%
* Solar Bay extreme: 9.8%
* Hitman world of assassination: 3.9%
Fixes: c1a44e8d43 ("anv: force StackIDControl value for Wa_14021821874")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40310>
this special cases the pData for template updating since it's a weird
one-off case where all the data needs to be copied
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
It's the only driver that uses the pass so it may as well go there.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
Previously, we assumed that the selector for bcsel could be whatever,
regardless of the bit sizes of the data and we'd just fix it in the
back-end. This works okay for scalars but falls over the moment we
vectorize because all our vector handling assumes bit sizes match.
Since matching bit sizes is what the hardware wants anyway, it's better
to do the right thing in NIR and hope copy-propagation can fold in
conversions if needed.
Unfortunately, copy prop isn't that smart yet so this does hurt a bit:
Instrs: 1193679 -> 1198086 (+0.37%); split: -0.06%, +0.43%
CodeSize: 11915136 -> 11950592 (+0.30%); split: -0.05%, +0.34%
Full: 160985 -> 160941 (-0.03%); split: -0.04%, +0.01%
Estimated normalized CVT cycles: 4456.938557000181 -> 4480.876069000186 (+0.54%); split: -0.13%, +0.67%
Estimated normalized SFU cycles: 6350.9375 -> 6392.21875 (+0.65%)
Estimated normalized Load/Store cycles: 205773.0 -> 205795.0 (+0.01%)
Maximum number of threads: 12864 -> 12863 (-0.01%)
Number of spill instructions: 22487 -> 22489 (+0.01%)
Number of fill instructions: 52179 -> 52219 (+0.08%)
Hurt shaders:
google-meet-clvk/BgBlur
google-meet-clvk/Relight
parallel-rdp/small_subgroup
parallel-rdp/small_uber_subgroup
The proper solution here is to teach copy-prop about this stuff so that
it can propagate swizzles into ALU ops when they're supported:
https://gitlab.freedesktop.org/panfrost/mesa/-/issues/265
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14945
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
It calls both for some reason but never handles any other booleans than
32-bit. This was probably a mistake.
Fixes: e63a7882a0 ("etnaviv: call nir_lower_bool_to_bitsize")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
This should be doing a or and not an assign.
This fixes issues on NVK with mesh stages on DGC.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40266>
Zink uses the output primitive of the last vertex stage when deciding
the raster primitive. When we generate the gs the output primitive
depends on the raster primitive.
Not only does the generated gs output primitive have no value in chosing
the raster primitive, it can also get us stuck with the last raster
primitve which is of course incorrect.
Ignore it for generated shaders.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32399>
Use is_scalar to know if we can do transpose loading.
Also enable vectorization if 2 intrinsics share the same source (it
means the only difference is the base).
Fixes: e14d6b535c ("brw/nir: add new intrinsics to load data from the indirect address")
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>
This doesn't consider layers/mips because it doesn't seem possible,
but it doesn't hurt correctness either, it just means HiZ is disabled.
This fixes dEQP-VK.api.copy_and_blit.core.use_after_copy.*_tq on GFX12.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40304>
This is now done directly in the VOPD scheduler.
Foz-DB GFX1201:
Totals from 600 (0.52% of 114655) affected shaders:
no stats changed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>
This optimization was previously done in the post-RA optimizer,
but it is more fitting for the vopd scheduler.
Doing it here also has the benefit that we don't unnecessarily use
the constant bus when VOPD can't be used.
No Foz-DB changes on GFX12 until the next commit.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225>
This optimization changed the rendered result for 11 pixels, all with
less than 1% change. Neither the old nor the new is obviously more
correct than the other, and the CTS is fine. So let's assume this change
is unproblematic, and accept the new result.
Fixes: 3d304d5647 ("nir/opt_algebraic: remove is_used_once on outer instruction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40321>
This is required to make sure that conformant_trunc_coord is correctly
enabled/disabled. Otherwise, it might be disabled on GFX11 GPUs with
drm-shim.
Bumping the minor version shouldn't have any other effects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40313>
Add a nightly job running Cuttlefish with Venus on Turnip.
Similar to the existing Venus-on-ANV jobs, this uses Cuttlefish's
'venus_guest_angle' mode to run deqp-vk and deqp-egl with ANGLE and
Venus inside the Android guest, with Turnip on the host.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39993>
Introduce the arm64 counterpart of the debian/x86_64_test-android
container/rootfs.
Building Android arm64 targets is complicated by the fact that Google
only provides the Android NDK for x86_64 hosts. Because of this, the
debian/arm64_test-android setup is split into two parts:
debian/arm64_test-android-tools
Despite the name, this is a native x86_64 container used to build
ANGLE, dEQP, and deqp-runner for Android arm64 targets. The resulting
artifacts are uploaded to S3 and later consumed by the final image.
debian/arm64_test-android
This is the final arm64 container/rootfs. It downloads the previously
built tools and installs the Cuttlefish Debian package.
The Cuttlefish guest image and additional host tools are not included
in this image. It is currently only used in LAVA, where Cuttlefish
artifacts can be deployed separately and kept cached across container
rebuilds.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39993>