If a render area covers an area that is smaller than an attachment's
extent and is not aligned to the CCS block size, we must load the clear
color so that the pixels outside of that area are decompressed with the
right clear color.
Prevents the next patch from causing the following test failure on gfx9:
dEQP-VK.renderpass.suballocation.load_store_op_none.color_load_op_none_store_op_none
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
Store an array of clear values, one for each view format of the image.
Load the clear value based on the view format.
anv_image_msaa_resolve() may override the source or destination with
ISL_FORMAT_UNSUPPORTED, so make anv_image_get_clear_color_addr() handle
that format.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
In later commits, we'll rely on the number of view formats used by an
image to determine the size allocated for an array of clear colors in
the aux-state tracking buffer. Having a single view format for dmabufs
with clear color support allows anv to transparently handle this case.
Restrict the number of view formats by explicitly setting the image
format list to incomplete. Secondly, loosen the non-zero clear color
restriction on clear color supporting dmabufs. Those images can support
any clear color even with an incomplete list because we restrict
problematic accesses for the clear color during the negotiation phase.
Lastly, update add_all_surfaces_explicit_layout() to assert that the
sizing of the imported clear color struct meets expectations.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
The major differences compared to the NV extensions are:
- support for the sequence index as push constants
- support for draw with count tokens (note that DrawID is zero for
normal draws)
- support for raytracing
- support for IES (only compute is supported for now)
- improved preprocessing support with the state command buffer param
The NV DGC extensions were only enabled for vkd3d-proton and it will
maintain both paths for a while, so they can be replaced by the EXT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383>
When no color target is bound PE_COLOR_FORMAT_OVERWRITE must be set to
avoid GPU hangs.
Fixes: 07cd0f2306 ("etnaviv: blend: Add support for MRTs")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31845>
Using view3dAs2dArray changes the tiling and it's slower (-7.5% in
Silent Hill 2 Remake) than using 3D tiling. The previous implementation
was the best one regarding performance (it's also what RadeonSI does).
Sadly it seems that sampler2DViewOf3D can't really be supported without
that but nobody really needs it apparently.
Also view3dAs2array is incompatible for 2D views of sparse 3D images
because sparse 3D images requires 3D tiling.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11997
This reverts commit f5805bcb8e.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869>
If blending is disabled or the color write mask is 0, dual-source
blending would be ignored, and this can be simplified a bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31681>
When the FS is unknown, this can happen with fast-link GPL or unlinked
ESO, rely on the number of VS/TES outputs which should be a good
approximation of the number of PS inputs.
This fixes a (huge?) performance regression from May 2023 because
for depth-only rendering, the FS is NULL and NGG culling wasn't
considered at all.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31830>
The algorithm for adding extra physical edges works by finding edges
that jump over reconvergence points. Since predicated branches don't
introduce reconvergence points, we wouldn't add a physical edge from the
true block to the false block. However, this physical edge is still
needed as control flow does fall though here. This patch fixes this by
manually adding the physical edge so that we don't need to insert a
reconvergence point for it.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 39088571f0 ("ir3: add support for predication")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31733>
This has been removed few years ago by mistake but it's important for
performance. This is mostly for addrlib to determine tile_swizzle which
is used to make memory access faster with multiple render targets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31797>
FORCE_KERNEL_TAG allows testing kernel uprevs without rebuilding
containers by supplying an external kernel directly for booting on
hardware devices. Renaming it to EXTERNAL_KERNEL_TAG clarifies its
purpose, distinguishing it from KERNEL_TAG which rebuilds containers.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31795>
This was a VKCTS bug on earlier version of the CTS.
These tests have been actually passing since the VKCTS was uprevved to
1.3.9.0, which landed a bit before ADL testing in CI was turned on.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
There's currently no GL or GLES testing on the iris gallium driver,
and the VKCTS expectations were erroneously listed under iris-*.txt.
Fix the rules set for anv-adl-full, change the GPU_VERSION to anv-adl
and move the expectations around accordingly.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
A narrowing move from const is just emulated CONSTANT_DEMOTION_ENABLE so
we can permit it. But not the inverse.
Cc: mesa-stable
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
With CL we do not know the local wg size at compile time. Assuming the
worst-case (max) size limits the register footprint, leading to excess
spilling. OTOH with CL the app queries the supported local size after
compiling the kernel. So instead pick the largest warp size as the
target.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
If local wg size isn't known at compile time, we need to move some of
the state emit out of the state object and into IB2 cmdstream.
This still doesn't account for the fact that RA currently must assume
the worst case, meaning limiting cl kernels to a miniumum number of regs
and spilling excessively.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
In ir3_shader_descriptor_set() tread MESA_SHADER_KERNEL shaders in the
same way, as PIPE_SHADER_COMPUTE shaders, return 0.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
The InlineData passed to the shader is a fixed size unrelated to the
register size. It happens to match pre-Xe2, but by considering it the
same in Xe2, we ended up reading pushed constants from the wrong place
when they didn't fit in the InlineData.
Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31856>