This was already supported if we have the DX10 SetPredication command.
We are already handling the conditional correctly in svga_render_condition.
The support is indicated by have_set_predication_cmd.
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39619>
The gl43 capability indicates we have a DX11.1+ device which supports
64 UAVs shared across all stages. This limit is roughly equivalent to
GL_MAX_COMBINED_SHADER_OUTPUT_RESOURCES which is controlled by
caps.max_combined_shader_output_resources which we currently set to
SVGA_MAX_SHADER_BUFFERS (8) which is probably too low since this limit
is also supposed to include render targets which we also set to 8.
The shader linker will validate that the pipeline does not exceed this
combined limit so we don't have to worry about the sum of the max for all
stages (16*5=80) now exceeding it.
Increasing the combined limt and the number of SSBOs from 8 to 16 allows
Blender to run as it requires 12 SSBOs. In theory we could increase the
combined limit to 56 but these limits are poorly documented and
implemented.
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39043>
GitLab job timeouts should be set on individual jobs rather than in
generic rule templates.
Set zink-lavapipe to a 15 minute timeout.
LAVA jobs should have the blanket 1 hour timeout even if the jobs don't
take that long, due to how lava-job-submitter works.
Remove redundant timeouts from .radv-zink-test-valve, as they were always
being overridden.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39530>
And immediately implement it in terms of
DRM_FORMAT_MOD_ARM_INTERLEAVED_64K.
Also ban DRM_FORMAT_MOD_ARM_INTERLEAVED_64K for WSI in panfrost.
Normally, the modifier's test_props would take care of but as
panfrost doesn't use test_props, this has to be handled in
panfrost itself.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38986>
I noticed we disable the prefetch only on Gfx12.5. But surely that
recommendation carries on on later platforms.
It seems other drivers just disable it all the time and only have an
option to force the prefetch. So implementing the same thing here.
Blorp path is left untouched.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39424>
Instead of unconditionally emitting the dither table during GPU state
reset, only emit it when alpha_to_coverage is actually enabled in
the blend state. A tracking flag avoids redundant re-emission until the
next GPU state reset.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39557>
On Xe2+, HSD 14011946253 and the related documents explain that MCS
still only supports a single clear color.
Fixes: df006bba02 ("iris: Update aux state for color fast clears (xe2)")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
When changing the clear color without a fast clear, use dirty bits to
ensure that surfaces with inline clear colors are updated and that
partial resolves are done as needed.
Remove the flags at the bottom of fast_clear_color() as
blorp_fast_clear() already sets them for us.
Fixes: 64d861b700 ("iris: Skip some fast-clears even on color changes")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
Since this does most of the work to determine the right aux usage for
a depth texture, turn it into a helper that returns that aux usage in
order to avoid duplication of logic between it and its callers.
Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>
This appears to be needed to guarantee that a resolved depth surface
has no remaining fast-cleared blocks on DG2 as well as MTL. After
this series this should no longer be hit in practice since we'll be
doing partial resolves in most cases, but it seems sensible to keep
and correct the workaround for our peace of mind to make sure that
full resolves are truly resolving the main surface.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>
v2: Define additional enum BLORP_OP_HIZ_PARTIAL_RESOLVE to track
partial resolves (Nanley).
v3: Add comment regarding fall back to full resolve on Gfx12.0 (Nanley).
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>
This works around graphics corruption seen on MTL and DG2 platforms
when sampling from a HIZ-CCS depth surface that was previously fast
cleared and resolved for sampling. Apparently full resolves no longer
guarantee that the CCS surface ends up in a pass-through state due to
the behavior of the L3 cache in presence of compressible data. In
order to work around the problem this makes sure that we use a
CCS-enabled AUX mode for depth textures if the base surface has a CCS
control surface, even if we are instructed to use ISL_AUX_USAGE_NONE.
This appears to fix the corruption without the need to add extra L3
flushes after resolves (as was done in the Vulkan driver, see
5178ad761c).
v2: Use ISL_AUX_USAGE_HIZ_CCS_WT instead of ISL_AUX_USAGE_HIZ_CCS
usage to represent the requirements of sampling from a depth
surface (Nanley).
v3: Add some comments, remove redundant check, disallow creation of
ISL_AUX_USAGE_NONE surface state for depth sampler views since the
hardware is buggy (Nanley).
v4: Preserve use of ISL_AUX_STATE_CLEAR when fast-clearing a surface
(Nanley).
v5: Set ISL_AUX_STATE_COMPRESSED_NO_CLEAR state after clearing a HiZ
CCS WT resource on xe2+ (Nanley).
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>
The clear color state has to be allocated since we will be sampling
from non-WT HiZ CCS depth surfaces without disabling compression.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>
This commit adds support for masked clear operations in the BLT path,
allowing partial clears of specific color channels and stencil bits.
For color clears, calculate which bits to clear based on the clear_mask
by examining the format's channel layout. The clear_bits field is now
set according to the mask instead of clearing all channels.
For stencil clears, use the clear_mask parameter through to mask the
stencil bits in the S8_UINT_Z24_UNORM format path, which was previously
hardcoded to 0xff.
Update etna_blt_will_fastclear() to check that clear_mask is 0xf (all
channels) before allowing fast clear, since masked clears require the
full clear path.
Enable the clear_masked capability when BLT is available and the
BLT_64bpp_MASKED_CLEAR_FIX cap is supported.
Passes the following dEQPs:
- dEQP-GLES2.functional.*_clear.*masked*
- dEQP-GLES3.functional.*_clear.*masked*
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31512>
Add a new PIPE_CAP_CLEAR_MASKED capability that allows drivers to
handle buffer clears with color and stencil masks directly, instead
of falling back to drawing a quad in Mesa.
This patch introduces several changes:
1. Add the new pipe cap PIPE_CAP_CLEAR_MASKED to pipe_defines.h and
document it in the Gallium screen documentation.
2. Add color_clear_mask and stencil_clear_mask parameters to the
pipe_context::clear() hook:
- color_clear_mask (uint32_t): contains 4 color mask bits per draw buffer
(max 8 buffers = 32 bits)
- stencil_clear_mask (uint8_t): contains the stencil write mask (8 bits)
3. Update the state tracker to use the masked clear path when the
driver supports it:
- Pass ctx->Color.ColorMask for color buffer clears
- Pass ctx->Stencil.WriteMask for stencil clears
- Allow both color and stencil clears to avoid the quad path when
masks are present and the driver advertises support
4. Update all existing driver clear() hooks to accept the new
color_clear_mask and stencil_clear_mask parameter.
This optimization allows drivers that can efficiently handle masked
clears in hardware to do so, improving performance for applications
that frequently clear buffers with masks enabled.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31512>
BSpec 46969 (r45602) tells us that we get no fast-clears for 3D:
3D/Volumetric surfaces do not support Fast Clear operation.
For Y-tiled surfaces, we work around this in BLORP with
convert_rt_from_3d_to_2d(). However, that function doesn't support Ys-tiling.
We could modify our surface redescription code paths to support clearing
entire Ys tiles, but we choose to hold off on the added complexity until
we have a use-case.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
We'll use isl_surf_supports_ccs() in a scenario in which we want to
check for CCS support without creating a HIZ or MCS surface beforehand.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Allow them in all cases except for one which prevents
dEQP-GLES31.functional.image_load_store.3d.atomic.xor_r32i_return_value
from hitting the following assertion on TGL:
convert_rt_from_3d_to_2d:
Assertion `!isl_tiling_is_std_y(info->surf.tiling)' failed.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
GL only allows atomics on R32 formats. So, for a shader which does
atomic operations, only decompress the bound R32-formatted images
instead of every image.
Aside from the performance improvement, explicitly limiting the formats
here makes it clear which formats may be resolved on gfx12.0. This helps
us to limit the scope of the Ys + 3D-dim restriction that will be added
in the next patch.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
ISL prevents certain tilings from being used on 3D shader images prior
to gfx12 due to an undocumented dataport issue. We're going to allow
these tilings soon, so increase use of the shader flag to make use of
ISL's workaround.
Test case:
arb_shader_image_load_store-layer
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
The BO may contain a surface that is tiled with a 64K tiling. Without
this change, the following piglit test assert fails on ICL:
ext_external_objects-vk-stencil-display -auto -fbo
The assertion is:
isl_gfx11_emit_depth_stencil_hiz_s: Assertion
`info->depth_address % info->depth_surf->alignment_B == 0' failed.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Prevents the following piglit test from failing on DG2 when Tile64 is
force-enabled:
fbo-clear-formats GL_ARB_texture_rg -auto -fbo
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
The previous method to calculate imageSize().z was
incorrect for a cubearray view.
This change was tested on palm and cayman. Here is the test fixed:
spec/arb_texture_view/rendering-layers-image/layers rendering of imagecubearray: fail pass
Fixes: 6c1432f0be ("r600/eg: fix cube map array buffer images.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39063>
This extension seems to work.
This change was tested with the current piglit repository:
spec/ext_shader_realtime_clock/execution/clock2x32: skip pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37954>
Adding this mmap mode makes explicit in code that PAT compressed
buffers should not be mmaped.
Although there is no CPU access Xe KMD uAPI still requires a
cpu_caching to be set, so setting WC.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
Similar to the low latency option for encode, this reduces latency
of decoding at the cost of increased power usage.
Can be enabled with AMD_DEBUG=lowlatencydec
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>
Added support for externally provided motion hints by reading the
MFSampleExtension_MoveRegions sample attribute.
The motion hint data is converted into pipe_enc_move_info and passed
down to the driver for use during encoding.
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39515>