When no color target is bound PE_COLOR_FORMAT_OVERWRITE must be set to
avoid GPU hangs.
Fixes: 07cd0f2306 ("etnaviv: blend: Add support for MRTs")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31845>
Using view3dAs2dArray changes the tiling and it's slower (-7.5% in
Silent Hill 2 Remake) than using 3D tiling. The previous implementation
was the best one regarding performance (it's also what RadeonSI does).
Sadly it seems that sampler2DViewOf3D can't really be supported without
that but nobody really needs it apparently.
Also view3dAs2array is incompatible for 2D views of sparse 3D images
because sparse 3D images requires 3D tiling.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11997
This reverts commit f5805bcb8e.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869>
If blending is disabled or the color write mask is 0, dual-source
blending would be ignored, and this can be simplified a bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31681>
When the FS is unknown, this can happen with fast-link GPL or unlinked
ESO, rely on the number of VS/TES outputs which should be a good
approximation of the number of PS inputs.
This fixes a (huge?) performance regression from May 2023 because
for depth-only rendering, the FS is NULL and NGG culling wasn't
considered at all.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31830>
The algorithm for adding extra physical edges works by finding edges
that jump over reconvergence points. Since predicated branches don't
introduce reconvergence points, we wouldn't add a physical edge from the
true block to the false block. However, this physical edge is still
needed as control flow does fall though here. This patch fixes this by
manually adding the physical edge so that we don't need to insert a
reconvergence point for it.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 39088571f0 ("ir3: add support for predication")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31733>
This has been removed few years ago by mistake but it's important for
performance. This is mostly for addrlib to determine tile_swizzle which
is used to make memory access faster with multiple render targets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31797>
FORCE_KERNEL_TAG allows testing kernel uprevs without rebuilding
containers by supplying an external kernel directly for booting on
hardware devices. Renaming it to EXTERNAL_KERNEL_TAG clarifies its
purpose, distinguishing it from KERNEL_TAG which rebuilds containers.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31795>
This was a VKCTS bug on earlier version of the CTS.
These tests have been actually passing since the VKCTS was uprevved to
1.3.9.0, which landed a bit before ADL testing in CI was turned on.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
There's currently no GL or GLES testing on the iris gallium driver,
and the VKCTS expectations were erroneously listed under iris-*.txt.
Fix the rules set for anv-adl-full, change the GPU_VERSION to anv-adl
and move the expectations around accordingly.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
A narrowing move from const is just emulated CONSTANT_DEMOTION_ENABLE so
we can permit it. But not the inverse.
Cc: mesa-stable
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
With CL we do not know the local wg size at compile time. Assuming the
worst-case (max) size limits the register footprint, leading to excess
spilling. OTOH with CL the app queries the supported local size after
compiling the kernel. So instead pick the largest warp size as the
target.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
If local wg size isn't known at compile time, we need to move some of
the state emit out of the state object and into IB2 cmdstream.
This still doesn't account for the fact that RA currently must assume
the worst case, meaning limiting cl kernels to a miniumum number of regs
and spilling excessively.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
In ir3_shader_descriptor_set() tread MESA_SHADER_KERNEL shaders in the
same way, as PIPE_SHADER_COMPUTE shaders, return 0.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30835>
The InlineData passed to the shader is a fixed size unrelated to the
register size. It happens to match pre-Xe2, but by considering it the
same in Xe2, we ended up reading pushed constants from the wrong place
when they didn't fit in the InlineData.
Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31856>
No shader-db or fossil-db changes on any Intel platform.
v2: Simlify the logic for when to try constant folding. Do
commute_immediates before constant folding. Both suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31729>
Prior to Cherryview, align16 3src instruction sources had to have their
subregister number be DWord-aligned. Cherryview added a discontiguous
bit in the encoding to represent bit 1 of the subregister number. This
allows us to use packed HF sources.
Update the ISA encoding helpers to properly handle bit 1. While we're
at it, make them take a full subregister number and adjust accordingly,
rather than making the callers divide or multiply by some alignment.
Note that the destination subregister must still be DWord aligned, so
HF destinations must be strided.
Thanks to Ian Romanick for discovering that we were botching this.
BSpec: 12054, 12081
v2 (idr): Fix ordering of high and low bit parameters to brw_inst_bits.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>
Gfx11+ doesn't support Align16 instructions anymore - only Align1 mode.
Bailing early for Align16 is important so that brw_hw_decode_inst
doesn't try to read Align16 related instruction fields on generations
where they no longer exist (which could trigger assertions).
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>