As we're going to kick frag for suspending rendering passes to mitigate
frag job inconsistency between suspending rendering passes and resuming
render passes, deriving render target datasets based on
geometry_terminate property will be incorrect.
Stop to use geometry_terminate to decide whether to remember render
target datasets, instead use is_suspend directly.
In addition, is_resume is now also used instead of checking whether
suspended render taget datasets is available. This will help when either
the suspending render pass or the resuming render pass have multiple
graphics sub_cmds.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
When executing a secondary command buffer outside a renderpass, the
sub_cmds of that secondary command buffer is simply copied into the
primary command buffer. However, the 4 flags outside the type-specific
structures are not copied. Although owned flag is intentionally set to
false, the other 3 flags should be preserved.
Copy these 3 flags when executing sub_cmds of a secondary command buffer
outside renderpasses.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
The attachments field of the render pass state could be
MESA_VK_RP_ATTACHMENT_INFO_INVALID, which indicates no attachment
information is valid. If such situation really happens when initializing
the fragment state of a pipeline, this means neither a render pass nor a
VkPipelineRenderingCreateInfo structure is available -- in this case,
the specificiation for that structure says colorAttachmentCount is
considered as 0, so the loop iterating color attachments should just not
happen.
Skip iterating color attachments if the render pass has a attachments
field with value MESA_VK_RP_ATTACHMENT_INFO.
This fixes some regression on the Vulkan CTS testcase
dEQP-VK.pipeline.monolithic.misc.no_rendering introduced by !40870, in
which MESA_VK_RP_ATTACHMENT_INFO instead of 0 is set as the value of the
attachments field of the render pass state, if neither a render pass nor
the VkPipelineRenderingCreateInfo structure is available.
Fixes: 1950b6c1a7 ("vulkan: mark RP attachments as invalid when no rendering create info")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41032>
This reverts commit 2ee6b4d96e.
The previous change avoids 0.25MB (1%) size change on the driver binary file,
but blocks the runtime enablement for some intel tools which is critical
to our optimization tasks.
It's not a good tradeoff based on the new need of the tool in runtime,
so revert this change.
Test: meson setup builddir -Dallow-fallback-for=libdrm -D build-tests=true -Dbuildtype=release --reconfigure && ninja -C builddir && cd builddir && meson test
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: hwandy <hwandy@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41525>
This allows us to use LD_VAR_BUF instead of LD_VAR when the shaders are
linked together.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40761>
This will make it easier to create new default keys in other places
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40761>
pan_varying_layout contains both layout and format, in lower_fs_inputs
though the layout is referring to the VS layout and the format might
differ from what the FS layout expects. We cannot use the VS format as
FS format otherwise we risk interpolating an integer.
Fixes: 66bee415ad ("pan/compiler: Split lower_varyings_io into fs_inputs and vs_outputs")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40761>
textureGrad() has to be split into two halves on Mali: Computing the
gradient/LOD and doing the actual texture operation. On Valhal, we do
this with LOD_MODE_GRDESC but on Bifrost, we use LOD_MOD_EXPLICIT. When
converting to NIR, I missed this.
Fixes: 05a066c921 ("pan/nir: Add bifrost support to pan_nir_lower_tex()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41513>
Trailing zeroes should be harmless, but it seems to cause issues with
latest ffmpeg (which looks like an ffmpeg bug).
The extra bytes are useless, so we can just skip them like we already
do on VCN to workaround it.
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41485>
Because SDMA doesn't support MSAA, it's possible to get there because
RADV fallback to compute queue in this case.
Some tests only pass because RDNA2 and older don't support image
stores with depth/stencil and MSAA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41492>
allows deleting piles of moves & pressure.
simd16 results:
Totals:
Instrs: 2759547 -> 2753358 (-0.22%); split: -0.29%, +0.06%
CodeSize: 41141280 -> 41071072 (-0.17%); split: -0.23%, +0.06%
Totals from 332 (12.54% of 2647) affected shaders:
Instrs: 648080 -> 641891 (-0.95%); split: -1.23%, +0.28%
CodeSize: 9782272 -> 9712064 (-0.72%); split: -0.97%, +0.25%
simd32 is a loss because of RA being stupid. again, this is obviously the right
thing to do so we're doing it. stats are just a hint.
Totals:
Instrs: 4683556 -> 4689193 (+0.12%); split: -0.25%, +0.37%
CodeSize: 70072256 -> 70171920 (+0.14%); split: -0.23%, +0.38%
Number of spill instructions: 50320 -> 50316 (-0.01%)
Number of fill instructions: 51530 -> 51526 (-0.01%)
Totals from 351 (13.26% of 2647) affected shaders:
Instrs: 1349954 -> 1355591 (+0.42%); split: -0.86%, +1.28%
CodeSize: 20484224 -> 20583888 (+0.49%); split: -0.80%, +1.29%
Number of spill instructions: 21762 -> 21758 (-0.02%)
Number of fill instructions: 26328 -> 26324 (-0.02%)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
this is both a correctness fix (insufficient MEM registers reserved in some
cases) and a performance fix (unnecessary allocations & zeroing in the RA when
we don't spill).
fixes dEQP-VK.dgc.ext.compute.misc.scratch_space
stats are noise but positive i guess.
Totals from 35 (1.32% of 2647) affected shaders:
Instrs: 396770 -> 396690 (-0.02%)
CodeSize: 6040832 -> 6039600 (-0.02%)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
poking around, it seems branches stall the pipelines so we don't need to do any
dataflow analysis, but we do need to fall through for correctness. just keep
going across block boundaries. this isn't optimal yet but it reduces a
pile of A@1's already.
Totals from 1389 (52.47% of 2647) affected shaders:
CodeSize: 56385376 -> 56325776 (-0.11%); split: -0.13%, +0.03%
--
this also fixes issues where the first instruction of a block is a SEND that has
an unmet register dependency, since the old code was fundamentally broken. oops.
lol. fixes
dEQP-VK.compute.pipeline.workgroup_memory_explicit_layout.zero.uint8_t_array_to_uint_array_1
among many others.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
Lets us use more accumulators, I think this is well motivated. Saw this in a
test shader.
Totals from 242 (9.14% of 2647) affected shaders:
Instrs: 1365060 -> 1365035 (-0.00%); split: -0.00%, +0.00%
CodeSize: 20678592 -> 20680096 (+0.01%); split: -0.01%, +0.02%
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
Uses an approach based on HoneyKrisp. In the vertex shader, an
extra output writes 1 if the cull distance is >= 0, otherwise it
writes 0. In the fragment shader, if the extra outputs from the
vertex shader interpolate zero, all cull distances are < 0, so
the primitive is culled by discarding fragments.
Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41463>
This fixes CTS flakes in several tests, for example:
dEQP-VK.synchronization.signal_order.shared_binary_semaphore.write_copy_buffer_read_ssbo_compute_indirect.buffer_262144_opaque_fd
Fixes: 1c77a6f049 ("nvk: Don't emit MME FIFO config on Blackwell+")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41010>
We need to remove ddx/ddy before doing the cube lowering,
otherwise we insert instructions that break dominance.
Affects Sable.
Fixes: 7d552d71e9 ("ac/nir: optimize txd(coord, ddx/ddy(coord))")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41489>
When a VRS view is used with a depth/stencil view, the driver is
expected to copy the VRS rates to the HTILE buffer of the depth/stencil
view. Though if the image uses mipmaps and the base level can't support
HTILE there is no way to copy the rates. The workaround is to force VRS
to be 1x1 which is valid in Vulkan.
This fixes old VKCTS failures on RAPHAEL just because it supports
fragmentShadingRateWithShaderDepthStencilWrites compared to other GPUs
in CI (NAVI21/VANGOGH).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41427>
It's required with VK_KHR_maintenance11. This allows way more transfer
queue related CTS tests to run and all issues I found should already
be fixed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41316>
This changes timestamps so they are written with their available part
directly.
This allows to save a bit of memory and just write timestamp with only
one operation.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41507>