load_helper_invocation can not be reordered past a demote.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 7ece220f96 ("nak/nir: Lower systm values before lowering I/O")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
This was missing when this intrinsic was added.
Fix some issue with FSI lowering and probably more.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e779538ad2 ("nir: add nvidia IO intrinsics")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 0092edfec0 ("nir/dead_cf: Do not remove loops with loads that can't be reordered")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
with the unlowering pass, there is no longer a separate gl_LastFragData variable,
so this workaround just breaks color outputs
fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40437>
Enable the extension and feature bits on CSF (v10+). Map the conditional
rendering pipeline stage to all three subqueues for barrier handling.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Meta operations (blit, copy, clear images, fill/update buffer) internally
emit draws or dispatches that must not be affected by conditional
rendering. Save and disable the state in meta_gfx_start/meta_compute_start,
then restore it in the corresponding end functions.
CmdClearAttachments is the exception: it IS affected by conditional
rendering per the Vulkan spec, so its state is restored right after
meta_gfx_start.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
When a secondary command buffer is recorded with
conditionalRenderingEnable, its draws load the conditional rendering
flag from panvk_cs_subqueue_context::cond_render_flag via the
cs_subqueue_ctx_reg register.
In CmdExecuteCommands, evaluate the primary's predicate and write the
normalized result (0=skip, non-zero=render) to the subqueue context.
Since the context is per-queue and cs_subqueue_ctx_reg is never
clobbered across cs_call, this works correctly with
SIMULTANEOUS_USE_BIT and across multiple queues. For nested
inheritance, propagate the caller's own flag value.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Use panvk_cond_render() to wrap RUN_IDVS and RUN_COMPUTE GPU commands.
When active, the predicate is loaded from the buffer and a cs_if
branches over the GPU commands when the condition says to skip.
This per-command approach correctly leaves render pass operations
(BeginRendering, EndRendering, load ops) unaffected as required by the
spec. Only draws, dispatches, and CmdClearAttachments are conditional.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Add panvk_cond_render_state, CmdBeginConditionalRendering2EXT,
CmdEndConditionalRenderingEXT, and the panvk_cond_render() macro
that wraps GPU commands in a cs_if block.
For direct conditional rendering, the macro loads the predicate from
the buffer. For inherited secondaries, it loads a pre-evaluated flag
from panvk_cs_subqueue_context::cond_render_flag. When conditional
rendering is inactive, no GPU instructions are emitted.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
The code was built with the assumption that VS shaders could magically
read FS varying descriptors because we never emitted any VS varying
descriptor. The truth is that VS shaders never read any varying
descriptor at all, lea_buf is never used for varying stores on v9.
This simplifies varying handling by a lot since we can assume different
descriptor types for VS and FS without hacks.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
Previously the driver decided when the backend should use
LD_VAR_BUF[_IMM] instructions based on the total number of varyings
read, falling back to LD_VAR[_IMM] + descriptors when the varying index
could overflow the immediate index in the instructions. That means that
even adding a single varying read could overflow the index and make
everything fall back to LD_VAR.
With this patch the backend decides when to use LD_VAR_BUF for each
varying load, reporting that decision to the driver. This helps with
index overflows because only the instruction that actually overflow the
immediate use the LD_VAR fallback, leaving all other instructions on the
fast path.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
The function was a leftover from the varying ABI rework.
Fixes: 7fc6af99ea ("pan: Remove dead code for sso_abi builder and fixed_varyings")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
Per plane size has been decided outside when imported with explicit
layout. The unconditional alignup was added in the lvp sparse support,
which doesn't interact with format modifier path at all.
For later (in fear of breaking things):
1. We better validate against the must satisfied layout e.g. D/S tile
load/store. For color attachment, unaligned loads are used if the
alignment exceeds 16 bytes.
2. The additional alignment on the required size should have been
applied inside the resource_create_unbacked from llvmpipe backend.
3. The alignment requirement better comes from the backend instead of a
hard-coded value in the frontend.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
The layout plane offset might be out-of-order. Move the plane offset
calculation to lvp_image_init and respect offset from explicit layout,
which has also simplified plane offset handling.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
Now we have many more tests running than before. Rename the job
android-angle-lavapipe-cts to android-angle-lavapipe-cts-full since it's
already a nightly job, and raise the timeout to 45min.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
AHB image layout initialization is deferred until dedicated memory
import time. During vkCreateImage, only a stub is created just like
aliased ANB image since the layout has to be retrieved from the AHB.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
Use pkt_field_{get_set}() macros in tu_desc_{get,set}_addr() functions,
aligning with other such functions.
The tu_desc_{get,set}_buffer_addr() functions are added in order to
correctly retrieve and modify the buffer address for TEX_BUFFER descriptors
on A8XX.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40538>
Current implementation of tu_desc_set_ubwc() for A8XX will override
bitfields in the upper half of A8XX_TEX_MEMOBJ[5], leading to incorrect
behavior.
Fix this by using pkt_field_set() to correctly set the relevant descriptor
fields. Similarly, tu_desc_get_ubwc() is changed to use pkt_field_get() to
retrieve relevant field data from which it forms the returned address.
Fixes CTS tests on a8xx:
dEQP-VK.api.copy_and_blit.*.resolve_image.copy_with_regions_before_resolving.*
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40538>
Its no longer an error for depth and stencil formats to have invalid
accumulator format.
Fixes the following tests:
* dEQP-VK.api.info.image_format_properties.2d.optimal.d16_unorm
* dEQP-VK.api.info.image_format_properties.2d.optimal.d24_unorm_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.x8_d24_unorm_pack3
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
When the first attachment is assigned to a tile buffer, the buffer
alloc mask was not been updated. This means when a second attachment
is added to the same tile buffer it will be assigned the same offset
as the first which will lead to incorrect behaviour.
Fixes for depq-vk:
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.dedicated_allocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.dedicated_allocation.attachment.4.568
Fixes: a7de9dae6b ("pvr: Add routine for filling out usc_mrt_setup from dynamic rendering state")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
Do not assume that the application always provides images for backing
attachments. The app can provide a super set of attachments of which
only some are actually backed with images.
We want to filter-out attachments that aren't meaningful for rendering
or sampling, and create compiler resources only for relevant ones.
Fix assert in CTS:
pvr_arch_mrt.c:215: pvr_rogue_init_usc_mrt_setup: Assertion `att_format != VK_FORMAT_UNDEFINED' failed.
Seen in pipeline monolithic, for instance:
dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_sint_d32_sfloat_s8_uint.random_127
Fixes: d549c1d045 ("pvr: add pipeline handling to use dynamic rendering info")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
Expose the routine in preperation for a later commit.
Backport-to: 26.0
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
Use the valid/input coverage masks for tile buffer store coverage masks
when running single/multi-sampled fragment shaders respectively.
Fixes: 297a0c269a ("pvr, pco: tile buffer support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reported-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
The kernel commit 188db3d7fe66 ("drm/msm/a6xx: Rebase GMU register
offsets") shifted the GMU reg offsets to be relative to the GPU base.
This change landed in the v6.19 kernel.
This commit pulls that change back into mesa. Crashdec is modified to
add the offset if the devcoredump came from a kernel prior to v6.19.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40495>
When destroying H264/5 decode context we check the profile from decoder to
free the H264/5 PPS/SPS objects, but decoder is only created when decoding
first frame so these objects will never get freed in case decoder is NULL.
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40504>