Those are normally uniform always, but for the purpose of fused
threads handling, we need to check their sources.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ca1533cd03 ("nir/divergence: add a new mode to cover fused threads on Intel HW")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
I didn't test "nvk: Fix maxVariableDescriptorCount with iub" as
thoroughly as I should have and it regressed
dEQP-VK.api.maintenance3_check.descriptor_set because we were then
violating the requirement that maxPerSetDescriptors describes a limit
that's guaranteed to be supported (and reported as supported in
GetDescriptorSetLayoutSupport).
That commit was also based on a misreading of nvk_nir_lower_descriptors.c
where I thought that the end offset of an inline uniform block needed to
be less than the size of a UBO. That is not the case - on closer
inspection that code gracefully falls back to placing IUBs in globablmem
if necessary. So, we can afford to be less strict about our IUB sizing
and only require that IUBs follow the existing limit imposed by
maxInlineUniformBlockSize.
Fixes: ff7f785f09 ("nvk: Fix maxVariableDescriptorCount with iub")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37922>
It assumes you don't have dead writes to variables in the last block, and
will copy-propagate consts from the first write it finds.
Without this, the upcoming nir_opt_copy_prop_vars() change to have more
restricted write masks caused less nir_opt_dead_write_vars() (since it
doesn't trim write masks for dead writes, only removes fully-dead writes),
and then zero-initialization of variables at the top of a shader got
propagated, rather than the final store of the used channels of the
variable.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37313>
Usage of implicit input file in valhall_parse_isa makes it very
complicated for tools like ninja-to-soong to generate the Android
equivalent build file.
Instead use an explicit argument.
It also deduplicate the location of the input file name to have it
only in 'meson.build'.
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37742>
The extension is emulated on top of traditional transient attachments,
with the driver creating extra internal attachments for each subpass
where the MSRTSS sample count doesn't match the color or depth/stencil
attachment's sample count and insert unresolves/resolves around them,
except for the cases where the original attachment is unused
before/after the subpass respectively. An important case is if the
original attachment would be cleared at the beginning of the subpass, in
which case we rewrite the clear to apply to the driver-internal
multisample attachment. This requires redirecting the clear colors in
CmdBeginRenderPass2.
We create images, image views, and backing memory for these attachments
that are part of the framebuffer with classic renderpasses or allocated
per-render-pass with dynamic rendering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
These ops replicate the single-sampled source attachment to the
multi-sampled destination attachment before the start of a subpass. This
is the new hardware feature for
VK_EXT_multisample_render_to_single_sampled, and the actual
implementation of the extension emulates everything on top of these.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
This is an old leftover from the skeleton stage of the driver, and we
have never needed anything other than the image view. Having this in
the way made it impossible to write generic code that reads the
attachments in the !image_framebuffer and dynamic rendering cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
This will be necessary so that the 3d path can handle "unresolves",
where we select the single-sampled shader for the single-sampled source
but enable multisampling *without* per-sample shading to replicate it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
We need to always use the FMT6_Z24S8_AS_R8G8B8A8 format for GMEM even if
UBWC is disabled, as already done for the 2d store path. Because we
use the pre-baked RB_MRT_BUF_INFO register value, this means we have to
override it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
This was already present in the 2d paths but not in the 3d path,
probably because the flag was moved there only on a7xx and it was
missed. Prevents page faults from bad flag buffer accesses.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
This can happen if we resolve to a resolve attachment and then use that
resolve attachment as an input attachment in a later subpass. We don't
need to put it in GMEM, but it's still considered "written" because
input attachment reads need a dependency after the resolve.
MSRTSS input attachment tests effectively created such a scenario after
lowering to transient multisample attachments and inserting resolves.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
If the first use of an attachment is as an input attachment, but it has
LOAD_OP_CLEAR, then we have to clear the attachment in GMEM and patch
the input attachment to refer to GMEM. Noticed by inspection.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37919>
Trivial but still better than cs_add32 workaround on platforms
supporting MOVE_REG32.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37951>