Unigine Heaven has a TCS that reads pos.xyz and tescoord.w from all
invocations in every invocation. By putting those two in the same vec4,
AMD hw can reduce the amount of shared memory that is allocated for those
inputs from 2 vec4s to 1 vec4.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
Skip the code for other shader stages because it doesn't do anything there.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
Now we have a few extra methods and things are diverging a bit between
Linux and Windows. Add a few unit tests to make sure this works.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
Move the memstream code into common code. Other Rust code interfacing
with FILE pointers will find the memstream abstraction useful.
Most notably, pinning is actually enforced this time with PhantomPinned.
Add a .clang-format from a sibling dir (i.e.: compiler/nir) while we're
at it to keep things tidy.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
Flush support is needed by the Rust code, which will switch from
its own memstream to u_memstream in a future patch.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
With DRLR, we unfortunately don't get notified about when there are
feedback loops. We also can't detect ahead of time the case where an
input attachment is read before written. This means we need to try and
guess in the driver when that happens based on which input attachments
we read.
There are three main things we have to do with feedback loops:
- Forcing flushing between primitives in sysmem to avoid problems with
compressed textures before a7xx.
- Forcing late depth test if it's a depth feedback loop.
- Invalidating UCHE because we may read blit results.
The way this is implemented is currently a bit conservative, following
radv. However the impact of the flushing should be mitigated by
switching to GMEM whenever we can, which we already do.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
In gallium samplers and textures have a 1:1 correspondance, so it was
impossible to tell which is which. But we use non-bindless textures for
subpass loads, which don't have a sampler (so common code lowers to an
index of 0) and therefore the order matters. With dynamic rendering, we
can have more than 16 slots for input attachments, so the s2en wasn't
getting optimized away and suddenly it starts mattering, and it turns
out we got it wrong. By fixing this we also make the order the same as
the bindless order, which was always tested well by Vulkan.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
With VK_KHR_dynamic_rendering_local_read we can have input attachments
with no index, which normally correspond to depth/stencil attachments,
and we have to handle this here when determining whether we need to emit
an unscaled fragcoord for FDM.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
This will let us know when an input attachment doesn't have an
InputAttachmentIndex, which used to be illegal but is now allowed and
meaningful with VK_KHR_dynamic_rendering_local_read.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
The code seems to assume that it's supposed to be a base index for the
binding array, but it's not. It's entirely meaningless and happens to be
set to 0 for all descriptor types except for input attachments, which is
the one case where it's already ignored. Delete the dead code so that
when we change it to a "no index" sentinel value v3dv doesn't blow up.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
Tilers need to know, at the very least, which input attachments are
"real" input attachments which map to a color or depth/stencil
attachment in tile memory and which are "fake" input attachments that
should be treated as regular textures. To do this we need to know
VkRenderingInputAttachmentIndexInfoKHR::colorAttachmentCount, which
wasn't provided by the state tracking infrastructure. When the
VkRenderingInputAttachmentIndexInfoKHR isn't provided, we don't know it,
though, so we have to record a special UKNOWN value. In that case, the
driver will know thanks to
VUID-VkGraphicsPipelineCreateInfo-renderPass-09652 that all input
attachments are "real" input attachments that map directly to color
attachments if they have an InputAttachmentIndex.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
The Vulkan caselist is gigantic - over 400MB - and only growing over
time. Luckily zstd gets that down to about 9MB, with absolutely
negligible runtime overhead.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
Currently, we:
* copy the case list to a temporary file
* have sed read it in, decimate for fraction, write it out again
* have sed read it in, decimate for shard, write it out again
Flatten this out to have a single-pass sed covering both fraction and
shard that reads the source caselist in directly and writes the final
one.
This comes at the cost of an expression that's unreadable even for sed,
but with the Vulkan mustpass over 400MB and growing, it's a good
improvement.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
The validation layers being installed are somehow 206MB in size. Slim
them down to 26 MB by building the dependencies in release mode, and
stripping on install.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
We previously only had two traces (one for Raven, one for lavapipe)
running via Wine. To support this, the Vulkan container had Debian's
full Wine installation, one from DXVK, and then a third from Proton
which is used by the b2c jobs.
At 600MB each, this was pretty unsustainable. Remove everything but the
Proton ones.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
The radv-raven and lavapipe trace jobs were using Wine installed from
Debian. lavapipe had a single post-merge trace running, and raven had
all but one trace disabled due to being flaky.
b2c is using Proton instead, and it makes absolutely no sense to have
two parallel versions of Wine installed. These should be brought back at
some point running on the same version of Wine as the newer b2c jobs.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
One quirk of the shfl instruction is that it only works with dynamically
uniform indices. This commit adds a pass to lower shuffles to the
ir3-specific ones using a loop that iterates all distinct indices one by
one. This is based on the blob's sequence.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31501>
These are like shuffle_{xor,up,down} except they expect a dynamically
uniform index. This is necessary since the ir3 shfl instruction does not
work with a divergent index.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31501>
We have no cases where we intentionally pass a NULL layout when dynamic
offsets, and doing so would cause a null dereference. Le't asd an assert
for that.
CID: 1620447
Fixes: f39cd30f4f ("anv: Track all the descriptor sets")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31638>
Coverity notices that we've insured that index index is < MAX_RTS in one
case, but that it must be greater in one case. Since `color_att_count`
is a uint32_t, it can easily exceed MAX_RTS (8), and would thus create
an out-of-bounds read situation. While the type system would allow this,
the actually implementation shouldn't, so an assert should make Coverity
happy and help us check our assumption.
CID: 1620440
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31640>
Since deqp-runner is already being built in the debian base image for test
jobs, it’s not necessary to build it subsequently in GL and VK debian
test images. Update the image tag instructions for uprevving deqp-runner.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31429>
Since the 'git clone --branch' option only accepts branch names or tags as
arguments, it’s not currently possible to build deqp-runner directly from
a git commit hash.
Revise the deqp-runner build script so that it can build from a commit
even if it doesn’t have a tag or version number.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31429>