Treat R8_G8B8_420_UNORM and NV12 the same, because dri2 frontend doesn't
understand or care about the difference from the sampler PoV.
Fixes: 1e820ac128 ("freedreno: Rework supported-modifiers handling")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26601>
The Blitter engine lacks support for 3 components color format so we can
just fallback to RCS companion command buffer for the blit operation.
Even though blitter supports 96-bit support it only supports linear
tiling. We can support other types of tiling by falling back to the RCS
companion command buffer.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26300>
There was a subtle bug related to CFG tracking. Namely, some branch
instructions may point *only* to the block after the DO instruction
for the loop. If the MOV instructions are in the DO block, the may not
have liveness properly tracked.
Like in !25132, having the MOV instructions in blocks that might
contain other instructions helps scheduling.
shader-db:
All Broadwell and newer Intel GPUs had similar results (Ice Lake shown)
total cycles in shared programs: 848577248 -> 848557268 (<.01%)
cycles in affected programs: 78256396 -> 78236416 (-0.03%)
helped: 361 / HURT: 18
fossil-db:
All Skylake and newer Intel GPUs had similar results (Ice Lake shown)
Totals:
Cycles: 15021501924 -> 15021372904 (-0.00%); split: -0.00%, +0.00%
Totals from 735 (0.11% of 656080) affected shaders:
Cycles: 676429502 -> 676300482 (-0.02%); split: -0.02%, +0.00%
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
DCC and HTILE are only supported by SDMA on GFX10+ (unless disabled by a workaround).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
radv_init_metadata hits several assert failures when the image is
multi-planar. Make sure we use plane 0.
This change should make no difference in practice. Also, this is done
only to follow radeonsi. Since the opaque metadata is mainly for
validations and DCC, and we don't enable DCC for multi-planar images, we
probably don't need to call radv_query_opaque_metadata at all.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
Do not report DCC modifiers for multi-planar formats. We don't support
DCC for them and drmFormatModifierPlaneCount had incorrect values.
Fix vkGetImageSubresourceLayout for multi-planar images with modifiers.
In that case, memory planes and format planes are equivalent.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
For glext.h and glcorearb.h, it's already use 'APIENTRY', so for the osmesa.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26561>
This commit nak: implement SHL and SHR on SM50 caused a regression on
KHR-GL45.gpu_shader_fp64.* using zink.
This fixes the regression, by setting the wrap fields.
Fixes: 00be041ffc ("nak: implement SHL and SHR on SM50")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26586>
Merging deprecated="" of aliased and real functions isn't completely
predictable. The function (real or aliased) that's defined last overwrites
attributes of its alias defined before it.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
This is used when we want to be able to read the calls of autogenerated
functions, or when we want to use the default structure for our custom
marshal functions.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
Viewperf benefits. This implements glPushMatrix marshalling manually and
looks ahead in the unmarshal function what the following calls are.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
glMultMatrixf was doing it. glMatrixMultfEXT is the other user of
matrix_mult that needs to do it before we can skip it here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
Summary:
- Add a perf option to force primary ring submission
- Let device own secondary ring(s) for ad-hoc spawn
- For threads where swapchain and command pool are created, track with
TLS to instruct ring dispatch.
- If the pipeline creation or cache retrieval happens on the background
threads not on the hot paths, force synchronous and dispatch to the
secondary ring after waiting for primary ring becoming current.
- If the pipeline creation or cache retrieval happens on the hot paths
threads, dispatch to the primary ring to avoid being blocked by those
tasks on the secondary ring.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
Sync protocol and fix all the interfaces, otherwise we have to generate
two sets of headers with both interfaces to separate protocol sync and
the driver side adaptation.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
At first, no behavior change in this CL.
The instance level helper for normal command submission is left to work
with the current venus protocol. Meanwhile, we leave the helper to
submit recorded command buffer inside instance to it can later redirect
to the primary ring.
We've internalized a few ring helpers that no longer need to be exposed.
Besides, indirect submission decision is on per-ring basis since the
ring buffer can vary later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
This change only moves the fields without changing the accessors. It's
better to let ring own its own upload cs encoder (which is backed by
shmem array) to avoid lock contention between indirect submissions
across rings.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
Now we are able to break up the original lock to allow shmem alloc to be
outside the ring mutex, as long as the reply shmem set is still coupled
with ring submission.
Add and expose vn_instance_reply_shmem_alloc helper which will be used
by rings separately later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
The encoder must not be empty by then so switch to an assert. Failing to
get a reply shmem would end up with VK_ERROR_OUT_OF_HOST_MEMORY, thus
there's no need to track either.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
This can be thread-safe only because we have dropped seeking command
stream offset, which requires comparing pool shmem to decide conditional
set stream.
This is to prepare for later sharing reply shmem pool across rings.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
More considerations and details here:
- The seek is a bit lighter than set, since it assumes renderer side
resource being immutable. It does affect perf when Venus is still
making verbose synchronous calls at runtime (e.g. descriptor set,
buffer, device memory, etc).
- Seek still requires lock protection as the reply shmem must be
immutable before the seek and the followed cmd are committed to the
ring.
- Removing seek without doing set requires renderer change to always
bump the encoder end position according to what the original request
is instead of being ad-hoc upon what the host driver tells to write.
The overhead and extra complexity there isn't negligible.
- Further, removing seek requires each ring to track the prior reply
pool shmem in the multi-ring scenario. While the additional host side
resource lookup isn't costy as the number of resources is must less
than the vk object table.
- The nice thing is that we can make shmem pool thead safe to be more
easily shared across rings.
So we just drop it.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>