With lavapipe the previous change to only enable the hack when there's
no textures bound doesn't work anymore, since we don't have that
information when using the texture handles.
So add another random state restriction which is sufficient to pass
tests. (This really needs a better long term solution.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24887>
It is much easier to just add a simple late algebraic pass than actually
trying to teach the backend to recognize all the different patterns.
RV530 shader-db:
total instructions in shared programs: 129643 -> 129468 (-0.13%)
instructions in affected programs: 17665 -> 17490 (-0.99%)
helped: 176
HURT: 39
total presub in shared programs: 4912 -> 5411 (10.16%)
presub in affected programs: 1651 -> 2150 (30.22%)
helped: 0
HURT: 287
total temps in shared programs: 16904 -> 16918 (0.08%)
temps in affected programs: 812 -> 826 (1.72%)
helped: 25
HURT: 37
total cycles in shared programs: 194771 -> 194675 (-0.05%)
cycles in affected programs: 28096 -> 28000 (-0.34%)
helped: 146
HURT: 41
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9364
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24830>
When compiling VS/TES and GS separately, we can't know if the GS will
need shader queries, so we have to always declare this argument.
Similar story for the view index.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24862>
Huge thanks to Tapani Pälli for debugging this issue, figuring out
what was going wrong, proposing fixes, and walking me through where
things were going off the rails.
BLORP always disables tessellation and geometry shaders. Our handling
tried to look at ice->shaders.uncompiled[] to determine whether the next
draw needed those shaders. If not, we can leave BLORP's residual state
that disabled those stages in place, and skip looking at it.
Unfortunately, predicting the future is a bit fraught, in part due to
the uncompiled[] and prog[] arrays being slightly out of sync at times.
Consider the following case:
1. Draw with tessellation shaders in place
=> uncompiled[TES] and prog[TES] will both point at valid shaders.
2. Gallium calls pipe->bind_tes_state(NULL).
=> This makes uncompiled[TES] point at NULL, and flags
IRIS_STAGE_DIRTY_UNCOMPILED_TES.
Because iris_update_compiled_shaders() hasn't happened yet,
uncompiled[TES] is NULL but prog[TES] has the stale TES from
the previous draw still.
3. BLORP operations happen
=> BLORP sees uncompiled[TES] == NULL and decides that tessellation
is off for the upcoming draws. So it skips flagging tess state.
4. Gallium calls pipe->bind_tes_state(shader from step #1).
=> uncompiled[TES] points at the original shader.
IRIS_STAGE_DIRTY_UNCOMPILED_TES gets flagged again.
5. Draw again
=> This calls iris_update_compiled_shaders(), which sees that
a TES is bound, and calls iris_update_compiled_tes(). But
because the same shader was bound as before, the program it
comes up with is identical to the one already bound at
ice->shaders.prog[TES]. So, it thinks it doesn't have to
flag any tessellation state dirty because it was already
set up for the last draw.
This random unbind and rebind between draws leads to a situation
where, at step #3, BLORP thinks it can skip flagging tessellation
state (nothing is bound), and at step #5, normal state handling
thinks it can skip flagging tessellation state (nothing changed
since last time). So nobody does, and things break.
This unbind appears to be happening when st_release_variants()
decides it wants to free some shaders. Then a rebind happens to
put back the actual shader for the draw. So, it's not theoretical.
To fix this, we change BLORP to look at ice->shaders.prog[] rather
than uncompiled[]. This is equivalent to thinking about the previous
draw, rather than the next. If the last draw had tessellation off,
then BLORP's disabling was a no-op, and the GPU is still in the same
state as the previous draw. This is more reliable than predicting
the future.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8308
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9678
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24880>
VK_EXT_device_memory_report is implemented in venus driver side, which
has a feature struct. So we must enable it after setting features for
renderer extensions. This change also includes tiny format fixes.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24881>
For unmerged shaders on GFX9+, we would need to jump to the second
shader part which means shared VGPRs can't be enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
For separate VS/TCS compilation on GFX9+, the TCS might be using push
constants but not the VS and we can't know this information when
compiling the VS. Similar logic for the other arguments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
When VS and TCS are compiled separately on GFX9+, we can't know how
many descriptor sets are used for both stages and the function
arguments must match.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
It's hard to implement this because the function arguments must match
when eg. VS or TCS are compiled separately on GFX9+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
The H264 spec defines a complicated way of layering PPS scaling lists
on top of SPS scaling lists. The details can be found in 7.4.2.1
(seq_scaling_matrix_present_flag semantics) and 7.4.2.2
(pic_scaling_matrix_present_flag semantics).
Both ANV and RADV need to derive the final scaling lists sent to HW
using this logic in order to handle H264 scaling lists correctly.
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24572>
Instead of using bit 23 of nvk_cmd_push::range for this, pass it as a
separate bool. This lets us use the actual kernel flag with the new
UAPI.
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Tested-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24840>
Previously rb_tree_search would return the first node encountered, but
that may not be the first node that would be encoutnered by, say,
rb_tree_foreach.
Two test cases are added.
v2: Add more curly braces. Suggested by Faith.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24856>
The msm driver reserves the actual DMABUF size in the memory map, while
TU can request smaller memory chunk to be allocated. This potentially
can lead to a situation when next allocation IOVA will be in the middle
of the address space which is reserved for the DMABUF. Pass the
`real_size' to TU allocator instead, so that kernel and userspace have
the same picture of memory allocations.
Cc: mesa-stable
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24861>