This means we don't have to emit dead code anymore and can only
repack the sysvals that are actually used by the deferred part.
No Fossil DB changes on Navi 21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
Now that the deferred shader part is prepared before emitting
the non-deferred part, we can also gather info about what sysvals
it needs.
No Fossil DB changes on Navi 21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
The previous concept was to emit the non-deferred shader part
first, including the culling code, and then modify the
non-deferred part accordingly.
This caused some issues because it was really impossible to tell
which sysvals the deferred part needs after DCE, so we had to
run an additional cleanup pass afterwards.
The new concept is to prepare the deferred part first by applying
reusable variables (from the non-deferred part) and run DCE.
This opens the possibility to accurately gather info about what
the deferred part needs.
This idea is further expanded in the next commits.
Fossil DB stats on Navi 21:
Totals from 17 (0.02% of 79377) affected shaders:
Instrs: 18063 -> 18064 (+0.01%)
CodeSize: 93368 -> 93372 (+0.00%)
Latency: 49889 -> 49899 (+0.02%); split: -0.01%, +0.03%
SALU: 2416 -> 2417 (+0.04%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
This information will be collected by NIR core better,
no need to do it here. It is also currently unused.
No functional changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073>
Now that the brw_ip_ranges analysis is being used, there's no
need to track start_ip/end_ips in the blocks as they are mutate. And
also no need to call adjust_block_ips at the end of some passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
Calculate the IP ranges of the shader as an analysis pass. This will
later replace the existing tracking of start_ip/end_ip as the blocks are
changed (and the defer/adjust scheme to avoid too much churn when that
happen).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
Stop using the start_ip / end_ip, these are not really important for
those tests. What the test care was the number of instructions in the
block to check for changes and ensure we can peek at them by index.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
It was used by the pass tests to verify output with TEST_DEBUG=1,
replace it with brw_print_instructions().
The output is slightly different (not printing IP, not reordering the
blocks), we can add those features as we need, but given the usage was
already very reduced, don't bother with that until need arises.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
This will cause venus to take the prime blit path if modifiers are not
supported. This has been an outstanding TODO in venus for a while.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34218>
Drivers are expected to ignore unknown structs in pNext chains. Venus
is a bit weird because we advertise features based on the host driver
and so we have code for all sorts of things which may not be supported
by the host driver. When globalPriorityQuery is unsupported, we
shouldn't even attempt to return anything. Currently, we just crash in
this case because vn_physical_device::global_priority_properties is an
uninitialized pointer. While we're here, initialize it to NULL if it's
invalid.
Fixes: e488b5e45e ("venus: support VK_KHR_global_priority")
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34218>
When we're using the PRIME path and using vkCmdCopyImageToBuffer to copy
to a linear image, the buffer memory is what's shared with the window
system. For legacy drivers that depend on memory signaling via
wsi_memory_signal_submit_info, we need to tell the driver to signal the
buffer memory, not the image memory or else the window system may wait
on a driver-internal buffer and not wait for the copy to complete.
Cc: mesa-stable
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34218>
a1b05991 ("radv/rt: Flush L2 after writing internal node offset on GFX12")
did this for radv-internal CP writes - we also need to do this for PLOC
sync data initialization which is done in the common framework.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
The mirror needs to be reversed because the rotation is applied
before the mirroring.
VAAPI docs:
Mirroring of an image can be performed either along the
horizontal or vertical axis. It is assumed that the rotation
operation is always performed before the mirroring operation.
Cc: mesa-stable
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34140>
In order to stop ASAN from complaining.
Fixes: d21aa86b54 ("llvmpipe: Implement EGL_ANDROID_native_fence_sync")
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34258>
This should probably be done different, params should probably be considered immutable,
and this should be moved into the command buffer, also this gets set on decode paths as well
which might not make sense.
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>
Because if FMASK_COMPRESS_1FRAG_ONLY is set, the FMASK decompress
operation actually doesn't occur. Note that DCC_DECOMPRESS implicitly
decompresses FMASK.
This fixes an issue on GFX10-GFX10.3 which is uncovered by enabling
VK_EXT_sample_locations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
FMASK_DECOMPRESS also implies FAST_CLEAR_ELIMINATE, so it can run first.
The only exception is fast-clear for color images that have DCC and
FMASK but without comp-to-single (only GFX10) because FMASK_DECOMPRESS
can't eliminate DCC fast-clears.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
comp-to-single supports MSAA since a while and it's useless to perform
a fast clear eliminate for these fast color clears.
Only GFX10-GFX10.3 are affected because these are the only GPUs that
support DCC with MSAA with FMASK.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
Mesa is built twice in the same debian-android job, once for aarch64 and
once for x86_64 to catch as many build regressions as possible.
However the install dir used for the two builds is the same, and this
results in a mix of aarch64 and x86_64 artifacts ending up in
install.tar, because .gitlab-ci/prepare-artifacts.sh is called at the
end of the second build.
Having two separate jobs for aarch64 and x86_64 build would be cleaner
but it would also use more resources.
Since the aarch64 libraries are not used for anything for now, a cheaper
workaround is to build x86_64 first and just call prepare-artifacts.sh
after first build.
This way the aarch64 build will still be done to catch regressions, but
the artifacts won't end up in install.tar which is also more consistent
with the fact that S3_ARTIFACT_NAME only has x86_64 in the name
(mesa-x86_64-android-${BUILDTYPE}).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34234>