Suggested by @gurchetansingh.
Android's Soong build system treats several compiler warnings as errors
by default: https://android.googlesource.com/platform/build/soong/+/27f57506/cc/config/global.go/#218
To catch these issues in Mesa, introduce `soong_compat_c_args`
and `soong_compat_cpp_args` with the following flags treated as errors:
-D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS
-Werror=date-time
-Werror=gnu-alignof-expression
-Werror=ignored-qualifiers
-Werror=implicit-fallthrough
-Werror=int-conversion
-Werror=missing-prototypes
-Werror=pragma-pack
-Werror=pragma-pack-suspicious-include
-Werror=sizeof-array-div
-Werror=string-plus-int
-Werror=unreachable-code-loop-increment
These compatibility flags are added to the meson configurations
for ANV, Gfxstream, Lavapipe, PanVK, Turnip, and Venus.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
Based on the approach in e0eea5ea4e.
When a file is too large, -Wmisleading-indentantion will give the warning
below, that we can't prevent from a #pragma:
../src/freedreno/vulkan/tu_perfetto.cc: In function 'void setup_incremental_state(MesaRenderpassDataSource<TuRenderpassDataSource, TuRenderpassTraits>::TraceContext&, tu_device*)':
../src/freedreno/vulkan/tu_perfetto.cc:162: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
162 | if (!state->was_cleared)
../src/freedreno/vulkan/tu_perfetto.cc:162: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89549 for details.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
The job runs the following modules with ANGLE:
- CtsGraphicsTestCases
- CtsNativeHardwareTestCases
- CtsSkQPTestCases
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41440>
Stops allocating events in chunks. u_trace_event is allocated using a
linear allocator which has minimal overhead. Buffers for timestamps are
allocated using a custom allocator.
As a sideeffect, it is possible to deduplicate consecutive tracepoints.
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41271>
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
Fixes the following with meson2hermetic:
src/freedreno/registers/adreno/a6xx_perfcntrs.py/genrule.sbox.textproto
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "gen_header.py", line 1220, in <module>
File "gen_header.py", line 1216, in main
File "gen_header.py", line 1177, in dump_py_defines
File "gen_header.py", line 688, in parse
File "gen_header.py", line 680, in do_parse
File "external/python/cpython3/Modules/pyexpat.c", line 471, in StartElement
File "gen_header.py", line 732, in start_element
File "gen_header.py", line 673, in do_parse
FileNotFoundError:
[Errno 2] No such file or directory: './out/src/freedreno/registers/adreno/adreno_common.xml'
Soong/Bazel `genrules` run in a separate sandbox, and require that
all dependencies be explicitly declared. It is necessary for
reproducible, hermetic and distributed builds.
Meson prefers explicit dependency declaration too, but does not
require it. For example, if `adreno_common.xml` is modified, and
it is in `depend_files` for the `adreno_pm4.xml.h` custom_target,
meson knows to re-gen `adreno_pm4.xml.h` during incremental builds.
For freedreno, the custom targets in `src/freedreno/registers/*`
don't declare all XML dependencies that are actually used.
This patch fixes this. The other option is workaround this in
meson2hermetic, but being more explicit conceptually more correct.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41518>
Fixes:
src/freedreno/vulkan/tu_knl_kgsl.cc:1455:16:
error: 'alignof' applied to an expression is a GNU extension [-Werror,-Wgnu-alignof-expression]
alignof(*objs), VK_SYSTEM_ALLOCATION_SCOPE_COMMAND);
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41518>
Fixes:
src/freedreno/vulkan/tu_cmd_buffer.cc:2162:10:
error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
The other option is the [[fallthrough]] annotation.
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41518>
UBO and sampler descriptors are smaller than texture descriptors, but
the nature of Vulkan descriptor sets means we need to make them just as
big. Zero out the remaining dwords so that we don't get garbage that
trips up asserts in gfxrecon-replay.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41608>
Apparently we emitted uninitialized values for VSC in tu6_emit_tile_select
when HW binning wasn't used, which Adreno 630 doesn't like and hangs.
Instead of adding even more conditions, just always init VSC state,
the rare cases where initializing it can be skipped - not worth
the complexity.
Fixes: 49191f46e6 ("tu/a6xx: Emit VSC addresses for each bin to restore after preemption")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41604>
Update the headless Android WSI patch to fix intermittent timeout issues. It
now uses an ImageReader listener to actively drain and instantly release frames
from the buffer queue. This acts as a "null compositor" that prevents buffer
starvation while maintaining stable GPU backpressure.
This fixes dEQP-VK.wsi.android.maintenance1.* in newer VKCTS versions and
resolves the race conditions that caused occasional teardown crashes.
Also rebase build-deqp-gl_Build-Don-t-build-Vulkan-utilities-for-GL-builds.patch
on top of the updated WSI patch.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41541>
This splits the nir_move_to_top_input_loads option into 2 options. The latter
option is mainly for at_offset/at_sample loads. Then it updates most places to
use only the first option.
The rationale is that moving at_sample loads makes Control (game) shaders
worse, as per the code comment.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41167>
Unfortunately we have to disable concurrent binning by default
because it hurts performance in a number of desktop games without
any case where we know it helps.
There are less vertex fetch resource available in BV compared to BR,
so when binning runs in BV, there are many vertices, and vertices are
attribute heavy - BV has much worse performance than BR, sometimes more
than 50% worse.
Even with worse performance it won't be bad if concurrent binning
actually overlapped with other workload in those cases, but in case of
desktop games - there is almost never a chance for overlap.
However it's impossible to statically find out if binning on BV would
be much slower than on BR, and we also cannot statically predict if
there is enough overlap (if any) to cover for the performance penalty.
Given the above, I don't see a way out but to make concurrent binning
opt in via `tu_allow_concurrent_binning` driconf toggle.
Still allow concurrent binning in CI to catch issues early.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41394>
I've pulled in a pile of changes to reduce the overhead (runtime and
memory) when sharding for deqp-runner, along with a bunch of fixes for
KHR_display testing that we recently enabled, plus a few others that
affect our drivers.
The big new set of failures looks like it's from more complete coverage of
blitting between formats.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41243>
We're regularly hitting 13 minutes of deqp-runner runtime on our jobs,
which is too long. Once we uprev the CTS, one of them gets to 14 minutes
and triggered the existing job timeout.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41243>
Previously, if only non bindless accesses where present, we would end up
emitting an empty preamble.
Also avoid emitting non binless textures.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40309>
Apparently, this is a major footgun since it is not uncommon for apps to
enable all the features exposed by a driver. Having UBWC disabled for
D24S8 can result in a major performance loss, and the reason can be hard
for devs to spot. This footgun is already known to have happened a few
times. Furthermore, disabling UBWC depending on a Vulkan feature being
requested broke D24S8 sharing via external memory when only one device
was created with customBorderColorWithoutFormat.
Fortunately, there is the depthStencilSwizzleOneSupport feature, which
was added after the above hardware deficiency was found and, when false,
forbids the problematic state combination.
To prevent the footgun described above, we now set
depthStencilSwizzleOneSupport to false by default. This allows UBWC to be
enabled for D24S8 in all cases while remaining conformant. We also have
the tu_enable_d24s8_border_color_workaround driconf option, which enables
the previous workaround for apps that don't know about
depthStencilSwizzleOneSupport, which is currently only the ANGLE
translation layer.
One caveat is that we cannot use the fast border color HW feature for
D24S8+USAGE_SAMPLED+VK_FORMAT_UNDEFINED, so a new driconf toggle is
added. enable_fast_border_color_for_undefined_formats is set for DXVK and
vkd3d-proton since they are known not to use border colors with D24S8.
Lacking fast border colors is a much smaller penalty than not having UBWC
for D24S8.
For some context also see: https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/4346
This partially reverts 36916949.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41514>
Caused hangs in the following CTS test with CB enabled:
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.random.seed1_geometry_tessellation_multiview
Fixes: 50cc9c723c ("tu/u_trace: Prevent cloning stale RB_DONE_TS results")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41490>
We create driver param instructions once we encounter their first use
and cache them for further uses. This creates problems when the first
use occurs in a block that doesn't dominate all further uses. This was
hit in practice with a driver param that was used both in the preamble
and in the main shader.
Fix this by simply not caching driver params. Since they are simply movs
from const regs, ir3_cp or ir3_cse should clean up most cases of
multiple uses.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 8b0b81339b ("freedreno/ir3: add NIR compiler")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15418
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41484>
pre_chain.rp_trace usage relied on a bunch of bad assumptions
and together with u_trace_move didn't cause issues until
u_trace is started to be refactored. Fixing those bad assumptions
and correctly initializing and freeing pre_chain.rp_trace
also requires fixing u_trace_move at the same time.
u_trace_move fixes:
- If dst had trace chunks in it - we may have leaked them.
- The correct list move pattern is "list_replace -> list_inithead"
not "list_replace -> list_delinit"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41390>
When uncached memory type is used under emulation then most games have a
significant performance penalty due to accessing the buffer atomically.
Instead when this option is set, it will override uncached buffer
allocations to instead be cached+coherent if the host supports it. This
allows the atomic accesses to still be done but not have abysmal
performance.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41323>
Instead of leaving timestamp_copy_data half-initialized in
copy_timestamp_cs_pool - always have it fully initialized and valid
state there.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41438>
This will simplify things for PERFCNTR_CONFIG, where we will be getting
GPU timestamps along with the sampled counter values.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41315>