v_writelane_b32 can take two sgprs but only if one is m0.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 20.2 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6662>
aco_instruction_selection_setup.cpp (previously used as a header) has
been split into a header and an implementation file. The latter "only"
implements init_context and setup_isel_context, but since these files
carry a long trail of helper functions, this cleans up the isel header
a lot.
Reduces library size by 3.1% due to more functions being compiled with
static linkage. Makes aco_instruction_selection.cpp compile 3% faster.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>
Statically known values were encoded using template parameters previously,
causing specializations for each of the 5 sets of template arguments to be
generated. Since emit_load is not performance critical (the inner loop
never runs more often than twice), it's better for build time to use
runtime arguments everywhere.
Reduces build time of this file by 9% (17.3s -> 15.7s on my machine) and
reduces libaco's size by 2.6%.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>
Android building rules are aligned to meson ones
Fixes the following building error:
FAILED: ninja: 'external/mesa/src/amd/registers/amdgfxregs.json',
needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_amd_common_intermediates/common/sid_tables.h',
missing and no known rule to make it
Fixes: b7a6333ee ("amd/registers: switch to new generated register definitions")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6618>
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
Sounds useful to determine if ACO breaks a specific pipeline
because of various optimizations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6487>
The result of 0xf << 28 is a signed integer and hence overflows into the sign
bit. In practice compilers did the right thing here, since the intent of the
code was unsigned arithmetic anyway.
Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>
32-bit shifts were accidentally used before this change despite the intended
output being 64 bits.
This was observed when compiling Dolphin's ubershaders.
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>
The result of 0xf << 28 is a signed integer and hence overflows into the sign
bit. In practice compilers did the right thing here, since the intent of the
code was unsigned arithmetic anyway.
These conditions were observed in:
* dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.1d.format.r4g4b4a4_unorm_pack16.count_8.size.512x1
* dEQP-VK.binding_model.descriptorset_random.sets32.noarray.ubolimitlow.sbolimitlow.sampledimglow.outimgonly.noiub.nouab.frag.ialimithigh.0
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>
This simplifies the optimizer and should make SDWA optimizations easier.
No fossil-db changes on Navi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6293>
This renames it to drop the ptr_as and makes it handle all of the stride
cases. There's a bit of a tricky bit in here around Booleans but we
currently use 32-bit for those always.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Input and output info is gathered from intrinsics. nir_variables are
ignored (and we'll remove them anyway).
This is a prerequisite for ACO, but also makes the IR prettier.
The ac_nir_to_llvm change has to be in this commit.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>
With the kernel timeline sysncobj changes, the kernel submits do
not necessarily happen in global vkQueueSubmit order. Which should
be fine, we added the appropriate waits for that. (See
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT in the winsys)
However, all kernel submissions take a lock on the bo_list mutex,
and since we do the wait in the winsys, we wait while having the
bo_list mutex held. This means that as soon as a wait and a signal
submission are out of order we have a deadlock on the bo_list mutex
and the wait.
Solution is to use a shared reader lock during the kernel submission,
as we only need read access for the submission.
Fixes: 6bc5ce7a91 "radv: Add timeline syncobj for timeline semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3446
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6478>
If decrement == 0 then:
- it isn't safe to access the submission
- even if it is, checking that the result of the atomic_sub is 0
doesn't given an unique owner anymore.
So skip it. The submission always starts out with refcount >= 1,
so first one to decrement to 0 still get dibs on executing it.
Fixes: 4aa75bb3bd "radv: Add wait-before-submit support for timelines."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6478>
Replace div(x) by min(div(x), FLT_MAX)) to avoid getting a NaN result
when x is 0.
A cheaper alternative would be to use legacy mult instructions but they're
not exposed by LLVM.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6259>
Only advertise VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT if CLOCK_MONOTONIC_RAW
is defined. Fixes the build on OpenBSD which has CLOCK_MONOTONIC but not
CLOCK_MONOTONIC_RAW.
Fixes: 67a2c1493c ("vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6517>