When the NM_UNMANAGED_PLATFORM_INIT flag is cleared last in
device_link_changed(), a recheck-assume is scheduled and then the
device goes immediately to UNAVAILABLE. During the state transition,
addresses and routes are removed from the interface. Then,
recheck-assume finds that the device can be assumed but it's too late
since the device was already deconfigured.
This is a problem as the whole point of assuming a device is to
activate a connection while leaving the device untouched.
In the NMCI "dracut_NM_vlan_over_bridge and dracut_NM_vlan_over_bond"
test, NM in real root tries to assume a vlan device that was activated
in initrd. When the interface gets deconfigured in UNAVAILABLE, the
connection to the NFS server breaks and the rootfs becomes
inaccessible.
The fix to this problem is to delay state transitions in
device_link_changed() to a idle handler, so that recheck-assume can
run before.
Fixes-test: @dracut_NM_vlan_over_bridge
Fixes-test: @dracut_NM_vlan_over_bond
https://bugzilla.redhat.com/show_bug.cgi?id=2047302
We have at least static and transient hostnames. Let's be clear which
one we are talking about.
Note that also NM_SETTINGS_HOSTNAME gets renamed to
NM_SETTINGS_STATIC_HOSTNAME, because it seems clearer.
The only purpose of NM_SETTINGS_STATIC_HOSTNAME is to be the backing
property for the "Hostname" D-Bus property for the NMDBusObject glue.
So, while the new name makes more sense to me, it's now also
inconsistent with it's primary use (the D-Bus property). Still...
When the device gets realized, similar to the situation that the device
is unmanaged by platform-init, if the device is still unmanaged by
parent and we clear the assume state. Then, when the device becomes
managed, NM is not able to properly assume the device using the UUID.
Therefore, we should not clear the assume state if the device has only
the NM_UNMANAGED_PLATFORM_INIT or the NM_UNMANAGED_PARENT flag set
in the unmanaged flags.
The previous commit 3c4450aa4d ('core: don't reset assume state too
early') did something similar for NM_UNMANAGED_PLATFORM_INIT flag only.
We use clang-format for automatic formatting of our source files.
Since clang-format is actively maintained software, the actual
formatting depends on the used version of clang-format. That is
unfortunate and painful, but really unavoidable unless clang-format
would be strictly bug-compatible.
So the version that we must use is from the current Fedora release, which
is also tested by our gitlab-ci. Previously, we were using Fedora 34 with
clang-tools-extra-12.0.1-1.fc34.x86_64.
As Fedora 35 comes along, we need to update our formatting as Fedora 35
comes with version "13.0.0~rc1-1.fc35".
An alternative would be to freeze on version 12, but that has different
problems (like, it's cumbersome to rebuild clang 12 on Fedora 35 and it
would be cumbersome for our developers which are on Fedora 35 to use a
clang that they cannot easily install).
The (differently painful) solution is to reformat from time to time, as we
switch to a new Fedora (and thus clang) version.
Usually we would expect that such a reformatting brings minor changes.
But this time, the changes are huge. That is mentioned in the release
notes [1] as
Makes PointerAligment: Right working with AlignConsecutiveDeclarations. (Fixes https://llvm.org/PR27353)
[1] https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html#clang-format
Completely rework IP configuration in the daemon. Use NML3Cfg as layer 3
manager for the IP configuration of an interface. Use NML3ConfigData as
pieces of configuration that the various components collect and
configure. NMDevice is managing most of the IP configuration at a higher
level, that is, it starts DHCP and other IP methods. Rework the state
handling there.
This is a huge rework of how NetworkManager daemon handles IP
configuration. Some fallout is to be expected.
It appears the patch deletes many lines of code. That is not accurate, because
you also have to count the files `src/core/nm-l3*`, which were unused previously.
Co-authored-by: Beniamino Galvani <bgalvani@redhat.com>
The two nm-sudo helper functions are only used by the OVS device plugin,
but they are part of NetworkManager core binary.
This is done commonly, where the NetworkManager binary has symbols that
are used by the plugins. But in this case, NetworkManager itself doesn't
use the symbols. That will case the linker to drop them.
A previous solution for that was commit 684f2acffe ('build: add way to
keep unused symbols when linking NetworkManager'), but that doesn't seem
to work with clang 3.4 (rhel-7).
Instead, actually use the symbol so that it cannot be dropped.
Naming is important, because the name of a thing should give you a good
idea what it does. Also, to find a thing, it needs a good name in the
first place. But naming is also hard.
Historically, some strv helper API was named as nm_utils_strv_*(),
and some API had a leading underscore (as it is internal API).
This was all inconsistent. Do some renaming and try to unify things.
We get rid of the leading underscore if this is just a regular
(internal) helper. But not for example from _nm_strv_find_first(),
because that is the implementation of nm_strv_find_first().
- _nm_utils_strv_cleanup() -> nm_strv_cleanup()
- _nm_utils_strv_cleanup_const() -> nm_strv_cleanup_const()
- _nm_utils_strv_cmp_n() -> _nm_strv_cmp_n()
- _nm_utils_strv_dup() -> _nm_strv_dup()
- _nm_utils_strv_dup_packed() -> _nm_strv_dup_packed()
- _nm_utils_strv_find_first() -> _nm_strv_find_first()
- _nm_utils_strv_sort() -> _nm_strv_sort()
- _nm_utils_strv_to_ptrarray() -> nm_strv_to_ptrarray()
- _nm_utils_strv_to_slist() -> nm_strv_to_gslist()
- nm_utils_strv_cmp_n() -> nm_strv_cmp_n()
- nm_utils_strv_dup() -> nm_strv_dup()
- nm_utils_strv_dup_packed() -> nm_strv_dup_packed()
- nm_utils_strv_dup_shallow_maybe_a() -> nm_strv_dup_shallow_maybe_a()
- nm_utils_strv_equal() -> nm_strv_equal()
- nm_utils_strv_find_binary_search() -> nm_strv_find_binary_search()
- nm_utils_strv_find_first() -> nm_strv_find_first()
- nm_utils_strv_make_deep_copied() -> nm_strv_make_deep_copied()
- nm_utils_strv_make_deep_copied_n() -> nm_strv_make_deep_copied_n()
- nm_utils_strv_make_deep_copied_nonnull() -> nm_strv_make_deep_copied_nonnull()
- nm_utils_strv_sort() -> nm_strv_sort()
Note that no names are swapped and none of the new names existed
previously. That means, all the new names are really new, which
simplifies to find errors due to this larger refactoring. E.g. if
you backport a patch from after this change to an old branch, you'll
get a compiler error and notice that something is missing.
Add a new 'keep-configuration' device option, set to 'yes' by
default. When set to 'no', on startup NetworkManager ignores that the
interface is pre-configured and doesn't try to keep its
configuration. Instead, it activates one of the persistent
connections.
Turns out, we call nm_settings_get_connection_clone() *a lot* with sort order
nm_settings_connection_cmp_autoconnect_priority_p_with_data().
As we cache the (differently sorted) list of connections, also cache
the presorted list. The only complication is that every time we still
need to check whether the list is still sorted, because it would be
more complicated to invalidate the list when an entry changes which
affects the sort order. Still, such a check is usually successful
and requires "only" N-1 comparisons.
Commit 33b9fa3a3c ("manager: Keep volatile/external connections
while referenced by async_op_lst") changed active_connection_find() to
also return active connections that are not yet activating but are
waiting authorization.
This has side effect for other callers of the function. In particular,
_get_activatable_connections_filter() should exclude only ACs that are
really active, not those waiting for authorization.
Otherwise, in ensure_master_active_connection() all the ACs waiting
authorization are missed and we might fail to find the right master
AC.
Add an argument to active_connection_find to select whether include
ACs waiting authorization.
Fixes: 33b9fa3a3c ('manager: Keep volatile/external connections while referenced by async_op_lst')
https://bugzilla.redhat.com/show_bug.cgi?id=1955101
If the device is still unmanaged by platform-init (which means that
udev didn't emit the event for the interface) when the device gets
realized, we currently clear the assume state. Later, when the device
becomes managed, NM is not able to properly assume the device using
the UUID.
This situation arises, for example, when NM already configured the
device in initrd; after NM is restarted in the real root, udev events
can be delayed causing this race condition.
Among all unamanaged flags, platform-init is the only one that can be
delayed externally. We should not clear the assume state if the device
has only platform-init in the unmanaged flags.
D-Bus 1.3.1 (2010) introduced the standard "PropertiesChanged" signal
on "org.freedesktop.DBus.Properties". NetworkManager is old, and predates
this API. From that time, it still had it's own PropertiesChanged signal
that are emitted together with the standard ones. NetworkManager
supports the standard PropertiesChanged signal since it switched to
gdbus library in version 1.2.0 (2016).
These own signals are deprecated for a long time already ([1], 2016), and
are hopefully not used by anybody anymore. libnm-glib was using them and
relied on them, but that library is gone. libnm does not use them and neither
does plasma-nm.
Hopefully no users are left that are affected by this API break.
[1] 6fb917178a
Active-connections in the async_op_lst are not guaranteed to have a
settings-connection. In particular, the settings-connection for an
AddAndActivate() AC is set only after the authorization succeeds. Use
the non-asserting variant of the function to fix the following
failure:
nm_active_connection_get_settings_connection: assertion 'sett_conn' failed
1 _g_log_abort()
2 g_logv()
3 g_log()
4 _nm_g_return_if_fail_warning.constprop.14()
5 nm_active_connection_get_settings_connection()
6 active_connection_find()
7 _get_activatable_connections_filter()
8 nm_settings_get_connections_clone()
9 nm_manager_get_activatable_connections()
10 auto_activate_device_cb()
11 g_idle_dispatch()
12 g_main_context_dispatch()
13 g_main_context_iterate.isra.21()
14 g_main_loop_run()
15 main()
Fixes: 33b9fa3a3c ('manager: Keep volatile/external connections while referenced by async_op_lst')
https://bugzilla.redhat.com/show_bug.cgi?id=1933719https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/834
When the link goes away the manager keeps software devices alive as
unrealized because there is still a connection for them.
If the device is software and has a NM-generated connection, keeping
the device alive means that also the generated connection stays
alive. The result is that both stick around forever even if there is
no longer a kernel link.
Add a check to avoid this situation.
https://bugzilla.redhat.com/show_bug.cgi?id=1945282
Fixes: cd0cf9229d ('veth: add support to configure veth interfaces')
Along with NM_SETTINGS_CONNECTION_UPDATE_REASON_RESET_SYSTEM_SECRETS
and NM_SETTINGS_CONNECTION_UPDATE_REASON_RESET_AGENT_SECRETS, which can
be used in the NMSettingConnection's "updated" handlers to track secrets
updates, add NM_SETTINGS_CONNECTION_UPDATE_REASON_UPDATE_NON_SECRET so
that the handlers can tell when something other than secrets has been
updated in the connection.
It can also potentially be used in _connection_changed_update in
src/core/settings/nm-settings.c to stop emitting the
NetworkManager.Settings.Connection.Updated() dbus signal if only secrets
are being updated (on agent queries etc.) if it is deemed to be correct.
"libnm-core/" is rather complicated. It provides a static library that
is linked into libnm.so and NetworkManager. It also contains public
headers (like "nm-setting.h") which are part of public libnm API.
Then we have helper libraries ("libnm-core/nm-libnm-core-*/") which
only rely on public API of libnm-core, but are themself static
libraries that can be used by anybody who uses libnm-core. And
"libnm-core/nm-libnm-core-intern" is used by libnm-core itself.
Move "libnm-core/" to "src/". But also split it in different
directories so that they have a clearer purpose.
The goal is to have a flat directory hierarchy. The "src/libnm-core*/"
directories correspond to the different modules (static libraries and set
of headers that we have). We have different kinds of such modules because
of how we combine various code together. The directory layout now reflects
this.
Currently "src/" mostly contains the source code of the daemon.
I say mostly, because that is not true, there are also the device,
settings, wwan, ppp plugins, the initrd generator, the pppd and dhcp
helper, and probably more.
Also we have source code under libnm-core/, libnm/, clients/, and
shared/ directories. That is all confusing.
We should have one "src" directory, that contains subdirectories. Those
subdirectories should contain individual parts (libraries or
applications), that possibly have dependencies on other subdirectories.
There should be a flat hierarchy of directories under src/, which
contains individual modules.
As the name "src/" is already taken, that prevents any sensible
restructuring of the code.
As a first step, move "src/" to "src/core/". This gives space to
reorganize the code better by moving individual components into "src/".
For inspiration, look at systemd's "src/" directory.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/743