NMAcdManager is a rather simple instance.
It does not need (thread-safe) ref-counting, in fact, having
it ref-counted makes it slighly ugly that we connect a signal,
but never bother to disconnect it (while the ref-counted instance
could outlife the signal subscriber).
We also don't need GObject signals. They have more overhead
and are less type-safe than a regular function pointers. Signals
would make sense, if there could be multiple independent listeners,
but that just doesn't make sense.
Implementing it as a plain struct is less lines of code, and less
runtime over head.
Also drop the possiblitiy to reset the NMAcdManager instance.
It wasn't needed and I think it was buggy because it wouldn't
reset the n-acd instance.
https://github.com/NetworkManager/NetworkManager/pull/213
Kernel complains with:
platform-linux: link: change 10: token: set IPv6 address generation token to ::bbbb
platform-linux: do-change-link[10]: failure changing link: failure 22 (Invalid argument)
if we try to set an IPv6 token in ip_config_merge_and_apply() and we
haven't set accept_ra=1 yet. Since the flag is set when starting
router discovery, ensure that we don't try to set a token before that.
https://github.com/NetworkManager/NetworkManager/pull/214https://bugzilla.redhat.com/show_bug.cgi?id=1560652
In update update_ext_ip_config() we remove from various internal
configurations those addresses and routes that were removed externally
by users.
When the interface is brought down, the kernel automatically removes
routes associated with it and so we should not consider them as
"removed by users".
Instead, keep them so that they can be restored when the interface
comes up again.
device_link_changed() can't use nm_device_update_from_platform_link()
to update the device private fields because the latter overwrites
priv->iface and priv->up, and so the checks below as:
if (info.name[0] && strcmp (priv->iface, info.name) != 0) {
and:
was_up = priv->up;
priv->up = NM_FLAGS_HAS (info.n_ifi_flags, IFF_UP);
...
if (priv->up && !was_up) {
never succeed.
Fixes: d7f7725ae8
Nothing changes practically, as the NMDevice still starts this with
AF_UNSPEC. That is going to change in the following commit.
The ugly part:
priv->concheck_x[0] in few places. I believe we shouldn't be using union
aliasing here, and instead of indexing the v4/v6 arrays by a boolean it
should be an enum. I'm not fixing it here, but I eventually plan to if
this gets an ACK.
This allows us to use the correct DNS server for the particular interface
independent of what the system resolver is configured to use.
The ugly part:
Unfortunately, it is not all that easy. The libc's libresolv.so API does
not provide means for influencing neither interface nor name servers used
for DNS resolving.
Curl can also be compiled with c-ares resolver backend that does provide
the necessary functionality, but it requires and extra library and the
Linux distributions don't seem to enable it. (Fedora doesn't, which is a
good sign we don't have an option of relying on it.)
systemd-resolved does provide everything we need. If we take care to
keep its congfiguration up to date, we can use it to do the resolving on
a particular interface with that interface's DNS configuration. Great!
There's one more problem: Curl doesn't provide callbacks for resolving
host names. It doesn't, however, allow us to pass in the pre-resolved
hostnames in form of an CURLOPT_RESOLVE(3) option. This means we have to
parse the host name out of the URL ourselves. Fair enough I guess...
Before:
"manager: check_if_startup_complete returns FALSE because of eth0"
Now:
"manager: startup complete is waiting for device 'eth0' (autoactivate)"
Also, the logging line is now more a human readable sentence, but still
follows the same pattern as later
"manager: startup complete"
Meaning: grepping for "startup complete" becomes more helpful because
one first finds the reasons why startup-complete is not yet reached,
followed by the moment when it is reached.
Using these unormalized was wrong all along, but by chance didn't hit
paths that needed normalized connections. This may change if we
actually write in memory connections to /run with the keyfile plugin,
because that one wants them normalized.
This also saves some work, because normalization does boring things for
us, such as adding default ipv4/ipv6/proxy settings everywhere.
On networked boot we need to somehow communicate this to the early boot
machinery. Sadly, no DBus there and we're running in configure-and-quit
mode.
Abusing the state file for this sounds almost reasonable and is
reasonably straightforward thing to do.
libcurl does not allow removing easy-handles from within a curl
callback.
That was already partly avoided for one handle alone. That is, when
a handle completed inside a libcurl callback, it would only invoke the
callback, but not yet delete it. However, that is not enough, because
from within a callback another handle can be cancelled, leading to
the removal of (the other) handle and a crash:
==24572== at 0x40319AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24572== by 0x52DDAE5: Curl_close (url.c:392)
==24572== by 0x52EC02C: curl_easy_cleanup (easy.c:825)
==24572== by 0x5FDCD2: cb_data_free (nm-connectivity.c:215)
==24572== by 0x5FF6DE: nm_connectivity_check_cancel (nm-connectivity.c:585)
==24572== by 0x55F7F9: concheck_handle_complete (nm-device.c:2601)
==24572== by 0x574C12: concheck_cb (nm-device.c:2725)
==24572== by 0x5FD887: cb_data_invoke_callback (nm-connectivity.c:167)
==24572== by 0x5FD959: easy_header_cb (nm-connectivity.c:435)
==24572== by 0x52D73CB: chop_write (sendf.c:612)
==24572== by 0x52D73CB: Curl_client_write (sendf.c:668)
==24572== by 0x52D54ED: Curl_http_readwrite_headers (http.c:3904)
==24572== by 0x52E9EA7: readwrite_data (transfer.c:548)
==24572== by 0x52E9EA7: Curl_readwrite (transfer.c:1161)
==24572== by 0x52F4193: multi_runsingle (multi.c:1915)
==24572== by 0x52F5531: multi_socket (multi.c:2607)
==24572== by 0x52F5804: curl_multi_socket_action (multi.c:2771)
Fix that, by never invoking any callbacks when we are inside a libcurl
callback. Instead, the handle is marked for completion and queued. Later,
we complete all queue handles separately.
While at it, drop the @error argument from NMConnectivityCheckCallback.
It was only used to signal cancellation. Let's instead signal that via
status NM_CONNECTIVITY_CANCELLED.
https://bugzilla.gnome.org/show_bug.cgi?id=797136https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1792745https://bugzilla.opensuse.org/show_bug.cgi?id=1107197https://github.com/NetworkManager/NetworkManager/pull/207
Fixes: d8a31794c8
Later we want to fully support wireguard devices. Also,
possibly activating a generic profile in a wireguard device
would make sense.
Anyway, for the moment, just prevent that from happening
by explicitly marking the device as unmanaged.
(cherry picked from commit e3bd482329)
Currently, NMDeviceWireguard does neither set connection_type_check_compatible
nor implement check_connection_compatible. That means, it appears to be compatible
with every connection profile, which is obviously wrong.
Allow devices not to implement check_connection_compatible() and avoid the issue
by rejecting profiles by default.
(cherry picked from commit baa0008313)
NMConnection is an interface, which is implemented by the types
NMSimpleConnection (libnm-core), NMSettingsConnection (src) and
NMRemoteConnection (libnm).
NMSettingsConnection does a lot of things already:
1) it "is-a" NMDBusObject and exports the API of a connection profile
on D-Bus
2) it interacts with NMSettings and contains functionality
for tracking the profiles.
3) it is the base-class of types like NMSKeyfileConnection and
NMIfcfgConnection. These handle how the profile is persisted
on disk.
4) it implements NMConnection interface, to itself track the
settings of the profile.
3) and 4) would be better implemented via delegation than inheritance.
Address 4) and don't let NMSettingsConnection implemente the NMConnection
interface. Instead, a settings-connection references now a NMSimpleConnection
instance, to which it delegates for keeping the actual profiles.
Advantages:
- by delegating, there is a clearer separation of what
NMSettingsConnection does. For example, in C we often required
casts from NMSettingsConnection to NMConnection. NMConnection
is a very trivial object with very little logic. When we have
a NMConnection instance at hand, it's good to know that it is
*only* that simple instead of also being an entire
NMSettingsConnection instance.
The main purpose of this patch is to simplify the code by separating
the NMConnection from the NMSettingsConnection. We should generally
be aware whether we handle a NMSettingsConnection or a trivial
NMConnection instance. Now, because NMSettingsConnection no longer
"is-a" NMConnection, this distinction is apparent.
- NMConnection is implemented as an interface and we create
NMSimpleConnection instances whenever we need a real instance.
In GLib, interfaces have a performance overhead, that we needlessly
pay all the time. With this change, we no longer require
NMConnection to be an interface. Thus, in the future we could compile
a version of libnm-core for the daemon, where NMConnection is not an
interface but a GObject implementation akin to NMSimpleConnection.
- In the previous implementation, we cannot treat NMConnection immutable
and copy-on-write.
For example, when NMDevice needs a snapshot of the activated
profile as applied-connection, all it can do is clone the entire
NMSettingsConnection as a NMSimpleConnection.
Likewise, when we get a NMConnection instance and want to keep
a reference to it, we cannot do that, because we never know
who also references and modifies the instance.
By separating NMSettingsConnection we could in the future have
NMConnection immutable and copy-on-write, to avoid all unnecessary
clones.
Add a helper function nm_device_parent_find_for_connection() to
unify implementations of setting the parent in update_connection().
There is some change in behavior, in particular for nm-device-vlan.c,
which no longer compares the link information from platform. But
update_connection() is anyway a questionable concept, only used
for external assumed connection (which itself, is questionable). Meaning,
update_connection() is a hack not science, and it's not at all clear
what the correct behavior is.
Also, note how vlan's implementation differs from all others. Why?
Should we always resort to also check the information from platform?
Either way, one of the two approaches should be used consistently and
nm_device_parent_find_for_connection() opts to not consult platform
cache.
These should be logged on DEBUG level:
<warn> platform-linux: do-change-link[2]: failure changing link: failure 97 (Address family not supported by protocol)
<warn> device (wlo1): failed to enable userspace IPv6LL address handling (unspecified)
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/10
For dynamic IP methods (DHCP, IPv4LL, WWAN) the route metric is set at
activation/renewal time using the value from static configuration. To
support runtime change we need to update the dynamic configuration in
place and tell the DHCP client the new value to use for future
renewals.
https://bugzilla.redhat.com/show_bug.cgi?id=1528071
When the IPv4 method is 'auto' and there are static addresses
configured in the connection, start a DAD probe for the static
addresses and apply them immediately on success, without waiting for
DHCP to complete.
Note that if the static address is in the same subnet of the DHCP one,
when we add the DHCP address we want it to be primary and so we will
remove the static address temporarily to achieve the right order of
addresses.
https://bugzilla.redhat.com/show_bug.cgi?id=1369905
Note the special error codes NM_UTILS_ERROR_CONNECTION_AVAILABLE_*.
This will be used to determine, whether the profile is fundamentally
incompatible with the device, or whether just some other properties
mismatch. That information will be importand during a plain `nmcli
connection up`, where NetworkManager searches all devices for a device
to activate. If no device is found (and multiple errors happened),
we want to show the error that is most likely relevant for the user.
Also note, how NMDevice's check_connection_compatible() uses the new
class field "device_class->connection_type_check_compatible" to simplify
checks for compatible profiles.
The error reason is still unused.
We previously kept any acd-manager running if the device was
disconnected. It was possible to trigger a crash by setting a long
dad-timeout and interrupting the activation request:
nmcli con add type ethernet ifname eth0 con-name eth0+ ip4 1.2.3.4/32
nmcli con mod eth0+ ipv4.dad-timeout 10000
nmcli -w 2 con up eth0+
nmcli con down eth0+
After this, the n-acd timer would fire after 10 seconds and try to
disconnect an already disconnected device, throwing the assertion:
NetworkManager:ERROR:src/devices/nm-device.c:9845:
activate_stage5_ip4_config_result: assertion failed: (req)
Fixes: 28f6e8b4d2
We commonly don't use the glib typedefs for char/short/int/long,
but their C types directly.
$ git grep '\<g\(char\|short\|int\|long\|float\|double\)\>' | wc -l
587
$ git grep '\<\(char\|short\|int\|long\|float\|double\)\>' | wc -l
21114
One could argue that using the glib typedefs is preferable in
public API (of our glib based libnm library) or where it clearly
is related to glib, like during
g_object_set (obj, PROPERTY, (gint) value, NULL);
However, that argument does not seem strong, because in practice we don't
follow that argument today, and seldomly use the glib typedefs.
Also, the style guide for this would be hard to formalize, because
"using them where clearly related to a glib" is a very loose suggestion.
Also note that glib typedefs will always just be typedefs of the
underlying C types. There is no danger of glib changing the meaning
of these typedefs (because that would be a major API break of glib).
A simple style guide is instead: don't use these typedefs.
No manual actions, I only ran the bash script:
FILES=($(git ls-files '*.[hc]'))
sed -i \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>\( [^ ]\)/\1\2/g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\> /\1 /g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>/\1/g' \
"${FILES[@]}"
In device_ipx_changed() we only keep track of dad6_failed_addrs
addresses if the device's state is > DISCONNECTED.
For the same reason, we should also do that in queued_ip_config_change().
But it's worse. If the device is in state disconnected, and the user
externally adds IPv6 addresses, we will end up in queued_ip_config_change().
It is easily possible that "need_ipv6ll" ends up being TRUE, which results
in a call to check_and_add_ipv6ll_addr() and later possibly
ip_config_merge_and_apply (self, AF_INET6, TRUE);
This in turn will modify the IP configuration on the device, although
the device may be externally managed and NetworkManager shouldn't touch it.
https://bugzilla.redhat.com/show_bug.cgi?id=1593210
We first iterate over addresses that might have failed IPv6 DAD and
update the state in NMNDisc.
However, while we do that, don't yet invoke the changed signal.
Otherwise, we will invoke it multiple times (in case multiple addresses
failed). Instead, keep track of whether something changed, and handle
it once a bit later.
Whenever we process queued IP changes, we must handle all pending
dad6_failed_addrs. This is, to ensure we don't accumulate more
and more addresses in the list.
Rework the code, by stealing the entire list once at the beginning
dad6_failed_addrs = g_steal_pointer (&priv->dad6_failed_addrs);
and free it at the end:
g_slist_free_full (dad6_failed_addrs, (GDestroyNotify) nmp_object_unref);
This makes it easier to see, that we always process all addresses in
priv->dad6_failed_addrs.