For testing purpose, it's bad to let nm_utils_stable_id_parse()
directly access nm_utils_get_boot_id_str(). Instead, the function
should have no side-effects.
Since the boot-id is anyway cached, accessing it is cheap. Even
if it likely won't be needed.
Previously, whenever we needed /etc/machine-id we would re-load it
from file. The are 3 downsides of that:
- the smallest downside is the runtime overhead of repeatedly
reading the file and parse it.
- as we read it multiple times, it may change anytime. Most
code in NetworkManager does not expect or handle a change of
the machine-id.
Generally, the admin should make sure that the machine-id is properly
initialized before NetworkManager starts, and not change it. As such,
a change of the machine-id should never happen in practice.
But if it would change, we would get odd behaviors. Note for example
how generate_duid_from_machine_id() already cached the generated DUID
and only read it once.
It's better to pick the machine-id once, and rely to use the same
one for the remainder of the program.
If the admin wants to change the machine-id, NetworkManager must be
restarted as well (in case the admin cares).
Also, as we now only load it once, it makes sense to log an error
(once) when we fail to read the machine-id.
- previously, loading the machine-id could fail each time. And we
have to somehow handle that error. It seems, the best thing what we
anyway can do, is to log an error once and continue with a fake
machine-id. Here we add a fake machine-id based on the secret-key
or the boot-id. Now obtaining a machine-id can no longer fail
and error handling is no longer necessary.
Also, ensure that a machine-id of all zeros is not valid.
Technically, a machine-id is not an RFC 4122 UUID. But it's
the same size, so we also use NMUuid data structure for it.
While at it, also refactor caching of the boot-id and the secret
key. In particular, fix the thread-safety of the double-checked
locking implementations.
The need for this is the following:
"ipv4.dhcp-client-id" can be specified via global connection defaults.
In absence of any configuration in NetworkManager, the default depends
on the DHCP client plugin. In case of "dhclient", the default further
depends on /etc/dhcp.
For "internal" plugin, we may very well want to change the default
client-id to "mac" by universally installing a configuration
snippet
[connection-use-mac-client-id]
ipv4.dhcp-client-id=mac
However, if we the user happens to enable "dhclient" plugin, this also
forces the client-id and overrules configuration from /etc/dhcp. The real
problem is, that dhclient can be configured via means outside of NetworkManager,
so our defaults shall not overwrite defaults from /etc/dhcp.
With the new device spec, we can avoid this issue:
[connection-dhcp-client-id]
match-device=except:dhcp-plugin:dhclient
ipv4.dhcp-client-id=mac
This will be part of the solution for rh#1640494. Note that merely
dropping a configuration snippet is not yet enough. More fixes for
DHCP will follow. Also, bug rh#1640494 may have alternative solutions
as well. The nice part of this new feature is that it is generally
useful for configuring connection defaults and not specifically for
the client-id issue.
Note that this match spec is per-device, although the plugin is selected
globally. That makes some sense, because in the future we may or may not
configure the DHCP plugin per-device or per address family.
https://bugzilla.redhat.com/show_bug.cgi?id=1640494
This flag is more granular in whether to consider the connection
available or not. We probably should never check for the combined
flag NM_DEVICE_CHECK_CON_AVAILABLE_FOR_USER_REQUEST directly, but
always explicitly for the relevant parts.
Also, improve the error message, to indicate whether the device is
strictly unmanaged or whether it could be overruled.
The flags NMDeviceCheckConAvailableFlags and NMDeviceCheckDevAvailableFlags
both control whether a device appears available (either, available in
general, or related to a particular profile).
Also, both flag types strictly increase availability. Meaning: more flags,
more available.
There is some overlap between the flags, however they still have
their own distinct parts.
Improve the mapping from NMDeviceCheckConAvailableFlags to
NMDeviceCheckDevAvailableFlags, by picking exactly the flags
that are relevant.
We should not use GAsyncResult. At least, not for internal API.
It's more cumbersome then helpful, in my opinion. It requires
this awkward async_finish() pattern.
Instead, let the caller pass a suitable callback of the right type.
When the client terminates, we really don't care if it exited cleanly,
with an error or killed by a signal. We expect the client to never
exit and so all these situations are equally bad for us. Introduce a
new TERMINATED state instead of reusing existing FAIL or DONE states,
which are set when receiving particular events from the client.
The @was_active flag indicates that we started DHCP on an assumed
connection. The idea is that if DHCP succeeded before, any failure
must be treated like a renewal failure (and so it should start a grace
period) rather than a failure in getting an initial lease (which fails
the IP method).
When we clean up the DHCP instance, the flag must be reset to FALSE,
otherwise it will be potentially considered for other connections.
The effect of a DHCPv6 failure should depend only on current IP state.
This in the analogous of commit bd63d39252 ("dhcp: fix handling of
failure events") for IPv6.
We already have nm_utils_bin2hexstr() and _nm_utils_bin2hexstr_full().
This is confusing.
- nm_utils_bin2hexstr() is public API of libnm. Also, it has
a last argument @final_len to truncate the string at that
length.
It uses no delimiter and lower-case characters.
- _nm_utils_bin2hexstr_full() does not do any truncation, but
it has options to specify a delimiter, the character case,
and to update a given buffer in-place. Also, like
nm_utils_bin2hexstr() and _nm_utils_bin2hexstr() it can
allocate a new buffer on demand.
- _nm_utils_bin2hexstr() would use ':' as delimiter and make
the case configurable. Also, it would always allocate the returned
buffer.
It's too much and confusing. Drop _nm_utils_bin2hexstr() which is internal
API and just a wrapper around _nm_utils_bin2hexstr_full().
NMAcdManager is a rather simple instance.
It does not need (thread-safe) ref-counting, in fact, having
it ref-counted makes it slighly ugly that we connect a signal,
but never bother to disconnect it (while the ref-counted instance
could outlife the signal subscriber).
We also don't need GObject signals. They have more overhead
and are less type-safe than a regular function pointers. Signals
would make sense, if there could be multiple independent listeners,
but that just doesn't make sense.
Implementing it as a plain struct is less lines of code, and less
runtime over head.
Also drop the possiblitiy to reset the NMAcdManager instance.
It wasn't needed and I think it was buggy because it wouldn't
reset the n-acd instance.
https://github.com/NetworkManager/NetworkManager/pull/213
Kernel complains with:
platform-linux: link: change 10: token: set IPv6 address generation token to ::bbbb
platform-linux: do-change-link[10]: failure changing link: failure 22 (Invalid argument)
if we try to set an IPv6 token in ip_config_merge_and_apply() and we
haven't set accept_ra=1 yet. Since the flag is set when starting
router discovery, ensure that we don't try to set a token before that.
https://github.com/NetworkManager/NetworkManager/pull/214https://bugzilla.redhat.com/show_bug.cgi?id=1560652
In update update_ext_ip_config() we remove from various internal
configurations those addresses and routes that were removed externally
by users.
When the interface is brought down, the kernel automatically removes
routes associated with it and so we should not consider them as
"removed by users".
Instead, keep them so that they can be restored when the interface
comes up again.
device_link_changed() can't use nm_device_update_from_platform_link()
to update the device private fields because the latter overwrites
priv->iface and priv->up, and so the checks below as:
if (info.name[0] && strcmp (priv->iface, info.name) != 0) {
and:
was_up = priv->up;
priv->up = NM_FLAGS_HAS (info.n_ifi_flags, IFF_UP);
...
if (priv->up && !was_up) {
never succeed.
Fixes: d7f7725ae8
Nothing changes practically, as the NMDevice still starts this with
AF_UNSPEC. That is going to change in the following commit.
The ugly part:
priv->concheck_x[0] in few places. I believe we shouldn't be using union
aliasing here, and instead of indexing the v4/v6 arrays by a boolean it
should be an enum. I'm not fixing it here, but I eventually plan to if
this gets an ACK.
This allows us to use the correct DNS server for the particular interface
independent of what the system resolver is configured to use.
The ugly part:
Unfortunately, it is not all that easy. The libc's libresolv.so API does
not provide means for influencing neither interface nor name servers used
for DNS resolving.
Curl can also be compiled with c-ares resolver backend that does provide
the necessary functionality, but it requires and extra library and the
Linux distributions don't seem to enable it. (Fedora doesn't, which is a
good sign we don't have an option of relying on it.)
systemd-resolved does provide everything we need. If we take care to
keep its congfiguration up to date, we can use it to do the resolving on
a particular interface with that interface's DNS configuration. Great!
There's one more problem: Curl doesn't provide callbacks for resolving
host names. It doesn't, however, allow us to pass in the pre-resolved
hostnames in form of an CURLOPT_RESOLVE(3) option. This means we have to
parse the host name out of the URL ourselves. Fair enough I guess...
Before:
"manager: check_if_startup_complete returns FALSE because of eth0"
Now:
"manager: startup complete is waiting for device 'eth0' (autoactivate)"
Also, the logging line is now more a human readable sentence, but still
follows the same pattern as later
"manager: startup complete"
Meaning: grepping for "startup complete" becomes more helpful because
one first finds the reasons why startup-complete is not yet reached,
followed by the moment when it is reached.
Using these unormalized was wrong all along, but by chance didn't hit
paths that needed normalized connections. This may change if we
actually write in memory connections to /run with the keyfile plugin,
because that one wants them normalized.
This also saves some work, because normalization does boring things for
us, such as adding default ipv4/ipv6/proxy settings everywhere.
On networked boot we need to somehow communicate this to the early boot
machinery. Sadly, no DBus there and we're running in configure-and-quit
mode.
Abusing the state file for this sounds almost reasonable and is
reasonably straightforward thing to do.
libcurl does not allow removing easy-handles from within a curl
callback.
That was already partly avoided for one handle alone. That is, when
a handle completed inside a libcurl callback, it would only invoke the
callback, but not yet delete it. However, that is not enough, because
from within a callback another handle can be cancelled, leading to
the removal of (the other) handle and a crash:
==24572== at 0x40319AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24572== by 0x52DDAE5: Curl_close (url.c:392)
==24572== by 0x52EC02C: curl_easy_cleanup (easy.c:825)
==24572== by 0x5FDCD2: cb_data_free (nm-connectivity.c:215)
==24572== by 0x5FF6DE: nm_connectivity_check_cancel (nm-connectivity.c:585)
==24572== by 0x55F7F9: concheck_handle_complete (nm-device.c:2601)
==24572== by 0x574C12: concheck_cb (nm-device.c:2725)
==24572== by 0x5FD887: cb_data_invoke_callback (nm-connectivity.c:167)
==24572== by 0x5FD959: easy_header_cb (nm-connectivity.c:435)
==24572== by 0x52D73CB: chop_write (sendf.c:612)
==24572== by 0x52D73CB: Curl_client_write (sendf.c:668)
==24572== by 0x52D54ED: Curl_http_readwrite_headers (http.c:3904)
==24572== by 0x52E9EA7: readwrite_data (transfer.c:548)
==24572== by 0x52E9EA7: Curl_readwrite (transfer.c:1161)
==24572== by 0x52F4193: multi_runsingle (multi.c:1915)
==24572== by 0x52F5531: multi_socket (multi.c:2607)
==24572== by 0x52F5804: curl_multi_socket_action (multi.c:2771)
Fix that, by never invoking any callbacks when we are inside a libcurl
callback. Instead, the handle is marked for completion and queued. Later,
we complete all queue handles separately.
While at it, drop the @error argument from NMConnectivityCheckCallback.
It was only used to signal cancellation. Let's instead signal that via
status NM_CONNECTIVITY_CANCELLED.
https://bugzilla.gnome.org/show_bug.cgi?id=797136https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1792745https://bugzilla.opensuse.org/show_bug.cgi?id=1107197https://github.com/NetworkManager/NetworkManager/pull/207
Fixes: d8a31794c8
Later we want to fully support wireguard devices. Also,
possibly activating a generic profile in a wireguard device
would make sense.
Anyway, for the moment, just prevent that from happening
by explicitly marking the device as unmanaged.
(cherry picked from commit e3bd482329)
Currently, NMDeviceWireguard does neither set connection_type_check_compatible
nor implement check_connection_compatible. That means, it appears to be compatible
with every connection profile, which is obviously wrong.
Allow devices not to implement check_connection_compatible() and avoid the issue
by rejecting profiles by default.
(cherry picked from commit baa0008313)
NMConnection is an interface, which is implemented by the types
NMSimpleConnection (libnm-core), NMSettingsConnection (src) and
NMRemoteConnection (libnm).
NMSettingsConnection does a lot of things already:
1) it "is-a" NMDBusObject and exports the API of a connection profile
on D-Bus
2) it interacts with NMSettings and contains functionality
for tracking the profiles.
3) it is the base-class of types like NMSKeyfileConnection and
NMIfcfgConnection. These handle how the profile is persisted
on disk.
4) it implements NMConnection interface, to itself track the
settings of the profile.
3) and 4) would be better implemented via delegation than inheritance.
Address 4) and don't let NMSettingsConnection implemente the NMConnection
interface. Instead, a settings-connection references now a NMSimpleConnection
instance, to which it delegates for keeping the actual profiles.
Advantages:
- by delegating, there is a clearer separation of what
NMSettingsConnection does. For example, in C we often required
casts from NMSettingsConnection to NMConnection. NMConnection
is a very trivial object with very little logic. When we have
a NMConnection instance at hand, it's good to know that it is
*only* that simple instead of also being an entire
NMSettingsConnection instance.
The main purpose of this patch is to simplify the code by separating
the NMConnection from the NMSettingsConnection. We should generally
be aware whether we handle a NMSettingsConnection or a trivial
NMConnection instance. Now, because NMSettingsConnection no longer
"is-a" NMConnection, this distinction is apparent.
- NMConnection is implemented as an interface and we create
NMSimpleConnection instances whenever we need a real instance.
In GLib, interfaces have a performance overhead, that we needlessly
pay all the time. With this change, we no longer require
NMConnection to be an interface. Thus, in the future we could compile
a version of libnm-core for the daemon, where NMConnection is not an
interface but a GObject implementation akin to NMSimpleConnection.
- In the previous implementation, we cannot treat NMConnection immutable
and copy-on-write.
For example, when NMDevice needs a snapshot of the activated
profile as applied-connection, all it can do is clone the entire
NMSettingsConnection as a NMSimpleConnection.
Likewise, when we get a NMConnection instance and want to keep
a reference to it, we cannot do that, because we never know
who also references and modifies the instance.
By separating NMSettingsConnection we could in the future have
NMConnection immutable and copy-on-write, to avoid all unnecessary
clones.
Add a helper function nm_device_parent_find_for_connection() to
unify implementations of setting the parent in update_connection().
There is some change in behavior, in particular for nm-device-vlan.c,
which no longer compares the link information from platform. But
update_connection() is anyway a questionable concept, only used
for external assumed connection (which itself, is questionable). Meaning,
update_connection() is a hack not science, and it's not at all clear
what the correct behavior is.
Also, note how vlan's implementation differs from all others. Why?
Should we always resort to also check the information from platform?
Either way, one of the two approaches should be used consistently and
nm_device_parent_find_for_connection() opts to not consult platform
cache.
These should be logged on DEBUG level:
<warn> platform-linux: do-change-link[2]: failure changing link: failure 97 (Address family not supported by protocol)
<warn> device (wlo1): failed to enable userspace IPv6LL address handling (unspecified)
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/10