Completely rework IP configuration in the daemon. Use NML3Cfg as layer 3
manager for the IP configuration of an interface. Use NML3ConfigData as
pieces of configuration that the various components collect and
configure. NMDevice is managing most of the IP configuration at a higher
level, that is, it starts DHCP and other IP methods. Rework the state
handling there.
This is a huge rework of how NetworkManager daemon handles IP
configuration. Some fallout is to be expected.
It appears the patch deletes many lines of code. That is not accurate, because
you also have to count the files `src/core/nm-l3*`, which were unused previously.
Co-authored-by: Beniamino Galvani <bgalvani@redhat.com>
Give a consistent name.
A bit odd are now the names nm_g_bytes_hash() and nm_g_bytes_equal()
as they go together with nm_pg_bytes_hash()/nm_pg_bytes_equal().
But here the problem is more with the naming of "nm_p*_{equal,hash}()"
functions, which probably should be renamed to "nm_*_ptr_{equal,hash}()".
While both functions are basically the same, the majority of the time
we use g_snprintf(). There is no strong reason to prefer one or the
other, but let's keep using one variant.
The user of NMDhcpClient is supposed to call "nm_dhcp_client_stop()"
on the instance, also because it is a ref-counted GObject type. So we
wouldn't want to rely on the last unref to stop DHCP.
Still, in case that was not done, the code somehow made the assumption
it made any sense to possibly not stop the DHCP client. For the internal
client, there is of course nothing left after destroying the
NMDhcpClient instance, but what about dhclient? I don't think we should
ever leave dhclient running unsupervised. Even during restart of the
service, we need to stop it first, and restart it afterwards.
When we quit NetworkManager, we may want to leave the interface up and
configured. For that, we may need to take care that "nm_dhcp_client_stop()"
does not destory the IP configuration of the intrerface. But I don't
think that is a problem. What we still want to do, is kill the dhclient
instance.
NMDhcpClient is not supposed to be ever restarted. It starts when an
instance gets created, and it stops with "nm_dhcp_client_stop()". Hence,
the simple "is_stopped" flag is fine to prevent that multiple stop calls
are harmful.
NMDhcpManager was tracking DHCP clients. During start, it would check
whether an instance for the same ifindex is running, and stop it.
That seems unnecessary and wrong. Clearly, we cannot have multiple users
(like two `NMDevice`s) run DHCP on the same interface. But its up to
them to coordinate that. They also cannot configure IP addresses at the
same interface, and if they do, then there is a big problem already.
This comes from commit 1806235049 ('dhcp: convert dhcp backends to
classes'). Maybe back then there was also the idea that NetworkManager
could quit and leave dhclient running. That idea is also flawed. When
NetworkManager stops, it leaves the interface (possibly) up, so that
restart works without disruption. That does not mean that the DHCP
client needs to keep running. What works is to restart NetworkManager in
a timely manner, then NetworkManager will start a new DHCP client after
restart. What does not work is stop NetworkManager, do nothing (like
taking over the interface by running your own manager) and expect that
DHCP keeps working indefinitely. And of course, with the internal client
this cannot possibly work either. Don't stop NetworkManager for good, if
you expect NetworkManager to run DHCP on an interface.
A different things is that when NetworkManager crashes, that after
restart it kills the left over dhclient instances. That may require a
solution, for example systemd killing all processes or checking for
left-over PID files and kill the processes. But what was implemented in
NMDhcpManager was not a solution for that.
As such, it's not clear what conflicting instance we want to kill, or
why NMDhcpManager should even track NMDhcpClient instances.
NMDhcpClient communicates events via GObject signals. GObject signals in
principle could have several subscribers. In practice, a NMDhcpClient
instance has only one subscriber, because it was constructed with
certain parameters, so it's unlikely to be shared.
That one subscriber, always needs to subscribe to all signals
("state-changed" and "prefix-delegated"), Unless the subscriber only
creates a IPv4 client. In which case they won't subscribe to
"prefix-delegated", but that signal is also not invoked for IPv4
clients.
Combine the signals in one, and pass all parameters via a new
NMDhcpClientNotfiyData payload. I feel this is nicer, to pack all
parameters together. I find this more type-aware, where we can
switch (in the callback) based on a notify-type enum, instead
of subscribing multiple signal handlers.
With l3cfg work, DHCP handling will be refactored, where this model of
having one "generic" notify signal makes more sense than here. For the
moment, it is arguably pretty much the same. Also, because NMDhcpClient
subscribes two different handlers for IPv4 and IPv6. In the future,
there will be only one notify handler, and that can cover IPv4 and IPv6
and both "state-changed" and "prefix-delegated" (and other notification
types).
Naming is important, because the name of a thing should give you a good
idea what it does. Also, to find a thing, it needs a good name in the
first place. But naming is also hard.
Historically, some strv helper API was named as nm_utils_strv_*(),
and some API had a leading underscore (as it is internal API).
This was all inconsistent. Do some renaming and try to unify things.
We get rid of the leading underscore if this is just a regular
(internal) helper. But not for example from _nm_strv_find_first(),
because that is the implementation of nm_strv_find_first().
- _nm_utils_strv_cleanup() -> nm_strv_cleanup()
- _nm_utils_strv_cleanup_const() -> nm_strv_cleanup_const()
- _nm_utils_strv_cmp_n() -> _nm_strv_cmp_n()
- _nm_utils_strv_dup() -> _nm_strv_dup()
- _nm_utils_strv_dup_packed() -> _nm_strv_dup_packed()
- _nm_utils_strv_find_first() -> _nm_strv_find_first()
- _nm_utils_strv_sort() -> _nm_strv_sort()
- _nm_utils_strv_to_ptrarray() -> nm_strv_to_ptrarray()
- _nm_utils_strv_to_slist() -> nm_strv_to_gslist()
- nm_utils_strv_cmp_n() -> nm_strv_cmp_n()
- nm_utils_strv_dup() -> nm_strv_dup()
- nm_utils_strv_dup_packed() -> nm_strv_dup_packed()
- nm_utils_strv_dup_shallow_maybe_a() -> nm_strv_dup_shallow_maybe_a()
- nm_utils_strv_equal() -> nm_strv_equal()
- nm_utils_strv_find_binary_search() -> nm_strv_find_binary_search()
- nm_utils_strv_find_first() -> nm_strv_find_first()
- nm_utils_strv_make_deep_copied() -> nm_strv_make_deep_copied()
- nm_utils_strv_make_deep_copied_n() -> nm_strv_make_deep_copied_n()
- nm_utils_strv_make_deep_copied_nonnull() -> nm_strv_make_deep_copied_nonnull()
- nm_utils_strv_sort() -> nm_strv_sort()
Note that no names are swapped and none of the new names existed
previously. That means, all the new names are really new, which
simplifies to find errors due to this larger refactoring. E.g. if
you backport a patch from after this change to an old branch, you'll
get a compiler error and notice that something is missing.
In NetworkManager.conf, we can only configure one "[main].dhcp="
for both address families. Consequently, NMDhcpClientFactory
represents also both address families. However, most plugins
don't support IPv4 and IPv6 together.
Thus, if a plugin does not support an address family, we fallback
to the implementation of the "internal" plugin.
Slightly rework the code how that is done. Instead of having
a "get_type()" and "get_type_per_addr_family()" callback, have
an IPv4 and IPv6 getter.
nettools plugin represents the way how to do it, and other plugins
should mimic that behavior. The nettools implementation adds private
DHCP options as hex, except the options
- 249 (Microsoft Classless Static Route)
- 252 (Web Proxy Auto Discovery Protocol)
Adjust systemd plugin to do the same.
For 252, we now parse the "wpad" option differently. The change in
behavior is that the property is now no longer exposed as hexstring,
but as backslash escaped plain text.
For 249, the option is not implemented. But stop adding the option as
hex-string too.
A user might configure /etc/dhcpcd.conf to contain static fallback addresses.
In that case, the dhcpcd plugin reports the state as "static". Let's treat
that the same way as bound.
Note that this is not an officially supported or endorsed way of
configuring fallback addresses in NetworkManager. Rather, when using
DHCP plugins, the user can hack the system and make unsupported
modifications in /etc/dhcpcd.conf or /etc/dhcp. This change only makes
it a bit easier to do it.
See-also: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/579#note_922758https://bugzilla.gnome.org/show_bug.cgi?id=768362
Based-on-patch-by: gordonb3 <gordon@bosvangennip.nl>
Instead of passing the setting on during ip4_start()/ip6_start(), make
it a property of NMDhcpClient.
This property is currently only set by OLPC devices, and is only
implemented by NMDhcpDhclient. As such, it also does not need to change
or get reset. Hence, and immutable, construct-only property is clearer,
because we don't have to pass parameters to ip[46]_start().
Arguably, the parameter is still there, but being immutable and always
set, make it easier to reason about it.
Probably pid_t is always signed, because kill() documents that
negative values have a special meaning (technically, C would
automatically cast negative signed values to an unsigned pid_t type
too).
Anyway, NMDhcpClient at several places uses -1 as special value for "no
pid". At the same time, it checks for valid PIDs with "pid > 1". That
only works if pid_t is signed.
Add a static assertion for that.
Coverity says:
Error: ALLOC_FREE_MISMATCH (CWE-762):
NetworkManager-1.31.3/src/core/dhcp/nm-dhcp-systemd.c:234: alloc: Allocation of memory which must be freed using "free".
NetworkManager-1.31.3/src/core/dhcp/nm-dhcp-systemd.c:447: free: Calling "_nm_auto_g_free" frees "routes" using "g_free" but it should have been freed using "free".
# 445| }
# 446| NM_SET_OUT(out_options, g_steal_pointer(&options));
# 447|-> return g_steal_pointer(&ip4_config);
# 448| }
# 449|
Fixes: acc0d79224 ('systemd: merge branch 'systemd' into master')
For infiniband, request_broadcast is automatically (and always) enabled.
Otherwise, we usually don't enable it, and (unlike systemd-networkd),
there is currently no configuration option to enable it.
Still honor the new udev property that can indicate to enable the flag
per device.
See-also: https://github.com/systemd/systemd/pull/ ### 19346
The DHCP client has potentially a large number of options,
including boolean options (flags). It is cumbersome to implement
them one by one. Instead, make more prominent use of NMDhcpClientFlags.
NM_DHCP_STATE_DONE is for when the client reports that it is shutting
down. If we manually stop it, we should set the TERMINATED state, so
that NMDevice doesn't start a grace period waiting for a renewal.
This fixes the:
device (enp1s0): DHCPv4: trying to acquire a new lease within 90 seconds
message printed when NM is shutting down.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/802
This file was intended to be used by VPN plugins (by copying it).
However, it was also used internally.
Split the file, and move the internally used part to libnm-glib-aux.
The part that is only there for out of tree users, moves to
"nm-compat.h".
glib requires G_LOG_DOMAIN defined so that log messages are labeled
to belong to NetworkManager or libnm.
However, we don't actually want to use glib logging. Our library libnm
MUST not log anything, because it spams the user's stdout/stderr.
Instead, a library must report notable events via its API. Note that
there is also LIBNM_CLIENT_DEBUG to explicitly enable debug logging,
but that doesn't use glib logging either.
Also, the daemon does not use glib logging instead it logs to syslog.
When run with `--debug`.
Hence, it's not useful for us to define different G_LOG_DOMAIN per
library/application, because none of our libraries/applications should
use glib logging.
It also gets slightly confusing, because we have the static library like
`src/libnm-core-impl`, which is both linked into `libnm` (the library)
and `NetworkManager` (the daemon). Which logging domain should they use?
Set the G_LOG_DOMAIN to "nm" everywhere. But no longer do it via `-D`
arguments to the compiler.
See-also: https://developer.gnome.org/glib/stable/glib-Message-Logging.html#G-LOG-DOMAIN:CAPS
"libnm-core/" is rather complicated. It provides a static library that
is linked into libnm.so and NetworkManager. It also contains public
headers (like "nm-setting.h") which are part of public libnm API.
Then we have helper libraries ("libnm-core/nm-libnm-core-*/") which
only rely on public API of libnm-core, but are themself static
libraries that can be used by anybody who uses libnm-core. And
"libnm-core/nm-libnm-core-intern" is used by libnm-core itself.
Move "libnm-core/" to "src/". But also split it in different
directories so that they have a clearer purpose.
The goal is to have a flat directory hierarchy. The "src/libnm-core*/"
directories correspond to the different modules (static libraries and set
of headers that we have). We have different kinds of such modules because
of how we combine various code together. The directory layout now reflects
this.