The devices generally need to be IFF_UP and wait a little before the
carrier detection is reliable. Some devices, actually need to wait
more than a little -- r8169 needs up to 5 seconds.
For this reason, we delay startup complete while the carrier is down
after we bring the device up. We do this so that we don't reject
activations due to carrier down until we're sure it's really down.
This works well as long as it's us who brought the device up.
If we're restarting the daemon, the device is going to be already up
when we start up the daemon for the second time. There's, however, a
slim chance that the device was brought down and up very shortly before
the restart and therefore the carrier reporting is still not reliable.
As a matter of fact, we bring the devices down and back up on some
occassions, such as when enslaving to a team device.
Therefore, the following events in quick succession cause trouble:
# nmcli con up team-slave-eth0
[20099.205355] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
[20099.365641] nm-team: Port device eth0 added
[20099.370728] r8169 0000:03:00.0 eth0: Link is Down
[20099.436631] nm-team: Port device eth0 removed
[20099.463422] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
[20099.628505] r8169 0000:03:00.0 eth0: Link is Down
[20099.669425] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
[20099.833457] r8169 0000:03:00.0 eth0: Link is Down
[20099.838471] nm-team: Port device eth0 added
The device has been brought down, enslaved and brought up.
"Link is Down" indicates carrier not being detected.
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/7)
# systemctl restart NetworkManager
Now NM sees the device being up, but carrier down.
# nmcli con up testeth0
Error: Connection activation failed: No suitable device found for this connection (...).
Activation failed, because eth0 carrier still appears down.
# [20102.943464] r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Now it's up, but the party is already over. Shiet.
Let's wait whenever the device reaches unavailable state, whether we
bring it up at that point or not.
Fixes-test: @restart_L2_only_lacp
https://bugzilla.redhat.com/show_bug.cgi?id=2092361
This doesn't use NM_UNIQ_T() to create a truly unique name.
Instead, it avoids "_arr" as local variable name, which other
macros also use. By choosing a different name, we can nest such
macro calls, without getting a "-Wshadow" warning.
If you do:
nm_assert_addr_family(NMP_OBJECT_CAST_MPTCP_ADDR(obj)->addr_family));
then there are two nested NM_IN_SET() macro invocations. Once,
NMP_OBJECT_CAST_MPTCP_ADDR() checks that the object type is one of
a few selected (using NM_IN_SET()). Then, that is passed to
nm_assert_addr_family(), which checks NM_IN_SET(addr_family, AF_INET,
AF_INET6).
In general, it's easy to end up in a situation like this.
And it mostly works just fine. The only problem was that NM_IN_SET()
uses an internal, local variable "_x". The compiler will emit a very
verbose failure about the shadowed variable:
./src/libnm-std-aux/nm-std-aux.h:802:14: error: declaration of '_x' shadows a previous local [-Werror=shadow]
802 | type _x = (x); \
NM_UNIQ_T() exists for this purpose. Use it. NM_IN_SET() is
popular enough to warrant a special treatment to avoid this pitfall.
The strength of CList is of course to use it as a stack of queue,
and only append/remove from the front/tail.
However, since this is an intrusive list, it can also be useful to
just use it to track elements, and -- when necessary -- sort them
via c_list_sort().
If we have a sorted list, we might want to insert a new element
honoring the sort order. This function achieves that.
Don't mix <net/ethernet.h> and <linux/if_ether.h>.
Fixes the following build error with musl libc:
In file included from /usr/include/net/ethernet.h:10,
from ../src/libnm-platform/nm-linux-platform.c:17:
/usr/include/netinet/if_ether.h:115:8: error: redefinition of 'struct ethhdr'
115 | struct ethhdr {
| ^~~~~~
In file included from ../src/linux-headers/ethtool.h:19,
from ../src/libnm-std-aux/nm-linux-compat.h:22,
from ../src/libnm-platform/nm-linux-platform.c:10:
/usr/include/linux/if_ether.h:169:8: note: originally defined here
169 | struct ethhdr {
| ^~~~~~
Fixes: dc98ab807c ('platform: include "linux-headers" via "libnm-std-aux/nm-linux-compat.h"')
NetworkManager primarily manages interfaces in an independent fashion.
That means, whenever possible, we want to have a interface specific
view. In many cases, the underlying kernel API also supports that view.
For example, when configuring IP addresses or unicast routes, we do so
per interfaces and don't need a holistic view.
However, that is not always sufficient. For routing rules and certain
route types (blackhole, unreachable, etc), we need a system wide view
of all the objects in the network namespace.
Originally, NMPRulesManager was added to track routing rules. Then, it
was extended to also track certain route types, and the API was renamed to
NMPRouteManager.
This will also be used to track MPTCP addresses.
So rename again, to give it a general name that is suitable for what it
does. Still, the name is not great (suggestion welcome), but it should
cover the purpose of the API well enough. And it's the best I came
up with.
Rename.
For IPv6, kernel does not accept the ifa_scope parameter and always
determines the scope based on the address itself.
For IPv4, it honors whatever scope the user sets via netlink.
NetworkManager does not allow to directly configure the address
scope, but autodetects it.
Use nm_platform_ip4_address_get_scope() for detecting the scopt.
This also fixes the issue that to detect loopback addresses 127.0.0.0/8
and use scope "host".
Try:
$ nmcli device modify "$IFACE" +ipv4.addresses 127.0.0.5/8
We have our own copy of linux kernel headers, and we must never
directly include the corresponding versions from the system.
Avoid that, by only including the clones via "libnm-std-aux/nm-linux-compat.h"
and by including the compat wrapper header before other system headers.
It's similar to nm_ip_addr_cmp(), but it can be used as an argument
to g_qsort_with_data() to sort a list of NMIPAddr (or in_addr_t or
struct in6_addr).
The address family needs to be given as user-data.
By default, wpa_supplicant sets these parameters according to the
802.11 standard:
dot11RSNAConfigPMKLifetime = 43200 seconds (12 hours)
dot11RSNAConfigPMKReauthThreshold = 70%
With these, the supplicant triggers a new EAP authentication every 8
hours and 24 minutes. If the network uses one-time secrets, the
reauthentication fails and the supplicant disconnects. It doesn't seem
desirable that the client starts a reauthentication so early; bump the
lifetime to a week.
Currently, due to a bug, the new value is ignored by wpa_supplicant
when set via D-Bus. This patch needs the fix at [1], not yet merged.
[1] http://lists.infradead.org/pipermail/hostap/2022-July/040664.htmlhttps://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1306
The current behavior of "nmcli networking off" is that it starts
disconnecting the devices, but doesn't wait for them to actually
come down.
That is not too helpful: the user never knows when the network is
actually disconnected.
Some users, notably the NetworkManager-CI test suite, seem to expect the
devices are all disconnected after the command finishes. Even worse,
it immediately proceeds activating the connections:
@ovs_cloned_mac_set_on_iface
...
* Execute "nmcli networking off && nmcli networking on"
This results in pure utter chaos. In particular, the slave connections
sometimes refuse to activate after "nmcli networking on", because the
master connections are still getting disconnected in response to
preceding "nmcli networking off".
Let's make Enable(FALSE) and Sleep(TRUE) block until none of the devices
are expected to go down.
Note that this makes those call also return when Enable(TRUE) and
Sleep(FALSE) is issued in meanwhile. Therefore a return from
Enable(FALSE) doesn't necessarily imply the networking is disabled.
This is a feature, not a bug -- the actual manager state is available in
the "state" property.
Fixes-test: @ovs_cloned_mac_set_on_iface
https://bugzilla.redhat.com/show_bug.cgi?id=2093175https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1292
Usually we want no difference between the upstream project that we fork
via git-subtree, and our copy. However, for the subprojects, we need to
patch them. Do it.
If you know a better way, that allows to overwriting the subprojects
please send a patch.
n-dhcp4 only supports calling ACCEPT during the GRANTED state.
Not during a EXTENDED event. So usually, we would not want
to call accept in that case.
And we didn't. During EXTENDED event, we would usually skip ACD (because
it's either not enabled or we already passed ACD for the current address).
In that case, in _nm_dhcp_client_notify() we hit the line
if (client_event_type == NM_DHCP_CLIENT_EVENT_TYPE_BOUND && priv->l3cd_curr
&& nm_l3_config_data_get_num_addresses(priv->l3cd_curr, priv->config.addr_family) > 0)
priv->l3cfg_notify.wait_dhcp_commit = TRUE;
else
priv->l3cfg_notify.wait_dhcp_commit = FALSE;
and would not set `wait_dhpc_commit`. That means, we never called _dhcp_client_accept().
For nettools, that doesn't really matter because calling ACCEPT during EXTENDED
is invalid anyway. However, for dhclient that is fatal because we wouldn't reply the
D-Bus request from nm-dhcp-helper. The helper times out after 60 seconds and dhclient
would misbehave.
We need to fix that by also calling _dhcp_client_accept() in the case when we don't
need to wait (the EXTENDED case).
However, previously _dhcp_client_accept() was rather peculiar and didn't like to be
called in an unexpected state. Relax that. Now, when calling accept in an unexpected
state, just do nothing and signal success. That frees the caller from the complexity
to understand when they must/must not call accept.
https://bugzilla.redhat.com/show_bug.cgi?id=2109285
Fixes: 156d84217c ('dhcp/dhclient: implement accept/decline (ACD) for dhclient plugin')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1308
`nm_platform_ip_address_sync()` likes to add IFA_F_NOPREFIXROUTE flag
for all addresses, regardless of `a_ifi_flags` property. By setting this
boolean, that automatism can be suppressed, and the noprefixroute flag
does not get added automatically.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1307
Some compiler versions don't like this. Workaround.
src/libnm-platform/nm-linux-platform.c: In function event_seq_check:
src/libnm-platform/nm-linux-platform.c:7254:1: error: label at end of compound statement
out:
^~~
Fixes: 3d4906a3da ('platform: add genl socket support for events and genl family')
That seems common. It's also done by genl-ctrl-list and
iproute2's genl tool.
Also, use avoid the leading zeros (0x1c instead of 0x001c).
iproute2's genl tool does the former, libnl3's genl-ctrl-list
does the latter.
We now cache the family ID for generic netlink protocols. However,
when we for example create a wireguard interface, the kernel module
might just get autoloaded. At this point, we didn't know the family ID
yet.
We already made an effort, that if the family ID is unknown during
nm_platform_genl_get_family_id(), we would try to poll the genl socket
in the hope there is a relevant event there. However, polling the socket
also means to potentially emit all signals for any change that happen.
We don't want that, if we currently are already polling the socket.
Instead, fallback to synchronously get the family ID.
$ sudo rmmod wireguard \
./tools/run-nm-test.sh -m src/core/platform/tests/test-link-linux -p /link/software/detect/wireguard/1/external
Fixes: 3d4906a3da ('platform: add genl socket support for events and genl family')
An NMPObject is hashable, can be compared and printed. That is useful.
Make an NMPObject for MPTCP addresses. It will hold the content of
MPTCP_PM_ATTR_ADDR netlink attribute. But like other NMPObject types it
will also be used to represent the data as NetworkManager tracks it.
An object type that doesn't implement cmd_plobj_id_copy(), either:
- implements cmd_obj_copy(), but then we cannot copy the ID only
to a stack instance, because that cannot track ownership.
This is a bug in the caller. We cannot use stackinit for an
object of a type that is not a plain old data (in C++ terms).
- fallback to plain memcpy(). That is in line with nmp_object_clone().
and nmp_object_copy().
The _nm_alignas() exists to choose a suitable alignment. Since it's
on the stack, it has (almost) no overhead to just use the maximum
alignment.
That's why gint64 was chosen before. But that isn't the largest
alignment. We rely on C11 already, and we also always include <stddef.h>.
So use max_align_t instead.
NM_CMP_DIRECT_MEMCMP() gets called by NM_CMP_FIELD_MEMCMP_LEN().
For example, if you want to compare a NMIPAddr, it seems sensible
to call
NM_CMP_FIELD_MEMCMP_LEN(obj1, obj2, addr, nm_utils_addr_family_to_size(obj1->addr_family));
Granted, nm_utils_addr_family_to_size() asserts that addr_family is
either AF_INET or AF_INET6. However, if the assertion fails, we don't
want yet another undefined behavior here and do the sensible thing
about n zero.
In general, for a low-level function that uses memcmp(), it's non
obvious to ensure that the caller does not accidentally invoke undefined
behavior. nm_memcmp() avoids that.