If an address is removed during pruning and it had the TENTATIVE flag
before, the most likely cause of the removal is that it failed DAD. It
could also be that the user removed it at the same time we needed to
resync the platform cache, but that seems more unlikely.
Cleanup the handling of close().
First of all, closing an invalid (non-negative) file descriptor (EBADF) is
always a serious bug. We want to catch that. Hence, we should use nm_close()
(or nm_close_with_error()) which asserts against such bugs. Don't ever use
close() directly, to get that additional assertion.
Also, our nm_close() handles EINTR internally and correctly. Recent
POSIX defines that on EINTR the close should be retried. On Linux,
that is never correct. After close() returns, the file descriptor is
always closed (or invalid). nm_close() gets this right, and pretends
that EINTR is a success (without retrying).
The majority of our file descriptors are sockets, etc. That means,
often an error from close isn't something that we want to handle. Adjust
nm_close() to return no error and preserve the caller's errno. That is
the appropriate reaction to error (ignoring it) in most of our cases.
And error from close may mean that there was an IO error (except EINTR
and EBADF). In a few cases, we may want to handle that. For those
cases we have nm_close_with_error().
TL;DR: use almost always nm_close(). Unless you want to handle the error
code, then use nm_close_with_error(). Never use close() directly.
There is much reading on the internet about handling errors of close and
in particular EINTR. See the following links:
https://lwn.net/Articles/576478/https://askcodes.net/coding/what-to-do-if-a-posix-close-call-fails-https://www.austingroupbugs.net/view.php?id=529https://sourceware.org/bugzilla/show_bug.cgi?id=14627https://news.ycombinator.com/item?id=3363819https://peps.python.org/pep-0475/
When there are many VFs the default buffer size of 1 memory page is
not enough. Each VF can take up to ~120 bytes and so when the page
size is 4KiB at most ~34 VFs can be added.
Specify the buffer size when allocating the message.
Add a len argument to nlmsg_alloc() and nlmsg_alloc_simple(). After
that, nlmsg_alloc_size() can be dropped. Also, rename
nlmsg_alloc_simple() to nlmsg_alloc_new().
Prefer it over strncmp(), because it seems easier to understand (to me).
Prefer it over g_str_has_prefix(), because it can directly expand
to a plain strncmp() -- instead of first humping to glib, then calling
strlen() before calling strncmp().
Turns out, modern rmnet_* devices doesn't use ARPHRD_ETHER arptype, but
ARPHRD_RAWIP. Also complicating the fact is that ARPHRD_RAWIP is
actually added in v4.14, but devices using kernel before that version
define this value as "530" in an out-of-tree patch [1].
Recognize this case and check explicitly for 3 values of arptype.
[1] 54948008c2https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1392
Later, we should move all such objects. And we should rename
the API to have a unique prefix, like "NMPPlObjIP[4]Address".
This is just a first step that introduces more inconsistencies than it
solves. It will get better afterwards.
"nmp-base.h" really should only contain simple defines like enum types
or #define. As such, it almost does not need a source file.
However, the enum-to-string methods for the enums of "nmp-base.h" need a
place. Add "nmp-base.c" for that.
Our naming in libnm-platform is bad.
We have NMPlatform, which is a cache of objects. Consequently we have
platform methods like nm_platform_get_link().
We also have various other types that share the NMPlatform prefix, like
NMPlatformIP4Address. For those we have nm_platform_ip4_address_to_string().
"methods" of a type should have the same prefix as the type,
and we should not have types that share the same prefix.
Also, "NMPlatformIP4Address" is a long name, and inconsistent with the
strongly related NMPObjectIP4Address.
Add new files to move and rename parts of the platform API.
This changes a few places where we might have looked up the ifname in
NMPlatform to only print the ifindex. Since the ifindex is the real identifier,
and the logfile is already full of lines that associate the ifname with the ifindex,
this is fine.
NMPGlobalTracker allows to track objects for independent users/callers.
That is, callers that are not aware whether another caller tracks the
same/similar object. It thus groups all objects by their nmp_object_id_equal()
(as `TrackObjData` struct), while keeping a list of each individually tracked
object (as `TrackData` struct which honors the object and the user-tag parameter).
When the same caller (based on the user-tag) tracks the same object again, then
NMPGlobalTracker will only track it once and combine the objects. That is done by
also having a dictionary for the `TrackData` entries (`self->by_data`).
This latter dictionary lookup wrongly considered nmp_object_id_equal().
Instead, it needs to consider all minor differences of the objects, and
use nmp_object_equal().
For example, for NMPlatformMptcpAddress, only the "address" is part of
the ID. Other fields, like the MPTCP flags are not. Imagine a profile is
active with MPTCP endpoints configured with flags "subflow". During reapply,
the user can only update the MPTCP flags (e.g. to "signal"). When that happens,
the caller (NML3Cfg) would track a new NMPlatformMptcpAddress instance, that only
differs by MPTCP flags. In this case, we need to track the new address for the
differences that it has according to nmp_object_equal(), and not
nmp_object_id_equal().
Due to this bug, reapply might not work correctly. For other supported types (routing
rules and routes) this bug may have been harder to reproduce, because most attributes
of rules/routes are also part of the ID and because it's uncommon to reapply a minor
change to a rule/route.
https://bugzilla.redhat.com/show_bug.cgi?id=2120471
Fixes: b8398b9e79 ('platform: add NMPRulesManager for syncing routing rules')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1375
These variants provide additional nm_assert() checks, and are thus
preferable.
Note that we cannot just blindly replace &g_array_index() with
&nm_g_array_index(), because the latter would not allow getting a
pointer at index [arr->len]. That might be a valid (though uncommon)
usecase. The correct replacement of &g_array_index() is thus
nm_g_array_index_p().
I checked the code manually and replaced uses of nm_g_array_index_p()
with &nm_g_array_index(), if that was a safe thing to do. The latter
seems preferable, because it is familar to &g_array_index().
On netlink API, the attribute is indeed u32. However, this is an ifindex
which in most other kernel APIs and in NetworkManager code is a signed
integer. Note that of course kernel would only ever assign numbers that
are valid ifindexes, thus in the suitable range.
The IFLA_BOND_ACTIVE_SLAVE and IFLA_BOND_PRIMARY are not the same.
If the primary is not set, then that's it. Don't fallback.
Only NetworkManager API deprecated "active-slave" and uses it as
alias for "primary". That does not mean, kernel/netlink does that.
This optimization seems unnecessary. Just initialize a new route struct
and use it. The advantage is that we can have the variable in the scope
closer to where it's used, and don't need to think about what happens
outside the scope.
Previously, nm_platform_ip_address_sync() would always add the "IFA_F_NOPREFIXROUTE"
flag. Add a way to let the caller control that.
Add a flags argument, with a new flag "with-noprefixroute". By default
(with flags "none"), nm_platform_ip_address_sync() would no longer
add "IFA_F_NOPREFIXROUTE" flag, but the caller can now opt-in to that.
The purpose is that on "lo" interface we will want to let kernel
handle the prefix route. So have a per-ifindex opt-in for controlling
this.
During nm_platform_ip_address_flush() we use "none" flags, because the
function anyway doesn't add any addresses, so it wouldn't matter.
There is no change in behavior.
Co-authored-by: Thomas Haller <thaller@redhat.com>
This flag won't be used. Instead we will pass a flag to
nm_platform_ip_route_sync() to disable addition of the prefix route
flag.
This reverts commit bd84ae4dc5.
- name things related to `in_addr_t`, `struct in6_addr`, `NMIPAddr` as
`nm_ip4_addr_*()`, `nm_ip6_addr_*()`, `nm_ip_addr_*()`, respectively.
- we have a wrapper `nm_inet_ntop()` for `inet_ntop()`. This name
of our wrapper is chosen to be familiar with the libc underlying
function. With this, also name functions that are about string
representations of addresses `nm_inet_*()`, `nm_inet4_*()`,
`nm_inet6_*()`. For example, `nm_inet_parse_str()`,
`nm_inet_is_normalized()`.
<<<<
R() {
git grep -l "$1" | xargs sed -i "s/\<$1\>/$2/g"
}
R NM_CMP_DIRECT_IN4ADDR_SAME_PREFIX NM_CMP_DIRECT_IP4_ADDR_SAME_PREFIX
R NM_CMP_DIRECT_IN6ADDR_SAME_PREFIX NM_CMP_DIRECT_IP6_ADDR_SAME_PREFIX
R NM_UTILS_INET_ADDRSTRLEN NM_INET_ADDRSTRLEN
R _nm_utils_inet4_ntop nm_inet4_ntop
R _nm_utils_inet6_ntop nm_inet6_ntop
R _nm_utils_ip4_get_default_prefix nm_ip4_addr_get_default_prefix
R _nm_utils_ip4_get_default_prefix0 nm_ip4_addr_get_default_prefix0
R _nm_utils_ip4_netmask_to_prefix nm_ip4_addr_netmask_to_prefix
R _nm_utils_ip4_prefix_to_netmask nm_ip4_addr_netmask_from_prefix
R nm_utils_inet4_ntop_dup nm_inet4_ntop_dup
R nm_utils_inet6_ntop_dup nm_inet6_ntop_dup
R nm_utils_inet_ntop nm_inet_ntop
R nm_utils_inet_ntop_dup nm_inet_ntop_dup
R nm_utils_ip4_address_clear_host_address nm_ip4_addr_clear_host_address
R nm_utils_ip4_address_is_link_local nm_ip4_addr_is_link_local
R nm_utils_ip4_address_is_loopback nm_ip4_addr_is_loopback
R nm_utils_ip4_address_is_zeronet nm_ip4_addr_is_zeronet
R nm_utils_ip4_address_same_prefix nm_ip4_addr_same_prefix
R nm_utils_ip4_address_same_prefix_cmp nm_ip4_addr_same_prefix_cmp
R nm_utils_ip6_address_clear_host_address nm_ip6_addr_clear_host_address
R nm_utils_ip6_address_same_prefix nm_ip6_addr_same_prefix
R nm_utils_ip6_address_same_prefix_cmp nm_ip6_addr_same_prefix_cmp
R nm_utils_ip6_is_ula nm_ip6_addr_is_ula
R nm_utils_ip_address_same_prefix nm_ip_addr_same_prefix
R nm_utils_ip_address_same_prefix_cmp nm_ip_addr_same_prefix_cmp
R nm_utils_ip_is_site_local nm_ip_addr_is_site_local
R nm_utils_ipaddr_is_normalized nm_inet_is_normalized
R nm_utils_ipaddr_is_valid nm_inet_is_valid
R nm_utils_ipx_address_clear_host_address nm_ip_addr_clear_host_address
R nm_utils_parse_inaddr nm_inet_parse_str
R nm_utils_parse_inaddr_bin nm_inet_parse_bin
R nm_utils_parse_inaddr_bin_full nm_inet_parse_bin_full
R nm_utils_parse_inaddr_prefix nm_inet_parse_with_prefix_str
R nm_utils_parse_inaddr_prefix_bin nm_inet_parse_with_prefix_bin
R test_nm_utils_ip6_address_same_prefix test_nm_ip_addr_same_prefix
./contrib/scripts/nm-code-format.sh -F
The removed signal did not log the interface name.
That's because _LOG3D() takes the ifindex and looks into the platform
cache to find the interface name. However, if the link is already
removed, it won't find it.
Fix that by explicitly using the right name.
Before:
<debug> [1660070838.2976] platform: signal: link removed: 602: testX6 <DOWN;broadcast,multicast> mtu 1500
Now:
<debug> [1660070838.2976] platform: (testX6) signal: link removed: 602: testX6 <DOWN;broadcast,multicast> mtu 1500
- drop unused "keep_deleted" parameter. It just doesn't make sense.
Even less sense than for rules/routes, where this was taken from.
- fix nmp_global_tracker_sync_mptcp_addrs() to delete addresses
with conflicting flags. We did not correctly delete existing
addresses, that were to be reconfigured with different flags.
Fixes: 5374c403d2 ('platfrom: handle MPTCP addresses with NMPGlobalTracker')
When we configure MPTCP addresses, we usually do so per interface
(ifindex). That is, because each interface (via NMDevice and NML3Cfg)
decides how to configure MPTCP, and then we always add MTCP addresses
for this certain ifindex.
With that, we could have a purely interface-specific view and not a
global sync method. However, there are two problems:
The minor problem is that we don't cache the endpoints (because we don't
get notifications). We can only get a dump of all endpoints. It seems
odd to have a mptcp-addr-sync method that is per-ifindex, when it needs
to dump all addresses.
The much more important reason is that the number of endpoints that we
can configure in kernel is very limited. So we need to make a choice
which endpoints to configure, and for that we need to holistic view that
NMPGlobalTracker has.
Since the generic netlink API does (currently) not support notifications
about changes of the MPTCP addresses, we won't get notifications when
they change, and it seems wrong to put such things in the NMPlatform
cache.
We can just get the list of endpoints by polling, so add a function
nm_platform_mptcp_addrs_dump() for that.
Also, add nm_platform_mptcp_addr_update() which can add/remove/update
MPTCP addresses.
We already have two hash functions for MPTCP addresses:
nmp_object_id_hash*() which compares the identity of objects
and nm_platform_mptcp_addr_hash*(), which compares all fields.
There is also a need to hash only the address. Add it. Will be used
next.
The ID of an object does not entail all properties/attributes.
During sync, if we already have an object with the same ID configured,
it may still differ in other aspects.
Handle those cases, by deleting such routes/rules before adding the
desired object.
Since we don't get netlink notifications when the MPTCP endpoints
change, we don't cache them. And since we don't cache them,
there is less need to mark whether they were received from kernel
or created internally.
It's not very clear what the best identity is.
For example, in kernel you cannot add two MPTCP addresses that only differ by
ifindex. Thus (as far as kernel is concerned), the ifindex is not part of the
identity. Still, as we will have an interface centric view, this will be
useful for us.
On the other hand, to kernel is the "id" a second primary key, along
side "addr:port". However, to us it's not useful to consider that as
part of nmp_object_id_equal(), because usually kernel will pick an "id"
for us, and when we track objects that we are about to add, they don't
have an "id" yet.
So, adjust nmp_object_id_equal(). However -- somewhat unusual -- let it
deviate from kernel's understanding of what defines an MPTCP address.
CentOS 7's headers don't yet contains IFLA_BOND_PEER_NOTIF_DELAY.
Define it ourselves.
Fixes: f900f7bc2c ('platform: add netlink support for bond link')
sysfs is deprecated and kernel people will not add new bond options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
Add parentheses around macro arguments.
Yes, it's not technically necessary when using macro arguments are
surrounded by commas. Still do it, for consistency and for not having
special exceptions to the rule.
Don't mix <net/ethernet.h> and <linux/if_ether.h>.
Fixes the following build error with musl libc:
In file included from /usr/include/net/ethernet.h:10,
from ../src/libnm-platform/nm-linux-platform.c:17:
/usr/include/netinet/if_ether.h:115:8: error: redefinition of 'struct ethhdr'
115 | struct ethhdr {
| ^~~~~~
In file included from ../src/linux-headers/ethtool.h:19,
from ../src/libnm-std-aux/nm-linux-compat.h:22,
from ../src/libnm-platform/nm-linux-platform.c:10:
/usr/include/linux/if_ether.h:169:8: note: originally defined here
169 | struct ethhdr {
| ^~~~~~
Fixes: dc98ab807c ('platform: include "linux-headers" via "libnm-std-aux/nm-linux-compat.h"')