The major point of NMDedupMultiIndex is that it can de-duplicate
the objects. It thus makes sense the everybody is using the same
instance. Make the multi-idx instance of NMPlatform configurable.
This is not used outside of unit tests, because the daemon currently
always creates one platform instance and everybody then re-uses the
instance of the platform.
While this is (currently) only used by tests, and that the performance
optimization of de-duplicating is irrelevant for tests, this is still
useful. The test can then check whether two separate NMPlatform objects
shared the same instance and whether it was de-duplicated.
We don't cache certain routes, for example based on the protocol. This is
a performance optimization to ignore routes that we usually don't care
about.
Still, if the user does `ip route replace` with such a route, then we
need to pass it to nmp_cache_update_netlink_route(), so that we can
properly remove the replaced route.
Knowing which route was replaces might be impossible, as our cache does
not contain all routes. Likely all that nmp_cache_update_netlink_route()
can to is to set "resync_required" for NLM_F_REPLACE. But for that it
should see the object first.
This also means, if we ever write a BPF filter to filter out messages
that contain NLM_F_REPLACE, because that would lead to cache inconsistencies.
The route table is part of the weak-id. You can see that with:
ip route replace unicast 1.2.3.4/32 dev eth0 table 57
ip route replace unicast 1.2.3.4/32 dev eth0 table 58
afterwards, `ip route show table all` will list both routes. The replace
operation is only per-table. Note that NMP_CACHE_ID_TYPE_ROUTES_BY_WEAK_ID
already got this right.
Fixes: 10ac675299 ('platform: add support for routing tables to platform cache')
Older versions of iproute2 don't support the fwmark option. Remove it.
Fixes: 1cf8df2f35 ('platform: support VTI tunnels')
Fixes: b669a3ae46 ('platform: support VTI6 tunnels')
G_TYPE_CHECK_INSTANCE_CAST() can trigger a "-Wcast-align":
src/core/devices/nm-device-macvlan.c: In function 'parent_changed_notify':
/usr/include/glib-2.0/gobject/gtype.h:2421:42: error: cast increases required alignment of target type [-Werror=cast-align]
2421 | # define _G_TYPE_CIC(ip, gt, ct) ((ct*) ip)
| ^
/usr/include/glib-2.0/gobject/gtype.h:501:66: note: in expansion of macro '_G_TYPE_CIC'
501 | #define G_TYPE_CHECK_INSTANCE_CAST(instance, g_type, c_type) (_G_TYPE_CIC ((instance), (g_type), c_type))
| ^~~~~~~~~~~
src/core/devices/nm-device-macvlan.h:13:6: note: in expansion of macro 'G_TYPE_CHECK_INSTANCE_CAST'
13 | (G_TYPE_CHECK_INSTANCE_CAST((obj), NM_TYPE_DEVICE_MACVLAN, NMDeviceMacvlan))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid that by using _NM_G_TYPE_CHECK_INSTANCE_CAST().
This can only be done for our internal usages. The public headers
of libnm are not changed.
GArray.data is a char pointer. Most of the time we track other data in
a GArray. Casting that pointer can trigger "-Wcast-align=strict"
warnings.
Avoid them. Most of the time, instead use the nm_g_array*() helpers,
which also assert that the expected element size is correct.
We put all these structs inside the tagged union NMPObject.
Also, in a sense NMPlatformObject is the base "type" of all
these structs, meaning, it should be able to up and downcast.
Ensure the alignment matches.
This helps to avoid "-Wcast-align" warnings when trying to cast
a (NMPlatformObject*) to another (NMPlatformXXX *) type. Something
we commonly do.
This is the version shipped in Fedora 37. As Fedora 37 is now out, the
core developers switch to it. Our gitlab-ci will also use that as base
image for the check-{patch.tree} tests and to generate the pages. There
is a need that everybody agrees on which clang-format version to use,
and that version should be the one of the currently used Fedora release.
Also update the used Fedora image in "contrib/scripts/nm-code-format-container.sh"
script.
The gitlab-ci still needs update in the following commit. The change
in isolation will break the "check-tree" test.
When adding a new route we need to consider it contains extra nexthops
i.e it is a ECMP route. As we cannot modify the NMPObject once created,
we need to pass the extra nexthops as an argument.
We cannot use the original NMPObject because normalization is happening
during when adding the route.
When reading from netlink an ECMP IPv4 route, we need to parse the
multiple nexthops. In order to do that, we are introducing
NMPlatformIP4RtNextHop struct.
The first nexthop information will be kept at the original
NMPlatformIP4Route and the new property n_nexthops will indicate how
many nexthops we need to consider.
Otherwise, this file would need to be included in POTFILES.in.
This is unnecessary.
Fixes: 06cf1f5e2d ('platform/tests: extend monitor tool to dump the state of NMPlatform')
This is useful for manual testing ("manual", in the sense that you can
write a script that tests the behavior of the platform cache, without
humanly reading the logfile).
Usage:
To write the content of the platform cache once:
./src/core/platform/tests/monitor -P -S './statefile'
To keep monitor running, and update the state file:
./src/core/platform/tests/monitor -S './statefile'
The variable is passed to nmtstp_run_command_check_external(), which accepts
-1 to mean choose randomly. Change the function signature to reflect that.
These variants provide additional nm_assert() checks, and are thus
preferable.
Note that we cannot just blindly replace &g_array_index() with
&nm_g_array_index(), because the latter would not allow getting a
pointer at index [arr->len]. That might be a valid (though uncommon)
usecase. The correct replacement of &g_array_index() is thus
nm_g_array_index_p().
I checked the code manually and replaced uses of nm_g_array_index_p()
with &nm_g_array_index(), if that was a safe thing to do. The latter
seems preferable, because it is familar to &g_array_index().
- name things related to `in_addr_t`, `struct in6_addr`, `NMIPAddr` as
`nm_ip4_addr_*()`, `nm_ip6_addr_*()`, `nm_ip_addr_*()`, respectively.
- we have a wrapper `nm_inet_ntop()` for `inet_ntop()`. This name
of our wrapper is chosen to be familiar with the libc underlying
function. With this, also name functions that are about string
representations of addresses `nm_inet_*()`, `nm_inet4_*()`,
`nm_inet6_*()`. For example, `nm_inet_parse_str()`,
`nm_inet_is_normalized()`.
<<<<
R() {
git grep -l "$1" | xargs sed -i "s/\<$1\>/$2/g"
}
R NM_CMP_DIRECT_IN4ADDR_SAME_PREFIX NM_CMP_DIRECT_IP4_ADDR_SAME_PREFIX
R NM_CMP_DIRECT_IN6ADDR_SAME_PREFIX NM_CMP_DIRECT_IP6_ADDR_SAME_PREFIX
R NM_UTILS_INET_ADDRSTRLEN NM_INET_ADDRSTRLEN
R _nm_utils_inet4_ntop nm_inet4_ntop
R _nm_utils_inet6_ntop nm_inet6_ntop
R _nm_utils_ip4_get_default_prefix nm_ip4_addr_get_default_prefix
R _nm_utils_ip4_get_default_prefix0 nm_ip4_addr_get_default_prefix0
R _nm_utils_ip4_netmask_to_prefix nm_ip4_addr_netmask_to_prefix
R _nm_utils_ip4_prefix_to_netmask nm_ip4_addr_netmask_from_prefix
R nm_utils_inet4_ntop_dup nm_inet4_ntop_dup
R nm_utils_inet6_ntop_dup nm_inet6_ntop_dup
R nm_utils_inet_ntop nm_inet_ntop
R nm_utils_inet_ntop_dup nm_inet_ntop_dup
R nm_utils_ip4_address_clear_host_address nm_ip4_addr_clear_host_address
R nm_utils_ip4_address_is_link_local nm_ip4_addr_is_link_local
R nm_utils_ip4_address_is_loopback nm_ip4_addr_is_loopback
R nm_utils_ip4_address_is_zeronet nm_ip4_addr_is_zeronet
R nm_utils_ip4_address_same_prefix nm_ip4_addr_same_prefix
R nm_utils_ip4_address_same_prefix_cmp nm_ip4_addr_same_prefix_cmp
R nm_utils_ip6_address_clear_host_address nm_ip6_addr_clear_host_address
R nm_utils_ip6_address_same_prefix nm_ip6_addr_same_prefix
R nm_utils_ip6_address_same_prefix_cmp nm_ip6_addr_same_prefix_cmp
R nm_utils_ip6_is_ula nm_ip6_addr_is_ula
R nm_utils_ip_address_same_prefix nm_ip_addr_same_prefix
R nm_utils_ip_address_same_prefix_cmp nm_ip_addr_same_prefix_cmp
R nm_utils_ip_is_site_local nm_ip_addr_is_site_local
R nm_utils_ipaddr_is_normalized nm_inet_is_normalized
R nm_utils_ipaddr_is_valid nm_inet_is_valid
R nm_utils_ipx_address_clear_host_address nm_ip_addr_clear_host_address
R nm_utils_parse_inaddr nm_inet_parse_str
R nm_utils_parse_inaddr_bin nm_inet_parse_bin
R nm_utils_parse_inaddr_bin_full nm_inet_parse_bin_full
R nm_utils_parse_inaddr_prefix nm_inet_parse_with_prefix_str
R nm_utils_parse_inaddr_prefix_bin nm_inet_parse_with_prefix_bin
R test_nm_utils_ip6_address_same_prefix test_nm_ip_addr_same_prefix
./contrib/scripts/nm-code-format.sh -F
- drop unused "keep_deleted" parameter. It just doesn't make sense.
Even less sense than for rules/routes, where this was taken from.
- fix nmp_global_tracker_sync_mptcp_addrs() to delete addresses
with conflicting flags. We did not correctly delete existing
addresses, that were to be reconfigured with different flags.
Fixes: 5374c403d2 ('platfrom: handle MPTCP addresses with NMPGlobalTracker')
When we configure MPTCP addresses, we usually do so per interface
(ifindex). That is, because each interface (via NMDevice and NML3Cfg)
decides how to configure MPTCP, and then we always add MTCP addresses
for this certain ifindex.
With that, we could have a purely interface-specific view and not a
global sync method. However, there are two problems:
The minor problem is that we don't cache the endpoints (because we don't
get notifications). We can only get a dump of all endpoints. It seems
odd to have a mptcp-addr-sync method that is per-ifindex, when it needs
to dump all addresses.
The much more important reason is that the number of endpoints that we
can configure in kernel is very limited. So we need to make a choice
which endpoints to configure, and for that we need to holistic view that
NMPGlobalTracker has.
Since the generic netlink API does (currently) not support notifications
about changes of the MPTCP addresses, we won't get notifications when
they change, and it seems wrong to put such things in the NMPlatform
cache.
We can just get the list of endpoints by polling, so add a function
nm_platform_mptcp_addrs_dump() for that.
Also, add nm_platform_mptcp_addr_update() which can add/remove/update
MPTCP addresses.
sysfs is deprecated and kernel people will not add new bond options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
NetworkManager primarily manages interfaces in an independent fashion.
That means, whenever possible, we want to have a interface specific
view. In many cases, the underlying kernel API also supports that view.
For example, when configuring IP addresses or unicast routes, we do so
per interfaces and don't need a holistic view.
However, that is not always sufficient. For routing rules and certain
route types (blackhole, unreachable, etc), we need a system wide view
of all the objects in the network namespace.
Originally, NMPRulesManager was added to track routing rules. Then, it
was extended to also track certain route types, and the API was renamed to
NMPRouteManager.
This will also be used to track MPTCP addresses.
So rename again, to give it a general name that is suitable for what it
does. Still, the name is not great (suggestion welcome), but it should
cover the purpose of the API well enough. And it's the best I came
up with.
Rename.
The genl types that we care about are well known. Add an enum
for them, so we can do a lookup by index.
To kernel, the corresponding names (like "wireguard") are also well
known. However, the family-id, that we need when using genl are
allocated dynamically. So we need to lookup the family-id, and by having
an enum for the genl type, we can do so generically.
In the past, nmp_lookup_init_object() could both lookup all object for a
certain ifindex, and lookup all objects of a type. That fallback path
already leads to an assertion failure fora while now, so nobody should
be using this function to lookup all objects of a certain type (for
what, we have nmp_lookup_init_obj_type()).
Now, remove the fallback path, and rename the function to what it really
does.
Since kernel 5.18 there is a stricter validation [1][2] on the tos
field of routing rules, that must not include ECN bits.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f55fbb6afb8d701e3185e31e73f5ea9503a66744
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a410a0cf98854a698a519bfbeb604145da384c0e
Fixes the following failure:
>>> src/core/platform/tests/test-route-linux
>>> ...
# NetworkManager-MESSAGE: <warn> [1656321515.6604] platform-linux: do-add-rule: failure 22 (Invalid argument - Invalid dsfield (tos): ECN bits must be 0)
>>> failing... errno=-22, rule=[routing-rule,0x13d6e80,1,+alive,+visible; [6] 0: from all tos 0xff fwmark 0x4/0 suppress_prefixlen -459579276 action-214 protocol 255]
>>> existing rule: * [routing-rule,0x13d71e0,2,+alive,+visible; [6] 0: from all sport 65534 lookup 10009 suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d7280,2,+alive,+visible; [4] 0: from all fwmark 0/0x9a7e9992 ipproto 255 suppress_prefixlen 0 realms 0x00000008 none protocol 71]
>>> existing rule: [routing-rule,0x13d7320,2,+alive,+visible; [6] 598928157: from all suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d73c0,2,+alive,+visible; [4] 0: from 192.192.5.200/8 lookup 254 suppress_prefixlen 0 none protocol 9]
>>> existing rule: [routing-rule,0x13d7460,2,+alive,+visible; [4] 0: from all ipproto 3 suppress_prefixlen 0 realms 0xffffffff none protocol 5]
>>> existing rule: [routing-rule,0x13d7500,2,+alive,+visible; [4] 0: from all fwmark 0x1/0 lookup 254 suppress_prefixlen 0 action-124 protocol 4]
>>> existing rule: [routing-rule,0x13d75a0,2,+alive,+visible; [4] 0: from all suppress_prefixlen 0 action-109]
0: from all fwmark 0/0x9a7e9992 ipproto ipproto-255 realms 8 none proto 71
0: from 192.192.5.200/8 lookup main suppress_prefixlength 0 none proto ra
0: from all ipproto ggp realms 65535/65535 none proto 5
0: from all fwmark 0x1/0 lookup main suppress_prefixlength 0 124 proto static
0: from all 109
0: from all sport 65534 lookup 10009 suppress_prefixlength 0 none
598928157: from all none
Bail out! nm:ERROR:../src/core/platform/tests/test-route.c:1787:test_rule: assertion failed (r == 0): (-22 == 0)
Fixes: 5ae2431b0f ('platform/tests: add tests for handling policy routing rules')
On m68k we get a static assertion, that NMPlatformIP4Address.address
is not at the same offset as NMPlatformIPAddress.address_ptr.
On most architectures, the bitfields fits in a gap between the fields,
but not on m68k, where integers are 2-byte aligned.
Next we are going to assert that the prefix length is valid.
The test needs to have valid prefix lengths too. Adjust.
(cherry picked from commit a850e438a7)
These string functions allow to omit the string buffer. This is for
convenience, to use a global (thread-local) buffer. I think that is
error prone and we should drop that "convenience" feature.
At various places, pass a stack allocated buffer.
(cherry picked from commit b87afac8e8)
We often create the source with default priority, no destroy function and
attach it to the default context (g_main_context_default()). For that
case, we have wrapper functions like nm_g_timeout_add_source()
and nm_g_idle_add_source(). Use those.
There should be no change in behavior.
gcc-12.0.1-0.8.fc36 is annoying with false positives.
It's related to g_error() and its `for(;;) ;`.
For example:
../src/libnm-glib-aux/nm-shared-utils.c: In function 'nm_utils_parse_inaddr_bin_full':
../src/libnm-glib-aux/nm-shared-utils.c:1145:26: error: dangling pointer to 'error' may be used [-Werror=dangling-pointer=]
1145 | error->message);
| ^~
/usr/include/glib-2.0/glib/gmessages.h:343:32: note: in definition of macro 'g_error'
343 | __VA_ARGS__); \
| ^~~~~~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1133:31: note: 'error' declared here
1133 | gs_free_error GError *error = NULL;
| ^~~~~
/usr/include/glib-2.0/glib/gmessages.h:341:25: error: dangling pointer to 'addrbin' may be used [-Werror=dangling-pointer=]
341 | g_log (G_LOG_DOMAIN, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342 | G_LOG_LEVEL_ERROR, \
| ~~~~~~~~~~~~~~~~~~~~~~~
343 | __VA_ARGS__); \
| ~~~~~~~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1141:13: note: in expansion of macro 'g_error'
1141 | g_error("unexpected assertion failure: could parse \"%s\" as %s, but not accepted by "
| ^~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1112:14: note: 'addrbin' declared here
1112 | NMIPAddr addrbin;
| ^~~~~~~
I think the warning could potentially be useful and prevent real bugs.
So don't disable it altogether, but go through the effort to suppress it
at the places where it currently happens.
Note that NM_PRAGMA_WARNING_DISABLE_DANGLING_POINTER macro only expands
to suppressing the warning with __GNUC__ equal to 12. The purpose is to
only suppress the warning where we know we want to. Hopefully other gcc
versions don't have this problem.
I guess, we could also write a NM_COMPILER_WARNING() check in
"m4/compiler_options.m4", to disable the warning if we detect it. But
that seems too cumbersome.
Routes of type blackhole, unreachable, prohibit don't have an
ifindex/device. They are thus in many ways similar to routing rules,
as they are global. We need a mediator to keep track which routes
to configure.
This will be very similar to what NMPRulesManager already does for
routing rules. Rename the API, so that it also can be used for routes.
Renaming the file will be done next, so that git's rename detection
doesn't get too confused.
So far, certain NMObject types could not have an ifindex of zero. Hence,
nmp_lookup_init_object() took such an ifindex to mean lookup all objects
of that type.
Soon, we will support blackhole/unreachable/prohibit route types, which
have their ifindex set to zero. It is still useful to lookup those routes
types via nmp_lookup_init_object().
Change behaviour how to interpret the ifindex. Note that this also
affects various callers of nmp_lookup_init_object(). If somebody was
relying on the previous behavior, it would need fixing.
We use clang-format for automatic formatting of our source files.
Since clang-format is actively maintained software, the actual
formatting depends on the used version of clang-format. That is
unfortunate and painful, but really unavoidable unless clang-format
would be strictly bug-compatible.
So the version that we must use is from the current Fedora release, which
is also tested by our gitlab-ci. Previously, we were using Fedora 34 with
clang-tools-extra-12.0.1-1.fc34.x86_64.
As Fedora 35 comes along, we need to update our formatting as Fedora 35
comes with version "13.0.0~rc1-1.fc35".
An alternative would be to freeze on version 12, but that has different
problems (like, it's cumbersome to rebuild clang 12 on Fedora 35 and it
would be cumbersome for our developers which are on Fedora 35 to use a
clang that they cannot easily install).
The (differently painful) solution is to reformat from time to time, as we
switch to a new Fedora (and thus clang) version.
Usually we would expect that such a reformatting brings minor changes.
But this time, the changes are huge. That is mentioned in the release
notes [1] as
Makes PointerAligment: Right working with AlignConsecutiveDeclarations. (Fixes https://llvm.org/PR27353)
[1] https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html#clang-format
Completely rework IP configuration in the daemon. Use NML3Cfg as layer 3
manager for the IP configuration of an interface. Use NML3ConfigData as
pieces of configuration that the various components collect and
configure. NMDevice is managing most of the IP configuration at a higher
level, that is, it starts DHCP and other IP methods. Rework the state
handling there.
This is a huge rework of how NetworkManager daemon handles IP
configuration. Some fallout is to be expected.
It appears the patch deletes many lines of code. That is not accurate, because
you also have to count the files `src/core/nm-l3*`, which were unused previously.
Co-authored-by: Beniamino Galvani <bgalvani@redhat.com>