- drop unused "keep_deleted" parameter. It just doesn't make sense.
Even less sense than for rules/routes, where this was taken from.
- fix nmp_global_tracker_sync_mptcp_addrs() to delete addresses
with conflicting flags. We did not correctly delete existing
addresses, that were to be reconfigured with different flags.
Fixes: 5374c403d2 ('platfrom: handle MPTCP addresses with NMPGlobalTracker')
When we configure MPTCP addresses, we usually do so per interface
(ifindex). That is, because each interface (via NMDevice and NML3Cfg)
decides how to configure MPTCP, and then we always add MTCP addresses
for this certain ifindex.
With that, we could have a purely interface-specific view and not a
global sync method. However, there are two problems:
The minor problem is that we don't cache the endpoints (because we don't
get notifications). We can only get a dump of all endpoints. It seems
odd to have a mptcp-addr-sync method that is per-ifindex, when it needs
to dump all addresses.
The much more important reason is that the number of endpoints that we
can configure in kernel is very limited. So we need to make a choice
which endpoints to configure, and for that we need to holistic view that
NMPGlobalTracker has.
Since the generic netlink API does (currently) not support notifications
about changes of the MPTCP addresses, we won't get notifications when
they change, and it seems wrong to put such things in the NMPlatform
cache.
We can just get the list of endpoints by polling, so add a function
nm_platform_mptcp_addrs_dump() for that.
Also, add nm_platform_mptcp_addr_update() which can add/remove/update
MPTCP addresses.
sysfs is deprecated and kernel people will not add new bond options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
NetworkManager primarily manages interfaces in an independent fashion.
That means, whenever possible, we want to have a interface specific
view. In many cases, the underlying kernel API also supports that view.
For example, when configuring IP addresses or unicast routes, we do so
per interfaces and don't need a holistic view.
However, that is not always sufficient. For routing rules and certain
route types (blackhole, unreachable, etc), we need a system wide view
of all the objects in the network namespace.
Originally, NMPRulesManager was added to track routing rules. Then, it
was extended to also track certain route types, and the API was renamed to
NMPRouteManager.
This will also be used to track MPTCP addresses.
So rename again, to give it a general name that is suitable for what it
does. Still, the name is not great (suggestion welcome), but it should
cover the purpose of the API well enough. And it's the best I came
up with.
Rename.
The genl types that we care about are well known. Add an enum
for them, so we can do a lookup by index.
To kernel, the corresponding names (like "wireguard") are also well
known. However, the family-id, that we need when using genl are
allocated dynamically. So we need to lookup the family-id, and by having
an enum for the genl type, we can do so generically.
In the past, nmp_lookup_init_object() could both lookup all object for a
certain ifindex, and lookup all objects of a type. That fallback path
already leads to an assertion failure fora while now, so nobody should
be using this function to lookup all objects of a certain type (for
what, we have nmp_lookup_init_obj_type()).
Now, remove the fallback path, and rename the function to what it really
does.
Since kernel 5.18 there is a stricter validation [1][2] on the tos
field of routing rules, that must not include ECN bits.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f55fbb6afb8d701e3185e31e73f5ea9503a66744
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a410a0cf98854a698a519bfbeb604145da384c0e
Fixes the following failure:
>>> src/core/platform/tests/test-route-linux
>>> ...
# NetworkManager-MESSAGE: <warn> [1656321515.6604] platform-linux: do-add-rule: failure 22 (Invalid argument - Invalid dsfield (tos): ECN bits must be 0)
>>> failing... errno=-22, rule=[routing-rule,0x13d6e80,1,+alive,+visible; [6] 0: from all tos 0xff fwmark 0x4/0 suppress_prefixlen -459579276 action-214 protocol 255]
>>> existing rule: * [routing-rule,0x13d71e0,2,+alive,+visible; [6] 0: from all sport 65534 lookup 10009 suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d7280,2,+alive,+visible; [4] 0: from all fwmark 0/0x9a7e9992 ipproto 255 suppress_prefixlen 0 realms 0x00000008 none protocol 71]
>>> existing rule: [routing-rule,0x13d7320,2,+alive,+visible; [6] 598928157: from all suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d73c0,2,+alive,+visible; [4] 0: from 192.192.5.200/8 lookup 254 suppress_prefixlen 0 none protocol 9]
>>> existing rule: [routing-rule,0x13d7460,2,+alive,+visible; [4] 0: from all ipproto 3 suppress_prefixlen 0 realms 0xffffffff none protocol 5]
>>> existing rule: [routing-rule,0x13d7500,2,+alive,+visible; [4] 0: from all fwmark 0x1/0 lookup 254 suppress_prefixlen 0 action-124 protocol 4]
>>> existing rule: [routing-rule,0x13d75a0,2,+alive,+visible; [4] 0: from all suppress_prefixlen 0 action-109]
0: from all fwmark 0/0x9a7e9992 ipproto ipproto-255 realms 8 none proto 71
0: from 192.192.5.200/8 lookup main suppress_prefixlength 0 none proto ra
0: from all ipproto ggp realms 65535/65535 none proto 5
0: from all fwmark 0x1/0 lookup main suppress_prefixlength 0 124 proto static
0: from all 109
0: from all sport 65534 lookup 10009 suppress_prefixlength 0 none
598928157: from all none
Bail out! nm:ERROR:../src/core/platform/tests/test-route.c:1787:test_rule: assertion failed (r == 0): (-22 == 0)
Fixes: 5ae2431b0f ('platform/tests: add tests for handling policy routing rules')
On m68k we get a static assertion, that NMPlatformIP4Address.address
is not at the same offset as NMPlatformIPAddress.address_ptr.
On most architectures, the bitfields fits in a gap between the fields,
but not on m68k, where integers are 2-byte aligned.
Next we are going to assert that the prefix length is valid.
The test needs to have valid prefix lengths too. Adjust.
(cherry picked from commit a850e438a7)
These string functions allow to omit the string buffer. This is for
convenience, to use a global (thread-local) buffer. I think that is
error prone and we should drop that "convenience" feature.
At various places, pass a stack allocated buffer.
(cherry picked from commit b87afac8e8)
We often create the source with default priority, no destroy function and
attach it to the default context (g_main_context_default()). For that
case, we have wrapper functions like nm_g_timeout_add_source()
and nm_g_idle_add_source(). Use those.
There should be no change in behavior.
gcc-12.0.1-0.8.fc36 is annoying with false positives.
It's related to g_error() and its `for(;;) ;`.
For example:
../src/libnm-glib-aux/nm-shared-utils.c: In function 'nm_utils_parse_inaddr_bin_full':
../src/libnm-glib-aux/nm-shared-utils.c:1145:26: error: dangling pointer to 'error' may be used [-Werror=dangling-pointer=]
1145 | error->message);
| ^~
/usr/include/glib-2.0/glib/gmessages.h:343:32: note: in definition of macro 'g_error'
343 | __VA_ARGS__); \
| ^~~~~~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1133:31: note: 'error' declared here
1133 | gs_free_error GError *error = NULL;
| ^~~~~
/usr/include/glib-2.0/glib/gmessages.h:341:25: error: dangling pointer to 'addrbin' may be used [-Werror=dangling-pointer=]
341 | g_log (G_LOG_DOMAIN, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342 | G_LOG_LEVEL_ERROR, \
| ~~~~~~~~~~~~~~~~~~~~~~~
343 | __VA_ARGS__); \
| ~~~~~~~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1141:13: note: in expansion of macro 'g_error'
1141 | g_error("unexpected assertion failure: could parse \"%s\" as %s, but not accepted by "
| ^~~~~~~
../src/libnm-glib-aux/nm-shared-utils.c:1112:14: note: 'addrbin' declared here
1112 | NMIPAddr addrbin;
| ^~~~~~~
I think the warning could potentially be useful and prevent real bugs.
So don't disable it altogether, but go through the effort to suppress it
at the places where it currently happens.
Note that NM_PRAGMA_WARNING_DISABLE_DANGLING_POINTER macro only expands
to suppressing the warning with __GNUC__ equal to 12. The purpose is to
only suppress the warning where we know we want to. Hopefully other gcc
versions don't have this problem.
I guess, we could also write a NM_COMPILER_WARNING() check in
"m4/compiler_options.m4", to disable the warning if we detect it. But
that seems too cumbersome.
Routes of type blackhole, unreachable, prohibit don't have an
ifindex/device. They are thus in many ways similar to routing rules,
as they are global. We need a mediator to keep track which routes
to configure.
This will be very similar to what NMPRulesManager already does for
routing rules. Rename the API, so that it also can be used for routes.
Renaming the file will be done next, so that git's rename detection
doesn't get too confused.
So far, certain NMObject types could not have an ifindex of zero. Hence,
nmp_lookup_init_object() took such an ifindex to mean lookup all objects
of that type.
Soon, we will support blackhole/unreachable/prohibit route types, which
have their ifindex set to zero. It is still useful to lookup those routes
types via nmp_lookup_init_object().
Change behaviour how to interpret the ifindex. Note that this also
affects various callers of nmp_lookup_init_object(). If somebody was
relying on the previous behavior, it would need fixing.
We use clang-format for automatic formatting of our source files.
Since clang-format is actively maintained software, the actual
formatting depends on the used version of clang-format. That is
unfortunate and painful, but really unavoidable unless clang-format
would be strictly bug-compatible.
So the version that we must use is from the current Fedora release, which
is also tested by our gitlab-ci. Previously, we were using Fedora 34 with
clang-tools-extra-12.0.1-1.fc34.x86_64.
As Fedora 35 comes along, we need to update our formatting as Fedora 35
comes with version "13.0.0~rc1-1.fc35".
An alternative would be to freeze on version 12, but that has different
problems (like, it's cumbersome to rebuild clang 12 on Fedora 35 and it
would be cumbersome for our developers which are on Fedora 35 to use a
clang that they cannot easily install).
The (differently painful) solution is to reformat from time to time, as we
switch to a new Fedora (and thus clang) version.
Usually we would expect that such a reformatting brings minor changes.
But this time, the changes are huge. That is mentioned in the release
notes [1] as
Makes PointerAligment: Right working with AlignConsecutiveDeclarations. (Fixes https://llvm.org/PR27353)
[1] https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html#clang-format
Completely rework IP configuration in the daemon. Use NML3Cfg as layer 3
manager for the IP configuration of an interface. Use NML3ConfigData as
pieces of configuration that the various components collect and
configure. NMDevice is managing most of the IP configuration at a higher
level, that is, it starts DHCP and other IP methods. Rework the state
handling there.
This is a huge rework of how NetworkManager daemon handles IP
configuration. Some fallout is to be expected.
It appears the patch deletes many lines of code. That is not accurate, because
you also have to count the files `src/core/nm-l3*`, which were unused previously.
Co-authored-by: Beniamino Galvani <bgalvani@redhat.com>
Seems we can get a DOWN event during unit tests. I don't really
understand why, but let's ignore it.
[...]
#4 0x000055e365777786 in _l3_acd_nacd_event (fd=<optimized out>, condition=<optimized out>, user_data=0x55e367566270) at src/core/platform/tests/test-common.c:2703
#5 0x00007f4399c224cf in g_main_dispatch (context=0x55e36755fce0) at ../glib/gmain.c:3337
#6 g_main_context_dispatch (context=0x55e36755fce0) at ../glib/gmain.c:4055
#7 0x00007f4399c764f8 in g_main_context_iterate.constprop.0 (context=context@entry=0x55e36755fce0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
at ../glib/gmain.c:4131
#8 0x00007f4399c1fc03 in g_main_context_iteration (context=0x55e36755fce0, context@entry=0x0, may_block=may_block@entry=1) at ../glib/gmain.c:4196
#9 0x000055e365770719 in test_l3_ipv6ll (test_data=<optimized out>) at src/core/tests/test-l3cfg.c:1024
We no longer use tc objects from the platform cache; disable caching
by default.
The only exception where the cache is needed is in tc tests, as we
look into the platform there to check that objects look as expected.
Introduce a construct-only property for platform objects to enable or
disable the caching of tc objects. When disabled, the netlink socket
doesn't receive netlink events for tc objects, and objects are never
added to the cache. This commit doesn't change behavior yet.
Update nm_platform_qdisc_sync() and nm_platform_tfilter_sync() to
avoid looking into the platform cache, so that we no longer require to
keep tc and qdiscs in the cache.
There is no API in kernel to retrieve tc objects only for a specific
interface, so NM had to receive all tc events, even for unmanaged
interfaces. This could cause high CPU usage in some scenarios with
many objects.
Instead, try to delete root qdiscs and filters and then add the known
ones.
Also, combine the two functions together since they are related. In
particular, removing all qdiscs also removes all attached filters.
The memory layout of the NMPlatformIPAddress structure changed. The unit test
needs to be adjusted.
Fixes: 9ec9a92f17 ('platform: avoid bitfield at end of __NMPlatformIPAddress_COMMON macro')
This is supported since kernel 3.17, dated 5 October, 2014. Drop the backward
compatibility for that.
It's very hard to sensibly support a mode where we set the interface up,
but prevent kernel from enabling IPv6. We would hack around that by disabling
IPv6 altogether.
But these code paths are not tested and likely make no sense. And it's hard
to implement a sensible behavior in this case anyway.
The term "user_ipv6ll" is confusing and not something somebody familiar
with kernel or `ip -d link` would understand.
Also, it maps a boolean to addr-gen-mode "none" or "eui64", although
there are 2 more address generation modes in kernel.
Don't abstract the underlying API, and name things as they are in
kernel.
While both functions are basically the same, the majority of the time
we use g_snprintf(). There is no strong reason to prefer one or the
other, but let's keep using one variant.
Add and call the new `nm_platform_link_get_permanent_address()` to
obtain `l_perm_address` via netlink or lookup via ethtool if kernel
does not expose the `IFLA_PERM_ADDRESS`.
And call the new `nm_platform_link_get_permanent_address()` in the unit
tests.
https://bugzilla.redhat.com/show_bug.cgi?id=1987286
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
Having two functions like link_set_x() and link_set_nox() it is not a
good idea. This patch is introducing nm_platform_link_change_flags().
This allow flag modification directly, so the developer does not need to
define the virtual functions all the time everywhere.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
We need to handle the case that kernel mangles the configured values. We
already do, but there was a left over nm_platform_lnk_bridge_cmp() that
is still wrong.
Related: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/665
Fixes: ce9211500e ('platform/tests: work around rounding errors for bridge values in unit tests')
For certain options, kernel stores the numeric values in jiffies scale,
while the user space value is in USER_HZ (1/100th of a second) scale.
Jiffies scale depends on HZ setting (CONFIG_HZ), and depending on kernel
configuration its 100, 250, 300, or 1000.
That means, the round trip of clock_t_to_jiffies()/jiffies_to_clock_t()
has different rounding errors, depending on CONFIG_HZ and it maybe be
+/- 1 of the requested value.
Since the rounding error depends on CONFIG_HZ, we cannot find "good"
values for testing, that always behave the same. So we need to
workaround that.
Normalize the bridge values, if they look as if the value was mangled
due to rounding.
Related: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/665
In core, NMPlatform is (also) a singleton instance. As we will move platform code
to libnm-platform, this singleton part makes no sense there. Move the code
to NetworkManagerUtils.c.