When a software device is deactivated, normally we schedule a idle
task to unrealize the device (delete_on_deactivate). However, if a new
activation is enqueued on the same device (and that implies that the
new profile is compatible with the device), then the idle task is not
scheduled and the device will normally transition to the different
states (disconnected, prepare, config, etc.).
For ovs-interfaces, we remove the db entry on disconnect and that
makes the link go away; however, we don't clear the hw_addr* fields of
the device struct.
When the new link appears, we try to set the new cloned MAC but the
stale hw_addr field indicates that it's already set. Avoid this
problem by updating the address as soon as the link appears.
https://bugzilla.redhat.com/show_bug.cgi?id=2168477https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1532
(cherry picked from commit d403ac3d40)
- Only consider preferred context of "internet" type. There can be
multiple preferred contexts of multiple types, and we care about
"internet" type only.
- Don't check for "internet+mms" type. It turns out that "internet+mms"
isn't a thing in oFono, and is used to represent "internet" context
with MMSC in the lomiri-system-setting's UI only.
Fixes: 9fc72bf75d ('wwan/ofono: create connections based on available contexts')
Bug-UBports: https://gitlab.com/ubports/development/core/packaging/network-manager/-/issues/3
(cherry picked from commit 08a38ed619)
Trigger a dispatcher event when a connection is reapplied on a NM device.
Some devices such as phones have already a DHCP client running for accepting
connections when they are plugged into USB to transfer data over SSH.
When NetworkManager switches the connection IP method to shared,
it spawns a dnsmasq process to handle DHCP and DNS for that connection.
However, a dispatcher event is needed to disable the external DHCP server
for these USB connections as NetworkManager's dnsmasq handles them now.
Moreover, when the connection method is switched to a different mode,
the external DHCP server needs to be spawned again to make sure that
SSH connections are still possible to the device.
To achieve this, add a new NetworkManager Dispatcher event
'reapply' which is triggered when a connection is reapplied on a NM
device. This way, a dispatcher script can handle the case above by
inspecting the IP method in the dispatcher script.
(cherry picked from commit cef880c66f)
The onlink flag is part of each next hop.
When NetworkManager configures ECMP routes, we won't support that. All
next hops of an ECMP route must share the same onlink flag. That is fine
and fixed by this commit.
What is not fine, is that we don't track the rtnh_flags flags in
NMPlatformIP4RtNextHop, and consequently our nmp_object_id_cmp() is
wrong.
Fixes: 5b5ce42682 ('nm-netns: track ECMP routes')
(cherry picked from commit 6ed966258c)
If the route with a next hop is already onlink, we don't need to add a
direct route to the gateway.
It also wouldn't work previously, because the onlink route to the
gateway that we would add, would have no gateway and the RTNH_F_ONLINK
set. Kernel would reject that with an error. We would have to clear the
RTNH_F_ONLINK flag, if there is no gateway.
(cherry picked from commit 93b46c8906)
The dns-type must be included in the hash because it contributes to
the generated composite configuration. Without this, when the type of
a configuration changes (e.g. from DEFAULT to BEST), the DNS manager
would determine that there was no change and it wouldn't call
update_dns().
https://bugzilla.redhat.com/show_bug.cgi?id=2161957
Fixes: 8995d44a0b ('core: compare the DNS configurations before updating DNS')
(cherry picked from commit 46ccc82a81)
The tests failed in certain cases on gitlab-ci and were temporarily
disabled.
These issues should be fixed now and the test pass. Reenable.
(cherry picked from commit 5c324adc7c)
Kernel enforces that all nexthops must be reachable through a route.
L3Cfg is generating dependent onlink routes to solve this problem but
the IPv4 ECMP commit is happening before that.
To solve this we introduce two boolean fields "is_new" and "is_ready" to
know in which state is the L3Cfg affected. Initially, "is_new" is TRUE
and "is_ready" is FALSE. Here we schedule a commit on idle and we set
"is_new" to FALSE. When revisiting, we set "is_ready" to TRUE and then
we set the ECMP IPv4 routes.
When a reapply kicks in we reset the L3Cfg state by setting "is_new" to
TRUE.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1520
(cherry picked from commit 7a844ecba9)
Kernel enforces that all route nexthop are reachable but it doesn't care
if the drect route to the nexthop is in a different route table.
(cherry picked from commit f187e63fa8)
ECMP IPv4 route nexthops requires an onlink route but we should trust
l3cfg when generating and managing such routes.
This reverts commit 737cb5d424.
(cherry picked from commit cbf70b4dca)
We must trust l3cfg when generating dependent onlink routes for all kind
of routes not default routes only. This was done by
"nm_platform_ip_route_sync()" so there is not change in behaviour at
all.
"nm_platform_ip_route_sync()" could be needed for other situation where
l3cfg cannot add the dependent onlink routes, so we are not removing
that logic.
This reverts commit 6b4123db1c.
(cherry picked from commit 9c492c6fc4)
Certain ip-tunnel modules automatically create network interfaces (for
example, "ip_gre" module creates "gre0" and others).
Btw, that's not the same as `modprobe bonding max_bonds=1`, where
loading the module merely automatically creates a "bond0" interface. In
case of ip tunnel modules, these generated interfaces seem essential to
how the tunnel works, for example they cannot be deleted. I don't
understand the purpose of those interfaces, but they seem not just
regular tunnel interfaces (unlike, "bond0" which is a regular bond
interface, albeit automatically created).
Btw, if at the time when loading the module, an interface with such name
already exists, it will bump the name (for example, adding a "gre1"
interfaces, and so on). That adds to the ugliness of the whole thing,
but for our unit tests, that is no problem. Our unit tests run in a
separate netns, and we don't create conflicting interfaces. That is, an
interface named "gre0" is always the special tunnel interface and we
can/do rely on that.
Note that when the kernel module gets loaded, it adds those interfaces
to all netns. Thus, even if "test-route-linux" does not do anything with
ip tunnels, such an interface can always appear in a netns, simply by
running "test-link-linux" (or any other tool that creates a tunnel) in
parallel or even in another container.
Theoretically, we could just ensure that we load all the conflicting
ip-tunnel modules (with nmtstp_ensure_module()). There there are two
problems. First, there might be other tunnel modules that interfere but
are not covered by nmtstp_ensure_module(). Second, when kernel creates
those interfaces, it does not send correct RTM_NEWLINK notifications (a
bug), so our platform cache will not be correct, and
nmtstp_assert_platform() will fail.
The only solution is to detect and ignore those interfaces. Also,
ignore all interfaces of link-type "unknown". Those might be from other
modules that we don't know about and that exhibit the same problem.
(cherry picked from commit e99433866d)
Ubuntu 18.04 comes with iproute2-4.15.0-2ubuntu1.3. The
"/etc/iproute2/rt_protos" file from that version does not yet support
the "bgp" entry. Also the "babel" entry is only from 2014. Just choose
other entries. The point is that NetworkManager would ignore those, and
that applies to "zebra" and "bird" alike.
(cherry picked from commit 26592ebfe5)
This will make sure that the IP tunnel module is loaded. It does so by
creating (and deleting) a tunnel interface.
That is important, because those modules will create additional interfaces
that show up in `ip link` (like "gre0"), and those interfaces can interfere
with the tests.
Also add nmtstp_link_is_iptunnel_special() to detect whether an
interface is one of those special interfaces.
(cherry picked from commit 451cedf2bf)
Seems this test fails easily under gitlab-ci, if we set NMTST_SEED_RAND
to something else than "0". There is nothing particular special about
"0", except that a randomly different code paths are chosen.
A randomized test that doesn't pass on all systems with all random
paths, is broken. Disable for now. Needs to be fixed.
See-also: https://bugzilla.redhat.com/show_bug.cgi?id=2165141
(cherry picked from commit 14b1a7ba30)
Obviously, it would be nice if our unit tests are fast. However, with
valgrind and a busy machine, some of the tests can take a relatively
long time. In particular those, that are marked as "slow" (if you want
to skip them during development, do so via "NMTST_DEBUG=quick"
environment, or "CFLAGS=-DNMTST_TEST_QUICK=TRUE", see
"nm-test-utils.h").
Anyway. Our tests almost never hit the timeout, and if they do, the most
likely reason is that something was just slower then expected, and the
timeout is a bogus error.
Timeouts only act as last fail safe. It more important to avoid a false
(premature) timeout failure, than to minimize the wait time when the
test really hangs. Because a real hang is a bug anyway, that we will
discover and need to fix.
Increase the default test timeout for meson tests to 3 minutes.
Also, "test-route-linux" is known to take a long time. Increase that
timeout even further.
(cherry picked from commit 9ee42c0979)
Currently, the use of [global-dns] section for setting DNS options is
conditioned on presence of a nameserver in a [global-dns-domain-*] section.
Attempt to use the section for options alone results in an error:
[global-dns]
options=timeout:1
Or via D-Bus API:
# busctl set-property org.freedesktop.NetworkManager \
/org/freedesktop/NetworkManager org.freedesktop.NetworkManager \
GlobalDnsConfiguration 'a{sv}' 2 \
"options" as 1 "timeout:1" \
"domains" a{sv} 0
...
Nov 24 13:15:21 zmok.local NetworkManager[501184]: <debug> [1669292121.3904]
manager: set global DNS failed with error: Global
DNS configuration is missing the default domain
The insistence on existence of [global-dns-domain-*] would make sense if
other [global-dns-domain-...] sections were present.
However, the user might only want to set the options in resolv.conf and
still use connection-provide nameservers for the actual resolving.
Lift the limitation by allowing the [global-dns] to be used alone, while
still insist on [global-dns-domain-*] being there in presence of other
domain-specific options.
https://bugzilla.redhat.com/show_bug.cgi?id=2019306
(cherry picked from commit 1f0d1d78d2)
This effectively makes [*global-dns-domain-*] sections in configuration be
ignored unless [*global-dns] is also present. This happens because
nm_config_keyfile_has_global_dns_config() mixes the group names up and
attempts to loop up [.intern.global-dns-domain-*] in user configuration and
[global-dns-domain-*] in the internal one.
Fixes: da0ded4927 ('config: drop global-dns.enable option in favor of .config.enable')
(cherry picked from commit de1c06daab)
It is possible that an ra leads to two routes having
the same prefix as well as the same prefix length.
One of them, however, refers to the on-link prefix,
and the other one to a route from the route information field.
(Moreover, they might have different route preferences.)
Hence, if both routes differ in the on-link property,
both are added, and the route from the route information
option receives a metric penalty.
Fixed#1163.
(cherry picked from commit 11832e2ba3)
When we register/unregister a commit-type or when we add/remove a
config-data to NML3Cfg, that act only does the registration/addition.
Only on the next commit, are the changes actually done. The purpose
of that is to add/register multiple configurations and commit them later
when ready.
However, it would be wrong to not do the commit a short time after. The
configuration state is dirty with need to be committed, and that should
happen soon.
Worse, when a interface disappears, NMDevice will clear the ifindex and
the NML3Cfg instance, thereby unregistering all config data and commit
type. If we previously commited something, we need to do another follow-up
commit to cleanup that state.
That is for example important with ECMP routes, which are registered in
NMNetns. When NML3Cfg goes down, it always must unregister to properly
cleanup. Failure to do so, causes an assertion failure and crash. This
change fixes that.
Fix that by automatically schedule and idle commit on
register/unregister/add/remove of commit-type/config-data.
It should *always* be permissible to call a AUTO commit from
an idle handler, because various parties cannot use NML3Cfg
independently, and they cannot know when somebody else does a
commit.
Note that NML3Cfg remembers if it presiouvly did a commit
("commit_type_update_sticky"), so even if the last commit-type gets
unregistered, the next commit will still do a sticky update (one more
time).
The only remaining question is what happens during quitting. When
quitting, NetworkManager we may want to leave some interfaces up and
configured. If we were to properly cleanup the NML3Cfg we might need a
mechanism to handle that. However, currently we just leak everything
during quit, so that is not a concern now. It is something that needs
to be addressed in the future.
https://bugzilla.redhat.com/show_bug.cgi?id=2158394https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1505
Routes can be added with `ip route add|change|replace|append|prepend`.
Add a test that randomly tries to add such routes, and checks that
the cache stays consistent.
https://bugzilla.redhat.com/show_bug.cgi?id=2060684
nmtstp_env1_add_test_func() prepares a certain environment (with dummy
interface) that is used by some tests. Extend it, to allow creating more
than one interface (currently up to two).
The major point of NMDedupMultiIndex is that it can de-duplicate
the objects. It thus makes sense the everybody is using the same
instance. Make the multi-idx instance of NMPlatform configurable.
This is not used outside of unit tests, because the daemon currently
always creates one platform instance and everybody then re-uses the
instance of the platform.
While this is (currently) only used by tests, and that the performance
optimization of de-duplicating is irrelevant for tests, this is still
useful. The test can then check whether two separate NMPlatform objects
shared the same instance and whether it was de-duplicated.
We don't cache certain routes, for example based on the protocol. This is
a performance optimization to ignore routes that we usually don't care
about.
Still, if the user does `ip route replace` with such a route, then we
need to pass it to nmp_cache_update_netlink_route(), so that we can
properly remove the replaced route.
Knowing which route was replaces might be impossible, as our cache does
not contain all routes. Likely all that nmp_cache_update_netlink_route()
can to is to set "resync_required" for NLM_F_REPLACE. But for that it
should see the object first.
This also means, if we ever write a BPF filter to filter out messages
that contain NLM_F_REPLACE, because that would lead to cache inconsistencies.
The route table is part of the weak-id. You can see that with:
ip route replace unicast 1.2.3.4/32 dev eth0 table 57
ip route replace unicast 1.2.3.4/32 dev eth0 table 58
afterwards, `ip route show table all` will list both routes. The replace
operation is only per-table. Note that NMP_CACHE_ID_TYPE_ROUTES_BY_WEAK_ID
already got this right.
Fixes: 10ac675299 ('platform: add support for routing tables to platform cache')
CURLOPT_PROTOCOLS [0] was deprecated in libcurl 7.85.0 with
CURLOPT_PROTOCOLS_STR [1] as a replacement.
Well, technically it was only deprecated in 7.87.0, and retroactively
marked as deprecated since 7.85.0 [2]. But CURLOPT_PROTOCOLS_STR exists
since 7.85.0, so that's what we want to use.
This causes compiler warnings and build errors:
../src/core/nm-connectivity.c: In function 'do_curl_request':
../src/core/nm-connectivity.c:770:5: error: 'CURLOPT_PROTOCOLS' is deprecated: since 7.85.0. Use CURLOPT_PROTOCOLS_STR [-Werror=deprecated-declarations]
770 | curl_easy_setopt(ehandle, CURLOPT_PROTOCOLS, CURLPROTO_HTTP | CURLPROTO_HTTPS);
| ^~~~~~~~~~~~~~~~
In file included from ../src/core/nm-connectivity.c:13:
/usr/include/curl/curl.h:1749:3: note: declared here
1749 | CURLOPTDEPRECATED(CURLOPT_PROTOCOLS, CURLOPTTYPE_LONG, 181,
| ^~~~~~~~~~~~~~~~~
This patch is largely taken from systemd patch [2].
Based-on-patch-by: Frantisek Sumsal <frantisek@sumsal.cz>
[0] https://curl.se/libcurl/c/CURLOPT_PROTOCOLS.html
[1] https://curl.se/libcurl/c/CURLOPT_PROTOCOLS_STR.html
[2] 6967571bf2
[3] e61a4c0b7c
Fixes: 7a1734926a ('connectivity,cloud-setup: restrict curl protocols to HTTP and HTTPS')
In kernel, the valid range for the weight is 1-256 (on netlink this is
expressed as u8 in rtnh_hops, ranging 0-255).
We need an additional value, to represent
- unset weight, for non-ECMP routes in kernel.
- in libnm API, to express routes that should not be merged as ECMP
routes (the default).
Extend the type in NMPlatformIP4Route.weight to u16, and fix the code
for the special handling of the numeric range.
Also the libnm API needs to change. Modify the type of the attribute on
D-Bus from "b" to "u", to use a 32 bit integer. We use 32 bit, because
we already have common code to handle 32 bit unsigned integers, despite
only requiring 257 values. It seems better to stick to a few data types
(u32) instead of introducing more, only because the range is limited.
Co-Authored-By: Fernando Fernandez Mancera <ffmancera@riseup.net>
Fixes: 1bbdecf5e1 ('platform: manage ECMP routes')
There are two callers of available_connections_add(). One from
cp_connection_added_or_updated() (which is when a connection
gets added/modified) and one from nm_device_recheck_available_connections().
They both call first nm_device_check_connection_available() to see
whether the profile is available on the device. They certainly
need to pass the same check flags, otherwise a profile might
be available in some cases, and not in others.
I didn't actually test this, but I think this could result
in a profile wrongly not being listed as an available-connection.
Moreover, that might mean, that `nmcli connection up $PROFILE`
might work to find the device/profile, but `nmcli device up $DEVICE`
couldn't find the suitable profile (because the latter calls
nm_device_get_best_connection(), which iterates the
available-connections). I didn't test this, because regardless of
that, it seems obvious that the conditions for when we call
available_connections_add() must be the same from both places.
So the only question is what is the right condition, and it would
seem that _NM_DEVICE_CHECK_CON_AVAILABLE_FOR_USER_REQUEST is the right
flag.
Fixes: 02dbe670ca ('device: for available connections check whether they are available for user-request')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1496