The activation of a connection will clear the block of autoconnect,
we should do the same for reapply.
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit 0486efd358)
(cherry picked from commit 18ce5f43bd)
(cherry picked from commit 2695396939)
(cherry picked from commit 32d2e3c14b)
(cherry picked from commit 387ae9d7ff)
(cherry picked from commit 6f2c7733ce)
(cherry picked from commit 34f7499f3c)
Since kernel 5.18 there is a stricter validation [1][2] on the tos
field of routing rules, that must not include ECN bits.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f55fbb6afb8d701e3185e31e73f5ea9503a66744
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a410a0cf98854a698a519bfbeb604145da384c0e
Fixes the following failure:
>>> src/core/platform/tests/test-route-linux
>>> ...
# NetworkManager-MESSAGE: <warn> [1656321515.6604] platform-linux: do-add-rule: failure 22 (Invalid argument - Invalid dsfield (tos): ECN bits must be 0)
>>> failing... errno=-22, rule=[routing-rule,0x13d6e80,1,+alive,+visible; [6] 0: from all tos 0xff fwmark 0x4/0 suppress_prefixlen -459579276 action-214 protocol 255]
>>> existing rule: * [routing-rule,0x13d71e0,2,+alive,+visible; [6] 0: from all sport 65534 lookup 10009 suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d7280,2,+alive,+visible; [4] 0: from all fwmark 0/0x9a7e9992 ipproto 255 suppress_prefixlen 0 realms 0x00000008 none protocol 71]
>>> existing rule: [routing-rule,0x13d7320,2,+alive,+visible; [6] 598928157: from all suppress_prefixlen 0 none]
>>> existing rule: [routing-rule,0x13d73c0,2,+alive,+visible; [4] 0: from 192.192.5.200/8 lookup 254 suppress_prefixlen 0 none protocol 9]
>>> existing rule: [routing-rule,0x13d7460,2,+alive,+visible; [4] 0: from all ipproto 3 suppress_prefixlen 0 realms 0xffffffff none protocol 5]
>>> existing rule: [routing-rule,0x13d7500,2,+alive,+visible; [4] 0: from all fwmark 0x1/0 lookup 254 suppress_prefixlen 0 action-124 protocol 4]
>>> existing rule: [routing-rule,0x13d75a0,2,+alive,+visible; [4] 0: from all suppress_prefixlen 0 action-109]
0: from all fwmark 0/0x9a7e9992 ipproto ipproto-255 realms 8 none proto 71
0: from 192.192.5.200/8 lookup main suppress_prefixlength 0 none proto ra
0: from all ipproto ggp realms 65535/65535 none proto 5
0: from all fwmark 0x1/0 lookup main suppress_prefixlength 0 124 proto static
0: from all 109
0: from all sport 65534 lookup 10009 suppress_prefixlength 0 none
598928157: from all none
Bail out! nm:ERROR:../src/core/platform/tests/test-route.c:1787:test_rule: assertion failed (r == 0): (-22 == 0)
Fixes: 5ae2431b0f ('platform/tests: add tests for handling policy routing rules')
(cherry picked from commit bf9a2babb4)
(cherry picked from commit 09b0014a01)
(cherry picked from commit e1266b3b12)
(cherry picked from commit 8da69be278)
(cherry picked from commit d276884206)
If the MAC changes there is the possibility that the DHCP client will
not be able to renew the address because it uses the old MAC as
CHADDR. Depending on the implementation, the DHCP server might use
CHADDR (so, the old address) as the destination MAC for DHCP replies,
and those packets will be lost.
To avoid this problem, restart the DHCP client when the MAC changes.
https://bugzilla.redhat.com/show_bug.cgi?id=2110000
(cherry picked from commit 905adabdba)
(cherry picked from commit 5a49a2f6b2)
(cherry picked from commit d0fb3fbf8e)
(cherry picked from commit 59a52510f3)
(cherry picked from commit 0766d08db9)
(cherry picked from commit 25abc22ac9)
Add support for IPv6 multipath routes, by treating them as single-hop
routes. Otherwise, we can easily end up with an inconsistent platform
cache.
Background:
-----------
Routes are hard. We have NMPlatform which is a cache of netlink objects.
That means, we have a hash table and we cache objects based on some
identity (nmp_object_id_equal()). So those objects must have some immutable,
indistinguishable properties that determine whether an object is the
same or a different one.
For routes and routing rules, this identifying property is basically a subset
of the attributes (but not all!). That makes it very hard, because tomorrow
kernel could add an attribute that becomes part of the identity, and NetworkManager
wouldn't recognize it, resulting in cache inconsistency by wrongly
thinking two different routes are one and the same. Anyway.
The other point is that we rely on netlink events to maintain the cache.
So when we receive a RTM_NEWROUTE we add the object to the cache, and
delete it upon RTM_DELROUTE. When you do `ip route replace`, kernel
might replace a (different!) route, but only send one RTM_NEWROUTE message.
We handle that by somehow finding the route that was replaced/deleted. It's
ugly. Did I say, that routes are hard?
Also, for IPv4 routes, multipath attributes are just a part of the
routes identity. That is, you add two different routes that only differ
by their multipath list, and then kernel does as you would expect.
NetworkManager does not support IPv4 multihop routes and just ignores
them.
Also, a multipath route can have next hops on different interfaces,
which goes against our current assumption, that an NMPlatformIP4Route
has an interface (or no interface, in case of blackhole routes). That
makes it hard to meaningfully support IPv4 routes. But we probably don't
have to, because we can just pretend that such routes don't exist and
our cache stays consistent (at least, until somebody calls `ip route
replace` *sigh*).
Not so for IPv6. When you add (`ip route append`) an IPv6 route that is
identical to an existing route -- except their multipath attribute -- then it
behaves as if the existing route was modified and the result is the
merged route with more next-hops. Note that in this case kernel will
only send a RTM_NEWROUTE message with the full multipath list. If we
would treat the multipath list as part of the route's identity, this
would be as if kernel deleted one routes and created a different one (the
merged one), but only sending one notification. That's a bit similar to
what happens during `ip route replace`, but it would be nightmare to
find out which route was thereby replaced.
Likewise, when you delete a route, then kernel will "subtract" the
next-hop and sent a RTM_DELROUTE notification only about the next-hop that
was deleted. To handle that, you would have to find the full multihop
route, and replace it with the remainder after the subtraction.
NetworkManager so far ignored IPv6 routes with more than one next-hop, this
means you can start with one single-hop route (that NetworkManger sees
and has in the platform cache). Then you create a similar route (only
differing by the next-hop). Kernel will merge the routes, but not notify
NetworkManager that the single-hop route is not longer a single-hop
route. This can easily cause a cache inconsistency and subtle bugs. For
IPv6 we MUST handle multihop routes.
Kernels behavior makes little sense, if you expect that routes have an
immutable identity and want to get notifications about addition/removal.
We can however make sense by it by pretending that all IPv6 routes are
single-hop! With only the twist that a single RTM_NEWROUTE notification
might notify about multiple routes at the same time. This is what the
patch does.
The Patch
---------
Now one RTM_NEWROUTE message can contain multiple IPv6 routes
(NMPObject). That would mean that nmp_object_new_from_nl() needs to
return a list of objects. But it's not implemented that way. Instead,
we still call nmp_object_new_from_nl(), and the parsing code can
indicate that there is something more, indicating the caller to call
nmp_object_new_from_nl() again in a loop to fetch more objects.
In practice, I think all RTM_DELROUTE messages for IPv6 routes are
single-hop. Still, we implement it to handle also multi-hop messages the
same way.
Note that we just parse the netlink message again from scratch. The alternative
would be to parse the first object once, and then clone the object and
only update the next-hop. That would be more efficient, but probably
harder to understand/implement.
https://bugzilla.redhat.com/show_bug.cgi?id=1837254#c20
(cherry picked from commit dac12a8d61)
(cherry picked from commit 698cf1092c)
(cherry picked from commit 87fe255c89)
(cherry picked from commit c8a1d9ca73)
The variable with this purpose is usually called "IS_IPv4".
It's upper case, because usually this is a const variable, and because
it reminds of the NM_IS_IPv4(addr_family) macro. That letter case
is unusual, but it makes sense to me for the special purpose that this
variable has.
Anyway. The naming of this variable is a different point. Let's
use the variable name that is consistent and widely used.
(cherry picked from commit 8085c0121f)
(cherry picked from commit eec32669a9)
(cherry picked from commit 99cd6ed25e)
(cherry picked from commit d64930f7eb)
To parse the RTA_MULTIHOP message, "policy" is not right (which is used
to parse the overall message). Instead, we don't really have a special
policy that we should use.
This was not a severe issue, because the allocated buffer (with
G_N_ELEMENTS(policy) elements) was larger than need be. And apparently,
using the wrong policy also didn't cause us to reject important
messages.
(cherry picked from commit 997d72932d)
(cherry picked from commit 21b1978072)
(cherry picked from commit ef1587bd88)
(cherry picked from commit 6aa3d40199)
The compiler is often adament to warn about maybe-uninitialized.
(cherry picked from commit 3dd854eb1b)
(cherry picked from commit 471e987add)
(cherry picked from commit 4fa6001c60)
This "fix" was wrong, because at the beginning of the if-else-block
there is already `key_mgmt_conf = g_string_new(key_mgmt);`.
This reverts commit a6e06171ad.
a6e06171ad ('supplicant/config: fix setting "saw" for "wifi-sec.key-mgnt=sae"')
This is a partial backport of commit 5f146b40f3 ('supplicant/config:
Refactor key_mgmt config generation').
Based-on-patch-by: Jonas Dreßler <verdre@v0yd.nl>
Fixes: d17a0a0905 ('supplicant: allow fast transition for WPA-PSK and WPA-EAP')
(cherry picked from commit 5f146b40f3)
If the kernel command-line doesn't contain an explict ip=$method,
currently the generator creates connections with both IPv4 and IPv6
set to 'auto', and both allowed to fail.
Since NM is run in configure-and-quit mode in the initrd, NM can get
an IPv4 address or an IPv6 one (or both) depending on which address
family is quicker to complete. This unpredictable behavior is not
present in the legacy module, which always does IPv4 only by default.
Set a required-timeout of 20 seconds for IPv4, so that NM will
preferably get an IPv4, or will fall back to IPv6.
See also: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/729
(cherry picked from commit 0a18e97345)
(cherry picked from commit 1b9cf8c513)
Change the logic in check_ip_state() to delay the connection ACTIVATED
state if an address family is pending and its required-timeout has not
expired.
(cherry picked from commit 35cccc41cb)
(cherry picked from commit 51e5df275c)
Add a new property to specify the minimum time interval in
milliseconds for which dynamic IP configuration should be tried before
the connection succeeds.
This property is useful for example if both IPv4 and IPv6 are enabled
and are allowed to fail. Normally the connection succeeds as soon as
one of the two address families completes; by setting a required
timeout for e.g. IPv4, one can ensure that even if IP6 succeeds
earlier than IPv4, NetworkManager waits some time for IPv4 before the
connection becomes active.
(cherry picked from commit cb5960cef7)
(cherry picked from commit 08ce20481c)
For the umpteenth time: it is not ifcfg-rh writers decision to decide
what are valid configurations and only persist settings based on
some other settings.
If s390-options would only be allowed together with subchannels, then
this is alone nm_connection_verify()'s task to ensure.
Reproduce with
$ nmcli connection add type ethernet autoconnect no con-name zz ethernet.s390-options bridge_role=primary
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1935842
Fixes: 16bccfd672 ('core: handle s390 options more cleanly')
(cherry picked from commit d391f20730)
(cherry picked from commit b425793d90)
If a prefix delegation is needed, currently NM restarts DHCPv6 on the
device with default route, but only if DHCPv6 was already running.
Allow the device to start DHCPv6 for a PD even if it was running
without DHCPv6.
See also: https://github.com/coreos/fedora-coreos-tracker/issues/888
(cherry picked from commit 62869621bd)
(cherry picked from commit 75b8ced29a)
Previously we sent announcements immediately for non-controllers, or
after the first port was attached for controllers.
This has two problems:
- announcements can be sent when there is no carrier and they would
be lost;
- if a controller has a port, the port could be itself a controller;
in that case we start sending ARPs with the fake address of the
port. Later, when a leaf port is added to the second-level
controller, the correct port MAC will be propagated by kernel up to
both controllers.
To solve both problems, send ARP announcements only when the interface
has carrier. This also solves the second issue because controllers
created by NM have carrier only when there is a port with carrier.
Fixes: de1022285a ('device: do ARP announcements only after masters have a slave')
https://bugzilla.redhat.com/show_bug.cgi?id=1956793
(cherry picked from commit 1377f160ed)
(cherry picked from commit 70aeccf605)
When determining the hostname, it is preferable to evaluate devices in
a predictable order to avoid that the hostname changes between
different boots.
The current order is based first on hostname priority, then on the
presence of a best default route, and then on activation order.
The activation order is not a very strong condition, as it is
basically useless for devices that are autoactivated at boot.
As we already prefer IPv4 over IPv6 within the same connection, also
prefer it when 2 connections have the same priority and the same
default route status, to achieve better predictability.
https://bugzilla.redhat.com/show_bug.cgi?id=1970335https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/895
(cherry picked from commit 637a45e25b)
(cherry picked from commit 21051dc6d8)
If the TC setting contains no qdiscs and filters, it is lost after a
write-read cycle. Fix this by adding a new property to indicate the
presence of the (empty) setting.
(cherry picked from commit 6a88d4e55c)
NetworkManager supports a very limited set of qdiscs. If users want to
configure a unsupported qdisc, they need to do it outside of
NetworkManager using tc.
The problem is that NM also removes all qdiscs and filters during
activation if the connection doesn't contain a TC setting. Therefore,
setting TC configuration outside of NM is hard because users need to
do it *after* the connection is up (for example through a dispatcher
script).
Let NM consider the presence (or absence) of a TC setting in the
connection to determine whether NM should configure (or not) qdiscs
and filters on the interface. We already do something similar for
SR-IOV configuration.
Since new connections don't have the TC setting, the new behavior
(ignore existing configuration) will be the default. The impact of
this change in different scenarios is:
- the user previously configured TC settings via NM. This continues
to work as before;
- the user didn't set any qdiscs or filters in the connection, and
expected NM to clear them from the interface during activation.
Here there is a change in behavior, but it seems unlikely that
anybody relied on the old one;
- the user didn't care about qdiscs and filters; NM removed all
qdiscs upon activation, and so the default qdisc from kernel was
used. After this change, NM will not touch qdiscs and the default
qdisc will be used, as before;
- the user set a different qdisc via tc and NM cleared it during
activation. Now this will work as expected.
So, the new default behavior seems better than the previous one.
https://bugzilla.redhat.com/show_bug.cgi?id=1928078
(cherry picked from commit a48edd0410)
If the configuration contains dns=none and resolv.conf is updated
through a dispatcher script, currently there is no way to tell NM that
the content of resolv.conf changed, so that it can restart a hostname
resolution.
Use SIGUSR1 (and SIGHUP) for that.
(cherry picked from commit fa1f628bce)
Found by valgrind.
Fixes: 4154d9618c ('bluetooth: refactor BlueZ handling and let NMBluezManager cache ObjectManager data')
(cherry picked from commit 6813a4fe75)
(cherry picked from commit a25c577556)
Found by valgrind.
Fixes: b83f07916a ('supplicant: large rework of wpa_supplicant handling')
(cherry picked from commit 01df4a5ad0)
(cherry picked from commit 80a8a5d16d)
"uuid" is returned from nms_keyfile_nmmeta_check_filename(),
and contains "$UUID.nmmeta". We must compare only the first
"uuid_len" bytes.
Fixes: 064544cc07 ('settings: support storing "shadowed-storage" to .nmmeta files')
(cherry picked from commit 7e8e6836e0)
Coverity says:
Error: ALLOC_FREE_MISMATCH (CWE-762):
NetworkManager-1.31.3/src/core/tests/test-systemd.c:261: alloc: Allocation of memory which must be freed using "free".
NetworkManager-1.31.3/src/core/tests/test-systemd.c:274: free: Calling "_nm_auto_g_free" frees "exp2_arr" using "g_free" but it should have been freed using "free".
# 272| g_assert_cmpmem(expected_arr, expected_len, exp3_arr, exp3_len);
# 273| }
# 274|-> }
# 275|
# 276| #define _test_unbase64mem(base64, expected_str) \
Error: ALLOC_FREE_MISMATCH (CWE-762):
NetworkManager-1.31.3/src/core/tests/test-systemd.c:270: alloc: Allocation of memory which must be freed using "free".
NetworkManager-1.31.3/src/core/tests/test-systemd.c:274: free: Calling "_nm_auto_g_free" frees "exp3_arr" using "g_free" but it should have been freed using "free".
# 272| g_assert_cmpmem(expected_arr, expected_len, exp3_arr, exp3_len);
# 273| }
# 274|-> }
# 275|
# 276| #define _test_unbase64mem(base64, expected_str) \
Fixes: 0298d54078 ('systemd: expose unbase64mem() as nm_sd_utils_unbase64mem()')
(cherry picked from commit 44abe6d661)
Coverity says:
Error: ALLOC_FREE_MISMATCH (CWE-762):
NetworkManager-1.31.3/src/core/dhcp/nm-dhcp-systemd.c:234: alloc: Allocation of memory which must be freed using "free".
NetworkManager-1.31.3/src/core/dhcp/nm-dhcp-systemd.c:447: free: Calling "_nm_auto_g_free" frees "routes" using "g_free" but it should have been freed using "free".
# 445| }
# 446| NM_SET_OUT(out_options, g_steal_pointer(&options));
# 447|-> return g_steal_pointer(&ip4_config);
# 448| }
# 449|
Fixes: acc0d79224 ('systemd: merge branch 'systemd' into master')
(cherry picked from commit 64985beef8)
Found by Coverity:
Error: RESOURCE_LEAK (CWE-772):
NetworkManager-1.31.3/src/core/nm-config-data.c:450: alloc_fn: Storage is returned from allocation function "nm_config_data_get_value".
NetworkManager-1.31.3/src/core/nm-config-data.c:450: var_assign: Assigning: "str" = storage returned from "nm_config_data_get_value(self, "main", "auth-polkit", (enum [unnamed type of NMConfigGetValueFlags])6)".
NetworkManager-1.31.3/src/core/nm-config-data.c:454: noescape: Resource "str" is not freed or pointed-to in "nm_auth_polkit_mode_from_string".
NetworkManager-1.31.3/src/core/nm-config-data.c:465: leaked_storage: Variable "str" going out of scope leaks the storage it points to.
# 463| NM_SET_OUT(out_invalid_config, FALSE);
# 464|
# 465|-> return auth_polkit_mode;
# 466| }
# 467|
Fixes: 6d7446e52f ('core: add main.auth-polkit option "root-only"')
(cherry picked from commit ceaa1c369f)
nm_act_request_set_shared() already calls nm_utils_share_rules_apply().
Calling it twice, is pretty bad because during deactivate we will only
remove one of each duplicate rule.
Fixes: 701654b930 ('core: refactor tracking of shared-rules to use NMUtilsShareRules')
(cherry picked from commit 60744889e2)
Active-connections in the async_op_lst are not guaranteed to have a
settings-connection. In particular, the settings-connection for an
AddAndActivate() AC is set only after the authorization succeeds. Use
the non-asserting variant of the function to fix the following
failure:
nm_active_connection_get_settings_connection: assertion 'sett_conn' failed
1 _g_log_abort()
2 g_logv()
3 g_log()
4 _nm_g_return_if_fail_warning.constprop.14()
5 nm_active_connection_get_settings_connection()
6 active_connection_find()
7 _get_activatable_connections_filter()
8 nm_settings_get_connections_clone()
9 nm_manager_get_activatable_connections()
10 auto_activate_device_cb()
11 g_idle_dispatch()
12 g_main_context_dispatch()
13 g_main_context_iterate.isra.21()
14 g_main_loop_run()
15 main()
Fixes: 33b9fa3a3c ('manager: Keep volatile/external connections while referenced by async_op_lst')
https://bugzilla.redhat.com/show_bug.cgi?id=1933719https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/834
(cherry picked from commit 23cc0bf335)
Commit 33b9fa3a3c ("manager: Keep volatile/external connections
while referenced by async_op_lst") changed active_connection_find() to
also return active connections that are not yet activating but are
waiting authorization.
This has side effect for other callers of the function. In particular,
_get_activatable_connections_filter() should exclude only ACs that are
really active, not those waiting for authorization.
Otherwise, in ensure_master_active_connection() all the ACs waiting
authorization are missed and we might fail to find the right master
AC.
Add an argument to active_connection_find to select whether include
ACs waiting authorization.
Fixes: 33b9fa3a3c ('manager: Keep volatile/external connections while referenced by async_op_lst')
https://bugzilla.redhat.com/show_bug.cgi?id=1955101
(cherry picked from commit e694f2cec1)
If the device is still unmanaged by platform-init (which means that
udev didn't emit the event for the interface) when the device gets
realized, we currently clear the assume state. Later, when the device
becomes managed, NM is not able to properly assume the device using
the UUID.
This situation arises, for example, when NM already configured the
device in initrd; after NM is restarted in the real root, udev events
can be delayed causing this race condition.
Among all unamanaged flags, platform-init is the only one that can be
delayed externally. We should not clear the assume state if the device
has only platform-init in the unmanaged flags.
(cherry picked from commit 3c4450aa4d)
_set_state_full() in NMDevice already calls
nm_device_assume_state_reset() when the device reaches state >
DISCONNECTED.
(cherry picked from commit 5dc6d73243)
Probably pid_t is always signed, because kill() documents that
negative values have a special meaning (technically, C would
automatically cast negative signed values to an unsigned pid_t type
too).
Anyway, NMDhcpClient at several places uses -1 as special value for "no
pid". At the same time, it checks for valid PIDs with "pid > 1". That
only works if pid_t is signed.
Add a static assertion for that.
(cherry picked from commit 92bfe09724)
When the link goes away the manager keeps software devices alive as
unrealized because there is still a connection for them.
If the device is software and has a NM-generated connection, keeping
the device alive means that also the generated connection stays
alive. The result is that both stick around forever even if there is
no longer a kernel link.
Add a check to avoid this situation.
https://bugzilla.redhat.com/show_bug.cgi?id=1945282
Fixes: cd0cf9229d ('veth: add support to configure veth interfaces')
(cherry picked from commit d19773ecd4)