During disposal we're calling to remove_all_aps that in turns schedules
an auto-activate recheck. As the device is removed, this triggers an
assertion when trying to do the recheck.
Fix that by not scheduling the recheck.
Example of backtrace that this commits fix:
0 __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
1 0xf746e270 in __pthread_kill_implementation (threadid=<optimized out>, signo=6, no_tid=<optimized out>) at pthread_kill.c:43
2 0xf743fbc6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
3 0xf7431614 in __GI_abort () at abort.c:79
4 0xf775afea in g_assertion_message (domain=domain@entry=0x209a9f "nm", file=file@entry=0x1f7d59 "../NetworkManager-1.43.7/src/core/nm-policy.c", line=line@entry=1665,
func=func@entry=0x1f94d9 <__func__.6> "nm_policy_device_recheck_auto_activate_schedule",
message=message@entry=0x1d3e950 "assertion failed: (g_signal_handler_find(device, G_SIGNAL_MATCH_DATA, 0, 0, NULL, NULL, NM_POLICY_GET_PRIVATE(self)) != 0)")
at ../glib-2.72.3/glib/gtestutils.c:3253
5 0xf775b05e in g_assertion_message_expr (domain=0x209a9f "nm", file=0x1f7d59 "../NetworkManager-1.43.7/src/core/nm-policy.c", line=1665,
func=0x1f94d9 <__func__.6> "nm_policy_device_recheck_auto_activate_schedule",
expr=0x1f8afc "g_signal_handler_find(device, G_SIGNAL_MATCH_DATA, 0, 0, NULL, NULL, NM_POLICY_GET_PRIVATE(self)) != 0") at ../glib-2.72.3/glib/gtestutils.c:3279
6 0x0005f27a in nm_policy_device_recheck_auto_activate_schedule (self=0x1d3e950, device=0x209a9f) at ../NetworkManager-1.43.7/src/core/nm-policy.c:1679
7 0x000548ae in nm_manager_device_recheck_auto_activate_schedule (self=<optimized out>, device=<optimized out>) at ../NetworkManager-1.43.7/src/core/nm-manager.c:3113
8 0x00070622 in nm_device_recheck_auto_activate_schedule (self=<optimized out>) at ../NetworkManager-1.43.7/src/core/devices/nm-device.c:9249
9 0xf693aa8c in ap_add_remove (self=self@entry=0x1ceb0b0, is_adding=0, ap=<optimized out>, recheck_available_connections=0)
at ../NetworkManager-1.43.7/src/core/devices/wifi/nm-device-wifi.c:846
10 0xf693bcda in remove_all_aps (self=self@entry=0x1ceb0b0) at ../NetworkManager-1.43.7/src/core/devices/wifi/nm-device-wifi.c:863
11 0xf693f83c in dispose (object=0x1ceb0b0) at ../NetworkManager-1.43.7/src/core/devices/wifi/nm-device-wifi.c:3809
12 0xf7806e72 in g_object_unref (_object=<optimized out>) at ../glib-2.72.3/gobject/gobject.c:3636
13 g_object_unref (_object=0x1ceb0b0) at ../glib-2.72.3/gobject/gobject.c:3553
14 0x000f7fa4 in _nm_dbus_object_clear_and_unexport (location=location@entry=0xffa50644) at ../NetworkManager-1.43.7/src/core/nm-dbus-object.c:203
15 0x000576e4 in remove_device (self=self@entry=0x1c9c900, device=<optimized out>, quitting=quitting@entry=1) at ../NetworkManager-1.43.7/src/core/nm-manager.c:2289
16 0x0005a864 in nm_manager_stop (self=self@entry=0x1c9c900) at ../NetworkManager-1.43.7/src/core/nm-manager.c:7784
17 0x00023438 in main (argc=<optimized out>, argv=<optimized out>) at ../NetworkManager-1.43.7/src/core/main.c:530
Fixes: 96f40dcdcd ('wifi/ap: explicitly unexport AP and refactor add/remove AP')
Fixes: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1791
When a software device becomes deactivated, we check whether it can
be unrealized (= deleted in kernel), by calling function
delete_on_deactivate_check_and_schedule().
The function returns without doing anything if there is a new
activation enqueued on the device (priv->queued_act_request), because
in that case the device will be reused for the next activation.
This commit fixes a problem seen in NMCI test
@ovs_delete_connecting_interface: sometimes the device is not
unrealized after deleting the connection. That happens because if the
queued activation fails, we never try again to unrealize the device.
Fix that by calling delete_on_deactivate_check_and_schedule() when
there is a failure starting the queued activation.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2258
Unrealized software devices are always available for activation,
hardware devices never.
In nm_manager_get_best_device_for_activation() we call
nm_device_is_available() on candidate devices. Without this fix, any
unrealized software device would be not considered ready for
activation, which is wrong.
A software device can override the default implementation of
is_available(). For example NMDeviceOvsInterface does that and only
checks the OVSDB is ready.
Fixes: ba86c208e0 ('Revert "core: prevent the activation of unavailable OVS interfaces only"')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2253
With this configuration:
[Interface]
...
Address = 172.16.110.116/28,172.16.111.21/28
[Peer]
...
AllowedIPs = 172.16.110.112/28
[Peer]
...
AllowedIPs = 172.16.111.16/28
NetworkManager currently creates the following routes
(1) 172.16.110.112/28 dev wg0 proto static scope link metric 50 <-- peer route
(2) 172.16.110.112/28 dev wg0 proto kernel scope link src 172.16.110.116 metric 50 <-- prefix route
(3) 172.16.111.16/28 dev wg0 proto static scope link metric 50 <-- peer route
(4) 172.16.111.16/28 dev wg0 proto kernel scope link src 172.16.111.21 metric 50 <-- prefix route
If we try to reach a host in the second peer subnet, route (4)
matches. Route (4) doesn't specify a source IP and so the kernel will
use the first IP set on the interface (172.16.110.116), which is the
wrong one.
# ip route get 172.16.111.17
172.16.111.17 dev wg0 src 172.16.110.116 uid 0
To fix this problem, if the AllowedIP subnet is already reachable on
the interface via the prefix route of a static IP address, we should
skip adding the peer route.
wg-quick does something similar here:
https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick/linux.bash?h=v1.0.20250521#n177
The condition in wg-quick is a bit different because it checks that no
duplicate route exists on the interface. We can't do exactly the same
because in NMDeviceWireGuard we don't have visibility on all the
platform routes.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1790https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2254
Adds support for reapplying the `sriov.vfs` property. Note this
does not include `num_vfs`, as the configuration needs to be reset
and reconfigured from scratch in that case.
Previously, if an existing VF is modified (e.g. if we change the `trust`
flag), we reset all VF configurations, and started from scratch. But in
some cases, this is unnecessarily disruptive.
Resolves: https://issues.redhat.com/browse/RHEL-95844
After resuming from suspend, devices with wake-on-lan enabled are
temporarily set as unmanaged, and then managed again. At the beginning
of this process, an active device goes from state ACTIVATED to
UNMANAGED and is deconfigured via
"nm_device_cleanup(cleanup_type=CLEANUP_TYPE_DECONFIGURE)".
If the device is attached to a controller, the cleanup doesn't detach
it. Later when the device is managed again, NetworkManager tries to
create an assumed connection. Normally, this would fail because we
detect that the device is not configured. However, if there is a
controller-port relationship, the assumed connection generation
succeeds and the persistent connection doesn't go up.
As this is wrong, prevent the generation of the assumed connection by
detaching the port during a cleanup.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1766https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2242
The "notify::controller" signal must be emitted on the port, not on
the controller.
Fixes: 1f05526ed7 ('core: drop NMDevice master and introduce controller')
After ACD_WAIT_PROBING_EXTRA_TIME_MSEC has elapsed,
_l3_acd_data_timeout_schedule_probing_restart() keeps rescheduling the
timer with a zero interval, resulting in 100% CPU usage. This
continues until the probe is destroyed after
ACD_WAIT_PROBING_EXTRA_TIME2_MSEC.
When computing the interval, we need to use
(ACD_WAIT_PROBING_EXTRA_TIME_MSEC + ACD_WAIT_PROBING_EXTRA_TIME2_MSEC)
as the expiry time.
acd_data->probing_timestamp_msec indicates when the probing
started. It is used in different places to calculate the timeout for
certain operations. In particular, it is used to detect that the probe
creation took too long when handling the ACD_STATE_CHANGE_MODE_TIMEOUT
event.
If we reset this timestamp at every timer event, we'll never hit the
probe creation timeout. Therefore, the l3cfg will keep trying forever
to create the probe.
See: https://lists.freedesktop.org/archives/networkmanager/2025-July/000418.html
Fix this by not updating the timestamp during a timeout event.
Fixes: a09f9cc616 ('l3cfg: ensure the probing timeout is initialized on probe start')
When resolving the system hostname from DNS lookup, we use
nm_utils_validate_hostname() which checks that the result is a valid
hostname. A valid hostname is at most 64 characters on Linux. Anything
longer is discarded.
However, the reverse DNS lookup doesn't return a hostname, it returns
a DNS name. The DNS name can have multiple labels, each limited to 63
characters. The maximum length of the DNS name is 253 characters.
If the result is longer than 64 characters because it has multiple
labels, we should still accept it, provided that it is a valid DNS
name. Then when setting the hostname in the system, only the first
label will be kept.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2243
Resolves: https://issues.redhat.com/browse/RHEL-104357
Fix the following error seen when running the build_clean.sh script
with LTO disabled:
In file included from ../src/libnm-glib-aux/nm-default-glib.h:66,
from ../src/libnm-glib-aux/nm-default-glib-i18n-prog.h:13,
from ../src/core/nm-default-daemon.h:11,
from ../src/core/platform/tests/test-link.c:6:
In function ‘_nm_auto_freev’,
inlined from ‘test_link_get_bridge_fdb’ at ../src/core/platform/tests/test-link.c:2732:33:
../src/libnm-glib-aux/nm-macros-internal.h:166:8: error: ‘addrs’ may be used uninitialized [-Werror=maybe-uninitialized]
166 | if (*p) {
| ^
../src/core/platform/tests/test-link.c: In function ‘test_link_get_bridge_fdb’:
../src/core/platform/tests/test-link.c:2732:33: note: ‘addrs’ was declared here
2732 | nm_auto_freev NMEtherAddr **addrs;
| ^~~~~
cc1: all warnings being treated as errors
Fixes: 16ef33d380 ('bond-slb: fix memory leak')
Commit c5d1e35f99 ('device: support reapplying bridge-port VLANs')
didn't update can_reapply_change() to accept the "bridge-port.vlans"
property during a reapply. So, it was only possible to change the
bridge port VLANs by updating the "bridge.vlan-default-pvid" property
and doing a reapply. Fix that.
Fixes: c5d1e35f99 ('device: support reapplying bridge-port VLANs')
If the bridge default-pvid is zero, it means that the default PVID is
disabled. That is, the bridge PVID is not propagated to ports.
Currently NM tries to merge the existing bridge VLANs on the port with
the default PVID from the bridge, even when the PVID is zero. This
causes an error when setting the new VLAN list in the kernel, because
it rejects VLAN zero.
Skip the merge of the default PVID when zero.
Fixes: c5d1e35f99 ('device: support reapplying bridge-port VLANs')
If sendto() fails, the function returns and the remaining entries are
not deallocated. Use nm_auto_freev instead to free the array and the
pointer it contains.
Add a test to check that nm_auto_freev does the right thing on the
value returned by nm_linux_platform_get_bridge_fdb().
Fixes: 3f2f922dd9 ('bonding: send ARP announcement on bonding-slb link/carrier down')
Rename nm_linux_platform_get_link_fdb_table() to
nm_linux_platform_get_bridge_fdb(). The new name better indicates that
the function returns the bridge FDB entries.
The DHCP search list option (119) can use the "message compression"
algorithm specified in RFC 1035 section 4.1.4 to reduce the size of
the message in presence of subdomains that appear multiple times.
When using the compression a label starts with:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| 1 1| OFFSET |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
where the offset points to a previous domain.
Previously, the parsing code was taking the lower 6 bits of the first
byte, shifting them left 16 bits, and adding the next byte. Instead,
the shift should be of 8 bits.
The effect of this bug was that when the offset was greater than 255,
it was incorrectly parsed as a number larger than the message size,
and the parsing failed.
Note that while a single DHCP option can be at most 255 bytes, a DHCP
message can contain multiple instances of the same option. The
receiver must concatenate all the occurrences according to RFC 3396
and parse the resulting buffer.
Fixes: 6adade6f21 ('dhcp: add nettools dhcp4 client')
The validation of embedded NUL character was skipped due to the wrong
order of arguments to memchr(). Fix it.
Fixes: 4043f82790 ('lldp: cleanup converting binary LLDP fields to string')
Currently the bug is hidden because the macro is only called with
NM_SETTING_BOND_OPTION_ARP_IP_TARGET.
Fixes: 45c95e9314 ('device/bond: rework setting of arp_ip_target bond options')
Linux UIDs/GIDs are 32-bit unsigned integer, with 4294967295 reserved
as undefined.
Before:
# useradd -u 4294967294 -M testuser
useradd warning: testuser's uid -2 outside of the UID_MIN 1000 and UID_MAX 60000 range.
# nmcli connection add type tun ifname tun1 owner 4294967294 ipv4.method disabled ipv6.method disabled
Error: Failed to add 'tun-tun1' connection: tun.owner: '4294967294': invalid user ID
After:
# useradd -u 4294967294 -M testuser
useradd warning: testuser's uid -2 outside of the UID_MIN 1000 and UID_MAX 60000 range.
# nmcli connection add type tun ifname tun1 owner 4294967294 ipv4.method disabled ipv6.method disabled
Connection 'tun-tun1' (5da24d19-1723-45d5-8e04-c976f7a251d0) successfully added.
# ip -d link show tun1
2421: tun1: <NO-CARRIER,POINTOPOINT,MULTICAST,NOARP,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 500
link/none promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
tun type tun pi off vnet_hdr off persist on user testuser ...
^^^^^^^^^^^^^
Fixes: 1f30147a7a ('libnm-core: add NMSettingTun')
Currently, when a call to Reapply() results in stage3 being re-run, IPv6
ends up messed up. Like this:
$ nmcli device modify eth0 ipv4.address ''
$ nmcli device modify eth0 ipv4.address 172.31.13.37/24
$
NetworkManager[666]: <debug> [1751286095.2070] device[c95ca04a69467d81] (eth0): ip4: reapply...
...
NetworkManager[666]: <debug> [1751286095.2104] device[c95ca04a69467d81] (eth0): ip6: addrgenmode6: set none (already set)
NetworkManager[666]: <debug> [1751286095.2105] device[c95ca04a69467d81] (eth0): ip6: addrgenmode6: toggle disable_ipv6 sysctl after disabling addr-gen-mode
NetworkManager[666]: <debug> [1751286095.2105] platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/eth0/disable_ipv6' to '1' (current value is '0')
NetworkManager[666]: <debug> [1751286095.2106] platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/eth0/disable_ipv6' to '0' (current value is '1')
NetworkManager[666]: <debug> [1751286095.2106] platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/eth0/accept_ra' to '0' (current value is identical)
NetworkManager[666]: <debug> [1751286095.2106] platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/eth0/disable_ipv6' to '0' (current value is identical)
Not only is this unnecessary because addr-gen-mode already has the
desired value (as is logged), but also wipes off all IPv6 configuration.
This is fine on initial configuration, but not on Reapply().
Let's look at the device state first: if we've progressed past ip-config
state, then we can't possibly ever touch the offending sysctls. It's
okay -- we don't need to: addr-gen-mode is going to be set right if we
went through ip-config before.
Resolves: https://issues.redhat.com/browse/NMT-1681https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2232
Add a new capability to indicate that NetworkManager supports the
"sriov.preserve-on-down" connection property. With this, clients can
set the property only when supported, without the risk of creating an
invalid connection.
This commit adds NM_VERSION_INFO_CAPABILITY_IPV4_FORWARDING to the
VersionInfo D-Bus property, allowing clients such as nmstate to check
the NetworkManager's support of configuring per-device IPv4 sysctl
forwarding setting directly via the capabilities bitmask instead of
relying on the NetworkManager version comparisons.
Fix the following:
../src/core/nm-connectivity.c:958:1: warning: ‘check_platform_config’ defined but not used [-Wunused-function]
958 | check_platform_config(NMConnectivity *self,
| ^~~~~~~~~~~~~~~~~~~~~
Fixes: 91d447df19 ('device: don't start connectivity check on unconfigured devices')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2224
It is not clear whether we can actually respect this value. For example,
we should not restore the kernel's default value on deactivation or
device's state change, but it is unclear if we can ensure that we'll
still have the connection's configuration in all possible changes of
state.
Also, it is unclear if it's a desirable value that we want to support.
At this point it is mostly clear that trying to configure NM managed
devices externally always ends being dissapointing, no matter how hard
we try.
Remove this value for now, while we discuss whether it makes sense or
not, so it doesn't become stable in the new 1.54 release.
It is useful when there is an already active device and we want to
bring it down preserving the SR-IOV VFs. For example:
$ nmcli connection add type ethernet ifname eni1np1 sriov.total-vfs 2 ipv4.method disabled ipv6.method disabled
$ nmcli connection up ethernet-eni1np1
$ ip link show eni1np1
342: eni1np1: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 6e:cf:f0:08:74:f4 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, ...
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, ...
$ nmcli device modify eni1np1 sriov.preserve-on-down yes
$ nmcli connection down ethernet-eni1np1
$ ip link show eni1np1
342: eni1np1: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 6e:cf:f0:08:74:f4 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, ...
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, ...
When using the netdev datapath, we wait that the tun link appears, we
call nm_device_set_ip_ifindex() (which also brings the link up) and
then we check that the link is ready, i.e. that udev has announced the
link and the MAC address is correct. After that, we schedule stage3
(ip-config).
In this, there is a race condition that occurs sometimes in NMCI test
ovs_datapath_type_netdev_with_cloned_mac. In rare conditions,
nm_device_set_ip_ifindex() bring the interface up but then ovs-vswitch
changes again the flags of the interface without IFF_UP. The result is
that the interface stays down, breaking communications.
To fix this, we need to always call nm_device_bring_up() after the tun
device is ready. The problem is that we can't do it in
_netdev_tun_link_cb() because that function is already invoked
synchronously from platform code.
Instead, simplify the handling of the netdev datapath. Every
"link-changed" event from platform is handled by
_netdev_tun_link_cb(), which always schedule a delayed function
_netdev_tun_link_cb_in_idle(). This function just assigns the
ip-ifindex to the device if missing, and starts stage3 if the link is
ready. While doing so, it also bring the interface up.
Fixes: 99a6c6eda6 ('ovs, dpdk: fix creating ovs-interface when the ovs-bridge is netdev')
https://issues.redhat.com/browse/RHEL-17358https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2218
When connecting to an AP configured for WPA3 transition mode, the
connection will fail if PMF is disabled on the client due to SAE and
FT-SAE being unconditionally added to the key_mgmt variable's
parameters.
By removing the "!is_ap ||" check, SAE and FT-SAE will no longer be
selected when PMF is disabled, allowing clients to connect via
WPA2/PSK mode as per the original intent of
a0988868ba.
Signed-off-by: Conn O'Griofa <connogriofa@gmail.com>
The Open VSwitch interfaces have corresponding platform links. When an
Open VSwitch interface is created while NetworkManager is running, the
OVS factory usually sees an OVSDB entry appear first, then creates a
NMDevice. After that, when a platform link appears, the device is
already there.
Upon a (re-)start, the link might be seen first, and then things
go south. The OVS factory rejects the device, which results in Generic
device being created instead. Another device, this time of an
appropriate is created for the same link once the OVSDB entry is seen.
Needless to say, with two NMDevices for the same platform link existing,
no end of mayhem ensues (an assertion is tripped).
Resolves: https://issues.redhat.com/browse/NMT-1634https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2207