Commit graph

390 commits

Author SHA1 Message Date
Beniamino Galvani
1673e3f051 core: wait for carrier before resolving hostname via DNS
If there is no carrier on a device, don't try to resolve the hostname
on it. Instead, subscribe to carrier change notifications and retry
again once carrier goes up.

https://bugzilla.redhat.com/show_bug.cgi?id=2118817
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1402
(cherry picked from commit e3cf5083fb)
2022-10-14 11:20:45 +02:00
Thomas Haller
0db1c68a36
device: fix hanging port devices when controller goes down while port is not fully attached
This partly reverts 1fe8166fc9 ('device: only deactivate when the master
we've enslaved to goes away').

If the controller fails while the port is not yet fully attached,
before this patch the following happened:

  <info>  [1664299566.1065] device (bond0): state change: ip-config -> failed (reason 'config-failed', sys-iface-state: 'managed')
  ...
  <warn>  [1664299566.1073] device (bond0): Activation: failed for connection 'bond0'
  <trace> [1664299566.1073] device[6b76ac7314eb0b53] (bond0): master: release one slave a9f10ea824bb1725/eth1 (not enslaved) (configure)
  <debug> [1664299566.1073] device[a9f10ea824bb1725] (eth1): unmanaged: flags set to [!sleeping,!by-type,!platform-init,!user-explicit,!user-settings,!user-conf=0x0/0x179/managed], forget [is-slave=0x800], reason removed)
  ...
  <info>  [1664299566.1080] device (eth1): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')

Note that now eth1 has no controller, but it lingers in "ip-config" state indefinitely.

If we look at a case where the port is already attached we see:

  <info>  [1664299540.9661] device (bond0): state change: secondaries -> failed (reason 'config-failed', sys-iface-state: 'managed')
  ...
  <warn>  [1664299540.9667] device (bond0): Activation: failed for connection 'bond0'
  <trace> [1664299540.9667] device[6b76ac7314eb0b53] (bond0): master: release one slave a9f10ea824bb1725/eth1 (enslaved) (configure)
  <debug> [1664299540.9667] platform: (eth1) link: releasing 10 from master 'bond0' (80)
  ...
  <info>  [1664299540.9740] device (bond0): detached bond port eth1
  ...
  <debug> [1664299540.9749] device[a9f10ea824bb1725] (eth1): Activation: connection 'eth1' master failed
  ...
  <warn>  [1664299540.9749] device (eth1): queue-state[secondaries, reason:none, id:520]: replace previously queued state change
  ...
  <debug> [1664299540.9750] device[a9f10ea824bb1725] (eth1): queue-state[deactivating, reason:dependency-failed, id:533]: queue state change
  <debug> [1664299540.9751] device[a9f10ea824bb1725] (eth1): unmanaged: flags set to [!sleeping,!by-type,!platform-init,!user-explicit,!user-settings,!user-conf=0x0/0x179/managed], forget [is-slave=0x800], reason removed)
  ...
  <debug> [1664299541.0201] device[a9f10ea824bb1725] (eth1): enslaved to unknown device 0 (??)
  ...
  <debug> [1664299541.0227] device[a9f10ea824bb1725] (eth1): queue-state[deactivating, reason:dependency-failed, id:533]: change state
  <info>  [1664299541.0228] device (eth1): state change: ip-check -> deactivating (reason 'dependency-failed', sys-iface-state: 'managed')

Fix that by not ignoring the nm_device_slave_notify_release() call. Now we get:

  <info>  [1664391684.9757] device (bond0): state change: ip-config -> failed (reason 'config-failed', sys-iface-state: 'managed')
  ...
  <debug> [1664391684.9759] active-connection[69c2b12d61f5b171]: set state deactivated (was activating)
  <debug> [1664391684.9760] active-connection[142bb8240f6a696d]: check-master-ready: already signalled (state activating, master 0x56116f1480a0 is in state deactivated)
  ...
  <debug> [1664391684.9762] manager: ActivatingConnection now (none)
  ...
  <warn>  [1664391684.9763] device (bond0): Activation: failed for connection 'bond0'
  <trace> [1664391684.9763] device[142828814dec6e26] (bond0): master: release one slave 720791275fe8a68c/eth1 (not enslaved) (configure)
  <debug> [1664391684.9763] device[720791275fe8a68c] (eth1): Activation: connection 'eth1' master failed
  ...
  <debug> [1664391684.9764] device[720791275fe8a68c] (eth1): queue-state[deactivating, reason:dependency-failed, id:3047]: queue state change
  <debug> [1664391684.9765] device[720791275fe8a68c] (eth1): unmanaged: flags set to [!sleeping,!by-type,!platform-init,!user-explicit,!user-settings,!user-conf=0x0/0x179/managed], forget [is-slave=0x800], reason removed)
  ...
  <debug> [1664391684.9797] device[720791275fe8a68c] (eth1): queue-state[deactivating, reason:dependency-failed, id:3047]: change state
  <info>  [1664391684.9797] device (eth1): state change: config -> deactivating (reason 'dependency-failed', sys-iface-state: 'managed')

Commit 1fe8166fc9 ('device: only deactivate when the master we've
enslaved to goes away') added the "return", but it seems to also add it
in cases where we need to handle this. Restrict the return to cases if
we do "no-config".

https://bugzilla.redhat.com/show_bug.cgi?id=2130287

Fixes: 1fe8166fc9 ('device: only deactivate when the master we've enslaved to goes away')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1406
(cherry picked from commit 2be9c693d9)
2022-10-11 18:10:35 +02:00
Thomas Haller
6af0233a21
device: allow resetting the devip state via nm_device_devip_set_state()
There is no reason to disallow resetting the state.

(cherry picked from commit 607a9544cb)
2022-09-29 15:15:39 +02:00
Thomas Haller
46e3c23d85
bond: use _nm_setting_bond_opt_value_as_intbool() in _platform_lnk_bond_init_from_setting()
Previously, we used _nm_utils_ascii_str_to_bool(). That can accept any
kind of input (like "true"), so one might think that this is better to
use on user-input. However, NMSettingBond already validates the these
options are integers (either "0" or "1"). So a value like "true"
could never be here.

Use _nm_setting_bond_opt_value_as_intbool() because that asserts that
the option if of the expected type (integer).

(cherry picked from commit b1a72d0f21)
2022-09-29 14:14:42 +02:00
Thomas Haller
bb0fb83694
bond: make _platform_lnk_bond_init_from_setting() more readable via a macro
Use macros to make the code shorter and easier to read.

(cherry picked from commit b7c56c3ae1)
2022-09-29 14:14:42 +02:00
Thomas Haller
558bcd5aae
firewall/trivial: rename nm_firewall_config_apply() to nm_firewall_config_apply_sync()
Sync/blocking methods are ugly. Their name should highlight this.
Also, we may have an async variant, so we will need the "good" name
for apply() and apply_finish().

(cherry picked from commit dc66fb7d04)
2022-09-29 13:51:56 +02:00
Thomas Haller
bfb4452f7d
firewall/trivial: rename nm_firewall_config_new() to nm_firewall_config_new_shared()
(cherry picked from commit 7ad3fb1956)
2022-09-29 13:51:55 +02:00
Fernando Fernandez Mancera
65d31a11f8 veth: drop iface peer check during create_and_realize()
When fetching the parent device, if the system is slow, NetworkManager
can hit a race condition where the property is still NULL. In that case,
NetworkManager should create the veth link.

Checking that the peer device exists, it is type NM_DEVICE_TYPE_VETH and
it have a parent device is enough to know that we can skip the link
creation.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1399

https://bugzilla.redhat.com/show_bug.cgi?id=2129829

Fixes: 4655b7c308 ('veth: fix veth activation on booting')
(cherry picked from commit 07e0ab48d1)
2022-09-29 10:35:12 +02:00
Fernando Fernandez Mancera
178ff4e42a bond: fix arp_all_target option when arp_interval is disabled
The bond option arp_all_target can be set even if arp_interval is
disabled.

https://bugzilla.redhat.com/show_bug.cgi?id=2123311

Fixes: e064eb9d13 ('bond: use netlink to set bond options')
(cherry picked from commit 3871c670ab)
2022-09-27 13:54:11 +02:00
Lubomir Rintel
dc2d2da9db
device: don't ignore external slave removals
We've been outright ignoring master-slave checks if the link ended up
without a master since commit 2e22880894 ('device: don't remove the
device from master if its link has no master').

This was done to deal with OpenVSwitch port-interface relationship,
where the interface's platform link lacked an actual master in platform
(what matters there is the OVSDB entry), but the fix was too wide.

Let's limit the special case to devices whose were not enslaved to
masters that lack a platform link, which pretty much happens for
OpenVSwitch only.

Morale: Write better commit messages of future you is going to be upset
Fixes: 2e22880894 ('device: don't remove the device from master if its link has no master')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1358
(cherry picked from commit a1de6810df)
2022-09-13 12:38:19 +02:00
Fernando Fernandez Mancera
1ba916915d bond: fix primary bond option when the link is not present
Bond option netlink support requires primary property to be a ifindex
instead of the interface name. This is a workaround for supporting
specifying a primary that does not exist yet.

```
nmcli con add type bond ifname mybond0 bond.options "mode=active-backup,primary=veth1"
Connection 'bond-mybond0' (38100ef9-11e2-4003-aff9-cb2d152ce34f) successfully added.
nmcli con add type ethernet ifname veth1 master mybond0

cat /sys/class/net/mybond0/bonding/primary
veth1
```

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1362

Fixes: e064eb9d13 ('bond: use netlink to set bond options')
(cherry picked from commit 4fd90fb6cc)
2022-09-13 11:08:01 +02:00
Beniamino Galvani
bdaba47a68 device: don't emit recheck-assume if there is a queued activation request
The @dracut_NM_vlan_over_team_no_boot sometimes fails, among other
things, because it fails to assume an indicated connection after a
restart.

That seems to happen because after the decision to activate the
indicated connection, the device does not move from DISCONNECTED state
quickly enough. Another assumption recheck runs in between and decides
to generate a connection, because the assume state was already reset
in between.

First start, creates and activates b3a61b68-f744-4a4c-a513-61399c154a67
on vlan0017:

  NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
      (asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
  ...
  settings: update[b3a61b68-f744-4a4c-a513-61399c154a67]: adding connection "vlan0017"
      (45113870df0a4cfb/keyfile)

Second start:

  NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
      (after a restart, asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)

Assumption attempt successfully picks the right connection and thus
proceeds to reset the assume state:

  manager: (vlan0017): assume: will attempt to assume matching connection 'vlan0017'
      (b3a61b68-f744-4a4c-a513-61399c154a67) (indicated)
  device[c7c5101cf0b73f5f] (vlan0017): assume-state: set guess-assume=0, connection=(null)

Everything great so far, activation of the right connection is enqueued
and the device moves away from unavailable state. However, the
activation can't proceed immediately:

  device (vlan0017): state change: unmanaged -> unavailable
      (reason 'connection-assumed', sys-iface-state: 'assume')
  device (vlan0017): state change: unavailable -> disconnected
      (reason 'connection-assumed', sys-iface-state: 'assume')
  active-connection[0x55ba1162f1c0]: set device "vlan0017" [0x55ba1163c4f0]
  device[c7c5101cf0b73f5f] (vlan0017): queue activation request waiting for carrier

Now another assumption attempt is done. The original assume state is
gone, so a connection is generated:

  platform-linux: UDEV event: action 'add' subsys 'net' device 'vlan0017' (6); seqnum=1959
  device[c7c5101cf0b73f5f] (vlan0017): queued link change for ifindex 6
  manager: (vlan0017): assume: generated connection 'vlan0017' (57627119-8c20-4f9e-bf4d-4fc427b4a6a9)
  keyfile: commit: 57627119-8c20-4f9e-bf4d-4fc427b4a6a9 (vlan0017) added as
      "/run/NetworkManager/system-connections/vlan0017-57627119-8c20-4f9e-bf4d-4fc427b4a6a9.nmconnection"
      (nm-generated,volatile,external)

I think this shouldn't have happened. We've picked the correct
connection already and it's enqueued for activation!

Change the check in nm_device_emit_recheck_assume() to also consider
any queued activation.

Fixes-test: @dracut_NM_vlan_over_team_no_boot

Co-authored-by: Lubomir Rintel <lkundrak@v3.sk>

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1351
(cherry picked from commit 9eb8cbca76)
2022-09-05 09:20:52 +02:00
Beniamino Galvani
5a49a2f6b2
device: restart DHCP when the MAC changes
If the MAC changes there is the possibility that the DHCP client will
not be able to renew the address because it uses the old MAC as
CHADDR. Depending on the implementation, the DHCP server might use
CHADDR (so, the old address) as the destination MAC for DHCP replies,
and those packets will be lost.

To avoid this problem, restart the DHCP client when the MAC changes.

https://bugzilla.redhat.com/show_bug.cgi?id=2110000
(cherry picked from commit 905adabdba)
2022-08-25 23:24:47 +02:00
Beniamino Galvani
2f8e4e2b06
core: log when dynamic IP configuration is restarted and why
(cherry picked from commit 6cd69fde33)
2022-08-25 23:24:46 +02:00
Lubomir Rintel
9d7e5a3b79
device: wait for carrier on unavailable device even when it gets a connection assumed
The test in question leaves the device with a master set, which caused a
connection to get assumed and therefore the previous fix didn't kick in.

Fixes-test: @restart_L2_only_lacp
Fixes: 5b7f8f3f70 ('device: wait for carrier even if it wasn't us who brought the device IFF_UP')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1348
(cherry picked from commit c183f10f65)
2022-08-25 23:16:13 +02:00
Thomas Haller
56d0d35516
mptcp: rework "connection.mptcp-flags" for enabling MPTCP
1) The "enabled-on-global-iface" flag was odd. Instead, have only
and "enabled" flag and skip (by default) endpoints on interface
that have no default route. With the new flag "also-without-default-route",
this can be overruled. So previous "enabled-on-global-default" now is
the same as "enabled", and "enabled" from before behaves now like
"enabled,also-without-default-route".

2) What was also odd, as that the fallback default value for the flags
depends on "/proc/sys/net/mptcp/enabled". There was not one fixed
fallback default, instead the used fallback value was either
"enabled-on-global-iface,subflow" or "disabled".
Usually that is not a problem (e.g. the default value for
"ipv6.ip6-privacy" also depends on use_tempaddr sysctl). In this case
it is a problem, because the mptcp-flags (for better or worse) encode
different things at the same time.
Consider that the mptcp-flags can also have their default configured in
"NetworkManager.conf", a user who wants to switch the address flags
could previously do:

  [connection.mptcp]
  connection.mptcp-flags=0x32   # enabled-on-global-iface,signal,subflow

but then the global toggle "/proc/sys/net/mptcp/enabled" was no longer
honored. That means, MPTCP handling was always on, even if the sysctl was
disabled. Now, "enabled" means that it's only enabled if the sysctl
is enabled too. Now the user could write to "NetworkManager.conf"

  [connection.mptcp]
  connection.mptcp-flags=0x32   # enabled,signal,subflow

and MPTCP handling would still be disabled unless the sysctl
is enabled.

There is now also a new flag "also-without-sysctl", so if you want
to really enable MPTCP handling regardless of the sysctl, you can.
The point of that might be, that we still can configure endpoints,
even if kernel won't do anything with them. Then you could just flip
the sysctl, and it would start working (as NetworkManager configured
the endpoints already).

Fixes: eb083eece5 ('all: add NMMptcpFlags and connection.mptcp-flags property')
(cherry picked from commit c00873e08f)
2022-08-25 23:12:53 +02:00
Fernando Fernandez Mancera
1b704e2f42
bond: fix missing assignment of lp_interval_has
The variable `lp_interval` was being assigned instead of
`lp_interval_has`. The `lp_interval` bond option was not being set
correctly.

https://bugs.launchpad.net/network-manager/+bug/1987001

Fixes: e064eb9d13 ('bond: use netlink to set bond options')
(cherry picked from commit 7d4307e8df)
2022-08-25 15:40:17 +02:00
Fernando Fernandez Mancera
003eb75eb6 bond: fix parsing of arp_ip_target to platform
nm_setting_bond_get_option_normalized() is returning the arp_ip_target
IPs separated by comma instead of a blank space.

https://bugzilla.redhat.com/show_bug.cgi?id=2117202

Fixes: e064eb9d13 ('bond: use netlink to set bond options')
2022-08-11 12:41:59 +02:00
Thomas Haller
6fb11dbe77
device: allow reapplying changes to "connection.autoconnect-priorty"
Of course, this setting has no effect while being activated. But it
should not prevent reapply.
2022-08-09 14:11:55 +02:00
Thomas Haller
eb083eece5
all: add NMMptcpFlags and connection.mptcp-flags property 2022-08-09 08:02:54 +02:00
Thomas Haller
f4b128c63b
device: fix reapply for lldp/mdns/llmnr/dns-over-tls settings
When only one of those connection.{lldp,mdns,llmnr,dns-over-tls}
settings changes, we still need to do a full restart of the IP
configuration to reapply the changes.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
2022-08-09 08:02:37 +02:00
Lubomir Rintel
5cf96c4db2
bridge: fix reapply of vlan_filtering and default_pvid
Fixes: 8e8fed433f ('bridge: add reapply support')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1327
2022-08-06 21:08:51 +02:00
Thomas Haller
d20343c9d0
glib-aux: rework random number utils
Heavily inspired by systemd ([1]).

We now also have nm_random_get_bytes{,_full}() and
nm_random_get_crypto_bytes(), like systemd's random_bytes()
and crypto_random_bytes(), respectively.

Differences:

- instead of systemd's random_bytes(), our nm_random_get_bytes_full()
  also estimates whether the output is of high quality. The caller
  may find that interesting. Due to that, we will first try to call
  getrandom(GRND_NONBLOCK) before getrandom(GRND_INSECURE). That is
  reversed from systemd's random_bytes(), because we want to find
  out whether we can get good random numbers. In most cases, kernel
  should have entropy already, and it makes no difference.

Otherwise, heavily rework the code. It should be easy to understand
and correct.

There is also a major bugfix here. Previously, if getrandom() failed
with ENOSYS and we fell back to /dev/urandom, we would assume that we
have high quality random numbers. That assumption is not warranted.
Now instead poll on /dev/random to find out.

[1] a268e7f402/src/basic/random-util.c (L81)
2022-08-05 19:29:34 +02:00
Fernando Fernandez Mancera
e064eb9d13 bond: use netlink to set bond options
Use the netlink platform implementation for setting the bond link
options.
2022-08-04 11:18:36 +02:00
Fernando Fernandez Mancera
f900f7bc2c platform: add netlink support for bond link
sysfs is deprecated and kernel people will not add new bond options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
2022-08-04 11:18:36 +02:00
Lubomir Rintel
5b7f8f3f70 device: wait for carrier even if it wasn't us who brought the device IFF_UP
The devices generally need to be IFF_UP and wait a little before the
carrier detection is reliable. Some devices, actually need to wait
more than a little -- r8169 needs up to 5 seconds.

For this reason, we delay startup complete while the carrier is down
after we bring the device up. We do this so that we don't reject
activations due to carrier down until we're sure it's really down.
This works well as long as it's us who brought the device up.

If we're restarting the daemon, the device is going to be already up
when we start up the daemon for the second time. There's, however, a
slim chance that the device was brought down and up very shortly before
the restart and therefore the carrier reporting is still not reliable.
As a matter of fact, we bring the devices down and back up on some
occassions, such as when enslaving to a team device.

Therefore, the following events in quick succession cause trouble:

  # nmcli con up team-slave-eth0
  [20099.205355] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
  [20099.365641] nm-team: Port device eth0 added
  [20099.370728] r8169 0000:03:00.0 eth0: Link is Down
  [20099.436631] nm-team: Port device eth0 removed
  [20099.463422] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
  [20099.628505] r8169 0000:03:00.0 eth0: Link is Down
  [20099.669425] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
  [20099.833457] r8169 0000:03:00.0 eth0: Link is Down
  [20099.838471] nm-team: Port device eth0 added

The device has been brought down, enslaved and brought up.
"Link is Down" indicates carrier not being detected.

  Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/7)
  # systemctl restart NetworkManager

Now NM sees the device being up, but carrier down.

  # nmcli con up testeth0
  Error: Connection activation failed: No suitable device found for this connection (...).

Activation failed, because eth0 carrier still appears down.

  # [20102.943464] r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Now it's up, but the party is already over. Shiet.

Let's wait whenever the device reaches unavailable state, whether we
bring it up at that point or not.

Fixes-test: @restart_L2_only_lacp

https://bugzilla.redhat.com/show_bug.cgi?id=2092361
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1316
2022-08-02 15:06:35 +02:00
Dylan Van Assche
0f3eb6fabb
nm-device-bt: allow Bluetooth NAP type for complete-connection
Bluetooth NAP is besides Bluetooth PAN and DUN also supported by
NetworkManager. Add NAP to the supported Bluetooth types of
nm-device-bt.c

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1058

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1320
2022-08-01 09:37:42 +02:00
Lubomir Rintel
2b4b4193be bridge: fix reapply of non-bridge properties
Return was ommited in a branch that delegates settings check to a parent
class, resulting in a bridge property check applied incorrectly.

Fixes: 8e8fed433f ('bridge: add reapply support')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1318
2022-07-29 12:45:00 +02:00
Beniamino Galvani
0cd3ffa7e9 build: fix compilation
Fixes: dbf29c5450 ('platform: fix build with musl libc')
2022-07-27 19:51:56 +02:00
Beniamino Galvani
dbf29c5450 platform: fix build with musl libc
Don't mix <net/ethernet.h> and <linux/if_ether.h>.

Fixes the following build error with musl libc:

  In file included from /usr/include/net/ethernet.h:10,
                   from ../src/libnm-platform/nm-linux-platform.c:17:
  /usr/include/netinet/if_ether.h:115:8: error: redefinition of 'struct ethhdr'
    115 | struct ethhdr {
        |        ^~~~~~
  In file included from ../src/linux-headers/ethtool.h:19,
                   from ../src/libnm-std-aux/nm-linux-compat.h:22,
                   from ../src/libnm-platform/nm-linux-platform.c:10:
  /usr/include/linux/if_ether.h:169:8: note: originally defined here
    169 | struct ethhdr {
        |        ^~~~~~

Fixes: dc98ab807c ('platform: include "linux-headers" via "libnm-std-aux/nm-linux-compat.h"')
2022-07-27 18:46:01 +02:00
Thomas Haller
d3c9bb4666
platform: rename file "nmp-route-manager.[hc]" to "nmp-global-tracker.[hc]" 2022-07-26 12:45:55 +02:00
Thomas Haller
bf248e0400
platform: rename NMPRouteManager to NMPGlobalTracker
NetworkManager primarily manages interfaces in an independent fashion.
That means, whenever possible, we want to have a interface specific
view. In many cases, the underlying kernel API also supports that view.
For example, when configuring IP addresses or unicast routes, we do so
per interfaces and don't need a holistic view.

However, that is not always sufficient. For routing rules and certain
route types (blackhole, unreachable, etc), we need a system wide view
of all the objects in the network namespace.

Originally, NMPRulesManager was added to track routing rules. Then, it
was extended to also track certain route types, and the API was renamed to
NMPRouteManager.

This will also be used to track MPTCP addresses.

So rename again, to give it a general name that is suitable for what it
does. Still, the name is not great (suggestion welcome), but it should
cover the purpose of the API well enough. And it's the best I came
up with.

Rename.
2022-07-26 12:43:44 +02:00
Beniamino Galvani
2c70fef12e bridge: don't reset vlan filtering parameters on external connections
Fixes: 96fab7b462 ('all: add vlan-filtering and vlan-default-pvid bridge properties')

https://bugzilla.redhat.com/show_bug.cgi?id=2107647
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1305
2022-07-26 09:00:43 +02:00
Lubomir Rintel
8e8fed433f bridge: add reapply support
We're able to reapply all properties in the bridge setting, aside from
"mac-address" which is used for matching the device.

https://bugzilla.redhat.com/show_bug.cgi?id=2092762
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1296
2022-07-25 13:42:50 +02:00
Beniamino Galvani
93372e8100 ovs: fail device only when it's activating
It doesn't make sense to fail a device that is not activating.

Especially, if the device was in state UNMANAGED, it would enter state
FAILED (and then DISCONNECTED) or ACTIVATED (when external or
assumed); both are wrong.

https://bugzilla.redhat.com/show_bug.cgi?id=2077950
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1302
2022-07-19 14:02:24 +02:00
Beniamino Galvani
8c17760f62 ppp,wwan: remove explicit initialization of DNS priority
It's no longer necessary, as modem devices get the priority from the
ipmanual configuration created from the profile.
2022-07-18 07:48:13 +02:00
Beniamino Galvani
0717589972 wwan: enable manual IP configuration
Before 1.36, manual addresses from the profile were assigned to the
interface; restore that behavior.

The manual IP configuration also contains the DNS priority from the
profile; so this change ensures that the merged l3cd has a DNS
priority and that dynamically discovered DNS servers are not ignored
by the DNS manager.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
2022-07-18 07:48:12 +02:00
Beniamino Galvani
2ae8433520 device: add "is_manual" argument to ready_for_ip_config() device method
Some device types might want to run manual ip configuration while
skipping other methods.
2022-07-18 07:48:12 +02:00
Fernando Fernandez Mancera
4655b7c308 veth: fix veth activation on booting
When creating one profile for each veth during activation the creation
of the veth could fail. When the link for the first profile is created
the link for the peer is generated in kernel. Therefore when trying to
activate the second profile it will fail because the link already
exists. NetworkManager must check if the link already exists and
corresponds to the same veth, if so, it should skip the link creation.

https://bugzilla.redhat.com/show_bug.cgi?id=2036023
https://bugzilla.redhat.com/show_bug.cgi?id=2105956
2022-07-12 13:34:18 +02:00
Thomas Haller
d8a4b3bec2
all: reformat with clang-format (clang-tools-extra-14.0.0-1.fc36) and update gitlab-ci to f36 2022-07-06 11:06:53 +02:00
Beniamino Galvani
fb4ac007ba wifi: wait supplicant to settle before renewing DHCP after roam
After roaming to a different AP, if we trigger a DHCP renewal while
the supplicant is still reauthenticating the REQUEST will be lost and
the client will fall back to sending a DISCOVER, potentially getting a
different address.

Wait that the supplicant state settles before renewing.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1024
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1263
2022-07-04 13:21:06 +02:00
Thomas Haller
5245fc6c75
platform: rename nmp_lookup_init_object() to nmp_lookup_init_object_by_ifindex()
In the past, nmp_lookup_init_object() could both lookup all object for a
certain ifindex, and lookup all objects of a type. That fallback path
already leads to an assertion failure fora while now, so nobody should
be using this function to lookup all objects of a certain type (for
what, we have nmp_lookup_init_obj_type()).

Now, remove the fallback path, and rename the function to what it really
does.
2022-06-30 14:08:41 +02:00
Thomas Haller
e6a33c04eb
all: make "ipv6.addr-gen-mode" configurable by global default
It can be useful to choose a different "ipv6.addr-gen-mode". And it can be
useful to override the default for a set of profiles.

For example, in cloud or in a data center, stable-privacy might not be
the best choice. Add a mechanism to override the default via global defaults
in NetworkManager.conf:

  # /etc/NetworkManager/conf.d/90-ipv6-addr-gen-mode-override.conf
  [connection-90-ipv6-addr-gen-mode-override]
  match-device=type:ethernet
  ipv6.addr-gen-mode=0

"ipv6.addr-gen-mode" is a special property, because its default depends on
the component that configures the profile.

- when read from disk (keyfile and ifcfg-rh), a missing addr-gen-mode
  key means to default to "eui64".
- when configured via D-Bus, a missing addr-gen-mode property means to
  default to "stable-privacy".
- libnm's ip6-config::addr-gen-mode property defaults to
  "stable-privacy".
- when some tool creates a profile, they either can explicitly
  set the mode, or they get the default of the underlying mechanisms
  above.

  - nm-initrd-generator explicitly sets "eui64" for profiles it creates.
  - nmcli doesn' explicitly set it, but inherits the default form
    libnm's ip6-config::addr-gen-mode.
  - when NM creates a auto-default-connection for ethernet ("Wired connection 1"),
    it inherits the default from libnm's ip6-config::addr-gen-mode.

Global connection defaults only take effect when the per-profile
value is set to a special default/unset value. To account for the
different cases above, we add two such special values: "default" and
"default-or-eui64". That's something we didn't do before, but it seams
useful and easy to understand.

Also, this neatly expresses the current behaviors we already have. E.g.
if you don't specify the "addr-gen-mode" in a keyfile, "default-or-eui64"
is a pretty clear thing.

Note that usually we cannot change default values, in particular not for
libnm's properties. That is because we don't serialize the default
values to D-Bus/keyfile, so if we change the default, we change
behavior. Here we change from "stable-privacy" to "default" and
from "eui64" to "default-or-eui64". That means, the user only experiences
a change in behavior, if they have a ".conf" file that overrides the default.

https://bugzilla.redhat.com/show_bug.cgi?id=1743161
https://bugzilla.redhat.com/show_bug.cgi?id=2082682

See-also: https://github.com/coreos/fedora-coreos-tracker/issues/907

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1213
2022-06-29 07:38:48 +02:00
Thomas Haller
7f766014c5
team: specify cli-type for teamdctl_connect() to select usock/dbus
teamdctl_connect() has a parameter cli_type. If unspecified, the
library will try usock, dbus (if enabled) and zmq (if enabled).

Trying to use the unix socket if we expect to use D-Bus can be bad. For
example, it might cause SELinux denials.

As we anyway require libteam to use D-Bus, if D-Bus is available,
explicitly select the cli type.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1255
2022-06-27 14:04:40 +02:00
Thomas Haller
863b71a8fe
all: use internal _nm_utils_ip4_netmask_to_prefix()
We have two variants of the function: nm_utils_ip4_netmask_to_prefix()
and _nm_utils_ip4_netmask_to_prefix(). The former only exists because it
is public API in libnm. Internally, only use the latter.
2022-06-27 10:50:24 +02:00
Beniamino Galvani
f8885d0724 core: avoid stale entries in the DNS manager for non-virtual devices
_dev_l3_register_l3cds() schedules a commit, but if the device has
commit type NONE, that doesn't emit a l3cd-changed. Do it manually,
to ensure that entries are removed from the DNS manager.

Related: b86388bef3 ('core: avoid stale entries in the DNS manager')
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/995
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1268
2022-06-24 12:02:45 +02:00
Beniamino Galvani
a216739e09 device: stop ac6 grace time when ip6ll is ready in shared mode
The IPv6 shared mode starts IPv6 autoconf to send router
advertisements. IPv6 autoconf schedules a 30-second timeout waiting
for a link-local address to appear. When the link-local address
appears, we need to cancel the timeout.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1030
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1266
2022-06-22 18:05:55 +02:00
Fernando Fernandez Mancera
87eb61c864 libnm: support wait-activation-delay property
The property wait-activation-delay will delay the activation of an
interface the specified amount of milliseconds. Please notice that it
could be delayed some milliseconds more due to other events in
NetworkManager.

This could be used in multiple scenarios where the user needs to define
an arbitrary delay e.g LACP bond configure where the LACP negotiation
takes a few seconds and traffic is not allowed, so they would like to
use nm-online and a setting configured with this new property to wait
some seconds. Therefore, when nm-online is finished, LACP bond should be
ready to receive traffic.

The delay will happen right before the device is ready to be activated.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1248

https://bugzilla.redhat.com/show_bug.cgi?id=2008337
2022-06-16 02:14:21 +02:00
Lubomir Rintel
1f61f3f239 device: release slaves when an external device is going managed
When we're deactivating an externally created device that has a master
because we're activating a connection on it, actually remove the device
from the master. Otherwise unpleasant things happen:

  active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
  device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
  device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
  device (dummy0): disconnecting for new activation request.
  device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)

Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.
This patch changes that.

  device (br0): bridge port dummy0 was detached
  device (dummy0): released from master device br0
  active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
  device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
  device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')

  platform: (dummy0) emit signal link-changed changed: 102: dummy0
      <NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
       addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746

Now the platform sent us a new link, the "master" property is still set.

  device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
  device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
  device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
  config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
  active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
  active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
  device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
  device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare

  l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
  device (br0): master: add one slave 0a458361f9fed8f5/dummy0

Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above. Sad.

  device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
  active-connection[0x55ed7ba78400]: set state activating (was unknown)
  manager: NetworkManager state is now CONNECTING
  active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
  device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
  device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
  platform: (dummy0) link: releasing 102 from master 'br0' (101)
  device (br0): detached bridge port dummy0

Now things go south. The stage1 cleans the device up, removing it from
the master and the device itself decides it should deactivate itself
because it lots its master regardless of the fact that it should not
have one and it's in fact an unwanted carryover from previous activation.
I believe this is also wrong.

  device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
  device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'
  device[0a458361f9fed8f5] (dummy0): queue-state[deactivating, reason:connection-assumed, id:298]: queue state change
  device[0a458361f9fed8f5] (dummy0): activation-stage: synchronously invoke activate_stage2_device_config
  device (dummy0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')

Now things are really weird. We synchronously go to config, effectively
overriding the queued deactivation. We've really messed up.
2022-06-14 14:21:53 +02:00
Lubomir Rintel
1fe8166fc9 device: only deactivate when the master we've enslaved to goes away
Sometimes weird things happen.

Let dummy0 be an externally created device that has a master. We decide
to activate a connection that has no master on it:

  active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
  device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
  device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
  device (dummy0): disconnecting for new activation request.
  device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)

Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.

  device (br0): bridge port dummy0 was detached
  device (dummy0): released from master device br0
  active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
  device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
  device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')

  platform: (dummy0) emit signal link-changed changed: 102: dummy0
      <NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
       addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746

Now the platform sent us a new link, the "master" property is still set.

  device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
  device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
  device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
  config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
  active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
  active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
  device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
  device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare

  l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
  device (br0): master: add one slave 0a458361f9fed8f5/dummy0

Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above.

  device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
  active-connection[0x55ed7ba78400]: set state activating (was unknown)
  manager: NetworkManager state is now CONNECTING
  active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
  device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
  device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
  platform: (dummy0) link: releasing 102 from master 'br0' (101)
  device (br0): detached bridge port dummy0

Now stage1 cleans the device up, removing it from the master.

  device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
  device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'

We decide to deal with this by enqueuing a deactivation. That is not
great -- we shouldn't even have had this master!

This patch takes the deactivation path only if we were willingly
enslaved to the master in question.
2022-06-14 14:21:53 +02:00