Commit graph

1001 commits

Author SHA1 Message Date
Thomas Haller
3b58404712
platform: add NMPGenlFamilyType enum for generic netlink types
The genl types that we care about are well known. Add an enum
for them, so we can do a lookup by index.

To kernel, the corresponding names (like "wireguard") are also well
known. However, the family-id, that we need when using genl are
allocated dynamically. So we need to lookup the family-id, and by having
an enum for the genl type, we can do so generically.
2022-07-19 12:33:50 +02:00
Thomas Haller
5c84fe0db5
core: support "nm.debug" kernel command line to enable verbose logging
When NetworkManager runs in initrd, it can be cumbersome to enable debug logging.
Granted, when using dracut, the NetworkManager dracut module will honor "rd.debug".
However, a user may use NetworkManager in initrd without dracut. Then,
the only way to enable debug logging would be by changing
"NetworkManager.conf" and rebuild the initrd (or having some script in
place, that allows to more conveniently enable debug logging for
NetworkManager).

To make it easier for debugging, honor "nm.debug" on the kernel command
line.

Note that if "nm.debug" is set on the kernel command line, it always overrides
both the command line arguments and the configuration from NetworkManager.conf.
That is intentional. The only way to override that is by overriding the
kernel command line with a file "/run/NetworkManager/proc-cmdline".

https://bugzilla.redhat.com/show_bug.cgi?id=2102313
2022-07-18 15:00:04 +02:00
Thomas Haller
d4b7934997
core: support "/run/NetworkManager/proc-cmdline" to overwrite /proc/cmdline
We read /proc/cmdline for "match.kernel-command-line". But next we will
also honor "nm.debug" on the kernel command line, to enable debug
logging. For "nm.debug" it makes sense that it overwrites the debug
options from the command line and from "NetworkManager.conf". That
means, if you set "nm.debug", then verbose logging will be enabled. It
can only be turned off again at runtime (via D-Bus), otherwise, it's
hard to avoid.

It still can make sense to overrule this setting once again. Support
that, by honoring a file "/run/NetworkManager/proc-cmdline" to be used
instead of "/proc/cmdline".

This option is mainly for debugging and testing, but it might be useful
in production too, if you had "nm.debug" enabled during boot, but later
want to disable it until next reboot. Then you could do:

  sed 's/ *\<nm\.debug\> */ /g' /proc/cmdline > /run/NetworkManager/proc-cmdline
  nmcli general logging level DEFAULT domains DEFAULT
2022-07-18 14:58:00 +02:00
Beniamino Galvani
8c17760f62 ppp,wwan: remove explicit initialization of DNS priority
It's no longer necessary, as modem devices get the priority from the
ipmanual configuration created from the profile.
2022-07-18 07:48:13 +02:00
Beniamino Galvani
0717589972 wwan: enable manual IP configuration
Before 1.36, manual addresses from the profile were assigned to the
interface; restore that behavior.

The manual IP configuration also contains the DNS priority from the
profile; so this change ensures that the merged l3cd has a DNS
priority and that dynamically discovered DNS servers are not ignored
by the DNS manager.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
2022-07-18 07:48:12 +02:00
Beniamino Galvani
2ae8433520 device: add "is_manual" argument to ready_for_ip_config() device method
Some device types might want to run manual ip configuration while
skipping other methods.
2022-07-18 07:48:12 +02:00
Thomas Haller
a9818692b8
policy: downgrade verbosity of hostname change logging message
This message seems not useful at <info> level. Downgrade logging level.
2022-07-15 09:22:56 +02:00
Fernando Fernandez Mancera
4655b7c308 veth: fix veth activation on booting
When creating one profile for each veth during activation the creation
of the veth could fail. When the link for the first profile is created
the link for the peer is generated in kernel. Therefore when trying to
activate the second profile it will fail because the link already
exists. NetworkManager must check if the link already exists and
corresponds to the same veth, if so, it should skip the link creation.

https://bugzilla.redhat.com/show_bug.cgi?id=2036023
https://bugzilla.redhat.com/show_bug.cgi?id=2105956
2022-07-12 13:34:18 +02:00
Beniamino Galvani
1784fc9fa1 core: update DNS when the device enters IP_CONFIG state
Update DNS information when the device enters the IP_CONFIG state. In
this way, when dispatcher events "dhcp4-change,dhcp6-change" are
emitted resolv.conf already contains the information received from
the DHCP lease.

https://bugzilla.redhat.com/show_bug.cgi?id=2100456
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1283
2022-07-11 15:51:37 +02:00
Thomas Haller
d8a4b3bec2
all: reformat with clang-format (clang-tools-extra-14.0.0-1.fc36) and update gitlab-ci to f36 2022-07-06 11:06:53 +02:00
Slava Monich
8c5356cec6 supplicant: fix a memory leak
==30980== 8 bytes in 1 blocks are definitely lost in loss record 1,117 of 6,137
==30980==    at 0x4841C38: malloc (vg_replace_malloc.c:309)
==30980==    by 0x4A246C7: g_malloc (gmem.c:106)
==30980==    by 0x4A4A4BB: g_variant_get_strv (gvariant.c:1607)
==30980==    by 0x4A4CA73: g_variant_valist_get_nnp (gvariant.c:4901)
==30980==    by 0x4A4CA73: g_variant_valist_get_leaf (gvariant.c:5058)
==30980==    by 0x4A4CA73: g_variant_valist_get (gvariant.c:5239)
==30980==    by 0x4A4D11D: g_variant_get_va (gvariant.c:5502)
==30980==    by 0x4A4D1BD: g_variant_lookup (gvariant.c:989)
==30980==    by 0xE9389: parse_capabilities (nm-supplicant-interface.c:1241)
==30980==    by 0xEBF99: _properties_changed_main (nm-supplicant-interface.c:1941)
==30980==    by 0xEF549: _properties_changed (nm-supplicant-interface.c:2867)
==30980==    by 0xEF7ED: _get_all_main_cb (nm-supplicant-interface.c:2972)
==30980==    by 0x262057: _nm_dbus_connection_call_default_cb (nm-dbus-aux.c:70)
==30980==    by 0x48DB6A3: g_task_return_now (gtask.c:1215)
==30980==    by 0x48DBF43: g_task_return.part.3 (gtask.c:1285)
==30980==    by 0x4918885: g_dbus_connection_call_done (gdbusconnection.c:5765)
==30980==    by 0x48DB6A3: g_task_return_now (gtask.c:1215)
==30980==    by 0x48DB6D7: complete_in_idle_cb (gtask.c:1229)
==30980==    by 0x4A20981: g_main_dispatch (gmain.c:3325)
==30980==    by 0x4A20981: g_main_context_dispatch (gmain.c:4016)
==30980==    by 0x4A20BEF: g_main_context_iterate.isra.23 (gmain.c:4092)
==30980==    by 0x4A20E33: g_main_loop_run (gmain.c:4290)
==30980==    by 0x2C5C9: main (main.c:509)

Fixes: cd1e0193ab ('supplicant: add BIP interface capability')
2022-07-04 15:39:40 +03:00
Beniamino Galvani
fb4ac007ba wifi: wait supplicant to settle before renewing DHCP after roam
After roaming to a different AP, if we trigger a DHCP renewal while
the supplicant is still reauthenticating the REQUEST will be lost and
the client will fall back to sending a DISCOVER, potentially getting a
different address.

Wait that the supplicant state settles before renewing.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1024
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1263
2022-07-04 13:21:06 +02:00
Thomas Haller
5245fc6c75
platform: rename nmp_lookup_init_object() to nmp_lookup_init_object_by_ifindex()
In the past, nmp_lookup_init_object() could both lookup all object for a
certain ifindex, and lookup all objects of a type. That fallback path
already leads to an assertion failure fora while now, so nobody should
be using this function to lookup all objects of a certain type (for
what, we have nmp_lookup_init_obj_type()).

Now, remove the fallback path, and rename the function to what it really
does.
2022-06-30 14:08:41 +02:00
Thomas Haller
e6a33c04eb
all: make "ipv6.addr-gen-mode" configurable by global default
It can be useful to choose a different "ipv6.addr-gen-mode". And it can be
useful to override the default for a set of profiles.

For example, in cloud or in a data center, stable-privacy might not be
the best choice. Add a mechanism to override the default via global defaults
in NetworkManager.conf:

  # /etc/NetworkManager/conf.d/90-ipv6-addr-gen-mode-override.conf
  [connection-90-ipv6-addr-gen-mode-override]
  match-device=type:ethernet
  ipv6.addr-gen-mode=0

"ipv6.addr-gen-mode" is a special property, because its default depends on
the component that configures the profile.

- when read from disk (keyfile and ifcfg-rh), a missing addr-gen-mode
  key means to default to "eui64".
- when configured via D-Bus, a missing addr-gen-mode property means to
  default to "stable-privacy".
- libnm's ip6-config::addr-gen-mode property defaults to
  "stable-privacy".
- when some tool creates a profile, they either can explicitly
  set the mode, or they get the default of the underlying mechanisms
  above.

  - nm-initrd-generator explicitly sets "eui64" for profiles it creates.
  - nmcli doesn' explicitly set it, but inherits the default form
    libnm's ip6-config::addr-gen-mode.
  - when NM creates a auto-default-connection for ethernet ("Wired connection 1"),
    it inherits the default from libnm's ip6-config::addr-gen-mode.

Global connection defaults only take effect when the per-profile
value is set to a special default/unset value. To account for the
different cases above, we add two such special values: "default" and
"default-or-eui64". That's something we didn't do before, but it seams
useful and easy to understand.

Also, this neatly expresses the current behaviors we already have. E.g.
if you don't specify the "addr-gen-mode" in a keyfile, "default-or-eui64"
is a pretty clear thing.

Note that usually we cannot change default values, in particular not for
libnm's properties. That is because we don't serialize the default
values to D-Bus/keyfile, so if we change the default, we change
behavior. Here we change from "stable-privacy" to "default" and
from "eui64" to "default-or-eui64". That means, the user only experiences
a change in behavior, if they have a ".conf" file that overrides the default.

https://bugzilla.redhat.com/show_bug.cgi?id=1743161
https://bugzilla.redhat.com/show_bug.cgi?id=2082682

See-also: https://github.com/coreos/fedora-coreos-tracker/issues/907

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1213
2022-06-29 07:38:48 +02:00
Thomas Haller
7f766014c5
team: specify cli-type for teamdctl_connect() to select usock/dbus
teamdctl_connect() has a parameter cli_type. If unspecified, the
library will try usock, dbus (if enabled) and zmq (if enabled).

Trying to use the unix socket if we expect to use D-Bus can be bad. For
example, it might cause SELinux denials.

As we anyway require libteam to use D-Bus, if D-Bus is available,
explicitly select the cli type.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1255
2022-06-27 14:04:40 +02:00
Beniamino Galvani
bf9a2babb4 platform: fix routing rule test failure
Since kernel 5.18 there is a stricter validation [1][2] on the tos
field of routing rules, that must not include ECN bits.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f55fbb6afb8d701e3185e31e73f5ea9503a66744
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a410a0cf98854a698a519bfbeb604145da384c0e

Fixes the following failure:

  >>> src/core/platform/tests/test-route-linux
  >>> ...
  # NetworkManager-MESSAGE: <warn>  [1656321515.6604] platform-linux: do-add-rule: failure 22 (Invalid argument - Invalid dsfield (tos): ECN bits must be 0)
  >>> failing... errno=-22, rule=[routing-rule,0x13d6e80,1,+alive,+visible; [6] 0: from all tos 0xff fwmark 0x4/0 suppress_prefixlen -459579276 action-214 protocol 255]
  >>> existing rule: * [routing-rule,0x13d71e0,2,+alive,+visible; [6] 0: from all sport 65534 lookup 10009 suppress_prefixlen 0 none]
  >>> existing rule:   [routing-rule,0x13d7280,2,+alive,+visible; [4] 0: from all fwmark 0/0x9a7e9992 ipproto 255 suppress_prefixlen 0 realms 0x00000008 none protocol 71]
  >>> existing rule:   [routing-rule,0x13d7320,2,+alive,+visible; [6] 598928157: from all suppress_prefixlen 0 none]
  >>> existing rule:   [routing-rule,0x13d73c0,2,+alive,+visible; [4] 0: from 192.192.5.200/8 lookup 254 suppress_prefixlen 0 none protocol 9]
  >>> existing rule:   [routing-rule,0x13d7460,2,+alive,+visible; [4] 0: from all ipproto 3 suppress_prefixlen 0 realms 0xffffffff none protocol 5]
  >>> existing rule:   [routing-rule,0x13d7500,2,+alive,+visible; [4] 0: from all fwmark 0x1/0 lookup 254 suppress_prefixlen 0 action-124 protocol 4]
  >>> existing rule:   [routing-rule,0x13d75a0,2,+alive,+visible; [4] 0: from all suppress_prefixlen 0 action-109]
  0:      from all fwmark 0/0x9a7e9992 ipproto ipproto-255 realms 8 none proto 71
  0:      from 192.192.5.200/8 lookup main suppress_prefixlength 0 none proto ra
  0:      from all ipproto ggp realms 65535/65535 none proto 5
  0:      from all fwmark 0x1/0 lookup main suppress_prefixlength 0 124 proto static
  0:      from all 109
  0:      from all sport 65534 lookup 10009 suppress_prefixlength 0 none
  598928157:      from all none
  Bail out! nm:ERROR:../src/core/platform/tests/test-route.c:1787:test_rule: assertion failed (r == 0): (-22 == 0)

Fixes: 5ae2431b0f ('platform/tests: add tests for handling policy routing rules')
2022-06-27 13:31:08 +02:00
Beniamino Galvani
90e7afc2cd libnm,core: add support for {rto_min,quickack,advmss} route attributes 2022-06-27 11:38:43 +02:00
Beniamino Galvani
33f89f5978 ifcfg-rh: support reading boolean route attributes 2022-06-27 11:38:43 +02:00
Thomas Haller
ff694f42a7
dhcp/systemd: pass client instance to lease_to_ip6_config()
This makes it more consistent with nettools' lease_to_ip4_config().
The benefit of having a self pointer, is that it provides the necessary
context for logging. Without it, these functions cannot correctly log.

At this point, it's clearer to get the necessary data directly from the
DHCP client instance, instead of having the caller passing them on
(redundantly).
2022-06-27 10:53:40 +02:00
Thomas Haller
c7c57e8fb9
dhcp/nettools: log message about guessing subnet mask for IPv4 2022-06-27 10:53:40 +02:00
Thomas Haller
f0d132bda9
dhcp: add nm_dhcp_client_create_l3cd() helper 2022-06-27 10:53:39 +02:00
Thomas Haller
c06e6390a4
dhcp/nettools: normalize subnet netmask in nettools client
For an IPv4 subnet mask we expect that all the leading bits are set (no
"holes"). But _nm_utils_ip4_netmask_to_prefix() does not enforce that,
and tries to make the best of it.

In face of a netmask with holes, normalize the mask.
2022-06-27 10:53:39 +02:00
Thomas Haller
57dfa999f7
dhcp/nettools: accept missing "subnet mask" (option 1) in DHCP lease
Do the same as dhclient plugin in nm_dhcp_utils_ip4_config_from_options().

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1037
2022-06-27 10:53:36 +02:00
Thomas Haller
863b71a8fe
all: use internal _nm_utils_ip4_netmask_to_prefix()
We have two variants of the function: nm_utils_ip4_netmask_to_prefix()
and _nm_utils_ip4_netmask_to_prefix(). The former only exists because it
is public API in libnm. Internally, only use the latter.
2022-06-27 10:50:24 +02:00
Thomas Haller
ae6fe90851
ifcfg-rh: fix serializing lock route attributes
The lock attribute is a boolean, it can also be FALSE. We need
to handle that case, and don't add serialize "$NAME lock 0" for them.
2022-06-27 08:29:27 +02:00
Thomas Haller
efca0b8fa6
ifcfg-rh: in get_route_attributes_string() check prefix for "lock" names
In practice, the profile probably validates, so all the
attribute names are well-known. There is thus no attribute
name that has "lock-" in the middle of the string.

Still, fix it. We want to match only at the begin of the
name.
2022-06-27 08:29:26 +02:00
Thomas Haller
3628cf4805
core: log boot-id when NetworkManager starts
In a logfile, the "is starting" message is an interesting point
that indicates when NetworkManager is starting. Include
also the boot-id in the log, so that we can know whether this
was a restart from the same boot.

Also drop the "for the first time" part.

  <info>  [1656057181.8920] NetworkManager (version 1.39.7) is starting... (after a restart, asserts:10000, boot:486b1052-4bf8-48af-8f15-f3e85c3321f6)
2022-06-27 08:28:56 +02:00
Beniamino Galvani
f8885d0724 core: avoid stale entries in the DNS manager for non-virtual devices
_dev_l3_register_l3cds() schedules a commit, but if the device has
commit type NONE, that doesn't emit a l3cd-changed. Do it manually,
to ensure that entries are removed from the DNS manager.

Related: b86388bef3 ('core: avoid stale entries in the DNS manager')
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/995
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1268
2022-06-24 12:02:45 +02:00
Thomas Haller
6b0f67b736
connectivity: skip unexpected address families in system_resolver_resolve_cb()
This actually cannot happen, because GInetAddress is either
IPv4 or IPv6. Still.
2022-06-23 17:11:28 +02:00
Beniamino Galvani
a216739e09 device: stop ac6 grace time when ip6ll is ready in shared mode
The IPv6 shared mode starts IPv6 autoconf to send router
advertisements. IPv6 autoconf schedules a 30-second timeout waiting
for a link-local address to appear. When the link-local address
appears, we need to cancel the timeout.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1030
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1266
2022-06-22 18:05:55 +02:00
David Bauer
482885e6e9
supplicant/config: supplicant: prevent OWE downgrade
Prevent downgrade of Enhanced Open / OWE connection profiles
to unencrypted connections by forcing wpa_supplicant to use OWE.

Signed-off-by: David Bauer <mail@david-bauer.net>
2022-06-17 19:50:40 +02:00
Beniamino Galvani
2807f6a893 dhcp: nettools: save the lease after it gets accepted
Currently the lease gets saved only on the extended (renewal)
event. Also save it after it gets accepted.

Fixes: 52a0fe584c ('dhcp/nettools: better track currently granted lease')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1261
2022-06-17 18:11:30 +02:00
Beniamino Galvani
393bc628ff dhcp: wait DAD completion for DHCPv6 addresses
Wait that addresses received through DHCPv6 complete duplicate address
detection before reporting that the lease can be used.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://bugzilla.redhat.com/show_bug.cgi?id=2096386
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1258
2022-06-16 16:26:14 +02:00
Fernando Fernandez Mancera
3e4d084998 ifcfg-rh: fix wrong type for vint64 variable 2022-06-16 02:14:31 +02:00
Fernando Fernandez Mancera
87eb61c864 libnm: support wait-activation-delay property
The property wait-activation-delay will delay the activation of an
interface the specified amount of milliseconds. Please notice that it
could be delayed some milliseconds more due to other events in
NetworkManager.

This could be used in multiple scenarios where the user needs to define
an arbitrary delay e.g LACP bond configure where the LACP negotiation
takes a few seconds and traffic is not allowed, so they would like to
use nm-online and a setting configured with this new property to wait
some seconds. Therefore, when nm-online is finished, LACP bond should be
ready to receive traffic.

The delay will happen right before the device is ready to be activated.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1248

https://bugzilla.redhat.com/show_bug.cgi?id=2008337
2022-06-16 02:14:21 +02:00
Lubomir Rintel
1f61f3f239 device: release slaves when an external device is going managed
When we're deactivating an externally created device that has a master
because we're activating a connection on it, actually remove the device
from the master. Otherwise unpleasant things happen:

  active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
  device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
  device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
  device (dummy0): disconnecting for new activation request.
  device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)

Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.
This patch changes that.

  device (br0): bridge port dummy0 was detached
  device (dummy0): released from master device br0
  active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
  device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
  device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')

  platform: (dummy0) emit signal link-changed changed: 102: dummy0
      <NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
       addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746

Now the platform sent us a new link, the "master" property is still set.

  device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
  device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
  device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
  config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
  active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
  active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
  device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
  device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare

  l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
  device (br0): master: add one slave 0a458361f9fed8f5/dummy0

Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above. Sad.

  device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
  active-connection[0x55ed7ba78400]: set state activating (was unknown)
  manager: NetworkManager state is now CONNECTING
  active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
  device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
  device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
  platform: (dummy0) link: releasing 102 from master 'br0' (101)
  device (br0): detached bridge port dummy0

Now things go south. The stage1 cleans the device up, removing it from
the master and the device itself decides it should deactivate itself
because it lots its master regardless of the fact that it should not
have one and it's in fact an unwanted carryover from previous activation.
I believe this is also wrong.

  device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
  device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'
  device[0a458361f9fed8f5] (dummy0): queue-state[deactivating, reason:connection-assumed, id:298]: queue state change
  device[0a458361f9fed8f5] (dummy0): activation-stage: synchronously invoke activate_stage2_device_config
  device (dummy0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')

Now things are really weird. We synchronously go to config, effectively
overriding the queued deactivation. We've really messed up.
2022-06-14 14:21:53 +02:00
Lubomir Rintel
1fe8166fc9 device: only deactivate when the master we've enslaved to goes away
Sometimes weird things happen.

Let dummy0 be an externally created device that has a master. We decide
to activate a connection that has no master on it:

  active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
  device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
  device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
  device (dummy0): disconnecting for new activation request.
  device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)

Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.

  device (br0): bridge port dummy0 was detached
  device (dummy0): released from master device br0
  active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
  device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
  device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')

  platform: (dummy0) emit signal link-changed changed: 102: dummy0
      <NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
       addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746

Now the platform sent us a new link, the "master" property is still set.

  device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
  device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
  device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
  config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
  active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
  active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
  device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
  device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare

  l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
  device (br0): master: add one slave 0a458361f9fed8f5/dummy0

Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above.

  device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
  active-connection[0x55ed7ba78400]: set state activating (was unknown)
  manager: NetworkManager state is now CONNECTING
  active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
  device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
  device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
  device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
  platform: (dummy0) link: releasing 102 from master 'br0' (101)
  device (br0): detached bridge port dummy0

Now stage1 cleans the device up, removing it from the master.

  device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
  device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
  device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
  device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'

We decide to deal with this by enqueuing a deactivation. That is not
great -- we shouldn't even have had this master!

This patch takes the deactivation path only if we were willingly
enslaved to the master in question.
2022-06-14 14:21:53 +02:00
Lubomir Rintel
0fa8c5f94c device: stop checking the IP configuration state when cancelling activation
The @bond_mode_8023ad test has been seen failing, with a log like this:

  <debug> [...3.0484] device[...] (eth1): Activation: connection 'bond0.0' master deactivated
  <debug> [...3.0484] device[...] (eth1): add_pending_action (2): 'queued-state-change-deactivating'
  <debug> [...3.0484] device[...] (eth1): queue-state[deactivating, reason:new-activation, id:709]: queue state change

What happened is that eth1 has been activating. It was already enslaved
to a bond and was in an ip-config state when the bond was removed.
A change to "deactivating" state has been enqueued. But then this
happened:

  <trace> [...3.0942] device[...] (eth1): ip4: check-state: state done => done, is_failed=0, is_pending=0,
                      is_started=0 temp_na=0, may-fail-4=1, may-fail-6=1; disabled4; manualip4=done; ignore6 manualip6=done
  <trace> [...3.0942] device[...] (eth1): ip: check-state: (combined) state pending => done
  <debug> [...3.0943] device[...] (eth1): ip: set (combined) state done (was pending, reason: check-ip-state)
  <info>  [...3.0943] device (eth1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
  <debug> [...3.0943] device[...] (eth1): add_pending_action (3): 'in-state-change'
  <debug> [...3.0943] device[...] (eth1): queue-state[deactivating, reason:new-activation, id:709]: clear queued state change

The IP config succeeded and the queued "deactivating" change was
overriden by the IP4 check result, prompting a change to "ip-check".
With the master still missing. Not good.

Let's terminate the appempts to check the IP state when we cancel the
activation, so that it doesn't override the enqueued state change.

Fixes-test: @bond_mode_8023ad

https://bugzilla.redhat.com/show_bug.cgi?id=2080928
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1245
2022-06-14 14:21:53 +02:00
Beniamino Galvani
b41b11d613 ppp: don't remove addresses from interface while IPCP/IPV6CP is running
pppd also tries to configure addresses by itself through some
ioctls. If we remove between those calls an address that was added,
pppd fails and quits.

To avoid this race condition, don't remove addresses while IPCP and
IPV6CP are running. Once pppd sends an IP configuration, it has
finished configuring the interface and we can proceed normally.

https://bugzilla.redhat.com/show_bug.cgi?id=2085382
2022-06-14 12:26:21 +02:00
Beniamino Galvani
e8275d7139 core: add nm_l3cfg_block_obj_pruning()
Add a function prevent the removal of addresses and routes from the
interface for a given address family.
2022-06-14 12:26:21 +02:00
Beniamino Galvani
d6429d3ddb device: ensure DHCP is restarted every time the link goes up
Currently we call nm_device_update_dynamic_ip_setup() in
carrier_changed() every time the carrier goes up again and the device
is activating, to kick a restart of DHCP.

Since we process link events in a idle handler, it can happen that the
handler is called only once for different events; in particular
device_link_changed() might be called once for a link-down/link-up
sequence.

carrier_changed() is "level-triggered" - it cares only about the
current carrier state. nm_device_update_dynamic_ip_setup() should
instead be "edge-triggered" - invoked every time the link goes from
down to up. We have a mechanism for that in device_link_changed(), use
it.

Fixes-test: @ipv4_spurious_leftover_route

https://bugzilla.redhat.com/show_bug.cgi?id=2079406
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1250
2022-06-11 18:24:00 +02:00
Dominique Martinet
4d7b494eb3 ppp-manager: ip6: set interface mtu based on ppp config
impl_ppp_manager_set_ip4_config always has been setting interface mtu
based on ppp configuration: do the same for ip6 in case it matters.
2022-06-09 14:21:10 +00:00
Dominique Martinet
6991333bc0 ppp-manager: ip6: fix dns not being used
ipv6 DNS received on ppp interface were being ignored because their
priority was not set.
Fix this by using default priority in impl_ppp_manager_set_ip6_config(),
as was done for ip4_config in b2e559fab2 ("core: initialize l3cd
dns-priority for ppp and wwan")

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1022
2022-06-09 14:21:10 +00:00
Thomas Haller
431d139d15
dispatcher: log duration of dispatcher call
Yes, we anyway log the timestamps for every log message. So one could
always calculate the offset. However, when you read a logfile, it can be
cumbersome to stop looking at where you currently are to find the
start/end of a call. For convenience, log the duration explicitly.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1251
2022-06-09 13:23:35 +02:00
Beniamino Galvani
f69a1cc874 device: fix memory leak
l3cd instances must be removed from the old l3cfg before calling
_cleanup_ip_pre(). Otherwise, _cleanup_ip_pre() unregisters them from
the device, and later _dev_l3_register_l3cds(self, l3cfg_old, FALSE,
FALSE) does nothing because the device doesn't have any l3cd.

Previously the l3cds would linger in the l3cfg, keeping a reference to
it and causing a memory leak; the leak was not detected by valgrind
because the l3cfg was still referenced by the NMNetns.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
Fixes-test: @stable_mem_consumption2

https://bugzilla.redhat.com/show_bug.cgi?id=2083453

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1252
2022-06-09 09:37:24 +02:00
Thomas Haller
8e86cfb8ab
l3cfg: fix comparing "has-dns-priority" flag in nm_l3_config_data_cmp_full()
Fixes: cb29244552 ('core: support compare flags in nm_l3_config_data_cmp_full()')
2022-06-09 08:53:34 +02:00
Thomas Haller
fb2b35b068
ifcfg: set errno for svGetValueEnum() to detect unset values 2022-06-07 09:55:39 +02:00
Thomas Haller
fe7bdaa7e4
wifi: fix crash in NMDeviceWifi.check_connection_compatible() checking WEP capability
https://bugzilla.redhat.com/show_bug.cgi?id=2092782

Fixes: feee84aac4 ('wifi: mark WEP connections incompatible if supplicant lacks capability')
2022-06-02 13:25:10 +02:00
Thomas Haller
240ec7f891
dhcp: implement ACD (address collision detection) for DHCPv4
This was working for internal plugin in the past, but broken by l3cfg
rework with 1.36. Re-add it. Not it also works with dhclient. For other
plugins, it's not really working, because we can't decline.

Now NMDhcpClient does ACD (using NML3Cfg) and abstracts that from
the caller (NMDevice).

It is complicated. Because there is state involved, meaning, we need
to remember the current state for ACD and react on and handle a
multitude of events. Getting this right, is non-trivial.

What we want is that if ACD fails, we decline the lease (and don't use
it).

https://bugzilla.redhat.com/show_bug.cgi?id=1713380
2022-06-01 10:37:44 +02:00
Thomas Haller
156d84217c
dhcp/dhclient: implement accept/decline (ACD) for dhclient plugin
dhclient itself doesn't do ACD. However, it expects the dhclient-script
to exit with non-zero status, which causes dhclient to send a DECLINE.

`man dhclient-script`:

  BOUND:
     Before actually configuring the address, dhclient-script should
     somehow ARP for it and exit with a nonzero status if it receives a
     reply. In this case, the client will send a DHCPDECLINE  message  to
     the server and acquire a different address.   This may also be done in
     the RENEW, REBIND, or REBOOT states, but is not required, and indeed may
     not be desirable.

See also Fedora's dhclient-script ([1]).

https://gitlab.isc.org/isc-projects/dhcp/-/issues/67#note_97226
33226f2d76/client/dhclient.c (L1652)

[1] a8f6fd046f/f/dhclient-script (_878)

https://bugzilla.redhat.com/show_bug.cgi?id=1713380
2022-05-31 18:32:36 +02:00