If the slave is 'external' we should never touch it, in particular we
should not release the link from its master; we only have to remove it
from master's list.
https://bugzilla.redhat.com/show_bug.cgi?id=1442361
Previously, if a master device had internal state 'managed', we would
promote the slave to 'managed' as well. However,
- if the slave is 'external', it should stay as is because we don't
want to start managing it
- if the slave is 'assumed', it will become managed when the
activation succeeds, so it's not necessary to do it here
Fixes: 850c977953
teamd uses a PID file to guarantee a single instance is running for
each device. If we spawn a new teamd process without waiting the
termination of the existing one, the new process can fail:
<debug> [1486191713.2530] kill child process 'teamd' (2676): wait for process to terminate after sending SIGTERM (15) (send SIGKILL in 2000 milliseconds)...
...
<debug> [1486191713.2539] device[0x7f737f5d7c40] (team1): running: /usr/bin/teamd -o -n -U -D -N -t team1 -c {"runner": {"name": "activebackup"}} -gg
Using team device "team1".
Using PID file "/var/run/teamd/team1.pid"
This program is not intended to be run as root.
Daemon already running on PID 2676.
Failed: File exists
To avoid this, keep track that a kill is in progress and postpone the
start of teamd.
https://bugzilla.redhat.com/show_bug.cgi?id=1415641
Change the output of nm_platform_error_to_string() to print the numeric value.
Also, accept a string buffer instead of using an alloca() allocated buffer.
There is still a macro to provide the previous functionality, but it
was ill-suited to call from inside a loop.
Add an utility function for resetting addresses/routes of NMIP6Config
from NMNDisc data. For one, this de-duplicates code in device and
nm-iface-helper.
Also, we no longer first reset (delete) all addresses and add them anew.
Instead, we first mark all entries as dirty for deletion, merge (append)
the new entires, and delete the remaining dirty entires. This saves a
extra work, in the expected case where NMIP6Config already contains
several of the new entries.
Previously, we would add exclusive routes via netlink message flags
NLM_F_CREATE | NLM_F_REPLACE for RTM_NEWROUTE. Similar to `ip route replace`.
Using that form of RTM_NEWROUTE message, we could only add a certain
route with a certain network/plen,metric triple once. That was already
hugely inconvenient, because
- when configuring routes, multiple (managed) interfaces may get
conflicting routes (multihoming). Only one of the routes can be actually
configured using `ip route replace`, so we need to track routes that are
currently shadowed.
- when configuring routes, we might replace externally configured
routes on unmanaged interfaces. We should not interfere with such
routes.
That was worked around by having NMRouteManager (and NMDefaultRouteManager).
NMRouteManager would keep a list of the routes which NetworkManager would like
to configure, even if momentarily being unable to do so due to conflicting routes.
This worked mostly well but was complicated. It involved bumping metrics to
avoid conflicts for device routes, as we might require them for gateway routes.
Drop that now. Instead, use the corresponding of `ip route append` to configure
routes. This allows NetworkManager to confiure (almost) all routes that we care.
Especially, it can configure all routes on a managed interface, without
replacing/interfering with routes on other interfaces. Hence, NMRouteManager
becomes obsolete.
It practice it is a bit more complicated because:
- when adding an IPv4 address, kernel will automatically create a device route
for the subnet. We should avoid that by using the IFA_F_NOPREFIXROUTE flag for
IPv4 addresses (still to-do). But as kernel may not support that flag for IPv4
addresses yet (and we don't require such a kernel yet), we still need functionality
similar to nm_route_manager_ip4_route_register_device_route_purge_list().
This functionality is now handled via nm_platform_ip4_dev_route_blacklist_set().
- trying to configure an IPv6 route with a source address will be rejected
by kernel as long as the address is tentative (see related bug rh#1457196).
Preferably, NMDevice would keep the list of routes which should be configured,
while kernel would have the list of what actually is configured. There is a
feed-back loop where both affect each other (for example, when externally deleting
a route, NMDevice must forget about it too). Previously, NMRouteManager would have
the task of remembering all routes which we currently want to configure, but cannot
due to conflicting routes.
We get rid of that, because now we configure non-exclusive routes. We however still
will need to remember IPv6 routes with a source address, that currently cannot be
configured yet. Hence, we will need to keep track of routes that
currently cannot be configured, but later may be.
That is still not done yet, as NMRouteManager didn't handle this
correctly either.
Rename to nm_platform_ip_address_flush(), it's more consistent with naming
for other platform functions.
Also, pass an address family argument. Sometimes I feel an option makes it clearer
what the function does. Otherwise, from the name it's not clear which address
families are affected. As an API, it feels more correct to me.
We soon also get a nm_platform_ip_route_flush() function, which will
look similar.
do a check on parent ifindex before calling "nm_device_supports_vlans"
otherwise if the parent device is a software device and its ifindex
member has not been updated yet we will trigger the g_return_if_fail
statement in "nmp_cache_lookup_entry_link".
This has been osserved in NetworkManager CI test suite, on NetworkManager
boot, during the creation of a vlan on top of a bond interface.
CI test: vlan_update_mac_from_bond
[...]
<info> [1503323670.0229] manager: (bond0): new Bond device (/org/freedesktop/NetworkManager/Devices/23)
<debug> [1503323670.0231] device[0x555555c3e320] (vlan10): constructed (NMDeviceVlan)
<debug> [1503323670.0231] manager: (vlan-vlan10) create virtual device vlan10
<debug> [1503323670.0231] device[0x555555c3e320] (vlan10): unmanaged: flags set to [platform-init,!sleeping=0x10/0x11/unmanaged/unrealized], set-managed [sleeping=0x1])
<trace> [1503323670.0235] exported-object[0x555555c3e320]: export: "/org/freedesktop/NetworkManager/Devices/24"
<trace> [1503323670.0235] properties-changed[0x555555c3e320]: ignoring notification for prop g-object-path on type NMDeviceVlan
<trace> [1503323670.0236] properties-changed[0x555555c3e320]: ignoring notification for prop path on type NMDeviceVlan
<info> [1503323670.0237] manager: (vlan10): new VLAN device (/org/freedesktop/NetworkManager/Devices/24)
<debug> [1503323670.0239] device[0x555555c3e320] (vlan10): create (is nm-owned)
Program received signal SIGTRAP, Trace/breakpoint trap.
g_logv (log_domain=0x5555557c39a9 "NetworkManager", log_level=
G_LOG_LEVEL_CRITICAL, format=<optimized out>,
args=args@entry=0x7fffffffdef0) at gmessages.c:1086
1086 g_private_set (&g_log_depth, GUINT_TO_POINTER (depth));
(gdb) bt
#0 0x00007ffff5ce3643 in g_logv (log_domain=0x5555557c39a9 "NetworkManager", log_level=
G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fffffffdef0) at gmessages.c:1086
#1 0x00007ffff5ce37bf in g_log (log_domain=log_domain@entry=0x5555557c39a9 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, format=format@entry=0x7ffff5d51190 "%s: assertion '%s' failed") at gmessages.c:1119
#2 0x00007ffff5ce37f9 in g_return_if_fail_warning (log_domain=log_domain@entry=0x5555557c39a9 "NetworkManager", pretty_function=pretty_function@entry=0x5555557b2a20 <__func__.32407> "nmp_cache_lookup_entry_link", expression=expression@entry=0x5555557b1037 "ifindex > 0") at gmessages.c:1128
#3 0x000055555566688a in nmp_cache_lookup_entry_link (cache=0x555555a780f0, ifindex=<optimized out>) at src/platform/nmp-object.c:1449
#4 0x00005555556668f9 in nmp_cache_lookup_link (cache=<optimized out>, ifindex=ifindex@entry=0) at src/platform/nmp-object.c:1464
#5 0x00005555556515e9 in nm_platform_link_get_obj (self=self@entry=0x555555a88880 [NMLinuxPlatform], ifindex=ifindex@entry=0, visible_only=visible_only@entry=1) at src/platform/nm-platform.c:618
#6 0x0000555555633e91 in link_supports_vlans (platform=0x555555a88880 [NMLinuxPlatform], ifindex=0) at src/platform/nm-linux-platform.c:4482
#7 0x00005555556d6d41 in create_and_realize (device=0x555555c3e320 [NMDeviceVlan], connection=0x7fffdc007890, parent=0x555555c33560 [NMDeviceBond], out_plink=0x7fffffffe1f8, error=0x7fffffffe358) at src/devices/nm-device-vlan.c:239
#8 0x00005555556b934c in nm_device_create_and_realize (self=self@entry=0x555555c3e320 [NMDeviceVlan], connection=connection@entry=0x7fffdc007890, parent=0x555555c33560 [NMDeviceBond], error=error@entry=0x7fffffffe358)
at src/devices/nm-device.c:2946
#9 0x00005555555b84c7 in connection_changed (connection=0x7fffdc007890, self=0x555555ab1070 [NMManager]) at src/nm-manager.c:1381
#10 0x00005555555b84c7 in connection_changed (self=0x555555ab1070 [NMManager], connection=0x7fffdc007890) at src/nm-manager.c:1431
#11 0x00005555555b9130 in retry_connections_for_parent_device (self=self@entry=0x555555ab1070 [NMManager], device=device@entry=0x555555c33560 [NMDeviceBond])
at src/nm-manager.c:1416
#12 0x00005555555b95c7 in add_device (self=self@entry=0x555555ab1070 [NMManager], device=device@entry=0x555555c33560 [NMDeviceBond], error=error@entry=0x7fffffffe598) at src/nm-manager.c:2238
#13 0x00005555555b83e1 in connection_changed (connection=0x7fffdc007b30, self=0x555555ab1070 [NMManager]) at src/nm-manager.c:1352
#14 0x00005555555b83e1 in connection_changed (self=0x555555ab1070 [NMManager], connection=0x7fffdc007b30) at src/nm-manager.c:1431
#15 0x00005555555be25b in nm_manager_start (self=0x555555ab1070 [NMManager], error=error@entry=0x7fffffffe720) at src/nm-manager.c:5202
#16 0x0000555555586b13 in main (argc=1, argv=0x7fffffffe888) at src/main.c:413
nmp_lookup_init_route_visible() was originally named this way, to only return routes
that are nmp_object_is_visible(). However, all routes are visible (as long as they are
nmp_object_is_alive()). Hence, this is a historic misnomer.
Also, passing @only_default FALSE is identical to the
nmp_lookup_init_addrroute() lookup.
So, rename the function to indicate it is a lookup for default routes
only. Also, get rid of the unsupported ifindex argument for which there
is no index.
Only the D-Bus bits use it, and we wouldn't pass a GVariant array around
in internal code anyway. Also validate the scan options earlier rather
than waiting for the supplicant to tell us they are invalid.
Enable background scanning for most WiFi connections except for
shared/AP and BSSID-locked ones. Make the non-WPA-Enterprise
interval very, very long to effectively disable periodic scanning
while connected.
Related: https://bugzilla.gnome.org/show_bug.cgi?id=766482
Change it to return TRUE when scanning is prohibited so that we
don't have to use use g_signal_emitv() and its special handling of
return values. Make the return value only change when we don't
want the default behavior (which would be to allow the scan).
Also add a parameter to the signal indicating whether the scan is
user/dbus-requested or whether it's an internal periodic scan.
The bluetooth device *never* manages NAP connection. Hence, checking for
nm_bt_vtable_network_server in "nm-bluez-manager.c" is wrong.
Especially, because nm_bt_vtable_network_server is only initialized
much later, so during initial start, the bluetooth factory would wronly
claim to support it. This leads to a crash when having a NAP connection.
Also, the bridge factory requires the bluetooth plugin. It should only
claim to support NAP when the bluetooth plugin is present. That
way, we get a proper "missing plugin" error message, instead of failing
later during activation.
It seems to me, distributing the logic to various match_connection()
functions makes it more complicated, because the implementation is
spread out and interact in complicated ways. Anyway.
Fixes: 8665cdfeff
Add code to NMPppDevice to activate new-style PPPoE connections. This
is a bit tricky because we can't create the link as usual in
create_and_realize(). Instead, we create a device without ifindex and
start pppd in stage2; when pppd reports a new configuration, we rename
the platform link to the correct name and set the ifindex into the
device.
This mechanism is inherently racy, but there is no way to tell pppd to
create an arbitrary interface name.
Make it possible to register different factories for the same setting
type, and add a match_connection() method to let each factory decide
if it's capable of handling a connection.
This will be used to decide whether a PPPoE connection must be handled
through the legacy Ethernet factory or through the new PPP factory.
The new device type represents a PPP interface, and will implement the
activation of new-style PPPoE connections, i.e. the ones that don't
claim the parent device.
Software devices don't have a permanent hardware address and thus it
doesn't make sense to enforce the 'fake' (generated) permanent one
when cloned-mac-address=permanent. Also, setting the fake permanent
address on bond devices, prevents them from inheriting the first slave
hardware address, so let's just skip the setting of MAC when
cloned-mac-address=permanent and there is no real permanent address.
https://bugzilla.redhat.com/show_bug.cgi?id=1472965
The settings "bridge.mac-address" and "ethernet.cloned-mac-address" have an
overlapping meaning. If the former is unset, fallback to the latter.
Effectively, "bridge.mac-address" is deprecated in favor of
"ethernet.cloned-mac-address", which is more powerful as it supports
various modes like "stable". However, if a connection specifies
"bridge.mac-address", it is used when creating the bridge interface,
while "ethernet.cloned-mac-address" is used shortly after, during
activation.
Reasons:
- it adds an O(1) lookup index for accessing NMIPxConfig's addresses.
Hence, operations like merge/intersect have now runtime O(n) instead
of O(n^2).
Arguably, we expect low numbers of addresses in general. For low
numbers, the O(n^2) doesn't matter and quite likely in those cases
the previous implementation was just fine -- maybe even faster.
But the simple case works fine either way. It's important to scale
well in the exceptional case.
- the tracked objects can be shared between the various NMPI4Config,
NMIP6Config instances with NMPlatform and everybody else.
- the NMPObject can be treated generically, meaning it enables code to
handle both IPv4 and IPv6, or addresses and routes. See for example
_nm_ip_config_add_obj().
- I want core to evolve to somewhere where we don't keep copies of
NMPlatformIP4Address, et al. instances. Instead they shall all be
shared. I hope this will reduce memory consumption (although tracking a
reference consumes some memory too). Also, it shortcuts nmp_object_equal()
when comparing the same object. Calling nmp_object_equal() on the
identical objects would be a common case after the hash function
pre-evaluates equality.
Maintaining an index is expensive.Not so much in term of runtime, but
in term of memory.
Drop some indexes, and require the caller to use a more broad index (and
filter out unwanted elements).
Dropped:
- can no longer lookup visible default-routes by ifindex.
If you care about default-routes, lookup all and search for the
desired ifindex. The overall number of default-routes is expected
to be small.
We drop NMP_CACHE_ID_TYPE_ROUTES_VISIBLE_BY_IFINDEX_WITH_DEFAULT
entirely.
- no longer have a separate index for non-default routes. We
expect that the most routes are non-default routes. So, don't
have an index without default-routes, instead let the caller
just lookup all routes, and reject default-routes themself.
We keep NMP_CACHE_ID_TYPE_ROUTES_VISIBLE_BY_DEFAULT, but it
now no longer tracks non-default routes.
This drops 1 out of 6 route indexes, and modifes another one, so
that we expect that there are almost no entires tracked by it.
NMIP4Config, NMIP6Config, and NMPlatform shall share one
NMDedupMultiIndex instance.
For that, pass an NMDedupMultiIndex instance to NMPlatform and NMNetns.
NMNetns than passes it on to NMDevice, NMDhcpClient, NMIP4Config and NMIP6Config.
So currently NMNetns is the access point to the shared NMDedupMultiIndex
instance, and it gets it from it's NMPlatform instance.
The NMDedupMultiIndex instance is really a singleton, we don't want
multiple instances of it. However, for testing, instead of adding a
singleton instance, pass the instance explicitly around.
The default value for miimon, when missing in the setting, is 0 if
arp_interval is != 0, and 100 otherwise. So, when generating a
connection, let's ignore miimon=0 (which means that miimon is
disabled) and accept any other value. Adding miimon=100 does not cause
any harm to the connection assumption.
While at it, slightly improve the code: ignore_if_zero() is not useful
for 'updelay','downdelay','arp_interval' because zero is their default
value, so introduce a new function that checks if the value is the
default (and specially handles 'miimon').
Reported-by: Taketo Kabe <rkabe@vega.pgw.jp>
https://bugzilla.redhat.com/show_bug.cgi?id=1463077
For master devices, instead of ignoring loss of carrier entirely,
handle it.
First of all, master devices are now by default ignore-carrier=yes.
That means, without explict user configuration in NetworkManager.conf,
the previous behavior in carrier_changed() does not change.
If the user decides to configure the master device like
[device-with-carrier]
match-device=type:bond,type:bridge,type:team
ignore-carrier=no
then, master device will disconnect on carrier loss like
regular devices.
https://github.com/NetworkManager/NetworkManager/pull/18
Co-authored-by: Thomas Haller <thaller@redhat.com>
Commit 348452f1e0 (device: renew DHCP
lease for active "ignore-carrier" devices on carrier-on (bgo #743368))
added this behavior for non-master devices.
The same reasoning applies here too.
https://github.com/NetworkManager/NetworkManager/pull/18
Based-on-patch-by: Nikolay Martynov <mar.kolya@gmail.com>
Previously, master device types like bridge, bond, and team
would overwrite is_available() and check_connection_available()
and always return TRUE.
The device already expresses via nm_device_is_master() that it
is of a master kind. Refactor the code, so, instead of having these
device types overwrite is_available() and check_connection_available(),
let the parents implementation react on nm_device_is_master().
There is no change in behavior at all. Instead, the knowledge how to
treat a master device moves from the device implementation to the
parent class.
Currently, device types like Bond hack around ignore-carrier
setting, as they always want to ignore-carrier.
Prepare so that also for such master types, we rely and honor the
ignore-carrier setting better. In the next commit, bond, bridge and
team devices they will get ignore-carrier turned on by default.
For externally managed interfaces, we create an in-memory connection
and keep the device with sys-iface-state=external.
When the user actively modifies the connection, we persist it to
storage. But we also must take over managing the device.
One problem is that nm_device_reapply() errors out if the device
is still activating. It's unclear how to reapply the connection
while the device is in the process of activation. So, if the user
modifies the created connection very quickly, reapplying the settings
will fail.
https://bugzilla.redhat.com/show_bug.cgi?id=1462223
Since commit 2b51d3967 "device: merge branch 'th/device-mtu-bgo777251'",
we always set the MTU for certain device types during activation. Even
if the MTU is neither specified via the connection nor other means, like
DHCP.
Revert that change. On activation, if nothing explicitly configures the
MTU, leave it unchanged. This is like what we do with ethernet's
cloned-mac-address, which has a default value "preserve".
So, as last resort the default value for MTU is now 0 (don't change),
instead of depending on the device type.
Note that you also can override the default value in global
configuration via NetworkManager.conf.
This behavior makes sense, because whenever NM actively resets the MTU,
it remembers the previous value and restores it when deactivating
the connection. That wasn't implemented before 2b51d3967, and the
MTU would depend on which connection was previously active. That
is no longer an issue as the MTU gets reset when deactivating.
https://bugzilla.redhat.com/show_bug.cgi?id=1460760
It's useless (and in some cases also harmful) to commit the
configuration to update the default route metric when the device has
no default route. Also, don't commit configuration for externally
activated devices.
https://bugzilla.redhat.com/show_bug.cgi?id=1459604
Don't log in a function that basically just inspects state, without
mutating it. Instead, pass the reason why a connection could not be
generated to the caller so that we have one sensible log message.
The device's RECHECK_ASSUME signal has only NMManager as subscriber
and it immediately calls recheck_assume_connection().
With the previous commit, recheck_assume_connection() always logs
a debug message, so we don't need this duplicate message anymore.
Originally 850c977 "device: track system interface state in NMDevice",
intended that a connection can only be assumed initially when seeing
a device for the first time. Assuming a connection later was to be
prevented by setting device's sys-iface-state to MANAGED.
That changed too much in behavior, because we used to assume external
connections also when they are activated later on. So this was attempted
to get fixed by
- acf1067 nm-manager: try assuming connections on managed devices
- b6b7d90 manager: avoid generating in memory connections during startup for managed devices
It's probably just wrong to prevent assuming connections based on the
sys-iface-state. So drop the check for sys-iface-state from
recheck_assume_connection(). Now, we can assume anytime on managed,
disconnected interfaces, like previously.
Btw, note that priv->startup is totally wrong to check there, because
priv->startup has the sole purpose of tracking startup-complete property.
Startup, as far as NMManager is concerned, is platform_query_devices().
However, the problem is that we only assume connections (contrary to
doing external activation) when we have a connection-uuid from the state
file or with guess-assume during startup.
When assuming a master device, it can fail with
(nm-bond): ignoring generated connection (IPv6LL-only and not in master-slave relationship)
thus, for internal reason the device cannot be assumed yet.
Fix that by attatching the assume-state to the device, so that on multiple
recheck_assume_connection() calls we still try to assume. Whenever we try
to assume the connection and it fails due to external reasons (like, the connection
no longer matching), we clear the assume state, so that we only try as
long as there are internal reasons why assuming fails.
https://bugzilla.redhat.com/show_bug.cgi?id=1452062
The state file should only be read initially when NM starts, that is:
during NMManager's platform_query_devices().
At all later points, for example when a software device gets destroyed
and re-realized, the state file is clearly no longer relevant.
Hence, pass the set-nm-owned flag from NMManager to realize_start_setup().
This is very much the same as with the NM_UNMANAGED_FLAG_USER_EXPLICT flag,
which we also read from the state-file.
curl must bind to the interface that has IP configuration, not the
underlying device. Without this commit, connectivity check fails on
certain connection types (PPPoE, WWAN).
Fixes: 9d43869e47
Don't crash if the bond mode can't be read from sysfs - for example
when the interface disappears. The generated connection will be bogus,
but at that point it doesn't matter because the in-memory connection
will be destroyed.
Fixes: 056a973a4fhttps://bugzilla.redhat.com/show_bug.cgi?id=1459580
After a daemon restart, any software device is considered !nm-owned,
even if it was created by NM. Therefore, a device stays around even if
the connection which created it gets deactivated or deleted.
Fix this by remembering the previous nm-owned state in the device
state file.
https://bugzilla.redhat.com/show_bug.cgi?id=1376199