It seems the assert there is too strict. I don't really understand why
it fails, but I also don't see why the assert is supposed to hold.
Just return in case the device is unmanagable at this point.
The activation shall fail later.
Traceback from a test build of commit a7aca2ab08:
#0 0x00007fdb28ffb643 in g_logv (log_domain=0x7fdb2b584cc9 "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fff10630200) at gmessages.c:1086
#1 0x00007fdb28ffb7bf in g_log (log_domain=log_domain@entry=0x7fdb2b584cc9 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, format=format@entry=0x7fdb29069190 "%s: assertion '%s' failed") at gmessages.c:1119
#2 0x00007fdb28ffb7f9 in g_return_if_fail_warning (log_domain=log_domain@entry=0x7fdb2b584cc9 "NetworkManager", pretty_function=pretty_function@entry=0x7fdb2b54fee0 <__func__.38922> "unmanaged_to_disconnected", expression=expression@entry=0x7fdb2b54d450 "nm_device_get_managed (device, FALSE)") at gmessages.c:1128
#3 0x00007fdb2b36e05b in unmanaged_to_disconnected (device=device@entry=0x7fdb2d2384f0 [NMDeviceVlan]) at src/nm-manager.c:3201
#4 0x00007fdb2b37eb3a in _internal_activate_generic (error=0x7fff106303d0, active=0x7fdb2d1d4550 [NMActRequest], self=0x0) at src/nm-manager.c:3430
#5 0x00007fdb2b37eb3a in _internal_activate_generic (self=self@entry=0x7fdb2d02b090 [NMManager], active=active@entry=0x7fdb2d1d4550 [NMActRequest], error=error@entry=0x7fff10630450) at src/nm-manager.c:3458
#6 0x00007fdb2b37fe90 in _activation_auth_done (active=0x7fdb2d1d4550 [NMActRequest], success=1, error_desc=0x0, user_data1=0x7fdb2d02b090, user_data2=0x7fdb0800bec0) at src/nm-manager.c:3866
#7 0x00007fdb2b4cc9d7 in auth_done (chain=0x7fdb2d17de30, error=0x0, unused=<optimized out>, user_data=<optimized out>) at src/nm-active-connection.c:929
#8 0x00007fdb2b4d6884 in auth_chain_finish (user_data=0x7fdb2d17de30) at src/nm-auth-utils.c:92
#9 0x00007fdb28ff4d7a in g_main_context_dispatch (context=0x7fdb2cff2e00) at gmain.c:3152
#10 0x00007fdb28ff4d7a in g_main_context_dispatch (context=context@entry=0x7fdb2cff2e00) at gmain.c:3767
#11 0x00007fdb28ff50b8 in g_main_context_iterate (context=0x7fdb2cff2e00, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3838
#12 0x00007fdb28ff538a in g_main_loop_run (loop=0x7fdb2cff2ec0) at gmain.c:4032
#13 0x00007fdb2b349ed7 in main (argc=1, argv=0x7fff106307a8) at src/main.c:438
https://bugzilla.redhat.com/show_bug.cgi?id=1478911
Add code to NMPppDevice to activate new-style PPPoE connections. This
is a bit tricky because we can't create the link as usual in
create_and_realize(). Instead, we create a device without ifindex and
start pppd in stage2; when pppd reports a new configuration, we rename
the platform link to the correct name and set the ifindex into the
device.
This mechanism is inherently racy, but there is no way to tell pppd to
create an arbitrary interface name.
When a user forces up a connection on a device, mark earlier the
device as managed: this would allow proper clean-up on the device also
when it was previously unmanaged or assumed.
This would avoid skipping IPv6LL address generation when instead it was
needed.
Fixes: adbf383628https://bugzilla.redhat.com/show_bug.cgi?id=1452046
In _internal_activate_device(), we try to find an existing master AC
for the slave AC, and we create a new one in case of failure. The
master AC may already exist, but it may not be detected by
find_master() because it is undergoing authorization.
The result is that we auto-activate the master when there is already a
user activation in place, and the auto-activation will cancel the user
one. This is bad, as user-activation should always have precedence.
To fix this, introduce a last-minute check before activating internal
connections.
https://bugzilla.redhat.com/show_bug.cgi?id=1450219
Don't log in a function that basically just inspects state, without
mutating it. Instead, pass the reason why a connection could not be
generated to the caller so that we have one sensible log message.
Originally 850c977 "device: track system interface state in NMDevice",
intended that a connection can only be assumed initially when seeing
a device for the first time. Assuming a connection later was to be
prevented by setting device's sys-iface-state to MANAGED.
That changed too much in behavior, because we used to assume external
connections also when they are activated later on. So this was attempted
to get fixed by
- acf1067 nm-manager: try assuming connections on managed devices
- b6b7d90 manager: avoid generating in memory connections during startup for managed devices
It's probably just wrong to prevent assuming connections based on the
sys-iface-state. So drop the check for sys-iface-state from
recheck_assume_connection(). Now, we can assume anytime on managed,
disconnected interfaces, like previously.
Btw, note that priv->startup is totally wrong to check there, because
priv->startup has the sole purpose of tracking startup-complete property.
Startup, as far as NMManager is concerned, is platform_query_devices().
However, the problem is that we only assume connections (contrary to
doing external activation) when we have a connection-uuid from the state
file or with guess-assume during startup.
When assuming a master device, it can fail with
(nm-bond): ignoring generated connection (IPv6LL-only and not in master-slave relationship)
thus, for internal reason the device cannot be assumed yet.
Fix that by attatching the assume-state to the device, so that on multiple
recheck_assume_connection() calls we still try to assume. Whenever we try
to assume the connection and it fails due to external reasons (like, the connection
no longer matching), we clear the assume state, so that we only try as
long as there are internal reasons why assuming fails.
https://bugzilla.redhat.com/show_bug.cgi?id=1452062
The state file should only be read initially when NM starts, that is:
during NMManager's platform_query_devices().
At all later points, for example when a software device gets destroyed
and re-realized, the state file is clearly no longer relevant.
Hence, pass the set-nm-owned flag from NMManager to realize_start_setup().
This is very much the same as with the NM_UNMANAGED_FLAG_USER_EXPLICT flag,
which we also read from the state-file.
After a daemon restart, any software device is considered !nm-owned,
even if it was created by NM. Therefore, a device stays around even if
the connection which created it gets deactivated or deleted.
Fix this by remembering the previous nm-owned state in the device
state file.
https://bugzilla.redhat.com/show_bug.cgi?id=1376199
Since commit 74dac5f (nm-manager: try assuming connections on managed devices),
and commit f4226e7 (manager: avoid generating in memory connections
during startup for managed devices), recheck_assume_connection() also
assumes connections on devices that are currently not in sys-iface-state
"external".
That is correct, as also for fully managed devices (which are currently
in disconnected state), we want to assume external connections. However,
when doing that, we must reset the sys-iface-state to external.
https://bugzilla.redhat.com/show_bug.cgi?id=1457242
Otherwise a device which was set as unmanaged (updated to the REMOVED
internal sys-state) will never update its own sys-state if later set
back as managed.
Manage either when setting explictly the device to managed either when
just upping a connection on an unmanaged device.
Commits 39d0559d9a ("platform: sort links by name instead of
ifindex") and 529a0a1a7f ("manager: sort slaves to be autoconnected
by device name") changed the order of activation of slaves. Introduce
a system-wide configuration property to preserve the old behavior.
https://bugzilla.redhat.com/show_bug.cgi?id=1452585
A property preferably only emits a notify-changed signal when
the value actually changes and it caches the value (so that
between property-changed signals the value is guaranteed not to change).
NMSettings and NMManager both already cache the hostname, because
NMHostnameManager didn't guarantee this basic concept.
Implement it and rely on it from NMSettings and NMPolicy.
And remove the copy of the property from NMManager.
Move the call for nm_dispatcher_call_hostname() from NMHostnameManager
to NMManager. Note that NMPolicy also has a call to the dispatcher
when set-transient-hostname returns. This should be cleaned up later.
Hostname management is complicated. At least, how it is implemented currently.
For example, NMPolicy also sets the hostname (NMPolicy calls
nm_settings_set_transient_hostname() to have hostnamed set the hostname,
but then falls back to sethostname() in settings_set_hostname_cb()).
Also, NMManager tracks the hostname in NM_MANAGER_HOSTNAME too, and
NMPolicy listens to changes from there -- instead of changes from
NMSettings.
Eventually, NMHostnameManager should contain the hostname parts from NMSettings
and NMPolicy.
Commit #acf1067a allowed to assume connections on already managed
devices. Anyway, in complex scenario with layered connections, during
the startup of NetworkManager, this could interfere with the connection
assumption based on saved state.
So, avoid to re-assume connections on already managed devices during
startup.
Fixes: acf1067a45
(cherry picked from commit b6b7d909f7)
Commit 850c97795 ("device: track system interface state in NMDevice")
introduced interface states for devices and prevented checking if a
connection should be assumed on already managed devices.
This prevented to properly manage the event of an ip configuration added
externally to NM to a managed but not (yet) activated device.
Fixes: 850c977953
(cherry picked from commit acf1067a45)
NMDeviceGeneric:check_connection_compatible() doesn't check for a
matching interface name. It relies on the parent implementation to
do that.
The parent implementation calls nm_manager_get_connection_iface().
That fails for NM_SETTING_GENERIC_SETTING_NAME, because that one has
no factory. Maybe this imbalance of having no factory for the Generic device
is wrong, but usually factories only match a distinct set of device
types, while the generic factory would handle them all (as last resort).
Without this, activating a generic connection might activate the
wrong interface.
(cherry picked from commit 3876b10a47)
Since commit 2d1b85f (th/assume-vs-unmanaged-bgo746440), we clearly
distinguish between two modes when encountering devices with external
IP configuration:
a) external devices. For those devices we generate a volatile in-memory
connection and pretend it's active. However, the device must not be
touched by NetworkManager in any way.
b) assume, seamless take over. Mostly for restart of NetworkManager,
we activate a connection gracefully without going through an down-up
cycle. After the device reaches activated state, the device is
considered fully managed. For this only an existing, non volatile
connection can be used.
Before 'th/assume-vs-unmanaged-bgo746440', the behaviors were not
clearly separated.
Since then, we only choose to assume a connection (b) when the state
file indicates a matching connection. Now, extend this to also assume
connections when:
- during first-start (not after a restart) when there is no
state file yet.
- and, if we have an existing, non volatile, connection which
matches the device's configuration.
This patch lets NetworkManager assume connection also on first start.
That is for example useful when handing over network configuration from
initrd.
This only applies to existing, permanent, matching(!) connections, so it is a
good guess that the user wants NM to take over this interface. This brings us
closer to the previous behavior before 'th/assume-vs-unmanaged-bgo746440'.
https://bugzilla.redhat.com/show_bug.cgi?id=1439220
(cherry picked from commit 27b2477cb7)
nm_config_device_state_*() always access the file system directly,
they don't cache data in NMConfig. Hence, they don't use the
@self argument.
Maybe those functions don't belong to nm-config.h, anyway. For lack
of a better place they are there.
(cherry picked from commit 1940be410c)
Set the device state as removed when the link disappears, so that in
the call to unrealize() when the device is unmanaged we also perform a
cleanup of it and especially, we terminate any DHCP client instances
running on the device.
If we keep DHCP clients running, we can hit assertions later when we
start another instance on the same interface, because we kill the old
dhclient from the pidfile, and the g_child_watch_add() done by the
first client instance is not able to waitpid() it, complaining with:
GChildWatchSource: Exit status of a child process was requested but
ECHILD was received by waitpid(). Most likely the process is
ignoring SIGCHLD, or some other thread is invoking waitpid() with a
nonpositive first argument; either behavior can break applications
that use g_child_watch_add()/g_spawn_sync() either directly or
indirectly.
https://bugzilla.redhat.com/show_bug.cgi?id=1436602
(cherry picked from commit df537d2eac)
When a VPN connection can't be activated we have to unexport and
dispose it. Commit f2182fbf9b ("core: don't emit double
PropertiesChanged signal for new active connections") removed the call
to nm_exported_object_unexport() in case of failure because the active
connection already gets unreferenced on failure.
However, an exported object can't be disposed until it's explicitly
unexported because GDBus code keeps a reference to it. The result was
that the active connection was kept alive and exported, but without
explicit references to it. As soon as the connection was unexported,
it was also automatically disposed, causing issues like:
(src/nm-exported-object.c:1025):dispose: code should not be reached
#0 _g_log_abort () at /lib64/libglib-2.0.so.0
#1 g_logv () at /lib64/libglib-2.0.so.0
#2 g_log () at /lib64/libglib-2.0.so.0
#3 g_warn_message () at /lib64/libglib-2.0.so.0
#4 dispose (object=0xaaf110) at src/nm-exported-object.c:1025
#5 dispose (object=0xaaf110) at src/nm-active-connection.c:1246
#6 dispose (object=0xaaf110) at src/vpn/nm-vpn-connection.c:2642
#7 g_object_unref () at /lib64/libgobject-2.0.so.0
#8 registration_data_free () at /lib64/libgio-2.0.so.0
#9 g_hash_table_remove_internal () at /lib64/libglib-2.0.so.0
#10 g_dbus_object_manager_server_unexport_unlocked () at /lib64/libgio-2.0.so.0
#11 g_dbus_object_manager_server_unexport () at /lib64/libgio-2.0.so.0
#12 nm_bus_manager_unregister_object (self=0x9069e0, object=object@entry=0xaaf110) at src/nm-bus-manager.c:858
#13 nm_exported_object_unexport (self=0xaaf110) at src/nm-exported-object.c:714
#14 _settings_connection_removed (connection=<optimized out>, user_data=0xaaf110) at src/nm-active-connection.c:184
#15 g_closure_invoke () at /lib64/libgobject-2.0.so.0
#16 signal_emit_unlocked_R () at /lib64/libgobject-2.0.so.0
#17 g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#18 g_signal_emit_by_name () at /lib64/libgobject-2.0.so.0
#19 nm_settings_connection_signal_remove (self=self@entry=0x9e4a80, allow_reuse=allow_reuse@entry=0) at src/settings/nm-settings-connection.c:2085
#20 do_delete (self=0x9e4a80, callback=0x58106a <con_delete_cb>, user_data=0xa84fa0) at src/settings/nm-settings-connection.c:768
#21 do_delete (connection=0x9e4a80, callback=0x58106a <con_delete_cb>, user_data=0xa84fa0) at src/settings/plugins/keyfile/nms-keyfile-connection.c:127
#22 nm_settings_connection_delete (self=self@entry=0x9e4a80, callback=callback@entry=0x58106a <con_delete_cb>, user_data=0xa84fa0) at src/settings/nm-settings-connection.c:694
#23 delete_auth_cb (self=self@entry=0x9e4a80, context=context@entry=0x7fffd80131e0, subject=0x91fb40, error=<optimized out>, data=data@entry=0x0) at src/settings/nm-settings-connection.c:1879
#24 pk_auth_cb (chain=0x7fffd00024a0, chain_error=<optimized out>, context=0x7fffd80131e0, user_data=<optimized out>) at src/settings/nm-settings-connection.c:1351
#25 auth_chain_finish (user_data=0x7fffd00024a0) at src/nm-auth-utils.c:92
#26 g_idle_dispatch () at /lib64/libglib-2.0.so.0
Restore the unexport upon failure to fix this.
Fixes: f2182fbf9bhttps://bugzilla.redhat.com/show_bug.cgi?id=1440077
(cherry picked from commit 69fd96118e)
For example, when starting without Wi-Fi plugin, a generic device
is created. On stop, we should not store the unmanaged state
on the state file, otherwise after restart the device is unmanaged.
Only store explicit user decisions.
https://bugzilla.redhat.com/show_bug.cgi?id=1440171
(cherry picked from commit 142ebb1037)
This makes it possible to retain Internet connectivity when multiple devices
have a default route, but one with the link type of a higher priority can not
reach the Internet.
This moves tracking of connectivity to NMDevice and makes the NMManager
negotiate the best of known connectivity states of devices. The NMConnectivity
singleton handles its own configuration and scheduling of the permission
checks, but otherwise greatly simplifies it.
This will be useful to determine correct metrics for multiple default routes
depending on actual internet connectivity.
The per-device connection checks is not yet exposed on the D-Bus, since they
probably should be per-address-family as well.
Perform the lookup for a matching device earlier, so that in
autoconnect_slaves() we already know which device a connection is
being activated on. This will be needed to sort the returned
connections by interface name.
When slave connections are autoactivated as dependency to master we
don't check if a compatible device is available before trying to
activate them, leading to the following failed assertion:
nm_act_request_new: assertion 'NM_IS_DEVICE (device)' failed
GLib 2.52 added a G_GNUC_PRINTF attribute to
g_dbus_message_new_method_error(). This triggered warning in
NetworkManager when built with -Wformat, which is an error when built
with -Werror=format-security. It seems that gcc isn't smart enough to
see that (foo = "bar") should be treated as a literal.
Fortunately there is a g_dbus_message_new_method_error_literal()
function which does not take printf-style arguments, and we don't need
them, so we can use that.
This patch was originally by Rico Tzschichholz <ricotz@ubuntu.com>, and
was submitted to Launchpad at
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1650972https://bugzilla.gnome.org/show_bug.cgi?id=780444
When remove_device() is called on an already unrealized device, we
should release it from master if necessary and clear its IP
configurations to avoid leaks.
https://bugzilla.redhat.com/show_bug.cgi?id=1433303
It includes a reason code that makes it possible for the clients to be
more reasonable about error messages.
The reason code is essentially copied from the VPN, plus three more
reasons that were useful for non-VPN connections.
When deciding whether to touch a device we sometimes look at whether
the active connection is external/assumed. In many cases however,
there is no active connection around (e.g. while moving the device
from state unmanaged to disconnected before assuming).
So in most cases we instead look at the device-state-reason to decide
whether to touch the interface (see nm_device_state_reason_check()).
Often it's desirable to have no state and passing data as function
arguments. However, the state reason has to be passed along several hops
(e.g. a queued state change). Or a change to a master/slave can affect
the slave/master, where we pass on the state reason. Or an intermediate
event might invalidate a previous state reason. Passing the state
whether to touch a device or not as a state-reason is cumbersome
and limited.
Instead, the device should be aware of whats going on. Add a
sys-iface-state with:
- SYS_IFACE_STATE_EXTERNAL: meaning, NM should not touch it
- SYS_IFACE_STATE_ASSUME: meaning, NM is gracefully taking over
- SYS_IFACE_STATE_MANAGED: meaning, the device is managed by NM
- SYS_IFACE_STATE_REMOVED: the device no longer exists
This replaces most checks of nm_device_state_reason_check() and
nm_active_connection_get_activation_type() by instead looking at
the sys-iface-state of the device.
This patch probably has still issues, but the previous behavior was
not very clear either. We will need to identify those issues in future
tests and tweak the behavior. At least, now there is one flag that
describes how to behave.
Now we only search for a candiate with matching UUID. No need to
first lookup all activatable connections, just find the candidate
by UUID and see if it is activatable.
- rename find_ac_for_connection() to
active_connection_find_first_by_connection().
This function has the unexpected(?) behavior to either
search by the @connection using pointer identity, or
by looking up the UUID (depending on whether @connection
is a NMSettingsConnection).
- Split out a active_connection_find_first() part.
The name "find-first" makes it clear that we walk the list until
a matching active-connection is found. That's why I added the
max_state argument. Otherwise, it would have to be called
"find-first-non-disconnected".
- drop nm_manager_get_connection_device(). It only had two callers.
It also used the ambiguity of the @connection argument, but
only one of the two callers cared about that. Meaning,
_internal_activate_device() now explicitly does a lookup by
the NMSettingsConnection.
There is only one caller of assume_connection().
The tasks there are not clearly separate and it is clearer just to
have one large recheck_assume_connection() function which proceeds
step by step, instead of breaking it into separate parts and move
them apart in the source code. The latter implies that -- unless
we forward-declare assume_connection() -- the order of definition
with assume_connection,recheck_assume_connection is contrary the
flow of the code.
Having a separate function in this case is not a simplification
so merge it.
Before, we would have the concept of assumed connections, which is used
for (1) externally configured device that NetworkManager should not
touch and (2) connections that NetworkManager should gracefully take
over after a restart (seamlessly, non-destructively).
The behavior was unclear and mixed. It wasn't clear whether the device
is in no-touch mode (1) or gracefully take-over (2).
Previous commits already introduce separate activation types EXTERNAL (1)
and ASSUME (2).
Also, previously, we would for both (1) and (2) try to find a matching
connection and use it. That doesn't make sense for either.
In the external case (1), we should not pretend that an existing connection
is active. Let's always create a new in-memory connection for these
cases. Note that this means, external devices now will always generate
a connection, instead of pretending an existing one is active.
For the assume case (2), we shall not use nm_utils_match_connection() to
guess which connection might be active. It can only the one that was
active on a previous run of NetworkManager. So, use the information from
the state file and try to activate it. If that fails, it is not an
assume activation type. Note, that this means we now most of the time
don't do ASSUME anymore. Most of the time we do EXTERNAL activation
That is because the state information is only available after restart
of NetworkManager.