NM fails to activate a slave if the master device already exists
but has not active connection.
One way to reproduce, create a bond master/slave configuration and
ensure that the master device exists (e.g. by activating the bond, and
killing NM without taking down the device, or externally via `ip link add`).
If you try to activate the slave it will fail with the following message
(in nmcli):
"Error: Connection activation failed: The active connection on MASTER is not a valid master for 'SLAVE'"
although MASTER is not active.
This also triggers the following assertion:
#0 0x0000003370c504e9 in g_logv () from /lib64/libglib-2.0.so.0
#1 0x0000003370c5063f in g_log () from /lib64/libglib-2.0.so.0
#2 0x000000000047646a in is_compatible_with_slave (master=0x0, slave=slave@entry=0xc4aa60) at nm-manager.c:2193
#3 0x000000000047e289 in ensure_master_active_connection (self=self@entry=0xc8d150, subject=0x7f23b80059e0, connection=connection@entry=0xc4aa60, device=device@entry=0xcac380, master_connection=master_connection@entry=0x0,
master_device=master_device@entry=0xc9e800, error=error@entry=0x7fffa5cc4958) at nm-manager.c:2395
#4 0x000000000047eb4a in _internal_activate_device (self=self@entry=0xc8d150, active=active@entry=0xcc33b0, error=error@entry=0x7fffa5cc4958) at nm-manager.c:2665
#5 0x000000000047ecf2 in _internal_activate_generic (self=self@entry=0xc8d150, active=active@entry=0xcc33b0, error=error@entry=0x7fffa5cc4958) at nm-manager.c:2712
#6 0x000000000047ef2b in _internal_activation_auth_done (active=0xcc33b0, success=<optimized out>, error_desc=0x0, user_data1=0xc8d150, user_data2=<optimized out>) at nm-manager.c:2848
#7 0x0000000000466fa1 in auth_done (chain=0xcef020, error=0x0, unused=<optimized out>, user_data=<optimized out>) at nm-active-connection.c:603
#8 0x00000000004753da in auth_chain_finish (user_data=0xcef020) at nm-manager-auth.c:88
#9 0x0000003370c492a6 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#10 0x0000003370c49628 in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0
#11 0x0000003370c49a3a in g_main_loop_run () from /lib64/libglib-2.0.so.0
#12 0x0000000000429e65 in main (argc=1, argv=0x7fffa5cc4e48) at main.c:678
Signed-off-by: Thomas Haller <thaller@redhat.com>
Something changed at some point so that NMManager was now recomputing
its state after a connection was activated, but before NMPolicy had
decided whether to give that connection the default route, meaning
NMManager would set the state to CONNECTED_LOCAL rather than
CONNECTED_GLOBAL.
Fix this by watching the active connection :default and :default6
properties too, so we do the right thing regardless of what order the
AC properties change in.
The NMDevice dispose() function contained some badly-duplicated logic
about when to deactivate a device on its last ref. This logic should
only run when the device is removed by the manager, since the manager
controls the device's life-cycle, and the manager knows best when to
clean up the device. But since it was tied to the device's refcount,
it could have run later than the manager wanted, or not at all.
It gets better. Dispose duplicated logic that was already done in
nm_device_cleanup(), and then *called* nm_device_cleanup() if the
device was still activated and managed. But the manager already
unmanages the device when removing it, which triggers a call to
nm_device_cleanup(), takes the device down, and resets the IPv6
sysctl properties, which dispose() duplicated too. So by the time
dispose() runs, the device should already be unmanaged if the
manager wants to deconfigure it, and most of the dispose() code
should be a no-op.
Clean all that up and remove duplicated functions. Now, the flow
should be like this:
1) manager decides to remove the device and calls remove_device()
2) if the device should be deconfigured, the manager unmanages
the device
3) the NMDevice state change handler tears down the active connection
via nm_device_cleanup() and resets IPv6 sysctl properties
4) when the device's last reference is finally released, only internal
data members are freed in dispose() because the device should
already have been cleaned up by the manager and be unmanaged
5) if the device should be left running because it has an assumable
connection, then the device is not unmanaged, and no cleanup
happens in the state change handler or in dispose()
The following procedure leaves an NMActiveConnection around for a deactivated
device, which causes errors in libnm-glib clients when they cannot create the
GObject for the non-existent device of the AC.
1) allow a device which can assume connections to be activated
2) stop NM, which should leave the device's IP configuration up
3) start NM and allow it to assume the device's existing connection
4) remove the device, either by unplugging it or 'rmmod'
The device is removed by nm-manager.c::remove_device(), but the device object
is not moved to UNMANAGED state, leaving the NMActiveConnection completely
unaware the device has gone away.
The nm-manager.c::remove_device() code did not correctly handle moving a
forcibly removed (eg, by unplugging or 'ip link del' or 'rmmod') device to
the UNMANAGED state when the device was active with an assumed connection.
To fix this, make the conditions when the device should be deactivated
on removal much more explicit.
A device should be deactivated on removal if:
1) it is forcibly removed, eg by the kernel network interface being
removed due to 'ip link del' or hotplugging, or internally by NM due
to a parent WWAN interface taking priority over a WWAN ethernet interface
2) if the device cannot assume connections, in which case NetworkManager
must have activated the device and since we cannot assume the connection
on restart, we should deactivate it
3) if the device is not activated, to ensure that its IPv6 parameters
and other things get reset to the pre-NetworkManager values
https://bugzilla.gnome.org/show_bug.cgi?id=729833
CC nm-manager.lo
nm-manager.c: In function 'assume_connection':
nm-manager.c:1605:345: error: 'return' with no value, in function returning non-void [-Werror=return-type]
g_return_if_fail (nm_device_get_state (device) >= NM_DEVICE_STATE_DISCONNECTED);
Minor error, introduced by commit f229f4e201.
Signed-off-by: Thomas Haller <thaller@redhat.com>
If the initial attempt to assume a connection on a device fails, and
the device remains un-activated, but then something changes its
configuration externally, try to generate a new connection and assume
that.
Instead of having a GObject property and a factory function to get
the plugin's device type, just use the factory function, since it
always has to be around.
Make Wi-Fi support a plugin using the new device factory interface.
Provides a 7% size reduction in the core NM binary.
Before After
NM: 1154104 1071992 (-7%)
Wi-Fi: 0 110464
(all results from stripped files)
Older Intel "ipw" devices (ipw2100, ipw2200, and ipw2915) only gained
kernel rfkill subsystem integration with 2.6.33. Before then their
custom rfkill functionality had to be polled via sysfs. Since we now
require at least a 3.x kernel, remove this old code.
It's only used to keep the DHCPManager up-to-date with hostname changes,
and that can be accomplished in much less code by just having NMManager
set a hostname on the DHCPManager itself.
rfkill handling should only pay attention to actual rfkill, since
rfkill is global but the modem management service state is per-device.
Thus calculating a global state from multiple devices is very
likely to get things wrong.
Remove all of the code that used to handle that sort of thing,
which means removing the 'enable-changed' signal from the Modem
device, since now nothing external to the modem device should
need to care whether it's enabled internally or not.
Poke the kernel's WWAN rfkill state when the user changes our
WWANEnabled property, the same as we do for WiFi. Also restore
saved WWAN state on startup, as we do for WiFi. No good reason
why WWAN should be different here.
Before platform raised 3 signals for each object type. Combine
them into one and add a new parameter @change_type to distinguish
between the change type.
Signed-off-by: Thomas Haller <thaller@redhat.com>
When assuming the connections on restart we want to prefer more-recently-used
connections. That's why we have to sort connections according to timestamps in
descending order. That means connections used more recently (higher timestamp)
go before connections with lower timestamp.
https://bugzilla.redhat.com/show_bug.cgi?id=1067712
Instead of waiting until the device is disposed and dbus-glib does
it for us, remove them when the Manager is done with them. If
something (like pending D-Bus calls) holds a reference to the device
when the Manager removes it, the device would previously still
service method calls until all references are released. When
the device is removed, it's dead, and it shouldn't be exported
anymore.
We'll want to track internal management separately in the future, so split out
user management (eg, whether the device has been explicitly marked unmanaged
by the user).
Instead of tracking unmanaged-ness in a couple variables (and because
I'd like to add one for user-unmanaged later) let's do it in a single
flags variable, and consolidate setting of the unmanaged states in one
place.
If two users had the ability to control networking, and user1 started
a private connection which user2 cannot see, user2 could start their
own connection and disconnect user1's connection. This is not
consistent with device disconnection. A user who cannot see a
connection should not be able to start/stop it, even if they are
allowed to control networking in general.
NMClient's "devices" property was getting out of sync because the
daemon was emitting "notify" before actually changing the property
value. This resulted in problems with re-activating virtual devices
that had previously been deactivated in gnome-control-center and
anaconda. (And probably gnome-shell and nm-applet?)
The AC doesn't get a D-Bus path until it's exported, but that happens after
it's handed to the Device it will be activated on. The Device emits a
PropertyChanged event when it's handed the AC, but it ignores ACs that
aren't exported yet. Thus when activating, the Device doesn't emit the
AC's path at all in the ActiveConnection property because it's NULL.
Fix that by exporting the AC immediately before starting activation
with it.
Second, move the notification of the Device.ActiveConnection property
to be emitted along with the state change to PREPARE instead of long
before it. While we don't guarantee signal ordering in general, this
seems like a more correct ordering.
https://bugzilla.gnome.org/show_bug.cgi?id=723783
If we find multiple plugins for the same type (eg, because the user
previously installed the "atm" and "bt" plugins, and didn't delete
them), log a warning.
Because not all clients set the 'hidden' property in a connection for
hidden/non-SSID-broadcasting networks, they may not show up in
the device's available-connections property. After the
PendingActivation object removal, all activations require the
connection to be in available-connections, and thus hidden SSID
networks could not be activated.
Unfortunately check_connection_available() is used both during
activation and to populate the available-connections array, but we
only want to special-case activation paths, and still ensure that
SSIDs not found in the scan list are not in available-connections.
To make it clear this is a WiFi only hack, and that we should
remove it at some point in the future, create another class method
specifically for hidden WiFi and use that in activation paths to
special-case hidden WiFi connection activation.
Since vxlan is new-ish, and vxlan IPv6 support in particular has only
been in the kernel since 3.11, we include our own copy of the vxlan
netlink constants rather than depending on the installed headers.
If IPv4 configuration did not succeed or the device has no IPv4 addresses
when NM restarts, it will detect the existing device configuration as
'disabled'. This can happen when a bridge has no slaves and thus cannot
perform IPv4 addressing because it has no carrier (since bridge carrier
status depends on slave carriers). When NM starts or restarts, it
sees the bridge has no IPv4 address and assumes the IPv4 method is
'disabled'. This creates a new connection, which blocks any slave
connections from activating if they specify their master via UUID
(since the bridge's active connection is generated).
Fix this by allowing matches from 'disabled' to 'auto' if the device
has no carrier, and there are no other differences between the
original and the candidate connections.
Dependencies may fail before the activation actually starts, like
when a software device gets removed while the activation is
scheduled but before it has started. In these cases, the
activation request should fail.
With some upcoming changes, ActiveConnection objects could change to
DEACTIVATED state during activation, for example if the AC's device
was removed while the AC was being authorized.
To ensure the AC stays alive and is not used after being freed,
keep a reference to the AC across authorization operations.
Make WWAN support a plugin using the new device factory interface.
Provides a 5% size reduction in the core NM binary.
Before After
NM: 1187224 1125208 (-5%)
MM: 0 100576
(all results from stripped files)
Make Bluetooth support a plugin using the new device factory interface.
Provides a 5% size reduction in the core NM binary.
Before After
NM: 1253016 1187224 (-5%)
BT: 0 85752
(all results from stripped files)
Make ADSL support a plugin using the new device factory interface.
Provides a 1% size reduction in the core NM binary.
Before After
NM: 1265336 1253016 (-1%)
ATM: 0 27360
(all results from stripped files)
In preparation for making WWAN and Bluetooth plugins, rework
the device plugin interface to meet those plugins' needs and
port WiMAX over in the process.
Instead of having NMManager listen directly to the ModemManager
for modem removal signals, have the NMDeviceModem and NMDeviceBt
listen for them (since they obviously have a pointer to the backing
NMModem object) and then re-emit any necessary device removal
signals to the manager.
In reality the connection provider (NMSettings) is always the same
object, and some device plugins need access to it. Instead of
cluttering up the device plugin API by passing the provider into
every plugin regardless of whether the plugin needs it, create
a getter function.
The OLPC mesh code did rely on nm_manager_get() referencing the
singleton when returning it, but all other callers of nm_manager_get()
did not. Thus the manager's refcount would always increase and
almost never decrease. Fix the refcounting so that the manager
always has only one ref, and it's lifetime is controlled by
main() and nothing else.
If a device is already activated, queue the new activation to allow
the transition through the DEACTIVATING state.
---
Also remove the "HACK" bits in nm_device_deactivate(). This hack was
added on 2007-09-25 in commit 9c2848d. At the time, with user settings
services, if a client created a connection and requested that NM
activate it, NM may not have read the connection from the client over
D-Bus yet. So NM created a "deferred" activation request which waited
until the connection was read from the client, and then began activation.
The Policy watched for device state changes and other events (like
it does now) and activated a new device if the old one was no longer
valid. It specifically checked for deferred activations and then
did nothing. However, when the client's connection was read, then
nm-device.c cleared the deferred activation bit, leading to a short
period of time where the device was in DISCONNECTED state but there
was no deferred activation, because the device only changes state to
PREPARE from the idle handler for stage1. If other events happened
during this time, the policy would tear down the device that was
about to be activated. This early state transition to PREPARE
worked around that.
We need to remove it now though, because (a) the reason for its
existence is no longer valid, and (b) _device_activate() may now
be called from inside nm_device_state_changed() and thus it cannot
change to a new state inside the function.
If the kernel doesn't tag a modem's ethernet interface with
DEVTYPE=wwan then NetworkManager has no idea that's a modem
(and cannot be used until connected via the control port).
Since DEVTYPE=wwan devices get ignored by NM, so should these
interfaces when NM knows they are modems.
That got broken at some point for ModemManager1, because the
data port isn't read until the modem is connected. NM only
looked for and removed the data-port-as-ethernet-device when
the modem was added, long before the MM1 data port was found.
ModemManager does provide a list of ports owned by the modem
though, which we can use at modem addition time to remove
an ethernet device that is controled by the modem.