We often lookup the private data and retrieve it via NM_DEVICE_GET_PRIVATE(),
which in turn calls G_TYPE_INSTANCE_GET_PRIVATE().
Instead cache the pointer to the private data.
There are up- and downsides:
- requries additional sizeof(gpointer) bytes for each NMDevice.
+ retrieving the private pointer will be slightly faster.
+ easier debugging in gdb as it is currently often a pain to
retrieve the private data.
But most importantly, the allows to change our common pattern
to first cache the private data in a variable @priv. That is
often cumbersome to write, especially for short functions.
This change gives us a choice to use self->priv directly.
Such a change should not be aimed for every class. Instead it makes
mostly sense for NMDevice, where it pays off better due to the
class' size and ubiquitous use.
https://mail.gnome.org/archives/networkmanager-list/2015-December/msg00017.html
Showcase for the new macros NM_UTILS_STRING_LOOKUP_TABLE() and
NM_UTILS_STRING_LOOKUP_TABLE_DEFINE_STATIC().
It changes behavior in case of looking up an invalid/unknown
state reason. Previously it would just have returned "unknown"
-- which was indistinguishable from a regular "unknown" value.
Now it returns the numeric id as a string. The string is allocated
with alloca(), which is desired but one should be aware of the pitfalls:
- prevents the caller from being inlined
- bad idea to do in a loop.
I found the handling of the master-device very confusing because it was
unclear who sets priv->master, and when it should be set.
Now:
- Setting priv->master (in a slave) always goes together with adding
the master to priv->slaves (in the master). Previously, this was
done at separate places, so it was not clear if master and slave
always agree on their relationship -- in fact, they did not.
- There are now three basic functions which do the enslaving/releasing:
(1) nm_device_master_add_slave()
(2) nm_device_master_enslave_slave()
(3) nm_device_master_release_one_slave()
Step 3/release basically undoes the 1/add and 2/enslave steps.
- completing the enslaving/releasing is now done by
(1) nm_device_slave_notify_enslave()
(2) nm_device_slave_notify_release()
These functions also emit signals like NM_DEVICE_MASTER.
- Derived classes no longer emit NM_DEVICE_SLAVES notification. Instead
the notification is emited together with NM_DEVICE_MASTER, whenever a
slaves changes state. Also, NM_DEVICE_SLAVES list now only exposes
slaves that are actually @is_enslaved.
Instead of reimplementing the slave property in bond, bridge
and team, just add the property to the parent class. It's not
that the parent class would be agnostic to the master/slave
implementation, all the slaves are known to the every device
type implementation.
Also, the derived class doesn't know the correct time when
to invoke the notify-changed for the slaves property.
E.g. it should be only invoked after nm_device_slave_notify_enslave()
when other components also consider the slave as enslaved.
Later this will be fixed so that the SLAVES property correspond
to what other master/slave related properties say.
release_slave() should do the right thing and handle errors as
good as it can. There is no value in propagating the error and
it's not clear what the caller should do in face of a failure
during release.
We have the master/slave related functions
- for master device:
- nm_device_master_add_slave()
- nm_device_master_release_slaves()
- nm_device_release_one_slave()
- nm_device_enslave_slave()
- for slave device:
- nm_device_slave_notify_enslave()
- nm_device_slave_notify_release()
Rename the two that didn't match the pattern to
- nm_device_master_release_one_slave()
- nm_device_master_enslave_slave()
But, of course, only one realized device can have the same
interface name at a time.
This commit effectively reverts most of:
1b37cd0340
core: allow ActiveConnections to be created without a device
But it's not easy to do a separate revert of that code due to
interdependencies with nm-manager.c.
Creating devices when they are defined by a connection also makes
makes it possible to require the NMDevice to be present when
activating it, which means we can remove a bunch of code from
NMManager that had to handle software devices not existing yet at
the time of the activation request.
But it also means we must be more careful when finding master
interfaces during slave activation, since we cannot simply match
by interface name alone. Instead we must find the master which
matches both the interface name and can control slaves of the type
which is being activated.
Ensure the platform link with the same interface name as the
NMDevice is actually compatible with it before using the link
for initialization of device properties. If not, remove the
NMDevice and create a new one since there are kernel resources
with a different type.
Unrealized devices aren't backed by kernel resources and so won't know
all of their attributes. That means three things:
1) they must update their attributes when they become realized
2) they must clear those attributes when unrealized
3) they must be looser in checking compatible connections until
they are realized
This requires that the setup() function be split into two parts, start & finish,
because finish must be run after add_device()
Also, we can simplify whether to pay attention to 'recheck-assume', which
is now dependent on priv->is_nm_owned, because the only case where NM should
*not* listen for the 'recheck-assume' signal is when the device is a
software device created by NM itself. That logic was previously spread
across the callers of add_device() but is now consolidated into
nm-manager.c::device_realized() and nm-device.c::nm_device_create_and_realize().
This property is TRUE for devices that exist either as a kernel device
or are backed by some other resource (eg, ModemManager object, Bluez
device, etc). It will eventually be FALSE for software devices that
are not yet instantiated.
There is only one user of the NM_DEVICE_MASTER property (NMActiveConnection).
device_master_changed() uses the property for change notifications,
but returns early if !nm_device_get_master().
We might emit the "notify::master" signal too often, even when nothing
changed (because currently we always emit it when priv->master changes).
If we would fix that to only emit the signal when nm_device_get_master()
actually changes, it would still be correct, because device_master_changed()
doesn't care about notifications when the master is unset.
Hence, the only user doesn't care about the difference. So change it because
it's unexpected that the property and function are not completely equal.
Commit cd3df12c8f reused the
virtual function component_added() to notify the vlan device
about a possibly new parent.
This reuse of the virtual function for another purpose is confusing.
Clean that up by splitting the implementation and add a new
virtual function nm_device_notify_new_device_added() which gets
(only implemented by NMDeviceVlan).
When deleting a master-device either via `nmcli device delete`
or `ip link delete`, the slave-device would hang.
This seems to be broken for a very long time already.
This is due to the following:
#0 0x00005555555f548c in nm_device_slave_notify_release (self=0x555555dc1300, reason=NM_DEVICE_STATE_REASON_NONE) at devices/nm-device.c:2175
#1 0x00005555555d6de2 in nm_device_release_one_slave (self=0x555555de3dd0, slave=0x555555dc1300, configure=0, reason=NM_DEVICE_STATE_REASON_NONE) at devices/nm-device.c:1117
#2 0x00005555555f02b7 in device_link_changed (self=0x555555dc1300) at devices/nm-device.c:1460
Previously, nm_device_slave_notify_release() being called with reason
"NONE" did not actually transition the device-state, thus keeping the
device wrongly in activated state.
There were two callers that passed configure=FALSE to nm_device_release_one_slave(),
(and thus reason=NONE to nm_device_slave_notify_release()):
- (1) device_link_changed():
nm_device_release_one_slave (priv->master, self, FALSE, /*wrong reason NONE*/);
- (2) nm_device_removed():
nm_device_release_one_slave (priv->master, self, FALSE, NM_DEVICE_STATE_REASON_REMOVED);
We should always change the device-state during nm_device_slave_notify_release()
regardless of the reason.
(2) was added by commit c83b40aca7, later
refined by commit 5dd48f7527. In a way
change the second fix to perform some of the configuration (but still
not unenslaving the device).
The new object type represents tunnels over IPv4 and IPv6.
We have a single setting type (NMSettingIPTunnel) for tunnels and it
can't be shared among different device factories. So we define also a
single device type for all tunnels.
This new object will also represent GRE tunnels, which before were
instantiated as NMDeviceGre and had a ".Device.Gre" D-Bus
interface. This commit introduces a change in behavior.
This enum was unused and meaningless because the platform signals
are emitted as a consequence of netlink messages. It is not clear
whether a netlink message was received due to an external event
or an internal action.
The DHCP client from new libsystemd-network requires a link-local IPv6
address to be passed to the library; add a new argument to
nm_dhcp_manager_start_ip6() and related functions.
When deconfiguring a device, we must also explicitly clear the
default-route -- unless the device was assumed.
This can easily reproduced by disconnecting the cable from the
wired connection that has the default rout. Prevously, the
default-route was not cleared and lingered around.
https://bugzilla.gnome.org/show_bug.cgi?id=757587
Device subclasses can call nm_device_recheck_available() at any time,
and the function would change the device's state to UNKNOWN in cases
where the device was available already. For WWAN devices, availability
is rechecked every time the modem state changes, resulting in:
NetworkManager[28919]: <info> (ttyUSB4): modem state changed, 'disabled' --> 'enabling' (reason: user-requested)
NetworkManager[28919]: <debug> [1445538582.116727] [devices/nm-device.c:2769] recheck_available(): [0x23bd710] (ttyUSB4): device is available, will transition to unknown
NetworkManager[28919]: <info> (ttyUSB4): modem state changed, 'enabling' --> 'searching' (reason: user-requested)
NetworkManager[28919]: <debug> [1445538582.776317] [devices/nm-device.c:2769] recheck_available(): [0x23bd710] (ttyUSB4): device is available, will transition to unknown
Previously most objects were implicitly unexported when they were
destroyed, but since refcounts may make the object live longer than
intended, we should explicitly unexport them when they should no
longer be present on the bus.
This means we can assume that objects will always be un-exported
already when they are destroyed, *except* when quitting where most
objects will live until exit because NM leaves interfaces up and
running on quit.
Otherwise we'd hit an assert and rightly so!
Program received signal SIGTRAP, Trace/breakpoint trap.
g_logv (log_domain=0x5555556b2f80 "NetworkManager", log_level=G_LOG_LEVEL_WARNING, format=<optimized out>, args=args@entry=0x7fffffffcd10) at gmessages.c:1046
1046 g_private_set (&g_log_depth, GUINT_TO_POINTER (depth));
(gdb) bt
#0 g_logv (log_domain=0x5555556b2f80 "NetworkManager", log_level=G_LOG_LEVEL_WARNING, format=<optimized out>, args=args@entry=0x7fffffffcd10) at gmessages.c:1046
#1 0x00007ffff4a4ea3f in g_log (log_domain=log_domain@entry=0x5555556b2f80 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_WARNING, format=format@entry=0x7ffff4ac1e4c "%s") at gmessages.c:1079
#2 0x00007ffff4a4ed56 in g_warn_message (domain=domain@entry=0x5555556b2f80 "NetworkManager", file=file@entry=0x5555556aca93 "devices/nm-device.c", line=line@entry=1101,
func=func@entry=0x5555556b22e0 <__FUNCTION__.35443> "nm_device_release_one_slave", warnexpr=warnexpr@entry=0x0) at gmessages.c:1112
#3 0x00005555555ba80a in nm_device_release_one_slave (self=self@entry=0x5555559ec4c0, slave=slave@entry=0x5555559f7800, configure=configure@entry=1, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE)
at devices/nm-device.c:1101
#4 0x00005555555c264b in slave_state_changed (slave=0x5555559f7800, slave_new_state=NM_DEVICE_STATE_FAILED, slave_old_state=NM_DEVICE_STATE_IP_CONFIG, reason=NM_DEVICE_STATE_REASON_NONE, self=0x5555559ec4c0)
at devices/nm-device.c:1700
#5 0x00007ffff339cdac in ffi_call_unix64 () at ../src/x86/unix64.S:76
#6 0x00007ffff339c6d5 in ffi_call (cif=cif@entry=0x7fffffffd1c0, fn=<optimized out>, rvalue=0x7fffffffd130, avalue=avalue@entry=0x7fffffffd0b0) at ../src/x86/ffi64.c:522
#7 0x00007ffff4d45678 in g_cclosure_marshal_generic (closure=0x5555559b0160, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=0x0)
at gclosure.c:1454
#8 0x00007ffff4d44e38 in g_closure_invoke (closure=0x5555559b0160, return_value=return_value@entry=0x0, n_param_values=4, param_values=param_values@entry=0x7fffffffd3c0,
invocation_hint=invocation_hint@entry=0x7fffffffd360) at gclosure.c:768
#9 0x00007ffff4d5675d in signal_emit_unlocked_R (node=node@entry=0x55555598a6f0, detail=detail@entry=0, instance=instance@entry=0x5555559f7800, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fffffffd3c0) at gsignal.c:3553
#10 0x00007ffff4d5e4c1 in g_signal_emit_valist (instance=instance@entry=0x5555559f7800, signal_id=signal_id@entry=72, detail=detail@entry=0, var_args=var_args@entry=0x7fffffffd5f8) at gsignal.c:3309
#11 0x00007ffff4d5ecc8 in g_signal_emit_by_name (instance=instance@entry=0x5555559f7800, detailed_signal=detailed_signal@entry=0x5555556c0405 "state-changed") at gsignal.c:3405
#12 0x00005555555bd0e0 in _set_state_full (self=self@entry=0x5555559f7800, state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE, quitting=quitting@entry=0)
at devices/nm-device.c:8580
#13 0x00005555555be0e7 in nm_device_state_changed (self=self@entry=0x5555559f7800, state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE) at devices/nm-device.c:8741
#14 0x00005555555c0a45 in queued_set_state (user_data=<optimized out>) at devices/nm-device.c:8765
#15 0x00007ffff4a4779a in g_main_dispatch (context=0x5555559433c0) at gmain.c:3109
#16 g_main_context_dispatch (context=context@entry=0x5555559433c0) at gmain.c:3708
#17 0x00007ffff4a47ae8 in g_main_context_iterate (context=0x5555559433c0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3779
#18 0x00007ffff4a47dba in g_main_loop_run (loop=0x555555943480) at gmain.c:3973
#19 0x000055555559713d in main (argc=1, argv=0x7fffffffdb78) at main.c:512
(gdb)
Device activation normally fails during one of the stages and in that
case the activation chain is implicitly interrupted.
But in some cases the device fails for external events (as a failure
of master connection) while the activation sequence is still running
and so we need to ensure that any pending activation source gets
cleared upon entering the failed state.
https://bugzilla.redhat.com/show_bug.cgi?id=1270814
RFC7217 introduces an alternative mechanism for creating addresses during
stateless IPv6 address configuration. It's supposed to create addresses whose
host part stays stable in a particular network but changes when the hosts
enters another network to mitigate possibility of tracking the host movement.
It can be used alongside RFC 4941 privacy extensions (temporary addresses)
and replaces the use of RFC 4862 interface identifiers.
The address creation mode is controlld by ip6.addr_gen_mode property
(ADDR_GEN_MODE in ifcfg-rh), with values of "stable-privacy" and "eui-64",
defaulting to "eui-64" if unspecified.
The host part of an address is computed by hashing a system-specific secret
salted with various stable values that identify the connection with a secure
hash algorithm:
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
For NetworkManager we use these parameters:
* F()
SHA256 hash function.
* Prefix
This is a network part of the /64 address
* Net_Iface
We use the interface name (e.g. "eth0"). This ensures the address won't
change with the change of interface hardware.
* Network_ID
We use the connection UUID here. This ensures the salt is different for
wireless networks with a different SSID as suggested by RFC7217.
* DAD_Counter
A per-address counter that increases with each DAD failure.
* secret_key
We store the secret key in /var/lib/NetworkManager/secret_key. If it's
shorter than 128 bits then it's rejected. If the file is not present we
initialize it by fetching 256 pseudo-random bits from /dev/urandom on
first use.
Duplicate address detection uses IDGEN_RETRIES = 3 and does not utilize the
IDGEN_DELAY delay (despite it SHOULD). This is for ease of implementation
and may change in future. Neither parameter is currently configurable.
NMDevice detects the DAD failures by watching the removal of tentative
addresses (happens for DAD of addresses with valid lifetime, typically
discovered addresses) or changes to addresses with dadfailed flag (permanent
addresses, typically link-local and manually configured addresses).
It retries creation of link-local addresses itself and lets RDisc know about
the rest so that it can decide if it's rdisc-managed address and retry
with a new address.
Currently NMDevice doesn't do anything useful about link-local address DAD
failures -- it just fails the link-local address addition instead of just
timing out, which happened before. RDisc just logs a warning and removes
the address from the list.
However, with RFC7217 stable privacy addresses the use of a different address
and thus a recovery from DAD failures would be possible.
Instead of using libnl-route-3 library to serialize netlink messages,
construct the netlink messages ourselves.
This has several advantages:
- Creating the netlink message ourself is actually more straight
forward then having an intermediate layer between NM and the kernel.
Now it is immediately clear, how a platform request translates to
a netlink/kernel request.
You can look at the kernel sources how a certain netlink attribute
behaves, and then it's immediately clear how to set that (and vice
versa).
- Older libnl versions might have bugs or missing features for which
we needed to workaround (often by offering a reduced/broken/untested
functionality). Now we can get rid or workaround like _nl_has_capability(),
check_support_libnl_extended_ifa_flags(), HAVE_LIBNL_INET6_TOKEN.
Another example is a libnl bug when setting vlan ingress map which
isn't even yet fixed in libnl upstream.
- We no longer need libnl-route-3 at all and can drop that runtime
requirement, saving some 400k.
Constructing the messages ourselves also gives better performance
because we don't have to create the intermediate libnl object.
- In the future we will add more link-type support which is easier
to support by basing directly on the plain kernel/netlink API,
instead of requiring also libnl3 to expose this functionality.
E.g. adding macvtap support: we already parsed macvtap properties
ourselves because of missing libnl support. To *add* macvtap
support, we also would have to do it ourself (or extend libnl).
The peer-address (IFA_ADDRESS) can also be all-zero (0.0.0.0).
That is distinct from an usual address without explicit peer-address,
which implicitly has the same peer and local address.
Previously, we treated an all-zero peer_address as having peer and
local address equal. This is especially grave, because the peer is part
of the primary key for an IPv4 address. So we not only get a property of
the address wrong, but we wrongly consider two different addresses as
one and the same.
To properly handle these addresses, we always must explicitly set the peer.
Usually, the peer-address is the same as the local address.
In case where it is not, it is the peer-address that determines
the IPv4 device-route. So we must use the peer-address.
Also, don't consider device-routes with the first octet of zero,
just like kernel does.
Also, nm_ip4_config_get_subnet_for_host() is effectively the same
as nm_ip4_config_destination_is_direct(). So drop it.
The peer-address seems less important then the prefix-length.
Also, nm_platform_ip4_address_delete() has the peer-address
argument as last.
Soon ip4_address_get() also receives a peer-address argument,
so get the order right first.