Commit d518278011 changed
the hashing for the APs to use direct-hashing.
That was wrong because get_ap_by_path() needs a full
string-comparison.
Fixes: d518278011
Now that NM follows the supplicant's scan list and CurrentBSS, any AP that isn't
known to the supplicant will be 'fake', and priv->current_ap always tracks
CurrentBSS.
We can then simplify link_timeout_cb() because any AP that would have been
force-removed before will now be marked "fake" if it's unknown to the supplicant,
and will always be removed by set_current_ap(), so we can remove the force
argument. To better fix#733105 we never want to remove an AP known to
the supplicant, even if it we failed to connect to it.
https://bugzilla.gnome.org/show_bug.cgi?id=733105
Since commit 7cb323d923,
nm_ap_new_from_properties() will always return an
AP with BSSID set. Restore the assertion during
try_fill_ssid_for_hidden_ap().
This reverts commit e9bc18d2a7.
Previously most objects were implicitly unexported when they were
destroyed, but since refcounts may make the object live longer than
intended, we should explicitly unexport them when they should no
longer be present on the bus.
This means we can assume that objects will always be un-exported
already when they are destroyed, *except* when quitting where most
objects will live until exit because NM leaves interfaces up and
running on quit.
The @aps hash has the D-Bus path of the exported
object as key. It already rightly saved to additionally
copy the string and relied on the path being stable.
When doing that, we can just go one step further and
use direct-hashing instead of string-hashing.
Note that NMExportedObject already promises that
the path will not change as long as the object is
exported. See code comments in the export/unexport
functions.
For future use of ObjectManager, we must explicitly unexport
the AP and no longer depend on having it unexported during
deconstruction (because object manager keeps the instance alive).
Also refactor adding/removal of APs and move the export/unexport
calls to the place where we emit the signal.
First add the new AP, before setting it as current.
Also set the AP *after* thawing the notifications. Otherwise
it is not clear which notification gets raised first as their
order is undefined. But we want that the client first sees
the new AP and later gets a notification about having a new
current.
Otherwise we'd hit an assert and rightly so!
Program received signal SIGTRAP, Trace/breakpoint trap.
g_logv (log_domain=0x5555556b2f80 "NetworkManager", log_level=G_LOG_LEVEL_WARNING, format=<optimized out>, args=args@entry=0x7fffffffcd10) at gmessages.c:1046
1046 g_private_set (&g_log_depth, GUINT_TO_POINTER (depth));
(gdb) bt
#0 g_logv (log_domain=0x5555556b2f80 "NetworkManager", log_level=G_LOG_LEVEL_WARNING, format=<optimized out>, args=args@entry=0x7fffffffcd10) at gmessages.c:1046
#1 0x00007ffff4a4ea3f in g_log (log_domain=log_domain@entry=0x5555556b2f80 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_WARNING, format=format@entry=0x7ffff4ac1e4c "%s") at gmessages.c:1079
#2 0x00007ffff4a4ed56 in g_warn_message (domain=domain@entry=0x5555556b2f80 "NetworkManager", file=file@entry=0x5555556aca93 "devices/nm-device.c", line=line@entry=1101,
func=func@entry=0x5555556b22e0 <__FUNCTION__.35443> "nm_device_release_one_slave", warnexpr=warnexpr@entry=0x0) at gmessages.c:1112
#3 0x00005555555ba80a in nm_device_release_one_slave (self=self@entry=0x5555559ec4c0, slave=slave@entry=0x5555559f7800, configure=configure@entry=1, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE)
at devices/nm-device.c:1101
#4 0x00005555555c264b in slave_state_changed (slave=0x5555559f7800, slave_new_state=NM_DEVICE_STATE_FAILED, slave_old_state=NM_DEVICE_STATE_IP_CONFIG, reason=NM_DEVICE_STATE_REASON_NONE, self=0x5555559ec4c0)
at devices/nm-device.c:1700
#5 0x00007ffff339cdac in ffi_call_unix64 () at ../src/x86/unix64.S:76
#6 0x00007ffff339c6d5 in ffi_call (cif=cif@entry=0x7fffffffd1c0, fn=<optimized out>, rvalue=0x7fffffffd130, avalue=avalue@entry=0x7fffffffd0b0) at ../src/x86/ffi64.c:522
#7 0x00007ffff4d45678 in g_cclosure_marshal_generic (closure=0x5555559b0160, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=0x0)
at gclosure.c:1454
#8 0x00007ffff4d44e38 in g_closure_invoke (closure=0x5555559b0160, return_value=return_value@entry=0x0, n_param_values=4, param_values=param_values@entry=0x7fffffffd3c0,
invocation_hint=invocation_hint@entry=0x7fffffffd360) at gclosure.c:768
#9 0x00007ffff4d5675d in signal_emit_unlocked_R (node=node@entry=0x55555598a6f0, detail=detail@entry=0, instance=instance@entry=0x5555559f7800, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fffffffd3c0) at gsignal.c:3553
#10 0x00007ffff4d5e4c1 in g_signal_emit_valist (instance=instance@entry=0x5555559f7800, signal_id=signal_id@entry=72, detail=detail@entry=0, var_args=var_args@entry=0x7fffffffd5f8) at gsignal.c:3309
#11 0x00007ffff4d5ecc8 in g_signal_emit_by_name (instance=instance@entry=0x5555559f7800, detailed_signal=detailed_signal@entry=0x5555556c0405 "state-changed") at gsignal.c:3405
#12 0x00005555555bd0e0 in _set_state_full (self=self@entry=0x5555559f7800, state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE, quitting=quitting@entry=0)
at devices/nm-device.c:8580
#13 0x00005555555be0e7 in nm_device_state_changed (self=self@entry=0x5555559f7800, state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_NONE) at devices/nm-device.c:8741
#14 0x00005555555c0a45 in queued_set_state (user_data=<optimized out>) at devices/nm-device.c:8765
#15 0x00007ffff4a4779a in g_main_dispatch (context=0x5555559433c0) at gmain.c:3109
#16 g_main_context_dispatch (context=context@entry=0x5555559433c0) at gmain.c:3708
#17 0x00007ffff4a47ae8 in g_main_context_iterate (context=0x5555559433c0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3779
#18 0x00007ffff4a47dba in g_main_loop_run (loop=0x555555943480) at gmain.c:3973
#19 0x000055555559713d in main (argc=1, argv=0x7fffffffdb78) at main.c:512
(gdb)
Device activation normally fails during one of the stages and in that
case the activation chain is implicitly interrupted.
But in some cases the device fails for external events (as a failure
of master connection) while the activation sequence is still running
and so we need to ensure that any pending activation source gets
cleared upon entering the failed state.
https://bugzilla.redhat.com/show_bug.cgi?id=1270814
RFC7217 introduces an alternative mechanism for creating addresses during
stateless IPv6 address configuration. It's supposed to create addresses whose
host part stays stable in a particular network but changes when the hosts
enters another network to mitigate possibility of tracking the host movement.
It can be used alongside RFC 4941 privacy extensions (temporary addresses)
and replaces the use of RFC 4862 interface identifiers.
The address creation mode is controlld by ip6.addr_gen_mode property
(ADDR_GEN_MODE in ifcfg-rh), with values of "stable-privacy" and "eui-64",
defaulting to "eui-64" if unspecified.
The host part of an address is computed by hashing a system-specific secret
salted with various stable values that identify the connection with a secure
hash algorithm:
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
For NetworkManager we use these parameters:
* F()
SHA256 hash function.
* Prefix
This is a network part of the /64 address
* Net_Iface
We use the interface name (e.g. "eth0"). This ensures the address won't
change with the change of interface hardware.
* Network_ID
We use the connection UUID here. This ensures the salt is different for
wireless networks with a different SSID as suggested by RFC7217.
* DAD_Counter
A per-address counter that increases with each DAD failure.
* secret_key
We store the secret key in /var/lib/NetworkManager/secret_key. If it's
shorter than 128 bits then it's rejected. If the file is not present we
initialize it by fetching 256 pseudo-random bits from /dev/urandom on
first use.
Duplicate address detection uses IDGEN_RETRIES = 3 and does not utilize the
IDGEN_DELAY delay (despite it SHOULD). This is for ease of implementation
and may change in future. Neither parameter is currently configurable.
NMDevice detects the DAD failures by watching the removal of tentative
addresses (happens for DAD of addresses with valid lifetime, typically
discovered addresses) or changes to addresses with dadfailed flag (permanent
addresses, typically link-local and manually configured addresses).
It retries creation of link-local addresses itself and lets RDisc know about
the rest so that it can decide if it's rdisc-managed address and retry
with a new address.
Currently NMDevice doesn't do anything useful about link-local address DAD
failures -- it just fails the link-local address addition instead of just
timing out, which happened before. RDisc just logs a warning and removes
the address from the list.
However, with RFC7217 stable privacy addresses the use of a different address
and thus a recovery from DAD failures would be possible.
In update_connection(), pickup the configuration of
the vlan interface from platform and create the proper
NMSettingVlan setting.
And during stage1, configure the flags of the device.
Also, change all the ingress/egress mappings at once
instead of having a netlink request for each mapping.
Also, ensure we *clear* all other mappings so that
only those are set, that were configured (done by
the *gress_reset_all argument).
Instead of using libnl-route-3 library to serialize netlink messages,
construct the netlink messages ourselves.
This has several advantages:
- Creating the netlink message ourself is actually more straight
forward then having an intermediate layer between NM and the kernel.
Now it is immediately clear, how a platform request translates to
a netlink/kernel request.
You can look at the kernel sources how a certain netlink attribute
behaves, and then it's immediately clear how to set that (and vice
versa).
- Older libnl versions might have bugs or missing features for which
we needed to workaround (often by offering a reduced/broken/untested
functionality). Now we can get rid or workaround like _nl_has_capability(),
check_support_libnl_extended_ifa_flags(), HAVE_LIBNL_INET6_TOKEN.
Another example is a libnl bug when setting vlan ingress map which
isn't even yet fixed in libnl upstream.
- We no longer need libnl-route-3 at all and can drop that runtime
requirement, saving some 400k.
Constructing the messages ourselves also gives better performance
because we don't have to create the intermediate libnl object.
- In the future we will add more link-type support which is easier
to support by basing directly on the plain kernel/netlink API,
instead of requiring also libnl3 to expose this functionality.
E.g. adding macvtap support: we already parsed macvtap properties
ourselves because of missing libnl support. To *add* macvtap
support, we also would have to do it ourself (or extend libnl).
The peer-address (IFA_ADDRESS) can also be all-zero (0.0.0.0).
That is distinct from an usual address without explicit peer-address,
which implicitly has the same peer and local address.
Previously, we treated an all-zero peer_address as having peer and
local address equal. This is especially grave, because the peer is part
of the primary key for an IPv4 address. So we not only get a property of
the address wrong, but we wrongly consider two different addresses as
one and the same.
To properly handle these addresses, we always must explicitly set the peer.
Usually, the peer-address is the same as the local address.
In case where it is not, it is the peer-address that determines
the IPv4 device-route. So we must use the peer-address.
Also, don't consider device-routes with the first octet of zero,
just like kernel does.
Also, nm_ip4_config_get_subnet_for_host() is effectively the same
as nm_ip4_config_destination_is_direct(). So drop it.
We already have nm_platform_tun_get_properties(). Rename the function
as they both sidestep the platform cache to lookup some link-specific
properties.
For recent kernels, the peer-ifindex of veths is reported as
parent (IFA_LINK). Prefer that over the ethtool lookup.
For one, this avoids the extra ethtool call which has the
downside of sidestepping the platform cache. Also, looking
up the peer-ifindex in ethtool does not report whether the
peer lifes in another netns (NM_PLATFORM_LINK_OTHER_NETNS).
Only use ethtool as fallback for older kernels.
Because Bluez5 dropped DUN support, NM must do that manually which
includes emulating the "connected" property for Bluetooth devices when
DUN is used. It does this by setting priv->connected = TRUE in
nm_bluez_device_connect_finish().
But for PAN, when NM does process the 'connected' property change
notification, priv->connected is already TRUE and
_take_variant_property_connected() does nothing. Hence the
corresponding GObject property notification is not emitted,
nm-device-bt.c::check_connect_continue() will never return success, and
the activation times out.
To fix this, ensure that GObject notifications are emitted when the
device is connected, even if emulated internally.
https://mail.gnome.org/archives/networkmanager-list/2015-October/msg00053.htmlhttps://bugzilla.redhat.com/show_bug.cgi?id=1255284
Add a new 'ignore' option to NMSettingWired.wake-on-lan which disables
management of wake-on-lan by NetworkManager (i.e. the pre-existing
option will not be touched). Also, change the default behavior to be
'ignore' instead of 'disabled'.
https://bugzilla.gnome.org/show_bug.cgi?id=755182
The peer-address seems less important then the prefix-length.
Also, nm_platform_ip4_address_delete() has the peer-address
argument as last.
Soon ip4_address_get() also receives a peer-address argument,
so get the order right first.
This adds a LldpNeighbors property to the Device D-Bus interface
carrying information about devices discovered through LLDP. The
property is an array of hashes and each hash describes the values of
LLDP TLVs for a specific neighbor.
The unmanaged-flag NM_UNMANAGED_EXTERNAL_DOWN is initially set during
nm_device_finish_init(). But it was only set if the device was down at
that point.
If due to a race the platform device was not yet initialized, a later
initialization in device_link_changed() would clear NM_UNMANAGED_PLATFORM_INIT.
If the device is not external-down (because it was already up during
nm_device_finish_init()), the device will be managed right away with
reason NM_DEVICE_STATE_REASON_NOW_MANAGED.
Together with commit e29ab54335, this
is a race that causes a failure to assume the external-down device.
https://bugzilla.redhat.com/show_bug.cgi?id=1269199
When a VLAN has a bond as parent device the MAC address of the bond
may change when other devices are enslaved and then the VLAN would
have a MAC which is different from parent's one.
Let the VLAN device listen for changes in hw-address property of
parent and update its MAC address accordingly.
https://bugzilla.redhat.com/show_bug.cgi?id=1264322
Executing:
# brctl addbr lbr0
# ip addr add 10.1.1.1/24 dev lbr0
# ip link set lbr0 up
can result in a race so that NetworkManager would manage the device
(and clear the IP addresses).
It happens, when NetworkManager first receives platform signals that
the device is already up:
signal: link changed: 11: lbr0 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 bridge* not-init addrgenmode eui64 addr D2:A1:B4:17:18:F2 driver bridge
Note that the device is still unknown via udev (not-init). The
unmanaged-state NM_UNMANAGED_EXTERNAL_DOWN gets cleared, but the
device still stays unmanaged.
Only afterwards the device is known in udev:
signal: link changed: 11: lbr0 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 bridge* init addrgenmode eui64 addr D2:A1:B4:17:18:F2 driver bridge
At this point, we also clear NM_UNMANAGED_PLATFORM_INIT, making
the device managed with reason NM_DEVICE_STATE_REASON_NOW_MANAGED.
That results in managing the external device.
Fix that by only clearing NM_UNMANAGED_EXTERNAL_DOWN after the device
is no longer NM_UNMANAGED_PLATFORM_INIT.
https://bugzilla.redhat.com/show_bug.cgi?id=1269199
schedule_stage3() used to set the firewall before really
scheduling the stage3. In that case, fw_change_zone_cb() would
then directly call:
activation_source_schedule (self, activate_stage3_ip_config_start, AF_INET);
This was different from all other places. Usually, only the
nm_device_schedule_*() functions would directly call to
activation_source_schedule().
Change this, to behave similar as when we wait for master-ready
while scheduling stage2. As such, it is more idiomatic, and
it would still work correctly even if there were multiple conditions
that would block scheduling-stage3.