NM never removes the current AP from the AP list, to prevent NM from
indicating that it's connected, but to nothing. But the supplicant
can remove that AP from its list at any time (out of range, turned off,
etc), leading to a priv->current_ap that is no longer known to the
supplicant but still exists in the NM AP list. Since the supplicant
has forgotten it, NM will never receive a removal signal for it.
To ensure that a supplicant-forgotten priv->current_ap is removed
from the NM AP list when priv->current_ap is cleared or changed, mark
any AP removed by the supplicant as 'fake'. It will then always be
removed in set_current_ap() and not linger in the AP list forever like
a zombie.
Instead of tricky logic to merge APs and age them, just tell the
supplicant what our aging parameters are, and rely on it to handle
removal from the list. Notable behavioral changes are:
* APs will now be removed when they haven't been seen for two
consecutive scans in which they would have been included. This
means that when the scan interval is short, out-of-range APs will
be removed much more quickly than the previous 360 seconds.
* APs now live at most 250 seconds (twice our longest scan interval)
instead of the previous 360 seconds.
* The problem with wpa_supplicant < 2.3 not notifying that a BSS has
been seen in the scan if none of its properties actually changed is
now avoided, because an AP is only removed when the supplicant removes it
In general these changes should make the scan list more responsive, at
the cost of slightly more instability in the list due to the unreliability
of WiFi scanning. But it also removes a layer of complexity and
abstraction from NetworkManager, pushing the scan results list closer
to that which the hardware reports.
Most nm_platform_*() functions operate on the platform
singleton nm_platform_get(). That made sense because the
NMPlatform instance was mainly to hook fake platform for
testing.
While the implicit argument saved some typing, I think explicit is
better. Especially, because NMPlatform could become a more usable
object then just a hook for testing.
With this change, NMPlatform instances can be used individually, not
only as a singleton instance.
Before this change, the constructor of NMLinuxPlatform could not
call any nm_platform_*() functions because the singleton was not
yet initialized. We could only instantiate an incomplete instance,
register it via nm_platform_setup(), and then complete initialization
via singleton->setup().
With this change, we can create and fully initialize NMPlatform instances
before/without setting them up them as singleton.
Also, currently there is no clear distinction between functions
that operate on the NMPlatform instance, and functions that can
be used stand-alone (e.g. nm_platform_ip4_address_to_string()).
The latter can not be mocked for testing. With this change, the
distinction becomes obvious. That is also useful because it becomes
clearer which functions make use of the platform cache and which not.
Inside nm-linux-platform.c, continue the pattern that the
self instance is named @platform. That makes sense because
its type is NMPlatform, and not NMLinuxPlatform what we
would expect from a paramter named @self.
This is a major diff that causes some pain when rebasing. Try
to rebase to the parent commit of this commit as a first step.
Then rebase on top of this commit using merge-strategy "ours".
Some dual-band access points use the same BSSID in both the 5GHz
and 2.4GHz bands, so obviously frequency must be taken into account
when matching. But no AP should ever use the same SSID/BSSID pair
concurrently in the same band on different frequencies.
Some APs also dynamically channel hop when they detect interference,
or when they restart, they do so on a different channel after finding
the channel with the lowest interference. To assure that we don't
end up with stale scan list entries when the AP has switched to
another channel in the same band, fall back to band matching
when merging new scan results.
The only reason to allow lazy matching was to update a fake
current AP when its real scan result comes in. Now that NM
tracks the current AP based on the the supplicant's current
BSS, a fake current AP will be replaced when the real scan
result arrives. So it's pointless to update the fake one
when it'll just be removed soon.
Instead of keeping track of it internally, and attempting to fuzzy-match
access points and stuff, just use the supplicant's information. If the
supplicant doesn't know what AP it's talking to, then we've got more
problems than just NM code.
The theory here is that when starting activation, we'll use the given
access point from the scan list that the supplicant already knows about.
If there isn't one yet (adhoc, hidden, or AP mode) then we'll create
a fake AP and add it to the scan list.
Once the supplicant has associated to the AP, it'll notify about the
current BSS which we then look up in the scan list, and set
priv->current AP to that. If for some reason the given AP isn't yet
in our scan list (which often happens for adhoc/AP) then we'll just
live with the fake AP until we get the scan result, which should happen
very soon.
We don't need to do any fuzzy-matching to find the current AP because
it will either always be one the supplicant knows about, or a fake one
that the supplicant doesn't know about that will be replaced soon anyway.
A few places in the NMSupplicantInterface API and in NMDeviceWifi's
use of it were still using "GHashTable *properties" where they should
have been using "GVariant *properties". (This didn't cause any actual
problems because nothing was looking at those arguments.)
(Also fix a comment typo.)
It was confusing to understand the difference between calling nm_device_connection_is_available()
and check_connection_available(), they behaved similar, but not really
the same. Especially nm_device_connection_is_available() would look
first into @available_connetions, and might call check_connection_available()
itself. Whereas @available_connetions was also populated by testing
check_connection_available(). This interrelation makes it hard to
understand when nm_device_connection_is_available() returned true.
Rename nm_device_connection_is_available() to nm_device_check_connection_available()
and remove all direct calls of check_connection_available() in favor of
the wrapper nm_device_check_connection_available().
Now we only call nm_device_check_connection_available() with different
parameters (@flags and @specific_object). We also have the additional
guarantee that specifying more @flags will widen the result and making
a connection "more" available, while specifying a @specific_object will
restrict it.
This also changes behavior in several cases. For example before
nm_device_connection_is_available() for user-requests would always
declare matching connections available on Wi-Fi devices (only)
regardless of the device state. Now the device state gets consistently
considered.
For default-unmanaged devices it also changes behavior in complicated
ways, because before we would put connections into @available_connetions
for every device-state, but nm_device_connection_is_available() had a
special over-ride only for unmanaged-state.
This also fixes a bug, that user can activate an unavailable Wi-Fi
device:
nmcli radio wifi off
nmcli connection up wlan0
Having logging statements in a simple getter (or is_*()) means
you cannot call these functions without cluttering the log.
Another approach would be to add an @out_reason argument, and
callers who actually care log the reason. For now, just get rid
of the messages.
nm_device_get_hw_address() may return NULL and nm_platform_link_get_type may
return NM_LINK_TYPE_NONE. While it might be a good idea to check for such cases
at the init time it seems easier to just ignore it and prevent blowing up in
subsequent deactivation.
A quick test case:
# while :; do ip link add moo0 type veth peer moo1; ip link del moo0 ; done
Yields:
NetworkManager:ERROR:devices/nm-device-ethernet.c:268:constructor:
assertion failed: (link_type == NM_LINK_TYPE_ETHERNET ||
link_type == NM_LINK_TYPE_VETH)
nm_device_set_hw_addr: assertion 'addr != NULL' failed
https://bugzilla.gnome.org/show_bug.cgi?id=740992
config.h should be included from every .c file, and it should be
included before any other include. Fix that.
(As a side effect of how I did this, this also changes us to
consistently use "config.h" rather than <config.h>. To the extent that
it matters [which is not much], quotes are more correct anyway, since
we're talking about a file in our own build tree, not a system
include.)
Split a base NMSettingIPConfig class out of NMSettingIP4Config and
NMSettingIP6Config, and update things accordingly.
Further simplifications of now-redundant IPv4-vs-IPv6 code are
possible, and should happen in the future.
nm_setting_verify() took a GSList of other NMSettings, but really it
would just be simpler all around to pass the NMConnection instead...
This means that several formerly NMSetting-branded functions that
operated on lists-of-settings now get replaced with
NMConnection-branded functions instead.
Add nm-core-types.h, typedefing all of the GObject types in
libnm-core; this is needed so that nm-setting.h can reference
NMConnection in addition to nm-connection.h referencing NMSetting.
Removing the cross-includes from the various headers causes lots of
fallout elsewhere. (In particular, nm-utils.h used to include
nm-connection.h, which included every setting header, so any file that
included nm-utils.h automatically got most of the rest of libnm-core
without needing to pay attention to specifics.) Fix this up by
including nm-core-internal.h from those files that are now missing
includes.
This can be triggered by stopping the DBUS service. The assertion happens
because when the supplicant stops (due to the name-owner-change, which is triggered
because dbus-daemon quit), the supplicant manager sets all supplicant interfaces
to DOWN state so that they can be cleaned up. That does two things:
1) calls supplicant_interface_acquire() to attempt to re-launch wpa_supplicant
in case wpa_supplicant segfaulted
2) moves the NMDevicWifi to UNAVAILABLE state because the supplicant is gone,
the device is no longer usable and we must terminate the connection and wait
for the supplicant to come back
But #2 also ends up calling supplicant_interface_acquire(), because that's what
we want to do when the NMDeviceWifi is first managed (at startup) and when the
supplicant dies. The code just doesn't differentiate between the two cases.
To fix this, just allow duplicate "waiting for supplicant" pending
actions, which is fine because the operation doesn't care about strict
added/removed sequencing.
#0 0x000000381d0504e9 in g_logv (log_domain=0x59cd5b "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fff8cccc1a0) at gmessages.c:989
#1 0x000000381d05063f in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at gmessages.c:1025
#2 0x000000000044f53b in nm_device_add_pending_action (self=0xa60310, action=0x7febecd1d37d "waiting for supplicant", assert_not_yet_pending=1) at devices/nm-device.c:6466
#3 0x00007febecd0bc56 in supplicant_interface_acquire (self=0xa60310) at nm-device-wifi.c:262
#4 0x00007febecd0b240 in device_state_changed (device=0xa60310, new_state=NM_DEVICE_STATE_UNAVAILABLE, old_state=NM_DEVICE_STATE_ACTIVATED, reason=NM_DEVICE_STATE_REASON_SUPPLICANT_FAILED) at nm-device-wifi.c:3136
#5 0x000000381dc05d8c in ffi_call_unix64 () at ../src/x86/unix64.S:76
#6 0x000000381dc056bc in ffi_call (cif=cif@entry=0x7fff8cccc6c0, fn=0x7febecd0b050 <device_state_changed>, rvalue=0x7fff8cccc630, avalue=avalue@entry=0x7fff8cccc5b0) at ../src/x86/ffi64.c:522
#7 0x000000381e010ad8 in g_cclosure_marshal_generic (closure=0xa30a40, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>,
marshal_data=0x7febecd0b050 <device_state_changed>) at gclosure.c:1454
#8 0x000000381e010298 in g_closure_invoke (closure=closure@entry=0xa30a40, return_value=return_value@entry=0x0, n_param_values=4, param_values=param_values@entry=0x7fff8cccc8c0, invocation_hint=invocation_hint@entry=0x7fff8cccc860)
at gclosure.c:777
#9 0x000000381e02211b in signal_emit_unlocked_R (node=node@entry=0xa322b0, detail=detail@entry=0, instance=instance@entry=0xa60310, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fff8cccc8c0) at gsignal.c:3624
#10 0x000000381e02a0f2 in g_signal_emit_valist (instance=instance@entry=0xa60310, signal_id=signal_id@entry=63, detail=detail@entry=0, var_args=var_args@entry=0x7fff8ccccaf8) at gsignal.c:3330
#11 0x000000381e02a8f8 in g_signal_emit_by_name (instance=0xa60310, detailed_signal=0x59a8d1 "state-changed") at gsignal.c:3426
#12 0x00000000004514a4 in _set_state_full (self=0xa60310, state=NM_DEVICE_STATE_UNAVAILABLE, reason=NM_DEVICE_STATE_REASON_SUPPLICANT_FAILED, quitting=0) at devices/nm-device.c:6820
#13 0x0000000000449ec6 in nm_device_state_changed (self=0xa60310, state=NM_DEVICE_STATE_UNAVAILABLE, reason=NM_DEVICE_STATE_REASON_SUPPLICANT_FAILED) at devices/nm-device.c:6949
#14 0x00007febecd0f247 in supplicant_iface_state_cb (iface=0x9b9290, new_state=13, old_state=12, disconnect_reason=0, user_data=0xa60310) at nm-device-wifi.c:2276
#15 0x000000381dc05d8c in ffi_call_unix64 () at ../src/x86/unix64.S:76
#16 0x000000381dc056bc in ffi_call (cif=cif@entry=0x7fff8cccd230, fn=0x7febecd0eb20 <supplicant_iface_state_cb>, rvalue=0x7fff8cccd160, avalue=avalue@entry=0x7fff8cccd0e0) at ../src/x86/ffi64.c:522
#17 0x000000381e010f35 in g_cclosure_marshal_generic_va (closure=0xa2f490, return_value=0x0, instance=0x9b9290, args_list=<optimized out>, marshal_data=0x0, n_params=3, param_types=0xa422e0) at gclosure.c:1550
#18 0x000000381e0104c7 in _g_closure_invoke_va (closure=closure@entry=0xa2f490, return_value=return_value@entry=0x0, instance=instance@entry=0x9b9290, args=args@entry=0x7fff8cccd470, n_params=3, param_types=0xa422e0) at gclosure.c:840
#19 0x000000381e029749 in g_signal_emit_valist (instance=0x9b9290, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7fff8cccd470) at gsignal.c:3238
#20 0x000000381e02a3af in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at gsignal.c:3386
#21 0x00000000004b0e4b in set_state (self=0x9b9290, new_state=13) at supplicant-manager/nm-supplicant-interface.c:344
#22 0x00000000004b0916 in smgr_avail_cb (smgr=0xa3c890, pspec=0xa3c8d0, user_data=0x9b9290) at supplicant-manager/nm-supplicant-interface.c:930
#23 0x000000381e010298 in g_closure_invoke (closure=0x9a68b0, return_value=return_value@entry=0x0, n_param_values=2, param_values=param_values@entry=0x7fff8cccd770, invocation_hint=invocation_hint@entry=0x7fff8cccd710) at gclosure.c:777
#24 0x000000381e02235d in signal_emit_unlocked_R (node=node@entry=0x990da0, detail=detail@entry=610, instance=instance@entry=0xa3c890, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fff8cccd770) at gsignal.c:3586
#25 0x000000381e02a0f2 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7fff8cccd900) at gsignal.c:3330
#26 0x000000381e02a3af in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at gsignal.c:3386
#27 0x000000381e014945 in g_object_dispatch_properties_changed (object=0xa3c890, n_pspecs=1, pspecs=0x0) at gobject.c:1047
#28 0x000000381e017019 in g_object_notify_by_spec_internal (pspec=<optimized out>, object=0xa3c890) at gobject.c:1141
#29 g_object_notify (object=0xa3c890, property_name=<optimized out>) at gobject.c:1183
#30 0x00000000004b56f1 in set_running (self=0xa3c890, now_running=0) at supplicant-manager/nm-supplicant-manager.c:228
#31 0x00000000004b5002 in name_owner_changed (dbus_mgr=0x99f740, name=0x9ba910 "fi.w1.wpa_supplicant1", old_owner=0xa945a0 ":1.25", new_owner=0xac2ce0 "", user_data=0xa3c890) at supplicant-manager/nm-supplicant-manager.c:294
#32 0x000000381dc05d8c in ffi_call_unix64 () at ../src/x86/unix64.S:76
#33 0x000000381dc056bc in ffi_call (cif=cif@entry=0x7fff8cccdd10, fn=0x4b4d50 <name_owner_changed>, rvalue=0x7fff8cccdc80, avalue=avalue@entry=0x7fff8cccdc00) at ../src/x86/ffi64.c:522
#34 0x000000381e010ad8 in g_cclosure_marshal_generic (closure=0xa530a0, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=0x0) at gclosure.c:1454
#35 0x000000381e010298 in g_closure_invoke (closure=0xa530a0, return_value=return_value@entry=0x0, n_param_values=4, param_values=param_values@entry=0x7fff8cccdf10, invocation_hint=invocation_hint@entry=0x7fff8cccdeb0) at gclosure.c:777
#36 0x000000381e02235d in signal_emit_unlocked_R (node=node@entry=0x99cce0, detail=detail@entry=0, instance=instance@entry=0x99f740, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fff8cccdf10) at gsignal.c:3586
#37 0x000000381e02a0f2 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7fff8ccce0d0) at gsignal.c:3330
#38 0x000000381e02a3af in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at gsignal.c:3386
#39 0x00000000004c4026 in proxy_name_owner_changed (proxy=0x998210, name=0xa3ad50 "fi.w1.wpa_supplicant1", old_owner=0x9cffc0 ":1.25", new_owner=0x99d230 "", user_data=0x99f740) at nm-dbus-manager.c:708
#40 0x000000381dc05d8c in ffi_call_unix64 () at ../src/x86/unix64.S:76
#41 0x000000381dc056bc in ffi_call (cif=cif@entry=0x7fff8ccce410, fn=0x4c3fd0 <proxy_name_owner_changed>, rvalue=0x7fff8ccce380, avalue=avalue@entry=0x7fff8ccce300) at ../src/x86/ffi64.c:522
#42 0x000000381e010ad8 in g_cclosure_marshal_generic (closure=closure@entry=0x9beb80, return_gvalue=return_gvalue@entry=0x0, n_param_values=<optimized out>, param_values=<optimized out>,
invocation_hint=invocation_hint@entry=0x7fff8ccce630, marshal_data=marshal_data@entry=0x0) at gclosure.c:1454
#43 0x0000003829a10864 in marshal_dbus_message_to_g_marshaller (closure=0x9beb80, return_value=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=0x7fff8ccce630, marshal_data=0x0) at dbus-gproxy.c:1736
#44 0x000000381e010298 in g_closure_invoke (closure=0x9beb80, return_value=return_value@entry=0x0, n_param_values=3, param_values=param_values@entry=0x7fff8ccce690, invocation_hint=invocation_hint@entry=0x7fff8ccce630) at gclosure.c:777
#45 0x000000381e02235d in signal_emit_unlocked_R (node=node@entry=0x9be290, detail=detail@entry=347, instance=instance@entry=0x998210, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fff8ccce690) at gsignal.c:3586
#46 0x000000381e02a0f2 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7fff8ccce840) at gsignal.c:3330
#47 0x000000381e02a3af in g_signal_emit (instance=instance@entry=0x998210, signal_id=<optimized out>, detail=<optimized out>) at gsignal.c:3386
#48 0x0000003829a111c0 in dbus_g_proxy_emit_remote_signal (message=0xa6c2b0, proxy=0x998210) at dbus-gproxy.c:1789
#49 dbus_g_proxy_manager_filter (connection=<optimized out>, message=0xa6c2b0, user_data=0x9be520) at dbus-gproxy.c:1356
#50 0x000000382001006e in dbus_connection_dispatch (connection=connection@entry=0x9badb0) at dbus-connection.c:4631
#51 0x0000003829a0ad65 in message_queue_dispatch (source=source@entry=0x9bdcc0, callback=<optimized out>, user_data=<optimized out>) at dbus-gmain.c:90
#52 0x000000381d0492a6 in g_main_dispatch (context=0x99b4b0) at gmain.c:3066
#53 g_main_context_dispatch (context=context@entry=0x99b4b0) at gmain.c:3642
#54 0x000000381d049628 in g_main_context_iterate (context=0x99b4b0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3713
#55 0x000000381d049a3a in g_main_loop_run (loop=0x99b5d0) at gmain.c:3907
#56 0x0000000000443c28 in main (argc=1, argv=0x7fff8cccf268) at main.c:704
Signed-off-by: Thomas Haller <thaller@redhat.com>
Since 3a54d050 the AP address is not a gbyte[], but a char *. The fake AP BSSID
fixup could trigger an assertion failure:
Oct 26 11:14:45 goatlord.localdomain NetworkManager[540]: nm_ethernet_address_is_valid: assertion 'addr != NULL' failed
https://bugzilla.gnome.org/show_bug.cgi?id=739258
Fixes: 3a54d05098
Creating a mode=ap connection (as GNOME control center does) would otherwise
assert in complete_connection despite having a proper SSID set:
NetworkManager-wifi:ERROR:nm-device-wifi.c:1118:complete_connection: assertion failed: (ssid)
Aborted
Most NMDevice types defined their own error domain but then never used
it. A few did use their errors, but some of those errors are redundant
with NMDeviceError, and others can be added to it.