We're hooking the signal on construction, but we only queue a pending
action on reaching UNAVAILABLE state. The signal could fire in between:
<info> [1539282167.9666] manager: (msh0): new 802.11 OLPC Mesh device (/org/freedesktop/NetworkManager/Devices/4)
<info> [1539282168.1440] manager: (wlan0): new 802.11 WiFi device (/org/freedesktop/NetworkManager/Devices/5)
<info> [1539282168.1831] device (msh0): found companion WiFi device wlan0
<warn> [1539282168.2110] device (msh0): remove_pending_action (1): 'waiting-for-companion' not pending
file src/devices/nm-device.c: line 13966 (<dropped>): should not be reached
https://github.com/NetworkManager/NetworkManager/pull/229
Instead of walking through the list all known networks and comparing
name & SSIDs to judge whether a network is an IWD KnownNetwork, look at
the Network.KnownNetwork pre-IWD-0.8 property.
Ensure priv->can_connect is up to date on IWD state changed. If we
exited the function early priv->can_connect would sometimes be wrongly
TRUE and we'd start a new autoconnect too early after IP configuration
had failed for example.
Handle AP mode connections by setting the Mode property on IWD's Device
interface to "ap" (which will make the Station interface go away, the
Powered property -- normally controlled by set_enabled -- to switch to
FALSE and back to TRUE, and then the AccessPoint interface to appear)
and then calling the AccessPoint.Start method. This is all done in the
CONFIG phase in NM. We also attempt to always set Mode back to
"station" and wait for the Station interface to reappear before going to
the NM DISCONNECTED state. All this complicates the code a little.
While making the necessary changes simplify a lot of the checks which
are implied by other things we've checked already, for example
priv->can_scan and priv->can_connect can now only be TRUE when device is
powered up and in station mode (Station interface is present) so we can
skip other checks. Also assume that check_connection_compatible has
been called before other methods are called so we can skip multiple
connection mode checks and checks that a IWD KnownNetwork exsists for
EAP connections.
act_stage1_prepare and act_stage2_config now borrow more code from
nm-device-wifi.c because both backend now handle multiple modes.
Use nm_utils_error_is_cancelled instead of checking
g_error_matches (error, G_IO_ERROR, G_IO_ERROR_CANCELLED), set
consider_is_disposing false. Also use the DBUS_INTERFACE_PROPERTIES
macro.
Make sure we g_variant_unref() the values returned from
g_dbus_proxy_call_finish. In get_ordered_networks_cb also make sure we
don't access the NMDeviceIwd data until after we know the call has not
been cancelled. Switch from _nm_dbus_proxy_call_finish to
g_dbus_proxy_call_finish where we don't care about the variant type
returned.
Only call nm_platform_wifi_get_capabilities and
nm_platform_wifi_get_mode with the wpa_supplicant backend. They're used
to initialize the wireless-capabilities property and to skip creating
NMDevices for interfaces in unknown wifi mode which IWD handles already.
Under valgrind, we cannot create an NAcd instance.
--10916-- WARNING: unhandled amd64-linux syscall: 321
--10916-- You may be able to write your own handler.
--10916-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--10916-- Nevertheless we consider this a bug. Please report
--10916-- it at http://valgrind.org/support/bug_reports.html.
This limitation already poses a problem, because running NetworkManager
under valgrind might fail. However, for tests it doesn't matter and we
can just skip them.
See the logfile at [1], for how NetworkManager first attempts to connect
using WPS (which takes about 30 seconds). However, early on, the user logs
into KDE and a secret agent would register, which possibly could provide
secrets to connect. I think it is problematic to wait for WPS (which is
unlikely to succeed) if a secret agent shows up in the meantime.
A possible fix would be that when
- WPS is pending
- the secret request already failed
- another secret-agent registers
then the activation (and WPS) is aborted and autoconnect may be tried
again, possibly with secrets provided by the new secret-agent.
However, this patch goes a step further: it always cancels activation
when the secret request fails. That means, WPS only works while the
user is also prompted for a secret. That makes sense to me, because
an action from the user is required. However, without secret prompt,
the user wouldn't be aware of that and is unlikely to press the WPS
push botton.
[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1079672#c33https://github.com/NetworkManager/NetworkManager/pull/216
We already have nm_utils_bin2hexstr() and _nm_utils_bin2hexstr_full().
This is confusing.
- nm_utils_bin2hexstr() is public API of libnm. Also, it has
a last argument @final_len to truncate the string at that
length.
It uses no delimiter and lower-case characters.
- _nm_utils_bin2hexstr_full() does not do any truncation, but
it has options to specify a delimiter, the character case,
and to update a given buffer in-place. Also, like
nm_utils_bin2hexstr() and _nm_utils_bin2hexstr() it can
allocate a new buffer on demand.
- _nm_utils_bin2hexstr() would use ':' as delimiter and make
the case configurable. Also, it would always allocate the returned
buffer.
It's too much and confusing. Drop _nm_utils_bin2hexstr() which is internal
API and just a wrapper around _nm_utils_bin2hexstr_full().
NMAcdManager is a rather simple instance.
It does not need (thread-safe) ref-counting, in fact, having
it ref-counted makes it slighly ugly that we connect a signal,
but never bother to disconnect it (while the ref-counted instance
could outlife the signal subscriber).
We also don't need GObject signals. They have more overhead
and are less type-safe than a regular function pointers. Signals
would make sense, if there could be multiple independent listeners,
but that just doesn't make sense.
Implementing it as a plain struct is less lines of code, and less
runtime over head.
Also drop the possiblitiy to reset the NMAcdManager instance.
It wasn't needed and I think it was buggy because it wouldn't
reset the n-acd instance.
https://github.com/NetworkManager/NetworkManager/pull/213
Kernel complains with:
platform-linux: link: change 10: token: set IPv6 address generation token to ::bbbb
platform-linux: do-change-link[10]: failure changing link: failure 22 (Invalid argument)
if we try to set an IPv6 token in ip_config_merge_and_apply() and we
haven't set accept_ra=1 yet. Since the flag is set when starting
router discovery, ensure that we don't try to set a token before that.
https://github.com/NetworkManager/NetworkManager/pull/214https://bugzilla.redhat.com/show_bug.cgi?id=1560652
The net.connman.iwd.Station interface, unlike the Device interface, can
go away and come back when the device changes modes or goes DOWN and UP.
The GDbusProxy for the interface becomes invalid after the interface
disappeared and reappeared. This would make the IWD backend stop
working after rfkill was used or after suspend/resume (provided the
suspend/resume events are detected, without them everything works and is
really fast too).
Redo the handling of the Powered property changes, corresponding to
device UP state, to get a new GDBusProxy when the Station interface
reappears. Simplify some checks knowing that priv->can_scan implies for
example that the Station interface is present, and that priv->enabled
implies the NM device state is >= DISCONNECTED.
https://github.com/NetworkManager/NetworkManager/pull/211
When there are profiles with wifi.hidden=yes, NetworkManager
will actively scan for these SSIDs. This makes the scan request
(and thus the user) recognizable and trackable.
It seems generally a bad idea to use hidden networks, as they
compromise either the privacy or usablity for the clients.
Log a (rate-limited) warning about this.
In update update_ext_ip_config() we remove from various internal
configurations those addresses and routes that were removed externally
by users.
When the interface is brought down, the kernel automatically removes
routes associated with it and so we should not consider them as
"removed by users".
Instead, keep them so that they can be restored when the interface
comes up again.
device_link_changed() can't use nm_device_update_from_platform_link()
to update the device private fields because the latter overwrites
priv->iface and priv->up, and so the checks below as:
if (info.name[0] && strcmp (priv->iface, info.name) != 0) {
and:
was_up = priv->up;
priv->up = NM_FLAGS_HAS (info.n_ifi_flags, IFF_UP);
...
if (priv->up && !was_up) {
never succeed.
Fixes: d7f7725ae8
Nothing changes practically, as the NMDevice still starts this with
AF_UNSPEC. That is going to change in the following commit.
The ugly part:
priv->concheck_x[0] in few places. I believe we shouldn't be using union
aliasing here, and instead of indexing the v4/v6 arrays by a boolean it
should be an enum. I'm not fixing it here, but I eventually plan to if
this gets an ACK.
This allows us to use the correct DNS server for the particular interface
independent of what the system resolver is configured to use.
The ugly part:
Unfortunately, it is not all that easy. The libc's libresolv.so API does
not provide means for influencing neither interface nor name servers used
for DNS resolving.
Curl can also be compiled with c-ares resolver backend that does provide
the necessary functionality, but it requires and extra library and the
Linux distributions don't seem to enable it. (Fedora doesn't, which is a
good sign we don't have an option of relying on it.)
systemd-resolved does provide everything we need. If we take care to
keep its congfiguration up to date, we can use it to do the resolving on
a particular interface with that interface's DNS configuration. Great!
There's one more problem: Curl doesn't provide callbacks for resolving
host names. It doesn't, however, allow us to pass in the pre-resolved
hostnames in form of an CURLOPT_RESOLVE(3) option. This means we have to
parse the host name out of the URL ourselves. Fair enough I guess...
Before:
"manager: check_if_startup_complete returns FALSE because of eth0"
Now:
"manager: startup complete is waiting for device 'eth0' (autoactivate)"
Also, the logging line is now more a human readable sentence, but still
follows the same pattern as later
"manager: startup complete"
Meaning: grepping for "startup complete" becomes more helpful because
one first finds the reasons why startup-complete is not yet reached,
followed by the moment when it is reached.
src/devices/nm-acd-manager.c:419:31: error: variable 'info' is uninitialized
when used here [-Werror,-Wuninitialized]
nm_utils_inet4_ntop (info->address, NULL),
Also make sure the secrets request callback only send a reply to IWD and
the Connect method return callback executes the device state change to
"disconnected".
state_changed (called when IWD signalled device state change) was
supposed to not change NM device state on connect success or failure and
instead wait for the DBus Connect() method callback but it would
actually still call nm_device_emit_recheck_auto_activate on failure so
refactor state_changed and network_connect_cb to make sure
the state change and nm_device_emit_recheck_auto_activate are only
called from network_connect_cb.
This fixes a race where during a switch from one network to another NM
would immediately start a new activation after state_changed and
network_connect_cb would then handle the Connect failure and mark the
new activation as failed.
When creating the mirror 802.1x connections for IWD 802.1x profiles
set the NM_SETTING_SECRET_FLAG_NOT_SAVED flag on the secrets that
may at some point be requested from our agent. The saved secrets could
not be used anyway because of our use of
NM_SECRET_AGENT_GET_SECRETS_FLAG_REQUEST_NEW in
nm_device_iwd_agent_query. But also try to respect whatever secret
caching policy has been configured in the IWD profile for those secrets,
IWD would be responsible for storing them if it was allowed in the
profile.
In two places stop using g_dbus_proxy_new_* to create whole new
interface proxy objects for net.connman.iwd.Network as this will
normally have a huge overhead compared to asking the ObjectManager
client that we already have in NMIwdManager for those proxies.
dbus-monitor shows that for each network path returned by
GetOrderedNetworks () -- and we call it every 10 or 20 seconds and may
get many dozens of networks back -- gdbus would do the following each
time:
org.freedesktop.DBus.AddMatch("member=PropertiesChanged")
org.freedesktop.DBus.StartServiceByName("net.connman.iwd")
org.freedesktop.DBus.GetNameOwner("net.connman.iwd")
org.freedesktop.DBus.Properties.GetAll("net.connman.iwd.Network")
org.freedesktop.DBus.RemoveMatch("member=PropertiesChanged")
Using these unormalized was wrong all along, but by chance didn't hit
paths that needed normalized connections. This may change if we
actually write in memory connections to /run with the keyfile plugin,
because that one wants them normalized.
This also saves some work, because normalization does boring things for
us, such as adding default ipv4/ipv6/proxy settings everywhere.
On networked boot we need to somehow communicate this to the early boot
machinery. Sadly, no DBus there and we're running in configure-and-quit
mode.
Abusing the state file for this sounds almost reasonable and is
reasonably straightforward thing to do.
libcurl does not allow removing easy-handles from within a curl
callback.
That was already partly avoided for one handle alone. That is, when
a handle completed inside a libcurl callback, it would only invoke the
callback, but not yet delete it. However, that is not enough, because
from within a callback another handle can be cancelled, leading to
the removal of (the other) handle and a crash:
==24572== at 0x40319AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24572== by 0x52DDAE5: Curl_close (url.c:392)
==24572== by 0x52EC02C: curl_easy_cleanup (easy.c:825)
==24572== by 0x5FDCD2: cb_data_free (nm-connectivity.c:215)
==24572== by 0x5FF6DE: nm_connectivity_check_cancel (nm-connectivity.c:585)
==24572== by 0x55F7F9: concheck_handle_complete (nm-device.c:2601)
==24572== by 0x574C12: concheck_cb (nm-device.c:2725)
==24572== by 0x5FD887: cb_data_invoke_callback (nm-connectivity.c:167)
==24572== by 0x5FD959: easy_header_cb (nm-connectivity.c:435)
==24572== by 0x52D73CB: chop_write (sendf.c:612)
==24572== by 0x52D73CB: Curl_client_write (sendf.c:668)
==24572== by 0x52D54ED: Curl_http_readwrite_headers (http.c:3904)
==24572== by 0x52E9EA7: readwrite_data (transfer.c:548)
==24572== by 0x52E9EA7: Curl_readwrite (transfer.c:1161)
==24572== by 0x52F4193: multi_runsingle (multi.c:1915)
==24572== by 0x52F5531: multi_socket (multi.c:2607)
==24572== by 0x52F5804: curl_multi_socket_action (multi.c:2771)
Fix that, by never invoking any callbacks when we are inside a libcurl
callback. Instead, the handle is marked for completion and queued. Later,
we complete all queue handles separately.
While at it, drop the @error argument from NMConnectivityCheckCallback.
It was only used to signal cancellation. Let's instead signal that via
status NM_CONNECTIVITY_CANCELLED.
https://bugzilla.gnome.org/show_bug.cgi?id=797136https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1792745https://bugzilla.opensuse.org/show_bug.cgi?id=1107197https://github.com/NetworkManager/NetworkManager/pull/207
Fixes: d8a31794c8
As we accept addr_family %AF_UNSPEC to detect the address family,
we also need to return it. Just returning the binary address without
the address family makes no sense.
Note that NMSettingEthtool and NMSettingMatch don't have such
functions either.
We have API
nm_connection_get_setting (NMConnection *, GType)
nm_connection_get_setting_by_name (NMConnection *, const char *)
which can be used generically, meaning: the requested setting type
is an argument to the function. That is generally more useful and
flexible.
Don't add API which duplicates existing functionality and is (arguably)
inferiour. Drop it now. This is an ABI/API break for the current development
cycle where the 1.14.0 API is still unstable. Indeed it's already after
1.14-rc1, which is ugly. But it's also unlikely that somebody already uses
this API/ABI and is badly impacted by this change.
Note that nm_connection_get_setting() and nm_connection_get_setting_by_name()
are slightly inconvenient in C still, because they usually require a cast.
We should fix that by changing the return type to "void *". Such
a change may be possibly any time without breaking API/ABI (almost, it'd
be an API change when taking a function pointer without casting).
(cherry picked from commit a10156f516)