- use NMUuid type where appropriate.
- no error handling for generate_duid_from_machine_id().
It cannot fail anymore.
- add thread-safety to generate_duid_from_machine_id() with
double-checked locking.
- use unions for converting the sha256 digest to the target
type.
For testing purpose, it's bad to let nm_utils_stable_id_parse()
directly access nm_utils_get_boot_id_str(). Instead, the function
should have no side-effects.
Since the boot-id is anyway cached, accessing it is cheap. Even
if it likely won't be needed.
Previously, whenever we needed /etc/machine-id we would re-load it
from file. The are 3 downsides of that:
- the smallest downside is the runtime overhead of repeatedly
reading the file and parse it.
- as we read it multiple times, it may change anytime. Most
code in NetworkManager does not expect or handle a change of
the machine-id.
Generally, the admin should make sure that the machine-id is properly
initialized before NetworkManager starts, and not change it. As such,
a change of the machine-id should never happen in practice.
But if it would change, we would get odd behaviors. Note for example
how generate_duid_from_machine_id() already cached the generated DUID
and only read it once.
It's better to pick the machine-id once, and rely to use the same
one for the remainder of the program.
If the admin wants to change the machine-id, NetworkManager must be
restarted as well (in case the admin cares).
Also, as we now only load it once, it makes sense to log an error
(once) when we fail to read the machine-id.
- previously, loading the machine-id could fail each time. And we
have to somehow handle that error. It seems, the best thing what we
anyway can do, is to log an error once and continue with a fake
machine-id. Here we add a fake machine-id based on the secret-key
or the boot-id. Now obtaining a machine-id can no longer fail
and error handling is no longer necessary.
Also, ensure that a machine-id of all zeros is not valid.
Technically, a machine-id is not an RFC 4122 UUID. But it's
the same size, so we also use NMUuid data structure for it.
While at it, also refactor caching of the boot-id and the secret
key. In particular, fix the thread-safety of the double-checked
locking implementations.
In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.
That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.
Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).
Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
The need for this is the following:
"ipv4.dhcp-client-id" can be specified via global connection defaults.
In absence of any configuration in NetworkManager, the default depends
on the DHCP client plugin. In case of "dhclient", the default further
depends on /etc/dhcp.
For "internal" plugin, we may very well want to change the default
client-id to "mac" by universally installing a configuration
snippet
[connection-use-mac-client-id]
ipv4.dhcp-client-id=mac
However, if we the user happens to enable "dhclient" plugin, this also
forces the client-id and overrules configuration from /etc/dhcp. The real
problem is, that dhclient can be configured via means outside of NetworkManager,
so our defaults shall not overwrite defaults from /etc/dhcp.
With the new device spec, we can avoid this issue:
[connection-dhcp-client-id]
match-device=except:dhcp-plugin:dhclient
ipv4.dhcp-client-id=mac
This will be part of the solution for rh#1640494. Note that merely
dropping a configuration snippet is not yet enough. More fixes for
DHCP will follow. Also, bug rh#1640494 may have alternative solutions
as well. The nice part of this new feature is that it is generally
useful for configuring connection defaults and not specifically for
the client-id issue.
Note that this match spec is per-device, although the plugin is selected
globally. That makes some sense, because in the future we may or may not
configure the DHCP plugin per-device or per address family.
https://bugzilla.redhat.com/show_bug.cgi?id=1640494
is_available would recently return true after IWD had disconnected
if a connection was active because it would check that
priv->dbus_station_proxy was non-NULL (i.e. that the DBus interface was
still visible, which it wasn't) but that check would be overridden
if the NMDevice state was activated. Now require priv->dbus_obj to be
non-NULL, which would even be enough on its own although I'm leaving the
previous check there too to catch potential IWD states we don't support
in which priv->dbus_station_proxy is NULL without an active connection.
Sanity check networks received from the Station.GetOrderedNetworks()
DBus method. Duplicates shouldn't happen but the code should be safe
against bogus data received over DBus. There was a recent bug in a
library used by IWD causing occasional duplicates to be returned which
would cause invalid memory accesses reported by valgrind in NM because
g_hash_table_insert would free what we passed as the key.
Watch for connection-removed events and delete corresponding IWD network
configs if found. This mainly changes anything for 802.1X networks
where the deleted NM connections might annoyingly re-appear after a
restart. For PSK networks though it'll make IWD forget the password
which, until now, would be remembered by IWD even if it was removed or
changed in the NM profile, which is a bug.
This is still fragile because we don't handle "connection-updated"
events so the data->mirror_connection pointer for a known network record
may after some time point to an NMSettingsConnection with a different
SSID or security type and there are corner cases where the IWD-side
profile will not be forgotten. At least I'm trying to make sure we
don't crash and don't wrongly remove any IWD profile which could also be
annoying for complicated EAP configs.
Something is possibly wrong with the DBus signal handling if a newly
added KnownNetwork interface already has an entry in
priv->known_networks, but since we handle this case add a warning and
update the GDBusProxy pointer for that existing entry.
interface_added expects mirror_8021x_connection() to return the pointer to
the existing connection if one exists, and NULL on error, rather than
NULL if a conneciton exists. While touching that, add logic to return
specifically a connection with EAP method set to 'external' if one
exists even though this should not affect any other logic we have
currently.
Some internal logic causes the secrets in a connection to be
occasionally moved to NMSettingConnection's priv->system_secrets after a
connection attempt so we need to use nm_act_request_get_secrets to get
them added to the device's settings connection and applied connection if
the PSK is missing during an AP or AdHoc mode activation (in
infrastructure mode we already do secret requests though they're cached
by IWD in most cases).
The common steps for the PSK available and unavailable scenarios is moved
from act_stage2_config to act_set_mode.
This simplifies the code by using g_variant_lookup. In this handler
where we parse more than one property this is probably slower although
the number of string comparisons will be the same.
This flag is more granular in whether to consider the connection
available or not. We probably should never check for the combined
flag NM_DEVICE_CHECK_CON_AVAILABLE_FOR_USER_REQUEST directly, but
always explicitly for the relevant parts.
Also, improve the error message, to indicate whether the device is
strictly unmanaged or whether it could be overruled.
The flags NMDeviceCheckConAvailableFlags and NMDeviceCheckDevAvailableFlags
both control whether a device appears available (either, available in
general, or related to a particular profile).
Also, both flag types strictly increase availability. Meaning: more flags,
more available.
There is some overlap between the flags, however they still have
their own distinct parts.
Improve the mapping from NMDeviceCheckConAvailableFlags to
NMDeviceCheckDevAvailableFlags, by picking exactly the flags
that are relevant.
- fix stopping ppp-manager. Previously, we would take a reference
to priv->ppp_manager to cancel it later. However, deactivate_cleanup()
is called first, which already issues nm_ppp_manager_stop().
Thereby, not using a callback and not waiting for the operation
to complete.
- get rid of this "step" state machine. There are litterally two steps
that need to be performed asynchornously. Instead chain the calls.
- it is now obviously visible, that the async callback never completes
synchronously upon being called (provided that all async operations
that it calls themself have this behavior -- which they should).
Previously nm_ppp_manager_stop() would return a handle which
makes it easy to cancel the operation.
However, sometimes, we may want to cancel an operation based on
an GCancellable. So, extend nm_ppp_manager_stop() to hook it
with a cancellable.
Essentially, move the code from nm-modem.c to nm-ppp-manager-call.c,
where it belongs and where the functionality gets available to every
component.
We should not use GAsyncResult. At least, not for internal API.
It's more cumbersome then helpful, in my opinion. It requires
this awkward async_finish() pattern.
Instead, let the caller pass a suitable callback of the right type.
All operations must be cancellable in a timely manner, otherwise, the objects
hang during shutdown.
Also, get rid of the GAsyncResult pattern. It is more cumbersome than
helpful.
There are still several issues, marked by FIXME comments.
When the client terminates, we really don't care if it exited cleanly,
with an error or killed by a signal. We expect the client to never
exit and so all these situations are equally bad for us. Introduce a
new TERMINATED state instead of reusing existing FAIL or DONE states,
which are set when receiving particular events from the client.
The @was_active flag indicates that we started DHCP on an assumed
connection. The idea is that if DHCP succeeded before, any failure
must be treated like a renewal failure (and so it should start a grace
period) rather than a failure in getting an initial lease (which fails
the IP method).
When we clean up the DHCP instance, the flag must be reset to FALSE,
otherwise it will be potentially considered for other connections.
The effect of a DHCPv6 failure should depend only on current IP state.
This in the analogous of commit bd63d39252 ("dhcp: fix handling of
failure events") for IPv6.
We're hooking the signal on construction, but we only queue a pending
action on reaching UNAVAILABLE state. The signal could fire in between:
<info> [1539282167.9666] manager: (msh0): new 802.11 OLPC Mesh device (/org/freedesktop/NetworkManager/Devices/4)
<info> [1539282168.1440] manager: (wlan0): new 802.11 WiFi device (/org/freedesktop/NetworkManager/Devices/5)
<info> [1539282168.1831] device (msh0): found companion WiFi device wlan0
<warn> [1539282168.2110] device (msh0): remove_pending_action (1): 'waiting-for-companion' not pending
file src/devices/nm-device.c: line 13966 (<dropped>): should not be reached
https://github.com/NetworkManager/NetworkManager/pull/229
Instead of walking through the list all known networks and comparing
name & SSIDs to judge whether a network is an IWD KnownNetwork, look at
the Network.KnownNetwork pre-IWD-0.8 property.
Ensure priv->can_connect is up to date on IWD state changed. If we
exited the function early priv->can_connect would sometimes be wrongly
TRUE and we'd start a new autoconnect too early after IP configuration
had failed for example.
Handle AP mode connections by setting the Mode property on IWD's Device
interface to "ap" (which will make the Station interface go away, the
Powered property -- normally controlled by set_enabled -- to switch to
FALSE and back to TRUE, and then the AccessPoint interface to appear)
and then calling the AccessPoint.Start method. This is all done in the
CONFIG phase in NM. We also attempt to always set Mode back to
"station" and wait for the Station interface to reappear before going to
the NM DISCONNECTED state. All this complicates the code a little.
While making the necessary changes simplify a lot of the checks which
are implied by other things we've checked already, for example
priv->can_scan and priv->can_connect can now only be TRUE when device is
powered up and in station mode (Station interface is present) so we can
skip other checks. Also assume that check_connection_compatible has
been called before other methods are called so we can skip multiple
connection mode checks and checks that a IWD KnownNetwork exsists for
EAP connections.
act_stage1_prepare and act_stage2_config now borrow more code from
nm-device-wifi.c because both backend now handle multiple modes.
Use nm_utils_error_is_cancelled instead of checking
g_error_matches (error, G_IO_ERROR, G_IO_ERROR_CANCELLED), set
consider_is_disposing false. Also use the DBUS_INTERFACE_PROPERTIES
macro.
Make sure we g_variant_unref() the values returned from
g_dbus_proxy_call_finish. In get_ordered_networks_cb also make sure we
don't access the NMDeviceIwd data until after we know the call has not
been cancelled. Switch from _nm_dbus_proxy_call_finish to
g_dbus_proxy_call_finish where we don't care about the variant type
returned.
Only call nm_platform_wifi_get_capabilities and
nm_platform_wifi_get_mode with the wpa_supplicant backend. They're used
to initialize the wireless-capabilities property and to skip creating
NMDevices for interfaces in unknown wifi mode which IWD handles already.
Under valgrind, we cannot create an NAcd instance.
--10916-- WARNING: unhandled amd64-linux syscall: 321
--10916-- You may be able to write your own handler.
--10916-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--10916-- Nevertheless we consider this a bug. Please report
--10916-- it at http://valgrind.org/support/bug_reports.html.
This limitation already poses a problem, because running NetworkManager
under valgrind might fail. However, for tests it doesn't matter and we
can just skip them.
See the logfile at [1], for how NetworkManager first attempts to connect
using WPS (which takes about 30 seconds). However, early on, the user logs
into KDE and a secret agent would register, which possibly could provide
secrets to connect. I think it is problematic to wait for WPS (which is
unlikely to succeed) if a secret agent shows up in the meantime.
A possible fix would be that when
- WPS is pending
- the secret request already failed
- another secret-agent registers
then the activation (and WPS) is aborted and autoconnect may be tried
again, possibly with secrets provided by the new secret-agent.
However, this patch goes a step further: it always cancels activation
when the secret request fails. That means, WPS only works while the
user is also prompted for a secret. That makes sense to me, because
an action from the user is required. However, without secret prompt,
the user wouldn't be aware of that and is unlikely to press the WPS
push botton.
[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1079672#c33https://github.com/NetworkManager/NetworkManager/pull/216