While we request a scan, we are not yet actually scanning. That means, the supplicant's
"scanning" property will only change to TRUE a while after we initiate the scan. It may
even never happen.
We thus need to handle that the request is currently pending and react when the
request completes.
Add nm_utils_invoke_on_timeout() beside nm_utils_invoke_on_idle().
They are fundamentally similar, except one schedules an idle handler
and the other a timeout.
Also, use the current g_main_context_get_thread_default() as context
instead of the singleton instance. That is a change in behavior, but
the only caller of nm_utils_invoke_on_idle() is the daemon, which
doesn't use different main contexts. Anyway, to avoid anybody being
tripped up by this also change the order of arguments. It anyway
seems nicer to first pass the cancellable, and the callback and user
data as last arguments. It's more in line with glib's asynchronous
methods.
Also, in the unlikely case that the cancellable is already cancelled
from the start, always schedule an idle action to complete fast.
- avoid g_assert(). Either we want to gracefully assert (g_return_*()) or we
want to use assertions that are disabled in production builds (nm_assert());
- rename variable s_connection to s_con. This is how variables for this
purpose are commonly called.
The profile's "ipv4.gateway" and "ipv6.gateway" has only one real
purpose: to define the next hop of a static default route.
Usually, when specifying a gateway in this way, the default route from
other addressing methods (like DHCPv4 or IPv6 autoconf) gets ignored.
If you have a WireGuard peer with "AllowedIPs=0.0.0.0/0" and
"wireguard.peer-routes" enabled, NetworkManager would automatically add
a route to the peer. Previously, if the user also set a gateway, that
route was suppressed.
That doesn't feel right. Note that configuring a gateway on a WireGuard
profile is likely to be wrong to begin with. At least, unless you take
otherwise care to avoid routing loops. If you take care, setting a
gateway may work, but it would feel clearer to instead just add an
explicit /0 manual route instead.
Also, note that usually you don't need a gateway anyway. WireGuard is a
Layer 3 (IP) tunnel, where the next hop is alway just the other side of
the tunnel. The next hop has little effect on the routes that you
configure on a WireGuard interface. What however matters is whether a
default route is present or not.
Also, an explicit gateway probably works badly with "ipv[46].ip4-auto-default-route",
because in that case the automatism should add a /0 peer-route route in a
separate routing table. The explicit gateway interferes with that too.
Nonetheless, without this patch it's not obvious why the /0 peer
route gets suppressed when a gateway is set. Don't allow for that, and
always add the peer-route.
Probably the profile's gateway setting is still wrong and causes the
profile not to work. But at least, you see all routes configured, and
it's clearer where the (wrong) default route to the gateway comes from.
When the server is restarted the write to unix socket fails with
EPIPE. In such case, don't fail all the calls in queue; instead, after
a sync of the ovsdb state (through a monitor call), start processing
the queue again, including the call that previously failed.
Add a retry counter to avoid that calls are stuck in the queue forever
in a hypothetical scenario in which the write always fails.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/459
This was previously tracked via a signal "scanning-prohibited".
However, I think it was buggy, because the signal didn't specify
a GSignalAccumulator, so when a NMDeviceOlpcMesh registered a handler,
NMDeviceWifi.scanning_prohibited() was ignored.
In theory, a GObject signal decouples the target and source of the
signal and is more abstract. But more abstraction is worse, if there
is exactly one target who cares about this signal: the OLPC mesh.
And that target is well known at compile time. So, don't pretend that
NMDeviceWifi or NMDeviceOlpcMesh aren't aware that they are together in
this.
Another downside of the signal is that you don't know when scanning gets
unblocked. You can only poll and asked whether it is blocked, but there
was no mechanism how NMDeviceWifi would be notified when scanning is
no longer blocked.
Rework this. Instead, the OLPC mesh explicitly registers and unregisters
its blocking state with nm_device_wifi_scanning_prohibited_track().
It feels better to first parse input arguments before authenticating.
One argument for otherwise would be that we shouldn't reveal any
information about the request before authenticating it. Meaning: every
request (even with invalid arguments) should fail with
permission-denied.
However, I prefer this for minor reasons:
- what makes a valid request is no secret. And if somebody makes an
invalid request, it should fail with invalid-arguments first.
- we possibly can short cut the expensive authentication process, where
we ask PolicyKit.
- by extracting the options variant early and only pass on the SSIDs
array, we handle the encoding of the options array earlier and where
it belongs: closer to the D-Bus request that defines the meaning of
the argument.
Also, change the failure reason to return invalid-argument.
This was first introduced by commit 4ed4b491fa ('2005-12-31 Dan
Williams <dcbw@redhat.com>'), a very long time ago.
It got reworked several times, but I don't think this code makes sense
anymore. So, if nm_platform_wifi_get_quality() returns an error, we
would ignore it for three times, until we would set the strength to the
error code (presumably -1). Why? If we cannot read the strength via
nl80211/WEXT, then we should just keep whatever we got from supplicant.
Drop this.
Also, only accept the percentage if it is in a valid range from 0 to
100%. If the driver (or platform code) gives us numbers out of that
range, we have no idea what their meaning is. In that case, the value
must be fixed in the lower layers, that knows how to convert the value
from the actual meaning to the requested percentage.
In NMSupplicantInterface, we determine whether we currently are scanning
both on the "scanning" supplicant state and the "Scanning" property.
Extend that. If we currently are scanning and are about to clear the
scanning state, then pretend to still scan as long as we are still
initializing BSS instances. What otherwise happens is that we declare
that we finished scanning, but the NMWifiAP instances are not yet ready.
The result is, that `nmcli device wifi` will already start printing the
scan list, when we didn't yet fully process all access points.
Now, _notify_maybe_scanning() will delay switching the scanning state to
disabled, as long as we have BSS initializing (bss_initializing_lst_head).
Also, ignore the "ScanDone" signal. It's redundant to the "Scanning"
property anyway.
Also, only set priv->last_scan_msec when we switch the scanning state
off. That is the right (and only) place where the last-scan timestamp
needs updating.
When starting with a locked modem, it may take some time for the user to
enter the PIN code, leading to the secrets request timing out. In that
case, we want the connection activation to be retried automatically once
the modem is unlocked, which can't be achieved if we propagate the error,
as the device will change state to 'failed'.
This patch ignores the 'no-secrets' error, as it means either the
request has timed out, or the user cancelled the request without
notifying NetworkManager. By doing this, we allow the connection to be
re-activated once the modem is unlocked.
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
When the modem is unlocked externally to NetworkManager, it is kept in
the 'need-auth' state. The current connection activation continues as
if nothing had changed, and the secrets request for the PIN code (which
is no longer necessary) eventually times out. The device state is then
changed to 'failed', meaning there won't be a new try at activating the
default connection automatically.
In order to prevent this, and retry activating the default connection
when the modem gets unlocked, we change the device state to
'deactivating' when we identify the modem has been unlocked externally.
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
If we change the the MTU of an ovs interface only through netlink, the
change could be overridden by ovs-vswitchd at any time when other
interfaces change. Set the MTU also in the ovsdb to prevent such
changes.
Note that if the MTU comes from the connection, we already set the
ovsdb MTU at creation time and so this other update becomes
useless. But it is needed when changing the MTU at runtime (reapply)
or when the MTU comes from a different source (e.g. DHCP).
The ovs-vswitchd.conf.db(5) man page says about the the mtu_request
column in the Interface table:
"Requested MTU (Maximum Transmission Unit) for the interface. A
client can fill this column to change the MTU of an
interface [...] If this is not set and if the interface has
internal type, Open vSwitch will change the MTU to match the
minimum of the other interfaces in the bridge."
Therefore, if the connection specifies a MTU, set it early when adding
the interface to the ovsdb so that it will not be changed to the
minimum of other interfaces.
g_clear_pointer() would always cast the destroy notify function
pointer to GDestroyNotify. That means, it lost some type safety, like
GPtrArray *ptr_arr = ...
g_clear_pointer (&ptr_arr, g_array_unref);
Since glib 2.58 ([1]), g_clear_pointer() is also more type safe. But
this is not used by NetworkManager, because we don't set
GLIB_VERSION_MIN_REQUIRED to 2.58.
[1] f9a9902aac
We have nm_clear_pointer() to avoid this issue for a long time (pre
1.12.0). Possibly we should redefine in our source tree g_clear_pointer()
as nm_clear_pointer(). However, I don't like to patch glib functions
with our own variant. Arguably, we do patch g_clear_error() in
such a manner. But there the point is to make the function inlinable.
Also, nm_clear_pointer() returns a boolean that indicates whether
anything was cleared. That is sometimes useful. I think we should
just consistently use nm_clear_pointer() instead, which does always
the preferable thing.
Replace:
sed 's/\<g_clear_pointer *(\([^;]*\), *\([a-z_A-Z0-9]\+\) *)/nm_clear_pointer (\1, \2)/g' $(git grep -l g_clear_pointer) -i
I think it's preferable to use nm_clear_g_free() instead of
g_clear_pointer(, g_free). The reasons are not very strong,
but I think it is overall preferable to have a shorthand for this
frequently used functionality.
sed 's/\<g_clear_pointer *(\([^;]*\), *\(g_free\) *)/nm_clear_g_free (\1)/g' $(git grep -l g_clear_pointer) -i
Configuration stages like act_stage2_config() can postpone progressing
to the next stage. Currently, when the condition that we wait for gets
satisfied, the code schedules the next stage from there.
I think that is wrong, because when we postpone from act_stage2_config(),
follow up steps of stage2 get skipped. Thus, when we are ready to progress,
the class should enter stage 2 again.
This requires that stage2 becomes reentrant and that the code reenters the
same stage.
We usually want to schedule stage2 when we just completed with the previous
stage (or, if we are currently in stage2, and want to re-enter it).
In those cases, the conditions are often right to just proceed right away.
No need to schedule the stage on an idle handler. Allow to invoke stage2
right away.
There was only API to schedule the stage on an idle handler.
Sometimes, we are just in the right situation to schedule the stage
right away. It should be possibly to avoid going through the extra hop.
For now, none of the caller makes use of this. So, there isn't any
actual change in behavior. But by adding this possibility, we may do
use in the future.
Avoid GDBusProxy, instead use GDBusConnection directly. I very much
prefer this because that way we have explicit control over what happens
on D-Bus. With GDBusProxy this is hidden under another layer of complex
code. The hardest part when using a D-Bus interface is to manage the
state via an asynchronous medium. GDBusProxy contains state about the
D-Bus interface and duplicate the state that we track. This makes it hard
to reason about things.
Rework creation of NMSupplicantInterface. Previously, a NMSupplicantInterface
had multiple initialization states. In particular, the first state would not
yet tie the interface to a certain D-Bus object path. Instead, NMSupplicantInterface
would try and retry to create the D-Bus object.
Now, NMSupplicantManager has an asynchronous method to create interface
instances. The manager only creates an interface instance after the D-Bus
path is known. That means, a NMSupplicantInterface instance is now
strongly tied to a name-owner and D-Bus path.
It follows that the state of NMSupplicantInterface can only go from STARTING,
via the supplicant states, to DOWN. Never back. That was already previously
the case that the state from DOWN was final and once the 3 initial
states were passed, the interface's state would never go back to the initial
state. Now this is more strict and more formalized. The 3 initialization states
are combined.
I think the tighter state handling simplifies users of NMSupplicantInterface.
See for example "nm-device-ethernet.c". It's still complicated, because handling
state is fundamentally difficult.
NMSupplicantManager will take care to D-Bus activate wpa_supplicant only
when necessary (poke). Previously, creating the manager instance
would always start suppliant service. Now, it's started on demand.
Add '_nm_setting_bond_get_option_or_default()' and move all the custom
policies applied by NM for bond options in there.
One such example of a custom policy is to set 'miimon' to 0 (instead of its
default value of 100) if 'arp_interval' is explicitly enabled
and 'miimon' is not.
This means removing every piece of logic from
nm_setting_bond_add_option() which used to clear out 'arp_interval' and
'arp_ip_target' if 'miimon' was set or clear out 'miimon' along with
'downdelay', 'updelay' and 'miimon' if 'arp_interval' was set.
This behaviour is a bug since the kernel allow setting any combination
of this options for bonds and NetworkManager should not limit the user
to do so.
Also use 'set_bond_attr_or_default()' instead of 'set_bond_attr()' as
the former calls '_nm_setting_bond_get_option_or_default()' to implement
the right logic to retrieve bond options according to current bond
configuration.