This is to support the S5 case, where usually the NM process is
stopped. If we are stopping and WoWLAN is set for the interface,
we do not deconfigure it and keep the connection alive so we
can receive packages that will potentially wake up the system.
Note that for this work, wpa_supplicant needs to be modified too
so it does not deconfigure the wireless interface either when
stopped. The needed patches for wpa_supplicant can be found in
http://lists.infradead.org/pipermail/hostap/2018-June/038644.html
We can't use g_steal_pointer(&active) in the argument list if another
argument uses @active because the order of evaluation is not defined.
This fixes the following bug:
src/nm-manager.c:511:_async_op_complete_ac_auth_cb: assertion failed: (active == async_op_data->ac_auth.active)
Fixes: f4fc62bad8https://bugzilla.redhat.com/show_bug.cgi?id=1585494
Coccinelle:
@@
expression a, b;
@@
-a ? a : b
+a ?: b
Applied with:
spatch --sp-file ternary.cocci --in-place --smpl-spacing --dir .
With some manual adjustments on spots that Cocci didn't catch for
reasons unknown.
Thanks to the marvelous effort of the GNU compiler developer we can now
spare a couple of bits that could be used for more important things,
like this commit message. Standards commitees yet have to catch up.
In nm_manager_get_best_device_for_connection(), not only check whether the first found
active-connection has a device. There might be multiple candidates, in which case iterate
over them and figure out which one is the most suitable.
Also, despite the found @ac has the same settings-connection, it does not
mean that the connection is available on the device. Extend the check and
only return compatible and ready devices. This can easily happen that
the settings-connection was modified in the meantime and no longer is
compatible with the device (despite currently being active on the
device, but with the previous settings).
It is possible, that there are multiple such conflicting connections.
Maybe not now, but with connecition cardinality it will soon be.
Find and disconnect them all.
nm_device_steal_connection() was a bit misleading. It only had one caller,
and what _internal_activate_device() really wants it to deactivate all
other active-connections for the same connection. Hence, it already
performed a lookup for the active-connection that should be disconnected,
only to then lookup the device, and tell it to steal the connection.
Note, that if existing_ac happens to be neither the queued nor the currenct
active connection, then previously it would have done nothing. It's
unclear when that exactly can happen, however, we can avoid that
question entirely.
Instead of having steal-connection(), have a disconnect-active-connection().
If there is no matching device, it will just set the active-connection's
state to DISCONNECTED. Which in turn does nothing, if the state is
already DISCONNECTED.
At various places we check whether we have an active-connection already.
For example, when activating a new connection in _internal_activate_device(),
we might want to "nm_device_steal_connection()" if the same profile is
already active.
However, the max-state argument was not accurate in several cases. For
the purpose of finding concurrent activations, we don't care about
active-connections that are already in state DEACTIVATING. Only those
that are ACTIVATED or before.
active_connection_find_by_connection() (or how it was called previously) is
a simpler wrapper around active_connection_find(), which accepts a NMConnection
argument, that may or may not be a NMSettingsConnection.
It always passed NM_ACTIVE_CONNECTION_STATE_DEACTIVATING as max_state, and I think
that one of the two callers don't should do that. A later commit will change that.
So, we also want to pass @max_state as argument to active_connection_find_by_connection().
At that point, make active_connection_find_by_connection() identical to
active_connection_find(), with the only exception, that it accepts a
NMConnection and does automatically do the right thing (either lookup by
pointer equality or by uuid).
There are various places where we do an internal activation (with an
internal auth-subject). In several of these places, the
ACTIVATION_REASON is USER_REQUEST.
I think it is wrong to generally abort all internal activations, except
AUTOCONNECT_SLAVES ones. I think, aborting an activation should be
opt-in instead of opt-out.
To me it seems, we want to abort the activation based on the activation
reason. For now, opt-in to EXTERNAL, ASSUME and AUTOCONNECT only. If
there are additional cases where we should abort activation, we should
add yet another reason and not use USER_REQUEST.
Until now, we only really cared about whether a connection was activated
with reason NM_ACTIVATION_REASON_AUTOCONNECT_SLAVES or not.
Now however, we will care about whether a connection was activated via
(genuine) autoconnect by NMPolicy, or external/assume by NMManager.
Add a new reason to distinguish between them.
During shutdown, we will need to still iterate the main loop
to do a coordinated shutdown. Currently we do not, and we just
exit, leaving a lot of objects hanging.
If we are going to fix that, we need during shutdown tell
NMDBusManager to reject all future operations.
Note that property getters and "GetManagerObjects" call is not
blocked. It continues to work.
Certainly for some operations, we want to allow them to be called even
during shutdown. However, these have to opt-in.
This also fixes an uglyness, where nm_dbus_manager_start() would
get the set-property-handler and the @manager as user-data. However,
NMDBusManager will always outlife NMManager, hence, after NMManager
is destroyed, the user-data would be a dangling pointer. Currently
that is not an issue, because
- we always leak NMManager
- we don't run the mainloop during shutdown
nm_settings_add_connection_dbus() has two callers. One of them is NMManager
during AddAndActivate. In this case, the NMActiveConnection already created
an auth-subject. Re-use it.
Note how creating an auth-subject involves reading procfs to determine
whether the process still exists. This is not about the additional
overhead of that, but about the race where the process could drop
of in the meantime. The calling process might be gone now, and we would
fail creating the auth-subject. There is no need for that, because we
already evaluated all information we need. Quite likely, in the case
of this race, PolicyKit will also determine that the process is gone
and fail authorization too. But that's PolicyKit's decision to make,
not nm_settings_add_connection_dbus()'s.
We cannot just fire off asynchronous actions without keeping a handle
to them. Otherwise, it's impossible for NMManager to know which
asynchronous operations are pending, and more importantly: it cannot
cancel them.
One day, I want that we do a clean shutdown, where NetworkManager stops
all pending operations, and cleans up everything. That implies, that
every operation is cancellable in a timely manner.
Rework pending nm_active_connection_authorize() calls to be tracked in a
list, so that they are still reachable to NMManager. Note that currently
NMManager does not yet try to cancel these operations ever. However, it
would now be possible to do so.
Previously, nm_active_connection_authorize() accepts two user-data
pointers for convenience.
nm_active_connection_authorize() has three callers. One only requires
one user-data, one passes two user-data pointers, and one requires
three pointer.
Also, the way how the third passes the user data (via
g_object_set_qdata_full()) is not great.
Let's only use one user-data pointer. We commonly do that, and it's easy
enough to allocate a buffer to pack multiple pointers together.
If we can't generate a connection and maybe_later is TRUE, it means
that the device can generate/assume connections but it failed for the
moment due to missing master/slaves/addresses. In this case, just
assume the connection from state file.
https://bugzilla.redhat.com/show_bug.cgi?id=1551958
During _new_active_connection() we just create the NMActiveConnection
instance to proceed with authorization. The caller might not even
authorize, so we must not touch the device yet.
Do that only later.
Often, functions perform a series of steps, and when they fail,
they bail out. It's simpler if the code is structured that way,
so you can read it from top to bottom and whenever something is
wrong, either return directly (or goto a cleanup label at the
bottom).
From the D-Bus layer, no specific-object is represented by "/". We
should early on normalize such values to NULL, and not expect or
handle them later (like during _new_active_connection()).
Merge _new_vpn_active_connection() into _new_active_connection(). It was the
only caller, and it is simpler to have all the code visible at one place.
That also shows, that the device argument is ignored and not handled.
Ensure that no device is specified for VPN type activations.
Also, in _add_and_activate_auth_done(), always steal the connection
from active's user-data. Otherwise, the lifetime of the connection
is extended until active gets destroyed. For example, if we would leak
active, we would also leak connection that way.
- pass is-vpn to _new_active_connection(). It is confusing that _new_active_connection()
would re-determine whether a connection is a VPN type, although that was already
established previously. The confusing part is: will they come to the
same result? Why? What if not?
Instead pass it as argument and assert that we got it right.
- the check for existing connections should look whether there is an existing
active connection of type NMVpnConnection. Instead, what matters is,
- do we have a connection of type VPN (otherwise, don't even bother
to search for existing-ac)
- is the connection already active?
Checking whether the connection is already active, and ask backwards
whether it's of type NMVpnConnection is odd, maybe even wrong in
some cases.
- there are only two callers of validate_activation_request(). One of them,
might already lookup the device before calling the validate function.
Safe to looking up again. But this is not only an optimization, more importantly,
it feels odd to first lookup a device, and then later look it up again. Are
we guaranteed to use the same path? Why? Just avoid that question.
- re-order some error checking for missing device, so that it is clearer.
- use cleanup attribute to handle return value and drop the "goto error".
"NMSettingsConnectionFlags" was an internal enum. Soon, we will add such
a type in libnm. Avoid the naming conflict by renaming. The "Int" stands
for "internal".
NMAuthChain is not really ref-counted. True, we have an internal ref-counter
to ensure that the instance stays alive while the callback is invoked. However,
the user cannot take additional references as there is no nm_auth_chain_ref().
When the user wants to get rid of the auth-chain, with the current API it
is important that the callback won't be called after that point. From the
name nm_auth_chain_unref(), it sounds like that there could be multiple references
to the auth-chain, and merely unreferencing the object might not guarantee that
the callback is canceled. However, that is luckily not the case, because
there is no real ref-counting involved here.
Just rename the destroy function to make this clearer.
Essentially, nm_connection_get_path() mirros nm_dbus_object_get_path().
However, when cloning a simple-connection, the path also gets cloned.
I think this field doesn't belong to NMConnection in the first place,
because NMConnection is not a D-Bus object. NMSettingsConnection (in
core) and NMRemoteConnection (in libnm) is.
Don't use the misleading alias, but use nm_dbus_object_get_path()
directly.
NMManager very much cares about changes to the connectivity state
of the device and was therefore listening to notify::connectivity
signals. However, property changed signals can be suppressed by
g_object_freeze_notify(). That is something we even encourage for
NMDBusObject instances, because the D-Bus glue makes use of the
property changed notifications, and encourages to combine multiple
changes by freezing the signal.
Using the property changed notifications of NMDBusObject instances is
ugly. Don't do that and instead add a special signal.
It might happen, that connectivitiy is lost only for a moment and
returns soon after. Based on that assumption, when we loose connectivity
we want to have a probe interval where we check for returning
connectivity more frequently.
For that, we handle tracking of the timeouts per-device.
The intervall shall start with 1 seconds, and double the interval time until
the full interval is reached. Actually, due to the implementation, it's unlikely
that we already perform the second check 1 second later. That is because commonly
the first check returns before the one second timeout is reached and bumps the
interval to 2 seconds right away.
Also, we go through extra lengths so that manual connectivity check
delay the periodic checks. By being more smart about that, we can reduce
the number of connectivity checks, but still keeping the promise to
check at least within the requested interval.
The complexity of book keeping the timeouts is remarkable. But I think
it is worth the effort and we should try hard to
- have a connectivity state as accurate as possible. Clearly,
connectivity checking means that we probing, so being more intelligent
about timeout and backoff timers can result in a better connectivity
state. The connectivity state is important because we use it for
the default-route penaly and the GUI indicates bad connectivity.
- be intelligent about avoiding redundant connectivity checks. While
we want to check often to get an accurate connectivity state, we
also want to minimize the number of HTTP requests, in case the
connectivity is established and suppossedly stable.
Also, perform connectivity checks in every state of the device.
Even if a device is disconnected, it still might have connectivity,
for example if the user externally adds an IP address on an unmanaged
device.
https://bugzilla.gnome.org/show_bug.cgi?id=792240
An asynchronous request should either be cancellable or not keep
the target object alive. Preferably both.
Otherwise, it is impossible to do a controlled shutdown when terminating
NetworkManager. Currently, when NetworkManager is about to terminate,
it just quits the mainloop and essentially leaks everything. That is a
bug. If we ever want to fix that, every asynchronous request must be
cancellable in a controlled way (or it must not prevent objects from
getting disposed, where disposing the object automatically cancels the
callback).
Rework the asynchronous request for connectivity check to
- return a handle that can be used to cancel the operation.
Cancelling is optional. The caller may choose to ignore the handle
because the asynchronous operation does not keep the target object
alive. That means, it is still possible to shutdown, by everybody
giving up their reference to the target object. In which case the
callback will be invoked during dispose() of the target object.
- also, the callback will always be invoked exactly once, and never
synchronously from within the asynchronous start call. But during
cancel(), the callback is invoked synchronously from within cancel().
Note that it's only allowed to cancel an action at most once, and
never after the callback is invoked (also not from within the callback
itself).
- also, NMConnectivity already supports a fake handler, in case
connectivity check is disabled via configuration. Hence, reuse
the same code paths also when compiling without --enable-concheck.
That means, instead of having #if WITH_CONCHECK at various callers,
move them into NMConnectivity. The downside is, that if you build
without concheck, there is a small overhead compared to before. The
upside is, we reuse the same code paths when compiling with or without
concheck.
- also, the patch synchronizes the connecitivty states. For example,
previously `nmcli networking connectivity check` would schedule
requests in parallel, and return the accumulated result of the individual
requests.
However, the global connectivity state of the manager might have have
been the same as the answer to the explicit connecitivity check,
because while the answer for the manual check is waiting for all
pending checks to complete, the global connectivity state could
already change. That is just wrong. There are not multiple global
connectivity states at the same time, there is just one. A manual
connectivity check should have the meaning of ensure that the global
state is up to date, but it still should return the global
connectivity state -- not the answers for several connectivity checks
issued in parallel.
This is related to commit b799de281b
(libnm: update property in the manager after connectivity check),
which tries to address a similar problem client side.
Similarly, each device has a connectivity state. While there might
be several connectivity checks per device pending, whenever a check
completes, it can update the per-device state (and return that device
state as result), but the immediate answer of the individual check
might not matter. This is especially the case, when a later request
returns earlier and obsoletes all earlier requests. In that case,
earlier requests return with the result of the currend devices
connectivity state.
This patch cleans up the internal API and gives a better defined behavior
to the user (thus, the simple API which simplifies implementation for the
caller). However, the implementation of getting this API right and properly
handle cancel and destruction of the target object is more complicated and
complex. But this but is not just for the sake of a nicer API. This fixes
actual issues explained above.
Also, get rid of GAsyncResult to track information about the pending request.
Instead, allocate our own handle structure, which ends up to be nicer
because it's strongly typed and has exactly the properties that are
useful to track the request. Also, it gets rid of the awkward
_finish() API by passing the relevant arguments to the callback
directly.
Specify a reason when creating active connections. The reason will be
used in the next commit to tell whether slaves must be reconnected or
not if a master has autoconnect-slaves=yes.
This allows to adjust the timeout of an existing checkpoint.
The main usecase of checkpoints, is to have a fail-safe when
configuring the network remotely. By allowing to reset the timeout,
the user can perform a series of actions, and keep bumping the
timeout. That way, the entire series is still guarded by the same
checkpoint, but the user can start with short timeout, and
re-adjust the timeout as he goes along.
The libnm API only implements the async form (at least for now).
Sync methods are fundamentally wrong with D-Bus, and it's probably
not needed. Also, follow glib convenction, where the async form
doesn't have the _async name suffix. Also, accept a D-Bus path
as argument, not a NMCheckpoint instance. The libnm API should
not be more restricted than the underlying D-Bus API. It would
be cumbersome to require the user to lookup the NMCheckpoint
instance first, especially since libnm doesn't provide an efficient
or convenient lookup-by-path method. On the other hand, retrieving
the path from a NMCheckpoint instance is always possible.