At various places we check whether we have an active-connection already.
For example, when activating a new connection in _internal_activate_device(),
we might want to "nm_device_steal_connection()" if the same profile is
already active.
However, the max-state argument was not accurate in several cases. For
the purpose of finding concurrent activations, we don't care about
active-connections that are already in state DEACTIVATING. Only those
that are ACTIVATED or before.
active_connection_find_by_connection() (or how it was called previously) is
a simpler wrapper around active_connection_find(), which accepts a NMConnection
argument, that may or may not be a NMSettingsConnection.
It always passed NM_ACTIVE_CONNECTION_STATE_DEACTIVATING as max_state, and I think
that one of the two callers don't should do that. A later commit will change that.
So, we also want to pass @max_state as argument to active_connection_find_by_connection().
At that point, make active_connection_find_by_connection() identical to
active_connection_find(), with the only exception, that it accepts a
NMConnection and does automatically do the right thing (either lookup by
pointer equality or by uuid).
There are various places where we do an internal activation (with an
internal auth-subject). In several of these places, the
ACTIVATION_REASON is USER_REQUEST.
I think it is wrong to generally abort all internal activations, except
AUTOCONNECT_SLAVES ones. I think, aborting an activation should be
opt-in instead of opt-out.
To me it seems, we want to abort the activation based on the activation
reason. For now, opt-in to EXTERNAL, ASSUME and AUTOCONNECT only. If
there are additional cases where we should abort activation, we should
add yet another reason and not use USER_REQUEST.
Until now, we only really cared about whether a connection was activated
with reason NM_ACTIVATION_REASON_AUTOCONNECT_SLAVES or not.
Now however, we will care about whether a connection was activated via
(genuine) autoconnect by NMPolicy, or external/assume by NMManager.
Add a new reason to distinguish between them.
There is no need to perform a lookup by path. NMSettings is a singleton,
it has the connection exactly iff the connection is linked.
Also add an assertion to double-check that the results agree with
the previous implementation.
Instead of passing the interval for the timeout, let concheck_periodic_schedule_do()
figure it out on its own. It only depends on cur-interval and
cur-basetime.
Additionally, pass now_ns timestamp, because we already made
decisions based on this particular timestamp. We don't want to
re-evalutate the current time but ensure to use the same timestamp.
There is no change in behavior, it just seems nicer this way.
When the device-state changes to "activated", force a connectivity check
right away. Something possibly happened that affected connectivity.
Also, reduce the interval time down to CONCHECK_P_PROBE_INTERVAL to
start probing again.
if device *is* a NM_DEVICE_IWD, then make sure to not pass that to
_nm_device_wifi_request_scan (which asserts on anything else than a
NM_DEVICE_WIFI device).
The crash can be triggered by enabling wifi.backend=iwd and clicking
on the 'select network' item in gnome shell for example. The journal
output looks like this:
NetworkManager[1861]: invalid cast from 'NMDeviceIwd' to 'NMDeviceWifi'
NetworkManager[1861]: **
NetworkManager[1861]: NetworkManager:ERROR:src/devices/wifi/nm-device-wifi.c:1127:_nm_device_wifi_request_scan: assertion failed: ((((__extension__ ({ GTypeInstance *__inst = (GTypeInstance*) ((_obj)); GType __t = ((nm_device_wifi_get_type ())); gboolean __r; if (!__inst) __r = (0); else if (__inst->g_class && __inst->g_class->g_type == __t) __r = (!(0)); else __r = g_type_check_instance_is_a (__inst, __t); __r; })))))
systemd[1]: NetworkManager.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: NetworkManager.service: Failed with result 'core-dump'.
Fixes: 297d4985abhttps://github.com/NetworkManager/NetworkManager/pull/107
Without this, nm_device_get_type_description() would quite likely
return "ethernet" for NMDeviceVeth types. This is wrong and was
broken recently.
Fixes: 0775602574
During shutdown, we will need to still iterate the main loop
to do a coordinated shutdown. Currently we do not, and we just
exit, leaving a lot of objects hanging.
If we are going to fix that, we need during shutdown tell
NMDBusManager to reject all future operations.
Note that property getters and "GetManagerObjects" call is not
blocked. It continues to work.
Certainly for some operations, we want to allow them to be called even
during shutdown. However, these have to opt-in.
This also fixes an uglyness, where nm_dbus_manager_start() would
get the set-property-handler and the @manager as user-data. However,
NMDBusManager will always outlife NMManager, hence, after NMManager
is destroyed, the user-data would be a dangling pointer. Currently
that is not an issue, because
- we always leak NMManager
- we don't run the mainloop during shutdown
nm_settings_add_connection_dbus() has two callers. One of them is NMManager
during AddAndActivate. In this case, the NMActiveConnection already created
an auth-subject. Re-use it.
Note how creating an auth-subject involves reading procfs to determine
whether the process still exists. This is not about the additional
overhead of that, but about the race where the process could drop
of in the meantime. The calling process might be gone now, and we would
fail creating the auth-subject. There is no need for that, because we
already evaluated all information we need. Quite likely, in the case
of this race, PolicyKit will also determine that the process is gone
and fail authorization too. But that's PolicyKit's decision to make,
not nm_settings_add_connection_dbus()'s.
We cannot just fire off asynchronous actions without keeping a handle
to them. Otherwise, it's impossible for NMManager to know which
asynchronous operations are pending, and more importantly: it cannot
cancel them.
One day, I want that we do a clean shutdown, where NetworkManager stops
all pending operations, and cleans up everything. That implies, that
every operation is cancellable in a timely manner.
Rework pending nm_active_connection_authorize() calls to be tracked in a
list, so that they are still reachable to NMManager. Note that currently
NMManager does not yet try to cancel these operations ever. However, it
would now be possible to do so.
The async authorization request also carries user-data and its result
must always be handled. For example, it might carry a GDBusMethodInvocation
context, which must be returned and freed.
Hence, when cancelling the request, we must always invoke the callback.
Also, when the NMActiveConnection progresses to state disconnected,
automatically abort the authorization request.
Previously, nm_active_connection_authorize() accepts two user-data
pointers for convenience.
nm_active_connection_authorize() has three callers. One only requires
one user-data, one passes two user-data pointers, and one requires
three pointer.
Also, the way how the third passes the user data (via
g_object_set_qdata_full()) is not great.
Let's only use one user-data pointer. We commonly do that, and it's easy
enough to allocate a buffer to pack multiple pointers together.
On RHEL, we don't have NetworkManager-config-connectivity-fedora package.
Hence, the spec files for RHEL differ from upstream in this regard.
The aim is that contrib/rpm's spec file can be used almost as-is for
RHEL, Fedora and possibly other distros. Hence, build the subpackage
conditionally to minimize the difference.
We currently start the bus manager only after the creation of a
NMManager because the NMManager is needed to handle set-property bus
calls. However, objects created by NMManager
(e.g. NMDnsSystemdResolved) need a bus connection and so their
initialization currently fail.
To fix this, split nm_dbus_manager_start() in two parts: first only
create the connection and acquire the bus. After this step the
NMManager can be set up. In the second step, set NMManager as the
set-property handler and start exporting objects on the bus.
Fixes: 297d4985ab
Otherwise, the order is undefined and unstable. If you call
GetManagedObjects() on D-Bus multiple times, it's a very nice
property if the diff is small and not full not noise.
...so that its prototype is compatible with GDestroyNotify:
src/devices/nm-acd-manager.c: In function ‘destroy_address_info’:
/usr/include/glib-2.0/glib/gmem.h:120:31: error: cast between incompatible function types from ‘NAcd * (*)(NAcd *)’ {aka ‘struct NAcd * (*)(struct NAcd *)’} to ‘void (*)(void *)’ [-Werror=cast-function-type]
GDestroyNotify _destroy = (GDestroyNotify) (destroy); \
^
src/devices/nm-acd-manager.c:430:2: note: in expansion of macro ‘g_clear_pointer’
g_clear_pointer (&info->acd, n_acd_free);
^~~~~~~~~~~~~~~
The same change was done upstream, so the subsequent subtree pull of n-acd
won't mess this up.
It can easily happen that connectivity checks take a long time to
complete (up to 20 seconds, when they time out).
So, before, during the first 20 seconds no connectivity checks would
return and bump the periodic interval. That meant, for the first 20
seconds we would each second schedule a periodic check.
Then, the checks start timing out, each one second apart as we scheduled
them. Previously, during each completion of the checks, we would bump
the interval every second.
Fix that two ways:
1) when the timer expires, also check whether there are still uncomplete
periodic checks. If there are, already bump the interval at that point.
2) at the same time, when this happens mark the handle so that when
they later complete, that they no longer cause another increase of the
interval (no-bump).
Now the bumping is done either by the timeout, or by the completion of
the request. Whatever happens first.