Note the special error codes NM_UTILS_ERROR_CONNECTION_AVAILABLE_*.
This will be used to determine, whether the profile is fundamentally
incompatible with the device, or whether just some other properties
mismatch. That information will be importand during a plain `nmcli
connection up`, where NetworkManager searches all devices for a device
to activate. If no device is found (and multiple errors happened),
we want to show the error that is most likely relevant for the user.
Also note, how NMDevice's check_connection_compatible() uses the new
class field "device_class->connection_type_check_compatible" to simplify
checks for compatible profiles.
The error reason is still unused.
It seems to me the NM_DEVICE_CLASS_DECLARE_TYPES() macro confuses more
than helping. Let's explicitly initialize the two fields, albeit with
another helper macro NM_DEVICE_DEFINE_LINK_TYPES() to get the list of
link-types right.
For consistency, also leave nop-lines like
device_class->connection_type_supported = NULL;
device_class->link_types = NM_DEVICE_DEFINE_LINK_TYPES ();
because all NMDevice class init methods should have this same
boiler plate code and to make it explicit that this is intended.
And there are only 3 occurences where this actually comes into play.
The majority of device implementations name their parent-class variable
"device_class". That also makes more sense as it is more consistant.
E.g. "parent" sounds like it's the direct parent, but that is not
the crucial point here. The crucial point at this place, is that we
access the NMDeviceClass typed pointer. Rename.
It is safer to enable send-sci by default because, at the cost of
8-byte overhead, it makes MACsec work over bridges (note that kernel
also enables it by default). While at it, also make the option
configurable.
https://bugzilla.redhat.com/show_bug.cgi?id=1588041
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
The change doesn't really make a difference. I thought it would, so I
did it. But turns out (as the code correctly assumes), while the
notifications are frozen, it's OK to leave the property still in an
inconsistent state while emitting the notify signal.
Still, it feels slightly more correct this way, so keep the change.
The number of authentication retires is useful also for passwords aside
802-1x settings. For example, src/devices/wifi/nm-device-wifi.c also has
a retry counter and uses a hard-coded value of 3.
Move the setting, so that it can be used in general. Although it is still
not implemented for other settings.
This is an API and ABI break.
Since commit 4a6fd0e83e (device: honor the
connection.autoconnect-retries for 802.1X) and the related bug bgo#723084,
we reuse the autoconnect-retries setting to control the retry count
for requesting passwords.
I think that is wrong. These are two different settings, we should not
reuse the autoconnect retry counter while the device is still active.
For example, the user might wish to set autoconnect-retries to infinity
(zero). In that case, we would retry indefinitly to request a password.
That could be problematic, if there is a different issue with the
connection, that makes it appear tha the password is wrong.
A full re-activation might succeed, but we would never stop retrying
to authenticate. Instead, we should have two different settings for
retrying to authenticate and to autoconnect.
This is a change in behavior compared to 1.8.
Names like
- nm_settings_connection_get_autoconnect_retries
- nm_settings_connection_set_autoconnect_retries
- nm_settings_connection_reset_autoconnect_retries
are about the same thing, but they are cumbersome to grep
because they share not a common prefix.
Rename them from SUBJECT_VERB_OBJECT to SUBJECT_OBJECT_VERB,
which sounds odd in English, but seems preferred to me.
Now you can grep for "nm_settings_connection_autoconnect_retries_" to
get all accessors of the retry count, or "nm_settings_connection_autoconnect_"
to get all accessors related to autoconnect in general.
- clearify in the manual page that setting retry to 1 means to try
once, without retry.
- log the initially set retry value in nm_settings_connection_get_autoconnect_retries().
- use nm_settings_connection_get_autoconnect_retries() in
nm_settings_connection_can_autoconnect().
At startup the manager tries to create virtual devices without a
specific order and spits warnings when a device can't be realized
because the parent device is not yet created. These failures are not
something the user should worry about because the creation will be
retried when the parent appears.
A better approach is to return an error code from the device's
create_and_realize() telling that it failed because the parent doesn't
exist. In this way, the manager knows that the device isn't ready and
can avoid printing warning messages.
Change the output of nm_platform_error_to_string() to print the numeric value.
Also, accept a string buffer instead of using an alloca() allocated buffer.
There is still a macro to provide the previous functionality, but it
was ill-suited to call from inside a loop.
Reduce the use of NM_PLATFORM_GET / nm_platform_get() to get
the platform singleton instance.
For one, this is a step towards supporting namespaces, where we need
to use different NMNetns/NMPlatform instances depending on in which
namespace the device lives.
Also, we should reduce our use of singletons. They are difficult to
coordinate on shutdown. Instead there should be a clear order of
dependencies, expressed by owning a reference to those singelton
instances. We already own a reference to the platform singelton,
so use it and avoid NM_PLATFORM_GET.
(cherry picked from commit 94d9ee129d)
The state-change of a device has a reason argument, which is mostly for information
only.
There are many places in code that are the source of a state-reason.
Mostly these are calls to:
- nm_device_state_changed()
- nm_device_queue_state()
- nm_device_queue_recheck_available()
- nm_device_set_unmanaged_by_*()
- nm_device_master_release_one_slave()
- nm_device_ip_method_failed()
- nm_modem_emit_prepare_result()
- nm_modem_emit_ppp_failed()
- nm_manager_deactivate_connection()
- NM_SET_OUT (out_failure_reason, NM_DEVICE_STATE_REASON_*);
However, there are a few places in code that look at the reason
to decide how to proceed. I think this is a bad pattern, because
cause and effect are decoupled and it gets hard to understand where
a certain reason is set and what consequences that has.
Add a nop-function nm_device_state_reason_check() to mark all uses
of the device state reason that derive decisions from it. That is,
highlight the "effect" part.
This argument is only relevant when the NMActStageReturn argument
indicates NM_ACT_STAGE_RETURN_FAILURE. In all other cases it is ignored.
Rename the argument to make the meaning clearer. The argument is passed
through several layers of code, it isn't obvious that this argument only
matters for the failure case. Also, the distinct name makes it easier
to distinguish from other uses of the "reason" name.
While at it, do some drive-by cleanup:
- use g_return_*() instead of g_assert() to have a more graceful
assertion.
- functions like dhcp4_start() don't need to return a failure reason.
Most callers don't care, and the caller who does can determine the
proper reason.
- allow omitting the out-argument via NM_SET_OUT().
NMDeviceEthernet and NMDeviceMacsec implement their own retry policy
for connection using 802.1X, and consider the credentials wrong when
the authentication fails for 3 times. In such case, they also disable
autoconnection for the device by setting the state reason NO_SECRETS.
This means that it's not possible at the moment to choose how many
times the authentication will be retried since they don't use the
standard reconnection logic.
Change NMDeviceEthernet and NMDeviceMacsec to use the number of
retries from connection.autoconnect-retries instead of a hardcoded
value to decide how many times the authentication must be restarted.
Use the per-connection authentication timeout for 802.1X Ethernet,
MACsec and Wi-Fi connections. In case the value is not defined, fall
back to the global one.
Instead of having a NM_SUPPLICANT_INTERFACE_CONNECTION_ERROR signal to notify
about failures during AddNetwork/SelectNetwork, accept a callback to report
success/failure.
Thereby, rename nm_supplicant_interface_set_config() to
nm_supplicant_interface_assoc().
The async callback is guaranteed to:
- be invoked exactly once, signalling success or failure
- always being invoked asyncronously.
The pending request can be (synchronously) cancelled via
nm_supplicant_interface_disconnect() or by disposing the
interface instance. In those cases the callback will be invoked
too, with error code cancelled/disposing.
Also change the signature of the NM_SUPPLICANT_INTERFACE_STATE signal,
to have three "int" type arguments. Thereby also fix the subscribers
to this signal that wrongly had type guint32, instead of guint
(which happens to be the same underlying type, so no real problem).
https://mail.gnome.org/archives/networkmanager-list/2017-February/msg00021.html
Add code to nm-device-macsec.c to support the creation of macsec
connection. Most of the code for controlling wpa_supplicant is copied
from nm-device-ethernet.c and probably could be consolidated in some
ways.