NMConnection is an interface, which is implemented by the types
NMSimpleConnection (libnm-core), NMSettingsConnection (src) and
NMRemoteConnection (libnm).
NMSettingsConnection does a lot of things already:
1) it "is-a" NMDBusObject and exports the API of a connection profile
on D-Bus
2) it interacts with NMSettings and contains functionality
for tracking the profiles.
3) it is the base-class of types like NMSKeyfileConnection and
NMIfcfgConnection. These handle how the profile is persisted
on disk.
4) it implements NMConnection interface, to itself track the
settings of the profile.
3) and 4) would be better implemented via delegation than inheritance.
Address 4) and don't let NMSettingsConnection implemente the NMConnection
interface. Instead, a settings-connection references now a NMSimpleConnection
instance, to which it delegates for keeping the actual profiles.
Advantages:
- by delegating, there is a clearer separation of what
NMSettingsConnection does. For example, in C we often required
casts from NMSettingsConnection to NMConnection. NMConnection
is a very trivial object with very little logic. When we have
a NMConnection instance at hand, it's good to know that it is
*only* that simple instead of also being an entire
NMSettingsConnection instance.
The main purpose of this patch is to simplify the code by separating
the NMConnection from the NMSettingsConnection. We should generally
be aware whether we handle a NMSettingsConnection or a trivial
NMConnection instance. Now, because NMSettingsConnection no longer
"is-a" NMConnection, this distinction is apparent.
- NMConnection is implemented as an interface and we create
NMSimpleConnection instances whenever we need a real instance.
In GLib, interfaces have a performance overhead, that we needlessly
pay all the time. With this change, we no longer require
NMConnection to be an interface. Thus, in the future we could compile
a version of libnm-core for the daemon, where NMConnection is not an
interface but a GObject implementation akin to NMSimpleConnection.
- In the previous implementation, we cannot treat NMConnection immutable
and copy-on-write.
For example, when NMDevice needs a snapshot of the activated
profile as applied-connection, all it can do is clone the entire
NMSettingsConnection as a NMSimpleConnection.
Likewise, when we get a NMConnection instance and want to keep
a reference to it, we cannot do that, because we never know
who also references and modifies the instance.
By separating NMSettingsConnection we could in the future have
NMConnection immutable and copy-on-write, to avoid all unnecessary
clones.
Specify a reason when creating active connections. The reason will be
used in the next commit to tell whether slaves must be reconnected or
not if a master has autoconnect-slaves=yes.
- indent with spaces for wrapped line
- drop g_return_val_if_fail(). It really shouldn't happen,
and if it does, we can just continue and will hit g_critical()
warnings below, when using the variables.
- get the NMActRequest instance first, and from there the applied
and settings connection. Not the other way around.
nm_device_get_settings_connection() and nm_device_get_applied_connection()
are just convenience functions to get the fields from
act_request.
This allows to adjust the timeout of an existing checkpoint.
The main usecase of checkpoints, is to have a fail-safe when
configuring the network remotely. By allowing to reset the timeout,
the user can perform a series of actions, and keep bumping the
timeout. That way, the entire series is still guarded by the same
checkpoint, but the user can start with short timeout, and
re-adjust the timeout as he goes along.
The libnm API only implements the async form (at least for now).
Sync methods are fundamentally wrong with D-Bus, and it's probably
not needed. Also, follow glib convenction, where the async form
doesn't have the _async name suffix. Also, accept a D-Bus path
as argument, not a NMCheckpoint instance. The libnm API should
not be more restricted than the underlying D-Bus API. It would
be cumbersome to require the user to lookup the NMCheckpoint
instance first, especially since libnm doesn't provide an efficient
or convenient lookup-by-path method. On the other hand, retrieving
the path from a NMCheckpoint instance is always possible.
Introduce a new flag NM_CHECKPOINT_CREATE_FLAG_ALLOW_OVERLAPPING
that allows the creation of overlapping checkpoints. Before, and
by default, checkpoints that reference a same device conflict,
and creating such a checkpoint failed.
Now, allow this. But during rollback automatically destroy all
overlapping checkpoints that were created after the checkpoint
that is about to rollback.
With this, you can create a series of checkpoints, and rollback them
individually. With the restriction, that if you once rolled back to an
older checkpoint, you no longer can roll"forward" to a younger one.
What this implies and what is new here, is that the checkpoint might be
automatically destroyed by NetworkManager before the timeout expires. When
the user later would try to manually destroy/rollback such a checkpoint, it
would fail because the checkpoint no longer exists.
We already do error checking in nm_checkpoint_manager_create(). No need
to split it in two places. Let all error conditions be handled by
nm_checkpoint_manager_create() first, and then once we decide all is
good, nm_checkpoint_new() can no longer fail.
Instead of scheduling one timeout only, let each checkpoint instance
individually schedule a timeout. This has some overhead, but glib
is supposed to make scheduling many timers efficient. Otherwise,
glib should be fixed.
This simplifies in my opinion the code, because it's up to each
checkpoint to maintain its own timeout.
Later we will also add a AdjustRollbackTimeout operation, which
allow to reschedule the timeout. It also seems slightly simpler,
if scheduling of the timeout is done by the NMCheckpoint instance
itself.
We don't need an external CheckpointItem, just to wrap the
CList instance. Embed it directly in NMCheckpoint.
Sure, that exposes the checkpoints_lst field in the (internal)
header file, hiding the private member less.
I find it slightly nicer and explict. Also, the list elements
are strictly speaking private, we should better not explicitly
use them outside of NMManager/NMDevice. The macro hides this.
Instead of using a GSList for tracking the devices, use a CList.
I think a CList is in most cases the more suitable data structure
then GSList:
- you can find out in O(1) whether the object is linked. That
is nice, for example to assert in NMDevice's destructor that
the object was unlinked, and we will use that later in
nm_manager_get_device_by_path().
- you can unlink the element in O(1) and you can unlink the
element without having access to the link's head
- Contrary to GSList, this does not require an extra slice
allocation for the link node. It quite possibliy consumes
slightly less memory because the CList structure is embedded
in a struct that we already allocate. Even if slice allocation
would be perfect to only consume 2*sizeof(gpointer) for the link
note, it would at most be as-good as CList. Quite possibly,
there is an overhead though.
- CList possibly has better memory locality, because the link
structure and the data are close to each other.
Something which could be seen as disavantage, is that with CList
one device can only be tracked in one NMManager instance at a time.
But that is fine. There exists only one NMManager instance for now,
and even if we would ever introduce multiple managers, we probably
would not associate one NMDevice instance with multiple managers.
The advantages are arguably not huge, but CList is IMHO clearly the
more suited data structure. No need to stick to a suboptimal data
structure for the job. Refactor it.
I dislike the static hash table to cache the integer counter for
numbered paths. Let's instead cache the counter at the class instance
itself -- since the class contains the information how the export
path should be exported.
However, we cannot use a plain integer field inside the class structure,
because the class is copied between derived classes. For example,
NMDeviceEthernet and NMDeviceBridge both get a copy of the NMDeviceClass
instance. Hence, the class doesn't contain the counter directly, but
a pointer to one counter that can be shared between sibling classes.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
We commonly only allow tabs at the beginning of a line, not
afterwards. The reason for this style is so that the code
looks formated right with tabstop=4 and tabstop=8.
commit involves more then just replacing the setting and writing them
out. What? Dunno. It's complex.
But let's not bypass the commit-changes function. That one is supposed
to get it right.
Using CList, we embed the list element in NMActiveConnection struct
itself. That means for example, that you couldn't track a
NMActiveConnection more then once. But we anyway never want that.
The advantage is, that removing an active connection from the list
is O(1), and we safe additional GSlice allocations for each node
element.
We also do this for libnm, where it causes visible changes
in behavior. But if somebody would rely on the hashing implementation
for hash tables, it would be seriously flawed.
GHashTable optimizes a NULL equality function to use direct pointer
comparison. That saves the overhead of calling g_direct_equal().
This is also documented behavior for g_hash_table_new().
While at it, also don't pass g_direct_hash() but use the default
of %NULL. The behavior is the same, but consistently don't use
g_direct_hash().
During write, it can regularly happen that the connection gets modified.
For example, keyfile never writes blobs as-is, it always writes the
blob to an external file, and replaces the certificate property with
a path.
Other reasons could be just bugs, where the reader and writer are not doing
a proper round trip (these cases should be fixed).
Refactor commit_changes(), to return the re-read connection to
the settings-connection class, and handle replacing the settings
there.
Also, prepare for another change. Sometimes we first call replace_settings()
followed by commit_changes(). It would be better to instead call commit_changes()
first, and only on success proceed with replace_settings(). Hence, commit_changes()
gets a new argument new_connection, that can be used to write another
connection to disk.
Replace the usage of g_str_hash() with our own nm_str_hash().
GLib's g_str_hash() uses djb2 hashing function, just like we
do at the moment. The only difference is, that we use a diffrent
seed value.
Note, that we initialize the hash seed with random data (by calling
getrandom() or reading /dev/urandom). That is a change compared to
before.
This change of the hashing function and accessing the random pool
might be undesired for libnm/libnm-core. Hence, the change is not
done there as it possibly changes behavior for public API. Maybe
we should do that later though.
At this point, there isn't much of a change. This patch becomes
interesting, if we decide to use a different hashing algorithm.
Since commit 0922a17738 ("manager: avoid that auto-activations
preempt user activations") the manager doesn't allow a internal
activation to disconnect the same connection already active on the
device. Thus, during a rollback we must ensure that the device is
deactivated before.
Fixes: 0922a17738
On rollback, before updating the settings-connection check if it
actually changed. Also, only reactivate the connection if it was
deactivated or if the settings/applied connection changed.
https://bugzilla.redhat.com/show_bug.cgi?id=1427187
We call these functions a lot. A GSList is just the wrong tool for the
job. Refactor the code to use instead a sorted array everywhere.
This means, we malloc() one array for all connections instead
slice-allocate a GSList item for each. Also, sorting an array
is faster then sorting a GSList.
Technically, the GSList implementation had the same big-O runtime
complexity, but using an array is still faster. That is, sorting
an array and a GSList is both O(n*log(n)).
Actually, nm_settings_get_connections_sorted() used
g_slist_insert_sorted() instead of g_slist_sort(). That results
in O(n^2). That could have been fixed to have O(n*log(n)), but
instead refactor the code to use an array.
This makes it easier to install the files with proper names.
Also, it makes the makefile rules slightly simpler.
Lastly, the documentation is now generated into docs/api, which makes it
possible to get rid of the awkward relative file names in docbook.
Keep the include paths clean and separate. We use directories to group source
files together. That makes sense (I guess), but then we should use this
grouping also when including files. Thus require to #include files with their
path relative to "src/".
Also, we build various artifacts from the "src/" tree. Instead of having
individual CFLAGS for each artifact in Makefile.am, the CFLAGS should be
unified. Previously, the CFLAGS for each artifact differ and are inconsistent
in which paths they add to the search path. Fix the inconsistency by just
don't add the paths at all.
When a global checkpoint is created (one with empty device list) we
save the status of all devices to restore it later. After the
checkpoint new interfaces and connections may appear and they can
significantly influence the overall networking status, but we don't
consider them at the moment.
Introduce a new flag DELETE_NEW_CONNECTIONS to delete any connection
added after the checkpoint and similarly a DISCONNECT_NEW_DEVICES to
ensure that the connection active on newly appeared devices doesn't
disrupt network connectivity.
https://bugzilla.redhat.com/show_bug.cgi?id=1378393
- use _NM_GET_PRIVATE() and _NM_GET_PRIVATE_PTR() everywhere.
- reorder statements, to have GObject related functions (init, dispose,
constructed) at the bottom of each file and in a consistent order w.r.t.
each other.
- unify whitespaces in signal and properties declarations.
- use NM_GOBJECT_PROPERTIES_DEFINE() and _notify()
- drop unused signal slots in class structures
- drop unused header files for device factories
First, consider all devices and not only realized and managed ones
when an empty list is passed. Also, move the list evaluation to the
checkpoint manager, since the check for device conflicts is done
there.
Fixes: 3e09aed2a0
In order to better restore the previous system state, allow the
inclusion of unmanaged devices in a checkpoint and try to revert to
the old state taking also the realized/managed state into account.