No longer use GDBusObjectMangaerClient and gdbus-codegen generated classes
for the NMClient cache. Instead, use GDBusConnection directly and a
custom implementation (NMLDBusObject) for caching D-Bus' ObjectManager
data.
CHANGES
-------
- This is a complete rework. I think the previous implementation was
difficult to understand. There were unfixed bugs and nobody understood
the code well enough to fix them. Maybe somebody out there understood the
code, but I certainly did not. At least nobody provided patches to fix those
issues. I do believe that this implementation is more straightforward and
easier to understand. It removes a lot of layers of code. Whether this claim
of simplicity is true, each reader must decide for himself/herself. Note
that it is still fairly complex.
- There was a lingering performance issue with large number of D-Bus
objects. The patch tries hard that the implementation scales well. Of
course, when we cache N objects that have N-to-M references to other,
we still are fundamentally O(N*M) for runtime and memory consumption (with
M being the number of references between objects). But each part should behave
efficiently and well.
- Play well with GMainContext. libnm code (NMClient) is generally not
thread safe. However, it should work to use multiple instances in
parallel, as long as each access to a NMClient is through the caller's
GMainContext. This follows glib's style and effectively allows to use NMClient
in a multi threaded scenario. This implies to stick to a main context
upon construction and ensure that callbacks are only invoked when
iterating that context. Also, NMClient itself shall never iterate the
caller's context. This also means, libnm must never use g_idle_add() or
g_timeout_add(), as those enqueue sources in the g_main_context_default()
context.
- Get ordering of messages right. All events are consistently enqueued
in a GMainContext and processed strictly in order. For example,
previously "nm-object.c" tried to combine signals and emit them on an
idle handler. That is wrong, signals must be emitted in the right order
and when they happen. Note that when using GInitable's synchronous initialization
to initialize the NMClient instance, NMClient internally still operates fully
asynchronously. In that case NMClient has an internal main context.
- NMClient takes over most of the functionality. When using D-Bus'
ObjectManager interface, one needs to handle basically the entire state
of the D-Bus interface. That cannot be separated well into distinct
parts, and even if you try, you just end up having closely related code
in different source files. Spreading related code does not make it
easier to understand, on the contrary. That means, NMClient is
inherently complex as it contains most of the logic. I think that is
not avoidable, but it's not as bad as it sounds.
- NMClient processes D-Bus messages and state changes in separate steps.
First NMClient unpacks the message (e.g. _dbus_handle_properties_changed()) and
keeps track of the changed data. Then we update the GObject instances
(_dbus_handle_obj_changed_dbus()) without emitting any signals yet. Finally,
we emit all signals and notifications that were collected
(_dbus_handle_changes_commit()). Note that for example during the initial
GetManagedObjects() reply, NMClient receive a large amount of state at once.
But we first apply all the changes to our GObject instances before
emitting any signals. The result is that signals are always emitted in a moment
when the cache is consistent. The unavoidable downside is that when you receive
a property changed signal, possibly many other properties changed
already and more signals are about to be emitted.
- NMDeviceWifi no longer modifies the content of the cache from client side
during poke_wireless_devices_with_rf_status(). The content of the cache
should be determined by D-Bus alone and follow what NetworkManager
service exposes. Local modifications should be avoided.
- This aims to bring no API/ABI change, though it does of course bring
various subtle changes in behavior. Those should be all for the better, but the
goal is not to break any existing clients. This does change internal
(albeit externally visible) API, like dropping NM_OBJECT_DBUS_OBJECT_MANAGER
property and NMObject no longer implementing GInitableIface and GAsyncInitableIface.
- Some uses of gdbus-codegen classes remain in NMVpnPluginOld, NMVpnServicePlugin
and NMSecretAgentOld. These are independent of NMClient/NMObject and
should be reworked separately.
- While we no longer use generated classes from gdbus-codegen, we don't
need more glue code than before. Also before we constructed NMPropertiesInfo and
a had large amount of code to propagate properties from NMDBus* to NMObject.
That got completely reworked, but did not fundamentally change. You still need
about the same effort to create the NMLDBusMetaIface. Not using
generated bindings did not make anything worse (which tells about the
usefulness of generated code, at least in the way it was used).
- NMLDBusMetaIface and other meta data is static and immutable. This
avoids copying them around. Also, macros like NML_DBUS_META_PROPERTY_INIT_U()
have compile time checks to ensure the property types matches. It's pretty hard
to misuse them because it won't compile.
- The meta data now explicitly encodes the expected D-Bus types and
makes sure never to accept wrong data. That would only matter when the
server (accidentally or intentionally) exposes unexpected types on
D-Bus. I don't think that was previously ensured in all cases.
For example, demarshal_generic() only cared about the GObject property
type, it didn't know the expected D-Bus type.
- Previously GDBusObjectManager would sometimes emit warnings (g_log()). Those
probably indicated real bugs. In any case, it prevented us from running CI
with G_DEBUG=fatal-warnings, because there would be just too many
unrelated crashes. Now we log debug messages that can be enabled with
"LIBNM_CLIENT_DEBUG=trace". Some of these messages can also be turned
into g_warning()/g_critical() by setting LIBNM_CLIENT_DEBUG=warning,error.
Together with G_DEBUG=fatal-warnings, this turns them into assertions.
Note that such "assertion failures" might also happen because of a server
bug (or change). Thus these are not common assertions that indicate a bug
in libnm and are thus not armed unless explicitly requested. In our CI we
should now always run with LIBNM_CLIENT_DEBUG=warning,error and
G_DEBUG=fatal-warnings and to catch bugs. Note that currently
NetworkManager has bugs in this regard, so enabling this will result in
assertion failures. That should be fixed first.
- Note that this changes the order in which we emit "notify:devices" and
"device-added" signals. I think it makes the most sense to emit first
"device-removed", then "notify:devices", and finally "device-added"
signals.
This changes behavior for commit 52ae28f6e5 ('libnm: queue
added/removed signals and suppress uninitialized notifications'),
but I don't think that users should actually rely on the order. Still,
the new order makes the most sense to me.
- In NetworkManager, profiles can be invisible to the user by setting
"connection.permissions". Such profiles would be hidden by NMClient's
nm_client_get_connections() and their "connection-added"/"connection-removed"
signals.
Note that NMActiveConnection's nm_active_connection_get_connection()
and NMDevice's nm_device_get_available_connections() still exposes such
hidden NMRemoteConnection instances. This behavior was preserved.
NUMBERS
-------
I compared 3 versions of libnm.
[1] 962297f908, current tip of nm-1-20 branch
[2] 4fad8c7c64, current master, immediate parent of this patch
[3] this patch
All tests were done on Fedora 31, x86_64, gcc 9.2.1-1.fc31.
The libraries were build with
$ ./contrib/fedora/rpm/build_clean.sh -g -w test -W debug
Note that RPM build already stripped the library.
---
N1) File size of libnm.so.0.1.0 in bytes. There currently seems to be a issue
on Fedora 31 generating wrong ELF notes. Usually, libnm is smaller but
in these tests it had large (and bogus) ELF notes. Anyway, the point
is to show the relative sizes, so it doesn't matter).
[1] 4075552 (102.7%)
[2] 3969624 (100.0%)
[3] 3705208 ( 93.3%)
---
N2) `size /usr/lib64/libnm.so.0.1.0`:
text data bss dec hex filename
[1] 1314569 (102.0%) 69980 ( 94.8%) 10632 ( 80.4%) 1395181 (101.4%) 1549ed /usr/lib64/libnm.so.0.1.0
[2] 1288410 (100.0%) 73796 (100.0%) 13224 (100.0%) 1375430 (100.0%) 14fcc6 /usr/lib64/libnm.so.0.1.0
[3] 1229066 ( 95.4%) 65248 ( 88.4%) 13400 (101.3%) 1307714 ( 95.1%) 13f442 /usr/lib64/libnm.so.0.1.0
---
N3) Performance test with test-client.py. With checkout of [2], run
```
prepare_checkout() {
rm -rf /tmp/nm-test && \
git checkout -B test 4fad8c7c64 && \
git clean -fdx && \
./autogen.sh --prefix=/tmp/nm-test && \
make -j 5 install && \
make -j 5 check-local-clients-tests-test-client
}
prepare_test() {
NM_TEST_REGENERATE=1 NM_TEST_CLIENT_BUILDDIR="/data/src/NetworkManager" NM_TEST_CLIENT_NMCLI_PATH=/usr/bin/nmcli python3 ./clients/tests/test-client.py -v
}
do_test() {
for i in {1..10}; do
NM_TEST_CLIENT_BUILDDIR="/data/src/NetworkManager" NM_TEST_CLIENT_NMCLI_PATH=/usr/bin/nmcli python3 ./clients/tests/test-client.py -v || return -1
done
echo "done!"
}
prepare_checkout
prepare_test
time do_test
```
[1] real 2m14.497s (101.3%) user 5m26.651s (100.3%) sys 1m40.453s (101.4%)
[2] real 2m12.800s (100.0%) user 5m25.619s (100.0%) sys 1m39.065s (100.0%)
[3] real 1m54.915s ( 86.5%) user 4m18.585s ( 79.4%) sys 1m32.066s ( 92.9%)
---
N4) Performance. Run NetworkManager from build [2] and setup a large number
of profiles (551 profiles and 515 devices, mostly unrealized). This
setup is already at the edge of what NetworkManager currently can
handle. Of course, that is a different issue. Here we just check how
long plain `nmcli` takes on the system.
```
do_cleanup() {
for UUID in $(nmcli -g NAME,UUID connection show | sed -n 's/^xx-c-.*:\([^:]\+\)$/\1/p'); do
nmcli connection delete uuid "$UUID"
done
for DEVICE in $(nmcli -g DEVICE device status | grep '^xx-i-'); do
nmcli device delete "$DEVICE"
done
}
do_setup() {
do_cleanup
for i in {1..30}; do
nmcli connection add type bond autoconnect no con-name xx-c-bond-$i ifname xx-i-bond-$i ipv4.method disabled ipv6.method ignore
for j in $(seq $i 30); do
nmcli connection add type vlan autoconnect no con-name xx-c-vlan-$i-$j vlan.id $j ifname xx-i-vlan-$i-$j vlan.parent xx-i-bond-$i ipv4.method disabled ipv6.method ignore
done
done
systemctl restart NetworkManager.service
sleep 5
}
do_test() {
perf stat -r 50 -B nmcli 1>/dev/null
}
do_test
```
[1]
Performance counter stats for 'nmcli' (50 runs):
456.33 msec task-clock:u # 1.093 CPUs utilized ( +- 0.44% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
5,900 page-faults:u # 0.013 M/sec ( +- 0.02% )
1,408,675,453 cycles:u # 3.087 GHz ( +- 0.48% )
1,594,741,060 instructions:u # 1.13 insn per cycle ( +- 0.02% )
368,744,018 branches:u # 808.061 M/sec ( +- 0.02% )
4,566,058 branch-misses:u # 1.24% of all branches ( +- 0.76% )
0.41761 +- 0.00282 seconds time elapsed ( +- 0.68% )
[2]
Performance counter stats for 'nmcli' (50 runs):
477.99 msec task-clock:u # 1.088 CPUs utilized ( +- 0.36% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
5,948 page-faults:u # 0.012 M/sec ( +- 0.03% )
1,471,133,482 cycles:u # 3.078 GHz ( +- 0.36% )
1,655,275,369 instructions:u # 1.13 insn per cycle ( +- 0.02% )
382,595,152 branches:u # 800.433 M/sec ( +- 0.02% )
4,746,070 branch-misses:u # 1.24% of all branches ( +- 0.49% )
0.43923 +- 0.00242 seconds time elapsed ( +- 0.55% )
[3]
Performance counter stats for 'nmcli' (50 runs):
352.36 msec task-clock:u # 1.027 CPUs utilized ( +- 0.32% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,790 page-faults:u # 0.014 M/sec ( +- 0.26% )
1,092,341,186 cycles:u # 3.100 GHz ( +- 0.26% )
1,209,045,283 instructions:u # 1.11 insn per cycle ( +- 0.02% )
281,708,462 branches:u # 799.499 M/sec ( +- 0.01% )
3,101,031 branch-misses:u # 1.10% of all branches ( +- 0.61% )
0.34296 +- 0.00120 seconds time elapsed ( +- 0.35% )
---
N5) same setup as N4), but run `PAGER= /bin/time -v nmcli`:
[1]
Command being timed: "nmcli"
User time (seconds): 0.42
System time (seconds): 0.04
Percent of CPU this job got: 107%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.43
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 34456
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 6128
Voluntary context switches: 1298
Involuntary context switches: 1106
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
[2]
Command being timed: "nmcli"
User time (seconds): 0.44
System time (seconds): 0.04
Percent of CPU this job got: 108%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.44
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 34452
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 6169
Voluntary context switches: 1849
Involuntary context switches: 142
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
[3]
Command being timed: "nmcli"
User time (seconds): 0.32
System time (seconds): 0.02
Percent of CPU this job got: 102%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.34
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 29196
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 5059
Voluntary context switches: 919
Involuntary context switches: 685
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
N6) same setup as N4), but run `nmcli monitor` and look at `ps aux` for
the RSS size.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
[1] me 1492900 21.0 0.2 461348 33248 pts/10 Sl+ 15:02 0:00 nmcli monitor
[2] me 1490721 5.0 0.2 461496 33548 pts/10 Sl+ 15:00 0:00 nmcli monitor
[3] me 1495801 16.5 0.1 459476 28692 pts/10 Sl+ 15:04 0:00 nmcli monitor
gtkdoc-scan supports regular expressions in the --ignore-decorators
command-line option. Since it is easier to use a regexp than grepping
macros from a source file, revert the ugly solution from commit
2d941dc95a ('build: fix errors when building with gtk-doc 1.32').
gtkdoc-scan 1.32 performs stricter checks on structures definitions
and so it complains on:
/build/networkmanager/src/NetworkManager/libnm/./nm-vpn-plugin-old.h:0: warning: partial declaration (struct) : typedef struct {
NM_DEPRECATED_IN_1_2
GObject parent;
} NMVpnPluginOld NM_DEPRECATED_IN_1_2;
because of the unrecognized token 'NM_DEPRECATED_IN_1_2'.
Pass all allowed macros to gtkdoc-scan through the --ignore-decorators
argument.
https://gitlab.gnome.org/GNOME/gtk-doc/issues/98https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/238
In libnm, we prefer opaque typedefs. gtk-doc needs to be patched to properly
generate documentation. Add a check for that.
Add a test. By default, this does not fail but just prints a warning. The test
can be made failing by setting NMTST_CHECK_GTK_DOC=1.
See-also: https://gitlab.gnome.org/GNOME/gtk-doc/merge_requests/2
(cherry picked from commit 02464c052e)
There are two aspects: the public crypto API that is provided by
"nm-crypto.h" header, and the internal header which crypto backends
need to implement. Split them.
Shared libraries built with sanitizers are a bit inconvenient to use
because they require that any application linking to them is run with
libasan preloaded using LD_PRELOAD. This limitation makes the
sanitizer support less useful because applications will refuse to
start unless there is a special environment variable set.
Let's turn the --enable-address-sanitizer configure flag into
--with-address-sanitizer=yes|no|exec so that is possible to enable
asan only for executables.
We commonly only allow tabs at the beginning of a line, not
afterwards. The reason for this style is so that the code
looks formated right with tabstop=4 and tabstop=8.
It already defaults to the right value. We only need to define
NM_VERSION_MIN_REQUIRED, so that parts of our internal build
can make use of deprecated API.
We don't need to have two version defines "CUR" and "NEXT".
The main purpose of these macros (if not their only), is to
make NM_AVAILABLE_IN_* and NM_DEPRECATED_IN_* macros work.
1) At the precise commit of a release, "CUR" and "NEXT" must be identical,
because whenever the user configures NM_VERSION_MIN_REQUIRED and
NM_VERSION_MAX_ALLOWED, then they both compare against the current
version, at which point "CUR" == "NEXT".
2) Every other commit aside the release, is a development version that leads
up the the next coming release. But as far as versioning is concerned, such
a development version should be treated like that future release. It's unstable
API and it may or may not be close to later API of the release. But
we shall treat it as that version. Hence, also in this case, we want to
set both "NM_VERSION_CUR_STABLE" and again NEXT to the future version.
This makes NM_VERSION_NEXT_STABLE redundant.
Previously, the separation between current and next version would for
example allow that NM_VERSION_CUR_STABLE is the previously release
stable API, and NM_VERSION_NEXT_STABLE is the version of the next upcoming
stable API. So, we could allow "examples" to make use of development
API, but other(?) internal code still restrict to unstable API. But it's
unclear which other code would want to avoid current development.
Also, the points 1) and 2) were badly understood. Note that for our
previousy releases, we usually didn't bump the macros at the stable
release (and if we did, we didn't set them to be the same). While using
two macros might be more powerful, it is hard to grok and easy to
forget to bump the macros a the right time. One macro shall suffice.
All this also means, that *immediately* after making a new release, we shall
bump the version number in `configure.ac` and "NM_VERSION_CUR_STABLE".
Although it is possible to generate distributable files on meson
since version 0.41 by using the `ninja dist` command, autotools does
different things that end up creating a different distributable
file.
meson build files have been added to autotools build files as
distributable files, so the whole meson port would also be
distributed.
https://mail.gnome.org/archives/networkmanager-list/2018-January/msg00047.html
This speeds up the initial object tree load significantly. Also, it
reduces the object management complexity by shifting the duties to
GDBusObjectManager.
The lifetime of all NMObjects is now managed by the NMClient via the
object manager. The NMClient creates the NMObjects for GDBus objects,
triggers the initialization and serves as an object registry (replaces
the nm-cache).
The ObjectManager uses the o.fd.DBus.ObjectManager API to learn of the
object creation, removal and property changes. It takes care of the
property changes so that we don't have to and lets us always see a
consistent object state. Thus at the time we learn of a new object we
already know its properties.
The NMObject unfortunately can't be made synchronously initializable as
the NMRemoteConnection's settings are not managed with standard
o.fd.DBus Properties and ObjectManager APIs and thus are not known to
the ObjectManager. Thus most of the asynchronous object property
changing code in nm-object.c is preserved. The objects notify the
properties that reference them of their initialization in from their
init_finish() methods, thus the asynchronously created objects are not
allowed to fail creation (or the dependees would wait forever). Not a
problem -- if a connection can't get its Settings, it's either invisible
or being removed (presumably we'd learn of the removal from the object
manager soon).
The NMObjects can't be created by the object manager itself, since we
can't determine the resulting object type in proxy_type() yet (we can't
tell from the name and can't access the interface list). Therefore the
GDBusObject is coupled with a NMObject later on.
Lastly, now that all the objects are managed by the object manager, the
NMRemoteSettings and NMManager go away when the daemon is stopped. The
complexity of dealing with calls to NMClient that would require any of
the resources that these objects manage (connection or device lists,
etc.) had to be moved to NMClient. The bright side is that his allows
for removal all of the daemon presence tracking from NMObject.
nm-core-types.h and nm-types.h contain the actual definition of types
and gtk-doc won't generate a "Implemented interfaces" section if they
are not included.
https://bugzilla.gnome.org/show_bug.cgi?id=765983
For internal compilation we want to be able to use deprecated
API without warnings.
Define the version min/max macros to effectively disable deprecation
warnings.
However, don't do it via CFLAGS option in the makefiles, instead hack it
to "nm-default.h". After all, *every* source file that is for internal
compilation needs to include this header as first.
This shall contain type definitions, with similar use
to "nm-core-internal.h".
However, it should contain a minimal set, so that we can include this
header in other headers under "src/", without including the whole
"nm-core-internal.h" in headers.
After copying "nm-vpn-plugin-old.*" to "nm-vpn-service-plugin.*",
rename the class and add it to the Makefile.
This will become the new VPN Service API for libnm 1.2. No changes
done yet except renaming of the classes and functions.
Rename the previous classes NMVpnPlugin(Old) to NMVpnServicePlugin
to have a distinct name from NMVpnEditorPlugin. Buth are plugins, but
with a different use.
https://bugzilla.gnome.org/show_bug.cgi?id=749951
Add functions nm_utils_enum_to_str() and nm_utils_enum_from_str()
which can be used to perform conversions between enum values and
strings, passing the GType automatically generated for every enum by
glib-mkenums.
Update the docs build to include and exclude the correct files.
Fill in some missing documentation, and fix problems in the existing
docs. (In particular, "<" can't appear as a literal in documentation,
so change it to "<". Also, "PKCS#12" has to be written as
"PKCS#<!-- -->12", or gtk-doc will think "#12" is a reference to a
type named "12".)
Create NMIPConfig as the parent of NMIP4Config and NMIP6Config, and
remove the two subclasses from the public API; while it's convenient
to still have both internally, they are now identical to the outside
world.
Port libnm-core/libnm to GDBus.
The NetworkManager daemon continues to use dbus-glib; the
previously-added connection hash/variant conversion methods are now
moved to NetworkManagerUtils (along with a few other utilities that
are now only needed by the daemon code).
In preparation for porting to GDBus, make nm_connection_to_dbus(),
etc, represent connections as GVariants of type 'a{sa{sv}}' rather
than as GHashTables-of-GHashTables-of-GValues.
This means we're constantly converting back and forth internally, but
this is just a stepping stone on the way to the full GDBus port, and
all of that code will go away again later.
There's not much point in keeping them separate: all existing
libnm-glib-vpn users also link against libnm-glib, and the amount of
extra code added to libnm by merging in libnm-vpn is negligible.
Additionally, nm-vpn-plugin will later need access to some
libnm-internal APIs.
So, merge them together.
Add a header file to expose private utility functions from libnm-core
that can be used by NetworkManager (core) and libnm.so. The header
is also used to give privileged access to libnm-core. Since NM links
statically, these functions are not exported and not part of public ABI.
This also removes the NM_UTILS_PRIVATE_CALL() macro and libnm.so no
longer exports nm_utils_get_private().
Before, this functionality was partly declared in nm-utils-private.h.
This was wrong because nm-utils-private.h is for functionality
entirely private to libnm-core.
Signed-off-by: Thomas Haller <thaller@redhat.com>