Run:
./contrib/scripts/nm-code-format.sh -i
./contrib/scripts/nm-code-format.sh -i
Yes, it needs to run twice because the first run doesn't yet produce the
final result.
Signed-off-by: Antonio Cardace <acardace@redhat.com>
We always build with PolicyKit support enabled, because it has no
additional dependencies, beside some D-Bus calls.
However, in NetworkManager.conf the user could configure
"main.auth-polkit" to disable PolicyKit. However, previously it would
only allow to disable PolicyKit while granting access to all users.
I think it's useful to have an option that disables PolicyKit and grants
access only to root. I think we should not go too far in implementing
our own authorization mechanisms beside PolicyKit (e.g. you cannot
disable PolicyKit and grant access based on group membership of the
user). However, disabling PolicyKit can be useful sometimes, and it's
simple to implement a "root-only" setup.
Note one change is that when NetworkManager now runs without a D-Bus
connection (in initrd), it would deny all non-root requests. Previously
it would grant access. I think there should be little difference in
practice, because if we have no D-Bus we also don't have any requests to
authenticate.
(cherry picked from commit 6d7446e52f)
We no longer add these. If you use Emacs, configure it yourself.
Also, due to our "smart-tab" usage the editor anyway does a subpar
job handling our tabs. However, on the upside every user can choose
whatever tab-width he/she prefers. If "smart-tabs" are used properly
(like we do), every tab-width will work.
No manual changes, just ran commands:
F=($(git grep -l -e '-\*-'))
sed '1 { /\/\* *-\*- *[mM]ode.*\*\/$/d }' -i "${F[@]}"
sed '1,4 { /^\(#\|--\|dnl\) *-\*- [mM]ode/d }' -i "${F[@]}"
Check remaining lines with:
git grep -e '-\*-'
The ultimate purpose of this is to cleanup our files and eventually use
SPDX license identifiers. For that, first get rid of the boilerplate lines.
- drops nm_dispatcher_init(), which was called early in the main loop
and created a proxy synchronously. Instead, the GDBusConnection is
always ready.
- reuse the GDBusConnection of NMDBusManager. This means, we won't even
try from in "initrd" configure-and-quit mode.
- as before, there is no "manager" instance. Instead, the data is stored
in a global variable. That's ok. What is not OK is how the entire
shutdown is handled (calling dispatcher blockingly, not waiting for
requests to complete). That is fixable, but a lot of work. It is
independent of whether we use a manager object or not.
Aquiring the bus early tells systemd that NetworkManager is started.
Do that even before setting up/creating the singletons for NMPlatform
and NMNetns.
This is a trick so that NetworkManager is considered earlier to be started.
But it's right, because we can and should create the D-Bus socket as early as
possible to let other services (that order After=network.target) can already
start too.
Of course, NetworkManager is not yet fully running and it will take a
while longer until it actually replies on D-Bus. But the requests are
not lost and services that talk to NetworkManager that early can in the
meantime to other startup actions.
First of all, NMDBusManager takes the system D-Bus connection synchronously, so we
should avoid API that is asynchronous and first needs to get glib's G_BUS_TYPE_SYSTEM
instance.
Also, the only reason why NMDBusManager might not have a D-Bus connection is in "initrd"
configure-and-quit mode. In that mode we also don't need polkit.
Note that various components (NMFirewallManager, NMAuthManager,
NMConnectivity, etc.pp) all request their own GDBusConnection from
glib's G_BUS_TYPE_SYSTEM singleton.
In the future, let them instead use the D-Bus connection that
NMDBusManager already has.
- NMDBusManager also uses g_bus_get(G_BUS_TYPE_SYSTEM), so in practice this
is just the same GDBusConnection instance.
- if it would not be the same GDBusConnection instance, it would
be more correct/logical to use the one that NMDBusManager uses.
- NMDBusManager already aquired the GDBusConnection synchronously
and it's ready for use. On the other hand, g_bus_get()/g_bus_get_sync()
has the notion that getting the singleton cannot be done without
waiting/blocking. So at least it involves locking or even dispatching
the async reply on D-Bus.
- in "configure-and-quit=initrd" we really don't have D-Bus available.
NMDBusManager should control whether the other components use D-Bus
or not. For example, NMFirewallManager should not ask glib for a
G_BUS_TYPE_SYSTEM singleton only to later find out that it doesn't work.
So if these components would reuse NMDBusManager's GDBusConnection,
then it must have the connection also in regular "configure-and-quit=true"
mode. In this case, we are in late boot and want do connectivity
checking and talk to firewalld.
Only read the keyfile databases once and cache them for the remainder of
the program.
- this avoids the overhead of opening the file over and over again.
- it also avoids the data changing without us expecting it. The state
files are internal and we don't support changing it outside of
NetworkManager. So in the base case we read the same data over
and over. In the worst case, we read different data but are not
interested in handling the changes.
- only write the file when the content changes or before exiting
(normally).
- better log what is happening.
- our state files tend to grow as we don't garbage collect old entries.
Keeping this all in memory might be problematic. However, the right
solution for this is that we come up with some form of garbage
collection so that the state files are reaonsably small to begin with.
Instead of having two functions nm_logging_set_syslog_identifier()
and nm_logging_set_prefix(), merge them.
They must both be called at earliest point and together. No point
in giving them the appearance that they could be called any time.
Rename nm_manager_write_device_state() to
nm_manager_write_device_state_all(), and split out the code to write a
single device state to a new function.
Delay warning about invalid domains until we setup syslog and nm-logging.
Preferably, we don't log anything by directly printing to stdout/stderr.
(cherry picked from commit 4439b6a35d)
Coccinelle:
@@
expression a, b;
@@
-a ? a : b
+a ?: b
Applied with:
spatch --sp-file ternary.cocci --in-place --smpl-spacing --dir .
With some manual adjustments on spots that Cocci didn't catch for
reasons unknown.
Thanks to the marvelous effort of the GNU compiler developer we can now
spare a couple of bits that could be used for more important things,
like this commit message. Standards commitees yet have to catch up.
During shutdown, we will need to still iterate the main loop
to do a coordinated shutdown. Currently we do not, and we just
exit, leaving a lot of objects hanging.
If we are going to fix that, we need during shutdown tell
NMDBusManager to reject all future operations.
Note that property getters and "GetManagerObjects" call is not
blocked. It continues to work.
Certainly for some operations, we want to allow them to be called even
during shutdown. However, these have to opt-in.
This also fixes an uglyness, where nm_dbus_manager_start() would
get the set-property-handler and the @manager as user-data. However,
NMDBusManager will always outlife NMManager, hence, after NMManager
is destroyed, the user-data would be a dangling pointer. Currently
that is not an issue, because
- we always leak NMManager
- we don't run the mainloop during shutdown
We currently start the bus manager only after the creation of a
NMManager because the NMManager is needed to handle set-property bus
calls. However, objects created by NMManager
(e.g. NMDnsSystemdResolved) need a bus connection and so their
initialization currently fail.
To fix this, split nm_dbus_manager_start() in two parts: first only
create the connection and acquire the bus. After this step the
NMManager can be set up. In the second step, set NMManager as the
set-property handler and start exporting objects on the bus.
Fixes: 297d4985ab
It might happen, that connectivitiy is lost only for a moment and
returns soon after. Based on that assumption, when we loose connectivity
we want to have a probe interval where we check for returning
connectivity more frequently.
For that, we handle tracking of the timeouts per-device.
The intervall shall start with 1 seconds, and double the interval time until
the full interval is reached. Actually, due to the implementation, it's unlikely
that we already perform the second check 1 second later. That is because commonly
the first check returns before the one second timeout is reached and bumps the
interval to 2 seconds right away.
Also, we go through extra lengths so that manual connectivity check
delay the periodic checks. By being more smart about that, we can reduce
the number of connectivity checks, but still keeping the promise to
check at least within the requested interval.
The complexity of book keeping the timeouts is remarkable. But I think
it is worth the effort and we should try hard to
- have a connectivity state as accurate as possible. Clearly,
connectivity checking means that we probing, so being more intelligent
about timeout and backoff timers can result in a better connectivity
state. The connectivity state is important because we use it for
the default-route penaly and the GUI indicates bad connectivity.
- be intelligent about avoiding redundant connectivity checks. While
we want to check often to get an accurate connectivity state, we
also want to minimize the number of HTTP requests, in case the
connectivity is established and suppossedly stable.
Also, perform connectivity checks in every state of the device.
Even if a device is disconnected, it still might have connectivity,
for example if the user externally adds an IP address on an unmanaged
device.
https://bugzilla.gnome.org/show_bug.cgi?id=792240
An asynchronous request should either be cancellable or not keep
the target object alive. Preferably both.
Otherwise, it is impossible to do a controlled shutdown when terminating
NetworkManager. Currently, when NetworkManager is about to terminate,
it just quits the mainloop and essentially leaks everything. That is a
bug. If we ever want to fix that, every asynchronous request must be
cancellable in a controlled way (or it must not prevent objects from
getting disposed, where disposing the object automatically cancels the
callback).
Rework the asynchronous request for connectivity check to
- return a handle that can be used to cancel the operation.
Cancelling is optional. The caller may choose to ignore the handle
because the asynchronous operation does not keep the target object
alive. That means, it is still possible to shutdown, by everybody
giving up their reference to the target object. In which case the
callback will be invoked during dispose() of the target object.
- also, the callback will always be invoked exactly once, and never
synchronously from within the asynchronous start call. But during
cancel(), the callback is invoked synchronously from within cancel().
Note that it's only allowed to cancel an action at most once, and
never after the callback is invoked (also not from within the callback
itself).
- also, NMConnectivity already supports a fake handler, in case
connectivity check is disabled via configuration. Hence, reuse
the same code paths also when compiling without --enable-concheck.
That means, instead of having #if WITH_CONCHECK at various callers,
move them into NMConnectivity. The downside is, that if you build
without concheck, there is a small overhead compared to before. The
upside is, we reuse the same code paths when compiling with or without
concheck.
- also, the patch synchronizes the connecitivty states. For example,
previously `nmcli networking connectivity check` would schedule
requests in parallel, and return the accumulated result of the individual
requests.
However, the global connectivity state of the manager might have have
been the same as the answer to the explicit connecitivity check,
because while the answer for the manual check is waiting for all
pending checks to complete, the global connectivity state could
already change. That is just wrong. There are not multiple global
connectivity states at the same time, there is just one. A manual
connectivity check should have the meaning of ensure that the global
state is up to date, but it still should return the global
connectivity state -- not the answers for several connectivity checks
issued in parallel.
This is related to commit b799de281b
(libnm: update property in the manager after connectivity check),
which tries to address a similar problem client side.
Similarly, each device has a connectivity state. While there might
be several connectivity checks per device pending, whenever a check
completes, it can update the per-device state (and return that device
state as result), but the immediate answer of the individual check
might not matter. This is especially the case, when a later request
returns earlier and obsoletes all earlier requests. In that case,
earlier requests return with the result of the currend devices
connectivity state.
This patch cleans up the internal API and gives a better defined behavior
to the user (thus, the simple API which simplifies implementation for the
caller). However, the implementation of getting this API right and properly
handle cancel and destruction of the target object is more complicated and
complex. But this but is not just for the sake of a nicer API. This fixes
actual issues explained above.
Also, get rid of GAsyncResult to track information about the pending request.
Instead, allocate our own handle structure, which ends up to be nicer
because it's strongly typed and has exactly the properties that are
useful to track the request. Also, it gets rid of the awkward
_finish() API by passing the relevant arguments to the callback
directly.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
The next commit will completely rework NMBusManager and replace
NMExportedObject by a new type NMDBusObject.
Originally, NMDBusObject was added along NMExportedObject to ease
the rework and have compilable, intermediate stages of refactoring. Now,
I think the new name is better, because NMDBusObject is very strongly related
to the bus manager and the old name NMExportedObject didn't make that
clear.
I also slighly prefer the name NMDBusObject over NMBusObject, hence
for consistancy, also rename NMBusManager to NMDBusManager.
This commit only renames the file for a nicer diff in the next commit.
It does not actually update the type name in sources. That will be done
later.
It has almost no callers, and it is a bit of a strange API. Let's
not cache the last accessed value inside NMConfigData. Instead, free
it right after use. It was not reused anyway, it only hangs around
as convenience for the caller.
With address sanitizer, the call to setrlimit() fails by default,
because the core dump would be huge. That could be overwritten via
ASAN_OPTIONS=disable_core=0
But just don't try to enable core-dumps with asan.
NMPlatform, NMRouteManager and NMDefaultRouteManager are singletons
instances. Users of those are for example NMDevice, which registers
to GObject signals of both NMPlatform and NMRouteManager.
Hence, as NMDevice:dispose() disconnects the signal handlers, it must
ensure that those singleton instances live longer then the NMDevice
instance. That is usually accomplished by having users of singleton
instances own a reference to those instances.
For NMDevice that effectively means that it shall own a reference to
several singletons.
NMPlatform, NMRouteManager, and NMDefaultRouteManager are all
per-namespace. In general it doesn't make sense to have more then
one instances of these per name space. Nnote that currently we don't
support multiple namespaces yet. If we will ever support multiple
namespaces, then a NMDevice would have a reference to all of these
manager instances. Hence, introduce a new class NMNetns which bundles
them together.
This moves tracking of connectivity to NMDevice and makes the NMManager
negotiate the best of known connectivity states of devices. The NMConnectivity
singleton handles its own configuration and scheduling of the permission
checks, but otherwise greatly simplifies it.
This will be useful to determine correct metrics for multiple default routes
depending on actual internet connectivity.
The per-device connection checks is not yet exposed on the D-Bus, since they
probably should be per-address-family as well.
Previously, the daemon would just use syslog with LOG_PERROR when run with
--debug option, even when actually configured to log into the journal.
Let's respect the configuration, but preserve the logging to stderr.
main() should pass the same atomic-section-prefix setting to it's
NMConfig instances. Currently both are NULL, but make it a define
to make this explicit.
Also, make static array @default_values const and sanitize value
when setting PROP_ATOMIC_SECTION_PREFIXES property.
The DNS manager and other singletons have the problem that
they are not properly destroyed on exit, that is, we leak
most of the instances. That should be eventually fixed and
all resources/memory should be released.
Anyway, fix the shutdown procedure by adding an explict command
nm_dns_manager_shutdown(). We should not rely on cleanup actions
to take place when the last reference is dropped, because then
we get complex interactions where we must ensure that everybody
drops the references at the right pointer.
Since the previous shutdown action was effectively never performed,
it is not quite clear what we actually want to do on shutdown.
For now, move the code to nm_dns_manager_stop(). We will see if
that is the desired behavior.