We don't need a separate "GSList *chains" to track the NMAuthChain
requests for the agents. Every agent should only have one auth-chain in
fly at any time. We can attach that NMAuthChain to the secret-agent.
Also, fix a race where:
1) A secret agent registers. We would start an auth-chain check, but not
yet track the secret agent.
2) Then the secret agent unregisters. The unregistration request will fail,
because the secret agent is not yet in the list of fully registered agents.
The same happens if the secret agent disconnects at this point.
agent_disconnect_cb() would not find the secret agent to remove.
3) afterwards, authentication completes and we register the
secret-agent, although we should not.
There is also another race: if we get authority_changed_cb() we would
not restart the authentication for the secret-agent that is still
registering. Hence, we don't know whether the result once it completes
would already contain the latest state.
Don't access the singleton getter here. Pass the agent-manager argument
instead to maybe_remove_agent_on_error().
Also, don't lookup the agent by name. We already know, whether the agent
is still tracked or not. Look at agent->agent_lst.
nm_agent_manager_get_agent_by_user() would only return the first
matching secret agent for the user. This way, we might miss an agent
that has permissions.
Instead, add nm_agent_manager_has_agent_with_permission() and search
all agents.
There was literally only one place where we would make use of
O(1) lookup of secret-agents: during removal.
In all other cases (which are the common cases) we had to iterate the
known agents. CList is more efficient and more convenient to use when
the main mode of operation is iterating.
Also note that handling secret agents inevitably scales linear with
the number of agents. That is, because for every check we will have
to sort the list of agents and send requests to them. It would be
very complicated (and probably less efficient for reasonable numbers
of secret agents) to avoid O(n).
Move it to shared as it's useful for clients as well.
Move and rename nm_dbus_manager_new_auth_subject_from_context() and
nm_dbus_manager_new_auth_subject_from_message() in nm-dbus-manager.c
as they're needed there.
All D-Bus method call implementations use similar error messages when
authenticating requests; add defines for them to ensure the same exact
message is reused.
The secret-agent D-Bus API knows 4 methods: GetSecrets, SaveSecrets,
DeleteSecrets and CancelGetSecrets. When we cancel a GetSecrets
request, we must issue another CancelGetSecrets to tell the agent
that the request was aborted. This is also true during shutdown.
Well, technically, during shutdown we anyway drop off the bus and
it woudn't matter. In practice, I think we should get this right and
always cancel properly.
To better handle shutdown change the following:
- each request now takes a reference on NMSecretAgent. That means,
as long as there are pending requests, the instance stays alive.
The way to get this right during shutdown, is that NMSecretAgent
registers itself via nm_shutdown_wait_obj_register() and
NetworkManager is supposed to keep running as long as requests
are keeping the instance alive.
- now, the 3 regular methods are cancellable (which means: we are
no longer interested in the result). CancelGetSecrets is not
cancellable, but it has a short timeout NM_SHUTDOWN_TIMEOUT_MS
to handle this. We anyway don't really care about the result,
aside logging and to be sure that the request fully completed.
- this means, a request (NMSecretAgentCallId) can now immediately
be cancelled and destroyed, both when the request returns and
when the caller cancels it. The exception is GetSecrets which
keeps the request alive while waiting for CancelGetSecrets. But
this is easily handled by unlinking the call-id and pass it on
to the CancelGetSecrets callback.
Previously, the NMSecretAgentCallId was only destroyed when
the D-Bus call returns, even if it was cancelled earlier. That's
unnecessary complicated.
- previously, D-Bus requests SaveSecrets and DeleteSecrets were not cancellable.
That is a problem. We need to be able to cancel them in order to shutdown in
time.
- use GDBusConnection instead of GDBusProxy. As most of the time, GDBusProxy
provides features we don't use.
- again, don't log direct pointer values, but obfuscate the indentifiers.
We no longer add these. If you use Emacs, configure it yourself.
Also, due to our "smart-tab" usage the editor anyway does a subpar
job handling our tabs. However, on the upside every user can choose
whatever tab-width he/she prefers. If "smart-tabs" are used properly
(like we do), every tab-width will work.
No manual changes, just ran commands:
F=($(git grep -l -e '-\*-'))
sed '1 { /\/\* *-\*- *[mM]ode.*\*\/$/d }' -i "${F[@]}"
sed '1,4 { /^\(#\|--\|dnl\) *-\*- [mM]ode/d }' -i "${F[@]}"
Check remaining lines with:
git grep -e '-\*-'
The ultimate purpose of this is to cleanup our files and eventually use
SPDX license identifiers. For that, first get rid of the boilerplate lines.
Out of the 33 callers of nm_auth_chain_add_call(), the permission
argument is:
- 29 times a C string literal like NM_AUTH_PERMISSION_NETWORK_CONTROL.
- 3 times assign a string that is in fact a static string (it's just
not a string literal)
- only NMManager's device_auth_request_cb() passes a permission of
(possibly) non static origin. But it already duplicates the string
for it's own purposes and attaches it as user-data to the
NMAuthChain.
There really is no need to duplicate the string.
Replace nm_auth_chain_add_call() by a macro that ensures that the
permission string is a C literal.
Rename nm_auth_chain_add_call() to nm_auth_chain_add_call_unsafe() to
indicate that the lifetime of the permission argument is now the
responsibility of the caller.
NMAuthChain usually requests several permissions at once. Hence, an error
argument in the overall callback does not make sense, because you
wouldn't know which request failed.
If at all, it could only mean that the overall request failed (like an
D-Bus failure communicating to D-Bus *for all permisssions*),
but we don't need to handle that specially. In fact, we don't really care
why permission was not granted, whether it's due to an error or legitimate
reasons.
The error in the callback was always set to %NULL. Remove it.
NMAuthChain is not ref-counted.
You may call nm_auth_chain_destroy() once before the callback
gets invoked. This destroys the auth-chain instance right away.
You may call nm_auth_chain_destroy() once from inside the callback.
This basically has no effect but is allowed for convenince.
All this does is remembering that destroy was called and asserts that
destroy gets called at most once.
After the callback returns, the auth-chain will always be destroyed.
That means, generally there is no need to call nm_auth_chain_destroy()
from inside the callback.
Remove that code, and refactor some code to return early (where it makes
sense).
"libnm-core" implements common functionality for "NetworkManager" and
"libnm".
Note that clients like "nmcli" cannot access the internal API provided
by "libnm-core". So, if nmcli wants to do something that is also done by
"libnm-core", , "libnm", or "NetworkManager", the code would have to be
duplicated.
Instead, such code can be in "libnm-libnm-core-{intern|aux}.la".
Note that:
0) "libnm-libnm-core-intern.la" is used by libnm-core itsself.
On the other hand, "libnm-libnm-core-aux.la" is not used by
libnm-core, but provides utilities on top of it.
1) they both extend "libnm-core" with utlities that are not public
API of libnm itself. Maybe part of the code should one day become
public API of libnm. On the other hand, this is code for which
we may not want to commit to a stable interface or which we
don't want to provide as part of the API.
2) "libnm-libnm-core-intern.la" is statically linked by "libnm-core"
and thus directly available to "libnm" and "NetworkManager".
On the other hand, "libnm-libnm-core-aux.la" may be used by "libnm"
and "NetworkManager".
Both libraries may be statically linked by libnm clients (like
nmcli).
3) it must only use glib, libnm-glib-aux.la, and the public API
of libnm-core.
This is important: it must not use "libnm-core/nm-core-internal.h"
nor "libnm-core/nm-utils-private.h" so the static library is usable
by nmcli which couldn't access these.
Note that "shared/nm-meta-setting.c" is an entirely different case,
because it behaves differently depending on whether linking against
"libnm-core" or the client programs. As such, this file must be compiled
twice.
(cherry picked from commit af07ed01c0)
We should no longer use nm_connection_for_each_setting_value() and
nm_setting_for_each_value(). It's fundamentally broken as it does
not work with properties that are not backed by a GObject property
and it cannot be fixed because it is public API.
Add an internal function _nm_connection_aggregate() to replace it.
Compare the implementation of the aggregation functionality inside
libnm with the previous two checks for secret-flags that it replaces:
- previous approach broke abstraction and require detailed knowledge of
secret flags. Meaning, they must special case NMSettingVpn and
GObject-property based secrets.
If we implement a new way for implementing secrets (like we will need
for WireGuard), then this the new way should only affect libnm-core,
not require changes elsewhere.
- it's very inefficient to itereate over all settings. It involves
cloning and sorting the list of settings, and retrieve and clone all
GObject properties. Only to look at secret properties alone.
_nm_connection_aggregate() is supposed to be more flexible then just
the two new aggregate types that perform a "find-any" search. The
@arg argument and boolean return value can suffice to implement
different aggregation types in the future.
Also fixes the check of NMAgentManager for secret flags for VPNs
(NM_CONNECTION_AGGREGATE_ANY_SYSTEM_SECRET_FLAGS). A secret for VPNs
is a property that either has a secret or a secret-flag. The previous
implementation would only look at present secrets and
check their flags. It wouldn't check secret-flags that are
NM_SETTING_SECRET_FLAG_NONE, but have no secret.
This is a mere debugging convenience thing: e.g. if you run, but want to
check whether nm-applet or nmcli agent works fine, it's convenient that
the agent you run later gets a chance to deal with the secrets requests
first.
Is seems to do the job and is simpler that adding some more complicated
policy (e.g. introducing priorities or something).
https://github.com/NetworkManager/NetworkManager/pull/174
We commonly don't use the glib typedefs for char/short/int/long,
but their C types directly.
$ git grep '\<g\(char\|short\|int\|long\|float\|double\)\>' | wc -l
587
$ git grep '\<\(char\|short\|int\|long\|float\|double\)\>' | wc -l
21114
One could argue that using the glib typedefs is preferable in
public API (of our glib based libnm library) or where it clearly
is related to glib, like during
g_object_set (obj, PROPERTY, (gint) value, NULL);
However, that argument does not seem strong, because in practice we don't
follow that argument today, and seldomly use the glib typedefs.
Also, the style guide for this would be hard to formalize, because
"using them where clearly related to a glib" is a very loose suggestion.
Also note that glib typedefs will always just be typedefs of the
underlying C types. There is no danger of glib changing the meaning
of these typedefs (because that would be a major API break of glib).
A simple style guide is instead: don't use these typedefs.
No manual actions, I only ran the bash script:
FILES=($(git ls-files '*.[hc]'))
sed -i \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>\( [^ ]\)/\1\2/g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\> /\1 /g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>/\1/g' \
"${FILES[@]}"
NMAuthChain is not really ref-counted. True, we have an internal ref-counter
to ensure that the instance stays alive while the callback is invoked. However,
the user cannot take additional references as there is no nm_auth_chain_ref().
When the user wants to get rid of the auth-chain, with the current API it
is important that the callback won't be called after that point. From the
name nm_auth_chain_unref(), it sounds like that there could be multiple references
to the auth-chain, and merely unreferencing the object might not guarantee that
the callback is canceled. However, that is luckily not the case, because
there is no real ref-counting involved here.
Just rename the destroy function to make this clearer.
I dislike the static hash table to cache the integer counter for
numbered paths. Let's instead cache the counter at the class instance
itself -- since the class contains the information how the export
path should be exported.
However, we cannot use a plain integer field inside the class structure,
because the class is copied between derived classes. For example,
NMDeviceEthernet and NMDeviceBridge both get a copy of the NMDeviceClass
instance. Hence, the class doesn't contain the counter directly, but
a pointer to one counter that can be shared between sibling classes.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
The next commit will completely rework NMBusManager and replace
NMExportedObject by a new type NMDBusObject.
Originally, NMDBusObject was added along NMExportedObject to ease
the rework and have compilable, intermediate stages of refactoring. Now,
I think the new name is better, because NMDBusObject is very strongly related
to the bus manager and the old name NMExportedObject didn't make that
clear.
I also slighly prefer the name NMDBusObject over NMBusObject, hence
for consistancy, also rename NMBusManager to NMDBusManager.
This commit only renames the file for a nicer diff in the next commit.
It does not actually update the type name in sources. That will be done
later.
When activation of the connection fails with no-secrets, we block
autoconnect due to that. However, NMPolicy also unblocks such
autoconnect, whenever a new secret-agent registers. The reason
is obviously, that the new secret-agent might be able to provide
the previously missing secrets.
However, there is a race between
- making the secret request, failing activation and blocking autoconnect
- new secret-agent registers
If the secret-agent registers after making the request, but before we
block autoconnect, then autoconnect stays blocked.
[1511468634.5759] device (wlp4s0): state change: config -> need-auth (reason 'none', sys-iface-state: 'managed')
[1511468634.5772] device (wlp4s0): No agents were available for this request.
[1511468638.4082] agent-manager: req[0x55ea7e58a5d0, :1.32/org.kde.plasma.networkmanagement/1000]: agent registered
[1511468638.4082] policy: re-enabling autoconnect for all connections with failed secrets
[1511468664.6280] device (wlp4s0): state change: need-auth -> failed (reason 'no-secrets', sys-iface-state: 'managed')
[1511468664.6287] policy: connection 'tuxmobil' now blocked from autoconnect due to no secrets
Note the long timing between making the secret request and the
activation failure. This race already existed before, but now with
WPS push-button method enabled by default, the duraction of the
activation is much longer and the race is easy to hit.
https://bugzilla.gnome.org/show_bug.cgi?id=790571
Replace the usage of g_str_hash() with our own nm_str_hash().
GLib's g_str_hash() uses djb2 hashing function, just like we
do at the moment. The only difference is, that we use a diffrent
seed value.
Note, that we initialize the hash seed with random data (by calling
getrandom() or reading /dev/urandom). That is a change compared to
before.
This change of the hashing function and accessing the random pool
might be undesired for libnm/libnm-core. Hence, the change is not
done there as it possibly changes behavior for public API. Maybe
we should do that later though.
At this point, there isn't much of a change. This patch becomes
interesting, if we decide to use a different hashing algorithm.
This makes it easier to install the files with proper names.
Also, it makes the makefile rules slightly simpler.
Lastly, the documentation is now generated into docs/api, which makes it
possible to get rid of the awkward relative file names in docbook.
We only needed proper glib enum types for having properties
and signal arguments. These got all converted to plain int,
so no longer generate such an enum type.
Had to rename "nm-enum-types.h" because it works badly with
"libnm/nm-enum-types.h". Maybe I could fix that differently,
but duplicate names is anyway error prone.
Note that "nm-core-enum-types.h" is already taken too, so
"nm-src-enum-types.h" it is.
- use _NM_GET_PRIVATE() and _NM_GET_PRIVATE_PTR() everywhere.
- reorder statements, to have GObject related functions (init, dispose,
constructed) at the bottom of each file and in a consistent order w.r.t.
each other.
- unify whitespaces in signal and properties declarations.
- use NM_GOBJECT_PROPERTIES_DEFINE() and _notify()
- drop unused signal slots in class structures
- drop unused header files for device factories
GError codes are only unique per domain, so logging the code without
also indicating the domain is not helpful. And anyway, if the error
messages are not distinctive enough to tell the whole story then we
should fix the error messages.
Based-on-patch-by: Dan Winship <danw@gnome.org>