Even when the system resolver is configured to something else that
systemd-resolved, it still is a good idea to keep systemd-resolved up to
date. If not anything else, it does a good job at doing per-interface
resolving for connectivity checks.
If for whatever reasons don't want NetworkManager to push the DNS data
it discovers to systemd-resolved, the functionality can be disabled
with:
[main]
systemd-resolved=false
g_string_new_len() allocates the buffer with length
bytes. Maybe it should be obvious (wasn't to me), but
if a init argument is given, that is taken as containing
length bytes.
So,
str = g_string_new_len (init, len);
is more like
str = g_string_new_len (NULL, len);
g_string_append_len (str, init, len);
and not (how I wrongly thought)
str = g_string_new_len (NULL, len);
g_string_append (str, init);
Fixes: 95b006c244
Previously, if "main.rc-manager" was set to "unmanaged"
and "/etc/resolv.conf" was symlink to our internal file
"/var/run/NetworkManager/resolv.conf", NM would not rewrite
the file, in an attempt to honor the requirement of NetworkManager
not changing resolv.conf.
No longer special case this. I think it was wrong and inconsistent.
If the user specifies rc-manager unmanaged, he also should manage
/etc/resolv.conf accordingly. And if the user decided to symlink
it to our internal file, that is fine. It should not stop NM from
updating that file.
Also, this was the only cases, where NM would not write our internal
resolv.conf (errors aside). It was inconsitent, and also not documented
behavior. Instead, it is documented that `man NetworkManager.conf`:
Regardless of this setting, NetworkManager will always write
resolv.conf to its runtime state directory
/var/run/NetworkManager/resolv.conf.
When a DNS plugin is enabled (like "main.dns=dnsmasq" or "main.dns=systemd-resolved"),
the name servers announced to the rc-manager are coerced to be 127.0.0.1
or 127.0.0.53.
Depending on the "main.rc-manager" setting, also "/etc/resolv.conf"
contains only this coerced name server to the local caching service.
The same is true for "/var/run/NetworkManager/resolv.conf" file, which
contains what we would write to "/etc/resolv.conf" (depending on
the "main.rc-manager" configuration).
Write a new file "/var/run/NetworkManager/no-stub-resolv.conf", which contains
the original name servers, uncoerced. Like "/var/run/NetworkManager/resolv.conf",
this file is always written.
The effect is, when one enables "main.dns=systemd-resolved", then there
is still a file "no-stub-resolv.conf" with the same content as with
"main.dns=default".
The no-stub-resolv.conf may be a possible solution, when a user wants
NetworkManager to update systemd-resolved, but still have a regular
/etc/resolv.conf [1]. For that, the user could configure
[main]
dns=systemd-resolved
rc-manager=unmanaged
and symlink "/etc/resolv.conf" to "/var/run/NetworkManager/no-stub-resolv.conf".
This is not necessarily the only solution for the problem and does not preclude
options for updating systemd-resolved in combination with other DNS plugins.
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/20
Make them just ask for connections from GDBus, as other D-Bus clients
do. GDBus anyway reuses the connection if it has one, but allows us to
deal with errors in a more civilized manner.
The DNS manager reacts to NM_DEVICE_IP4_CONFIG_CHANGED events, which
are generated when there is a relevant change to an IP4 configuration.
Until now, changes to the mdns,llmnr properties were not
considered relevant (and neither minor, this is already a bug).
Promote them to relevant so that the DNS manager is notified and will
rewrite the DNS configuration when one of this properties changes.
Note that the DNS priority should be considered relevant and added
into the checksum as well, but is a problem right now because in the
DNS manager we rely on the fact that an empty configuration (i.e. just
created) has a zero checksum. This is needed to avoid rewriting
resolv.conf when there is no change. The DNS priority initial value
depends on the connection type (VPN or not), so it's a bit difficult
to add it to checksum maintaining the assumption of checksum(empty)=0.
This should be improved in the future.
We commonly don't use the glib typedefs for char/short/int/long,
but their C types directly.
$ git grep '\<g\(char\|short\|int\|long\|float\|double\)\>' | wc -l
587
$ git grep '\<\(char\|short\|int\|long\|float\|double\)\>' | wc -l
21114
One could argue that using the glib typedefs is preferable in
public API (of our glib based libnm library) or where it clearly
is related to glib, like during
g_object_set (obj, PROPERTY, (gint) value, NULL);
However, that argument does not seem strong, because in practice we don't
follow that argument today, and seldomly use the glib typedefs.
Also, the style guide for this would be hard to formalize, because
"using them where clearly related to a glib" is a very loose suggestion.
Also note that glib typedefs will always just be typedefs of the
underlying C types. There is no danger of glib changing the meaning
of these typedefs (because that would be a major API break of glib).
A simple style guide is instead: don't use these typedefs.
No manual actions, I only ran the bash script:
FILES=($(git ls-files '*.[hc]'))
sed -i \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>\( [^ ]\)/\1\2/g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\> /\1 /g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>/\1/g' \
"${FILES[@]}"
With "main.rc-manager=file", if /etc/resolv.conf is a symlink, NetworkManager
would follow the symlink and update the file instead.
However, note that realpath() only returns a target, if the file actually
exists. That means, if /etc/resolv.conf is a dangling symlink, NetworkManager
would replace the symlink with a file.
This was the only case in which NetworkManager would every change a symlink
resolv.conf to a file. I think this is undesired behavior.
This is a change in long established behavior. Although note that there were several
changes regarding rc-manager settings in the past. See for example commit [1] and [2].
Now, first still try using realpath() as before. Only if that fails, try
to resolve /etc/resolv.conf as a symlink with readlink().
Following the dangling symlink is likely not a problem for the user, it
probably is even desired. The part that most likely can cause problems
is if the destination file is not writable. That happens for example, if
the destination's parent directories are missing. In this case, NetworkManager
will now fail to write resolv.conf and log a warning. This has the potential of
breaking existing setups, but it really is a mis-configuration from the user's
side.
This fixes for example the problem, if the user configures
/etc/resolv.conf as symlink to /tmp/my-resolv.conf. At boot, the file
would not exist, and NetworkManager would previously always replace the
link with a plain file. Instead, it should follow the symlink and create
the file.
[1] 718fd22436
[2] 15177a34behttps://github.com/NetworkManager/NetworkManager/pull/127
It's not clear whether this was desired behavior. However, it was
behavior for a long time, so we probably should not change it.
Just document what happens with dangling symlinks.
Do some preprocessing on the DNS configuration sent to plugins:
- add the '~' default routing (lookup) domain to IP configurations
with the default route or, when there is none, to all non-VPN
IP configurations
- use the dns-priority to decide which connection to use in case
multiple connections have the same domain
- consider a negative dns-priority value as a way to 'shadow' all
subdomains from other connections
- compute reverse DNS domains
and add the resulting domain list to NMDnsIPConfigData so that
split-DNS plugins can use that directly instead of reimplementing the
same logic themselves.
- consistently set options, searches, domains fields to %NULL,
if there are no values.
- in nm_global_dns_config_update_checksum(), ensure that we uniquely
hash values. E.g. a config with "searches[a], options=[b]" should
hash differently from "searches=[ab], options=[]".
- in nm_global_dns_config_to_dbus(), reuse the sorted domain list.
We already have it, and it guarantees a consistent ordering of
fields.
- in global_dns_domain_from_dbus(), fix memleaks if D-Bus strdict
contains duplicate entries.
I dislike the static hash table to cache the integer counter for
numbered paths. Let's instead cache the counter at the class instance
itself -- since the class contains the information how the export
path should be exported.
However, we cannot use a plain integer field inside the class structure,
because the class is copied between derived classes. For example,
NMDeviceEthernet and NMDeviceBridge both get a copy of the NMDeviceClass
instance. Hence, the class doesn't contain the counter directly, but
a pointer to one counter that can be shared between sibling classes.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
The next commit will completely rework NMBusManager and replace
NMExportedObject by a new type NMDBusObject.
Originally, NMDBusObject was added along NMExportedObject to ease
the rework and have compilable, intermediate stages of refactoring. Now,
I think the new name is better, because NMDBusObject is very strongly related
to the bus manager and the old name NMExportedObject didn't make that
clear.
I also slighly prefer the name NMDBusObject over NMBusObject, hence
for consistancy, also rename NMBusManager to NMDBusManager.
This commit only renames the file for a nicer diff in the next commit.
It does not actually update the type name in sources. That will be done
later.
Previously we always updated resolv.conf on quit. When we are using
systemd-resolved the update is not necessary because the resolver on
127.0.0.53 would still be reachable after NM quits. Also, when NM
manages resolv.conf directly there is no need to update the file
again. Let's rewrite resolv.conf only when using dnsmasq.
https://bugzilla.redhat.com/show_bug.cgi?id=1541031
Fixes the following error when building with gcc 4.8.5 and address
sanitizer:
src/dns/nm-dns-dnsmasq.c: In function 'update':
src/dns/nm-dns-dnsmasq.c:506:44: error: 'first_prio' may be used uninitialized in this function [-Werror=maybe-uninitialized]
} else if (first_prio < 0 && first_prio != prio)
^
Similarly to what systemd-resolved does, introduce the concept of
"routing" domain, which is a domain in the search list that is used
only to decide the interface over which a query must be forwarded, but
is not used to complete unqualified host names. Routing domains are
those starting with a tilde ('~') before the actual domain name.
Domains without the initial tilde are used both for completing
unqualified names and for the routing decision.
The "domain" key of the D-Bus configuration dictionary specifies the
domains a configuration applies to. In DNS code we consider domains
and searches as equivalent, so they should be exported via D-Bus using
the same logic used to populate resolv.conf and for plugins.
The connection.mdns setting is a per-connection setting,
so one might expect that one activated device can only have
one MDNS setting at a time.
However, with certain VPN plugins (those that don't have their
own IP interface, like libreswan), the VPN configuration is merged
into the configuration of the device. So, in this case, there
might be multiple settings for one device that must be merged.
We already have a mechanism for that. It's NMIP4Config. Let NMIP4Config
track this piece of information. Although, stricitly speaking this
is not tied to IPv4, the alternative would be to introduce a new
object to track such data, which would be a tremendous effort
and more complicated then this.
Luckily, NMDnsManager and NMDnsPlugin are already equipped to
handle multiple NMIPConfig instances per device (IPv4 vs. IPv6,
and Device vs. VPN).
Also make "connection.mdns" configurable via global defaults in
NetworkManager.conf.
Don't track the per-device configuration in NMDnsManager by
the ifname, but by the ifindex. We should consistently treat
the ifindex as the ID of a link, like kernel does.
At the few places where we actually need the ifname, resolve
it by looking into the platform cache. That is not necessarily
the same as the ifname that is currently tracked by NMDevice,
because netdev interfaces can be renamed, and NMDevice updates
it's link properties delayed. However, the platform cache has
the most recent notion of the correct interface name for an
ifindex, so if we ever hit a race here, we do it now more
correctly.
This also temporarily drops support for mdns. Will be re-added next,
but differently.
We had two separate queues, one for "SetLinkDNS" and one for
"SetLinkDomains". Merge them into one, and track the operation
as part of the new RequestItem structure.
A visible change to before is that we now would make all requests
per-interface first. Prevously, we would first make all SetLinkDNS
requests (for all interfaces) and then all SetLinkDomains requests.
It feels more correct to order the requests this way, not by
type.
The reason to merge is, that we will next get another operation
and in the current scheme we would need 3 GQueue instances.
While at it, refactor the code to use CList. We now anyway would
need a new struct to track the operation, requiring to allocate
and free it. Previously, we would only track the GVariant argument
as data of the GQueue.
Use a GHashTable instead of a GArray to construct the list of
@interfaces. Also, use NMCListElem instead of GList. With this,
the runtime is O(n*log(n)) instead of O(n^2).
I belive, we should take care that all our code has a reasonable
runtime complexity, even in common use-cases the number of elements
is small. This is not about performace, because likely we expect few
entries anyway, and the direct GArray implementation is likely faster
in those cases. It's about using the data structure that best suits the
access pattern.
The log(n) part comes from sorting the keys. I also believe we should
always aim for a stable behavior. When sending the D-Bus request to
resolved, the order of elements should be in ~some~ defined order.
"UNKNOWN" is not a good name. If you don't set the property
in the connection explicitly, it should be "DEFAULT".
Also, make "DEFAULT" -1. For one, that ensures that the enum's
underlying integer type is signed. Otherwise, it's cumbersome
to test "if (mdns >= DEFAULT)" because in case of unsigned types,
the compiler will warn about the check always being true.
Also, it allows for "NO" to be zero. These are no strong reasons,
but I tend to think this is better.
Also, don't make the property of NMSettingConnection a CONSTRUCT property.
Initialize the default manually in the init function.
Also, order the numeric values so that DEFAULT < NO < RESOLVE < YES with
YES being largest because it enables *the most*.
Update nm-policy.c and nm-dns-manager.c so that the connection-specific
settings get propagated to DNS manger. Currently the only such value is
the mDNS status.
Add update_mdns() function to DNS plugin interface. If a DNS plugin
supports mDNS, it can set an interface with a given index to support
mDNS resolving or also register the current hostname.
The mDNS support is currently added only to systemd-resolved DNS plugin.
The compiler warns when we ignore the return value from write().
And assigning it to an unused variable, causes another warning.
Make some use of it, at least to handle EINTR. All other errors
are still ignored.
While at it, rework the write code to first write to a buffer
in memory.