The code never set "iface_get_config->cidr_addr", despite
setting "cidr_prefix" and "has_cidr". As a result, cloud-setup
would think that the subnet is "0.0.0.0/$PLEN", and calculate
the gateway as "0.0.0.1".
As a result it would add a default route to table 30400 via 0.0.0.1,
which is obviously wrong.
How to detect the right gateway? Let's try obtain the subnet also via
the meta data. That seems mostly correct, except that we only access
subnet at index 0. What if there are multiple ones? I don't know.
https://bugzilla.redhat.com/show_bug.cgi?id=1912236
(cherry picked from commit c2629f72b0)
Background
==========
Imagine you run a container on your machine. Then the routing table
might look like:
default via 10.0.10.1 dev eth0 proto dhcp metric 100
10.0.10.0/28 dev eth0 proto kernel scope link src 10.0.10.5 metric 100
[...]
10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink
10.42.1.2 dev cali02ad7e68ce1 scope link
10.42.1.3 dev cali8fcecf5aaff scope link
10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink
That is, there are another interfaces with subnets and specific routes.
If nm-cloud-setup now configures rules:
0: from all lookup local
30400: from 10.0.10.5 lookup 30400
32766: from all lookup main
32767: from all lookup default
and
default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
10.0.10.1 dev eth0 table 30400 proto static scope link metric 10
then these other subnets will also be reached via the default route.
This container example is just one case where this is a problem. In
general, if you have specific routes on another interface, then the
default route in the 30400+ table will interfere badly.
The idea of nm-cloud-setup is to automatically configure the network for
secondary IP addresses. When the user has special requirements, then
they should disable nm-cloud-setup and configure whatever they want.
But the container use case is popular and important. It is not something
where the user actively configures the network. This case needs to work better,
out of the box. In general, nm-cloud-setup should work better with the
existing network configuration.
Change
======
Add new routing tables 30200+ with the individual subnets of the
interface:
10.0.10.0/24 dev eth0 table 30200 proto static metric 10
[...]
default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
10.0.10.1 dev eth0 table 30400 proto static scope link metric 10
Also add more important routing rules with priority 30200+, which select
these tables based on the source address:
30200: from 10.0.10.5 lookup 30200
These will do source based routing for the subnets on these
interfaces.
Then, add a rule with priority 30350
30350: lookup main suppress_prefixlength 0
which processes the routes from the main table, but ignores the default
routes. 30350 was chosen, because it's in between the rules 30200+ and
30400+, leaving a range for the user to configure their own rules.
Then, as before, the rules 30400+ again look at the corresponding 30400+
table, to find a default route.
Finally, process the main table again, this time honoring the default
route. That is for packets that have a different source address.
This change means that the source based routing is used for the
subnets that are configured on the interface and for the default route.
Whereas, if there are any more specific routes in the main table, they will
be preferred over the default route.
Apparently Amazon Linux solves this differently, by not configuring a
routing table for addresses on interface "eth0". That might be an
alternative, but it's not clear to me what is special about eth0 to
warrant this treatment. It also would imply that we somehow recognize
this primary interface. In practise that would be doable by selecting
the interface with "iface_idx" zero.
Instead choose this approach. This is remotely similar to what WireGuard does
for configuring the default route ([1]), however WireGuard uses fwmark to match
the packets instead of the source address.
[1] https://www.wireguard.com/netns/#improved-rule-based-routing
(cherry picked from commit fe80b2d1ec)
(cherry picked from commit 58e58361bd)
Change the logic in check_ip_state() to delay the connection ACTIVATED
state if an address family is pending and its required-timeout has not
expired.
(cherry picked from commit 35cccc41cb)
(cherry picked from commit 51e5df275c)
If the configuration contains dns=none and resolv.conf is updated
through a dispatcher script, currently there is no way to tell NM that
the content of resolv.conf changed, so that it can restart a hostname
resolution.
Use SIGUSR1 (and SIGHUP) for that.
(cherry picked from commit fa1f628bce)
Add support for `carrier-wait-timeout` setting from kernel cmdline.
This will create a new `15-carrier-timeout.conf` file in
/run/NetworkManager/conf.d with the parameter value as specified.
The setting also inserts `match-device` to `*`, matching all devices.
NB: The parameter on kernel cmdline is specified in seconds. This is
done to be backwards compatible with with network-legacy module. However
the generated setting will automatically multiply specified value by
1000 and store timeout value in ms.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/626https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/730
It's questionable whether the manual page should explain
exactly what it does.
However, it's a good exercise writing this up (to review
what happens). Also, a manual page that simply says "it configures
the network automatically" without going into how exactly, is
not very useful either.
NetworkManager is now able to configure veth interfaces throught the
NMSettingVeth. Veth interfaces only have "peer" property.
In order to support Veth interfaces in NetworkManager the design need
to pass the following requirements:
* Veth setting only has "peer" attribute.
* Ethernet profiles must be applicable to Veth interfaces.
* When creating a veth interface, the peer will be managed by
NetworkManager but will not have a profile.
* Veth connection can reapply only if the peer has not been modified.
* In order to modify the veth peer, NetworkManager must deactivate the
connection and create a new one with peer modified.
In general, it should support the basis of veth interfaces but without
breaking any existing feature or use case. The users that are using veth
interfaces as ethernet should not notice anything changed unless they
specified the veth peer setting.
Creating a Veth interface in NetworkManager is useful even without the
support for namespaces for some use cases, e.g "connecting one side of
the veth to an OVS bridge and the other side to a Linux bridge" this is
done when using OVN kubernetes [1][2]. In addition, it would provide
persistent configuration and rollback support for Veth interfaces.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1885605
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1894139
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
If this setting it true (or missing) we skip most of the D-Bus
Disconnect() calls whoe purpose was to keep IWD's internal autoconnect
mechanism always disabled. We use the IWD's Station.State property
updates, and secrets requets through our IWD agent, to find out when IWD
is trying to connect and create "assumed" activations on the NM side to
mirror the IWD state. This is quite complicated due to the many
possible combinations of NMDevice's state and IWD's state. A lot of
them are "impossible" but we try to be careful to consider all the
different possibilities.
NM has a nice API for "assuming connections" but it's designed for
slightly different use cases than what we have here and for now we
created normal "managed"-type activations when assuming an IWD automatic
connection.
Add a new `main.rc-manager=auto` setting, that favours to use
systemd-resolved (and not touch "/etc/resolv.conf" but configure
it via D-Bus), or falls back to `resolvconf`/`netconfig` binaries
if they are installed and enabled at compile time.
As final fallback use "symlink", like before.
Note that on Fedora there is no "openresolv" package ([1]). Instead, "systemd"
package provides "/usr/sbin/resolvconf" as a wrapper for systemd-resolved's
"resolvectl". On such a system the fallback to resolvconf is always
wrong, because NetworkManager should either talk to systemd-resolved
directly or not but never call "/usr/sbin/resolvconf". So, the special handling
for resolvconf and netconfig is only done if NetworkManager was build with these
applications explicitly enabled.
Note that SUSE builds NetworkManager with
--with-netconfig=yes
--with-config-dns-rc-manager-default=netconfig
and the new option won't be used there either. But of course, netconfig
already does all the right things on SUSE.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=668153
Suggested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fix the following build error with meson:
/usr/bin/python3 /home/bgalvani/work/NetworkManager/tools/generate-docs-nm-settings-docs-merge.py man/nm-settings-docs-nmcli.xml --only-from-first clients/cli/generate-docs-nm-settings-nmcli.xml libnm/nm-propery-infos-nmcli.xml libnm/nm-settings-docs-gir.xml
Traceback (most recent call last):
File "/home/bgalvani/work/NetworkManager/tools/generate-docs-nm-settings-docs-merge.py", line 120, in <module>
xml_roots = list([ET.parse(f).getroot() for f in gl_input_files])
File "/home/bgalvani/work/NetworkManager/tools/generate-docs-nm-settings-docs-merge.py", line 120, in <listcomp>
xml_roots = list([ET.parse(f).getroot() for f in gl_input_files])
File "/usr/lib64/python3.8/xml/etree/ElementTree.py", line 1202, in parse
tree.parse(source, parser)
File "/usr/lib64/python3.8/xml/etree/ElementTree.py", line 584, in parse
source = open(source, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '--only-from-first'
Fixes: 3c11116c48 ('docs: in "generate-docs-nm-settings-docs-merge.py" only take properties from first setting')
Especially for "nm-settings-docs-nmcli.xml", the first XML to merge is
"clients/cli/generate-docs-nm-settings-nmcli.xml". That file is
generated with the meta data from nmcli, and it contains all the
properties that are supported. Properties from other XML files,
that are passed as additional arguments should not be merged.
In most cases, there is no difference. It only matters for
"ipv6.dad-timeout" and "user.data". For example, "ipv6.dad-timeout"
is supported by GObject (part of "libnm/nm-settings-docs-gir.xml"),
but not by nmcli. Don't include it in the manual.
This also drops the now empty settings "dummy", "user", and "generic".
We have the correct meta-data of supported properties for nmcli. It is
in clients/common. Use that for generating the manual page instead of
the properties that are part of libnm (some properties may be in libnm
but not supported by nmcli, or some properties may not be GObject
properties, and not detected as by GObject introspection).
"nm-settings-docs-nmcli.xml" will be generated by a tool that depends on
"clients/common/". The file should thus not be in libnm directory, otherwise
there is a circular dependency.
Move the file to "man/" directory.
For consistency, also move "nm-settings-docs-dbus.xml". Note that we
cannot move "nm-settings-docs-gir.xml" to "man/", because that one is
needed for building clients.
There is no need that two XML files that essentially hold similar
information are fundamentally different. Make them more alike.
This way, we can use the same tools that operate on either of
these input files.
A significant part of NetworkManager's API are the connection profiles, documented
in `man nm-settings*`. But there are different aspects about profiles, depending
on what you are interested. There is the D-Bus API, nmcli options, keyfile format,
and ifcfg-rh format. Additionally, there is also libnm API.
Add distinct manual pages for the four aspects. Currently the two new manual
pages "nm-settings-dbus" and "nm-settings-nmcli" are still identical to the
former "nm-settings.5" manual. In the future, they will diverge to
account for the differences.
There are the following aspects:
- "dbus"
- "keyfile"
- "ifcfg-rh"
- "nmcli"
For "libnm" we don't generate a separate "nm-settings-libnm" manual
page. That is instead documented via gtk-doc.
Currently the keyfile and ifcfg-rh manual pages only detail settings
which differ. But later I think also these manual pages should contain
all settings that apply.
"nm-settings-docs-dbus.xml" is "nm-settings-docs-gir.xml" merged with
"nm-property-infos-dbus.xml". The name should reflect that, also because
we will get more files with this naming scheme.
The naming was inconsistent. Rename.
- all the property infos of this kind a now consistently called
"libnm/nm-property-infos-$TAG.xml".
- the script to generate files "libnm/nm-property-infos-$TAG.xml" is
now called "libnm/generate-docs-nm-property-infos.pl".
The manual page claimed that for "connectivitiy-change" actions, the dispatcher
scripts would get as first argument (the device name) "none". That was not done,
only for "hostname" actions.
For consistency, maybe that should be adjusted to also pass "none" for connectivity
change events. However, "none" is really an odd value, if there is no device. Passing
an empty word is IMO nicer. So stick to that behavior, despite being inconsistent.
Also fix the documentation about that.
Conceptionally, the MUD URL really depends on the device, and not so
much the connection profile. That is, when you have a specific IoT
device, then this device probably should use the same MUD URL for all
profiles (at least by default).
We already have a mechanism for that: global connection defaults. Use
that. This allows a vendor drop pre-install a file
"/usr/lib/NetworkManager/conf.d/10-mud-url.conf" with
[connection-10-mud-url]
connection.mud-url=https://example.com
Note that we introduce the special "connection.mud-url" value "none", to
indicate not to use a MUD URL (but also not to consult the global connection
default).
Commit b2a0738765 ('man: improve manual page for nm-online') removed
the explanation of how may-fail can be used to wait for a specific
address family during boot. I found that part useful. Add it again,
adapting it to the new behavior introduced by 1e5206414a ('device:
don't delay startup complete for pending-actions "autoconf", "dhcp4"
and "dhcp6"').
https://bugzilla.redhat.com/show_bug.cgi?id=1825666
monitor-connection-files was deprecated and disabled by default for a long
time. In the meantime, it has no effect at all.
Remove references from the manual pages.