Add support for IPv6 multipath routes, by treating them as single-hop
routes. Otherwise, we can easily end up with an inconsistent platform
cache.
Background:
-----------
Routes are hard. We have NMPlatform which is a cache of netlink objects.
That means, we have a hash table and we cache objects based on some
identity (nmp_object_id_equal()). So those objects must have some immutable,
indistinguishable properties that determine whether an object is the
same or a different one.
For routes and routing rules, this identifying property is basically a subset
of the attributes (but not all!). That makes it very hard, because tomorrow
kernel could add an attribute that becomes part of the identity, and NetworkManager
wouldn't recognize it, resulting in cache inconsistency by wrongly
thinking two different routes are one and the same. Anyway.
The other point is that we rely on netlink events to maintain the cache.
So when we receive a RTM_NEWROUTE we add the object to the cache, and
delete it upon RTM_DELROUTE. When you do `ip route replace`, kernel
might replace a (different!) route, but only send one RTM_NEWROUTE message.
We handle that by somehow finding the route that was replaced/deleted. It's
ugly. Did I say, that routes are hard?
Also, for IPv4 routes, multipath attributes are just a part of the
routes identity. That is, you add two different routes that only differ
by their multipath list, and then kernel does as you would expect.
NetworkManager does not support IPv4 multihop routes and just ignores
them.
Also, a multipath route can have next hops on different interfaces,
which goes against our current assumption, that an NMPlatformIP4Route
has an interface (or no interface, in case of blackhole routes). That
makes it hard to meaningfully support IPv4 routes. But we probably don't
have to, because we can just pretend that such routes don't exist and
our cache stays consistent (at least, until somebody calls `ip route
replace` *sigh*).
Not so for IPv6. When you add (`ip route append`) an IPv6 route that is
identical to an existing route -- except their multipath attribute -- then it
behaves as if the existing route was modified and the result is the
merged route with more next-hops. Note that in this case kernel will
only send a RTM_NEWROUTE message with the full multipath list. If we
would treat the multipath list as part of the route's identity, this
would be as if kernel deleted one routes and created a different one (the
merged one), but only sending one notification. That's a bit similar to
what happens during `ip route replace`, but it would be nightmare to
find out which route was thereby replaced.
Likewise, when you delete a route, then kernel will "subtract" the
next-hop and sent a RTM_DELROUTE notification only about the next-hop that
was deleted. To handle that, you would have to find the full multihop
route, and replace it with the remainder after the subtraction.
NetworkManager so far ignored IPv6 routes with more than one next-hop, this
means you can start with one single-hop route (that NetworkManger sees
and has in the platform cache). Then you create a similar route (only
differing by the next-hop). Kernel will merge the routes, but not notify
NetworkManager that the single-hop route is not longer a single-hop
route. This can easily cause a cache inconsistency and subtle bugs. For
IPv6 we MUST handle multihop routes.
Kernels behavior makes little sense, if you expect that routes have an
immutable identity and want to get notifications about addition/removal.
We can however make sense by it by pretending that all IPv6 routes are
single-hop! With only the twist that a single RTM_NEWROUTE notification
might notify about multiple routes at the same time. This is what the
patch does.
The Patch
---------
Now one RTM_NEWROUTE message can contain multiple IPv6 routes
(NMPObject). That would mean that nmp_object_new_from_nl() needs to
return a list of objects. But it's not implemented that way. Instead,
we still call nmp_object_new_from_nl(), and the parsing code can
indicate that there is something more, indicating the caller to call
nmp_object_new_from_nl() again in a loop to fetch more objects.
In practice, I think all RTM_DELROUTE messages for IPv6 routes are
single-hop. Still, we implement it to handle also multi-hop messages the
same way.
Note that we just parse the netlink message again from scratch. The alternative
would be to parse the first object once, and then clone the object and
only update the next-hop. That would be more efficient, but probably
harder to understand/implement.
https://bugzilla.redhat.com/show_bug.cgi?id=1837254#c20
(cherry picked from commit dac12a8d61)
(cherry picked from commit 698cf1092c)
(cherry picked from commit 87fe255c89)
The variable with this purpose is usually called "IS_IPv4".
It's upper case, because usually this is a const variable, and because
it reminds of the NM_IS_IPv4(addr_family) macro. That letter case
is unusual, but it makes sense to me for the special purpose that this
variable has.
Anyway. The naming of this variable is a different point. Let's
use the variable name that is consistent and widely used.
(cherry picked from commit 8085c0121f)
(cherry picked from commit eec32669a9)
(cherry picked from commit 99cd6ed25e)
To parse the RTA_MULTIHOP message, "policy" is not right (which is used
to parse the overall message). Instead, we don't really have a special
policy that we should use.
This was not a severe issue, because the allocated buffer (with
G_N_ELEMENTS(policy) elements) was larger than need be. And apparently,
using the wrong policy also didn't cause us to reject important
messages.
(cherry picked from commit 997d72932d)
(cherry picked from commit 21b1978072)
(cherry picked from commit ef1587bd88)
In function 'nm_uuid_unparse',
inlined from 'nm_uuid_generate_from_string_str' at src/libnm-glib-aux/nm-uuid.c:393:12,
inlined from 'nm_uuid_generate_from_strings.constprop' at src/libnm-glib-aux/nm-uuid.c:430:16:
src/libnm-glib-aux/nm-uuid.h:37:12: error: 'uuid' may be used uninitialized [-Werror=maybe-uninitialized]
37 | return nm_uuid_unparse_case(uuid, out_str, FALSE);
| ^
src/libnm-glib-aux/nm-uuid.c: In function 'nm_uuid_generate_from_strings.constprop':
src/libnm-glib-aux/nm-uuid.c:20:1: note: by argument 1 of type 'const struct NMUuid *' to 'nm_uuid_unparse_case.constprop' declared here
20 | nm_uuid_unparse_case(const NMUuid *uuid, char out_str[static 37], gboolean upper_case)
| ^
src/libnm-glib-aux/nm-uuid.c:390:12: note: 'uuid' declared here
390 | NMUuid uuid;
| ^
lto1: all warnings being treated as errors
The problem are code paths with failed g_return*() assertions. Being in
a bad state already, they don't bother to ensure proper return values,
and with LTO the compiler might think there are valid code paths wrongly
handled. Work around.
(cherry picked from commit cb9ca67901)
(cherry picked from commit 634e023e72)
In function '_nm_auto_g_free',
inlined from 'test_tc_config_tfilter_matchall_mirred' at src/libnm-core-impl/tests/test-setting.c:2955:24:
./src/libnm-glib-aux/nm-macros-internal.h:58:1: error: 'str' may be used uninitialized [-Werror=maybe-uninitialized]
58 | NM_AUTO_DEFINE_FCN_VOID0(void *, _nm_auto_g_free, g_free);
| ^
src/libnm-core-impl/tests/test-setting.c: In function 'test_tc_config_tfilter_matchall_mirred':
src/libnm-core-impl/tests/test-setting.c:2955:24: note: 'str' was declared here
2955 | gs_free char *str;
| ^
lto1: all warnings being treated as errors
lto-wrapper: fatal error: gcc returned 1 exit status
(cherry picked from commit 6f0e22a64a)
(cherry picked from commit 6329f1db5a)
Recent python-black (22.0) dropped support for Python 2 and thus fail
for those files. Make the examples Python3 compatible.
(cherry picked from commit 95e6a0a6e2)
(cherry picked from commit 2e4d1e8dc6)
(cherry picked from commit b78ca328d2)
I got a report of a scenario where multiple servers reply to a REQUEST
in SELECTING, and all servers send NAKs except the one which sent the
offer, which replies with a ACK. In that scenario, n-dhcp4 is not able
to obtain a lease because it restarts from INIT as soon as the first
NAK is received. For comparison, dhclient can get a lease because it
ignores all NAKs in SELECTING.
Arguably, the network is misconfigured there, but it would be great if
n-dhcp4 could still work in such scenario.
According to RFC 2131, ACK and NAK messages from server must contain a
server-id option. The RFC doesn't explicitly say that the client
should check the option, but I think it's a reasonable thing to do, at
least for NAKs.
This patch stores the server-id of the REQUEST in SELECTING, and
compares it with the server-id from NAKs, to discard other servers'
replies.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1144
(cherry picked from commit 118561e284)
(cherry picked from commit 3abfdbab33)
(cherry picked from commit c499412ec4)
It's not clear how this could happen, but it did:
#0 _g_log_abort (breakpoint=1) at gmessages.c:580
#0 0x00007f4e782c5895 in _g_log_abort (breakpoint=1) at gmessages.c:580
#1 0x00007f4e782c6b98 in g_logv (log_domain=0x558436ef1520 "nm", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7ffd5b20b0c0) at gmessages.c:1391
#2 0x00007f4e782c6d63 in g_log (log_domain=log_domain@entry=0x558436ef1520 "nm", log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, format=format@entry=0x7f4e78313620 "%s: assertion '%s' failed") at gmessages.c:1432
#3 0x00007f4e782c759d in g_return_if_fail_warning (log_domain=log_domain@entry=0x558436ef1520 "nm", pretty_function=pretty_function@entry=0x558436e49820 <__func__.43636> "nm_ip6_config_reset_addresses_ndisc", expression=expression@entry=0x558436e48b00 "priv->ifindex > 0") at gmessages.c:2809
#4 0x0000558436bc47ca in nm_ip6_config_reset_addresses_ndisc (self=0x5584385cc190 [NMIP6Config], addresses=0x5584385952a0, addresses_n=1, plen=plen@entry=64 '@', ifa_flags=ifa_flags@entry=768) at src/core/nm-ip6-config.c:1468
#5 0x0000558436d32e50 in ndisc_config_changed (ndisc=<optimized out>, rdata=0x55843856e4d0, changed_int=159, self=0x5584385c00f0 [NMDeviceOvsInterface]) at src/core/devices/nm-device.c:10838
#6 0x00007f4e7323b09e in ffi_call_unix64 () at ../src/x86/unix64.S:76
#7 0x00007f4e7323aa4f in ffi_call (cif=cif@entry=0x7ffd5b20b550, fn=fn@entry=0x558436d32a30 <ndisc_config_changed>, rvalue=<optimized out>, avalue=avalue@entry=0x7ffd5b20b460) at ../src/x86/ffi64.c:525
#8 0x00007f4e787a0386 in g_cclosure_marshal_generic_va (closure=<optimized out>, return_value=<optimized out>, instance=<optimized out>, args_list=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=<optimized out>) at gclosure.c:1604
#9 0x00007f4e7879f616 in _g_closure_invoke_va (closure=0x55843850b200, return_value=0x0, instance=0x55843856e5d0, args=0x7ffd5b20b800, n_params=2, param_types=0x558438495e50) at gclosure.c:867
#10 0x00007f4e787bba9c in g_signal_emit_valist (instance=0x55843856e5d0, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7ffd5b20b800) at gsignal.c:3301
#11 0x00007f4e787bc093 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at gsignal.c:3448
#12 0x0000558436ddf04b in check_timestamps (ndisc=ndisc@entry=0x55843856e5d0 [NMLndpNDisc], now_msec=now_msec@entry=15132, changed=changed@entry=(NM_NDISC_CONFIG_DHCP_LEVEL | NM_NDISC_CONFIG_GATEWAYS | NM_NDISC_CONFIG_ADDRESSES | NM_NDISC_CONFIG_ROUTES | NM_NDISC_CONFIG_DNS_SERVERS | NM_NDISC_CONFIG_MTU)) at src/core/ndisc/nm-ndisc.c:1539
#13 0x0000558436de08d0 in nm_ndisc_ra_received (ndisc=ndisc@entry=0x55843856e5d0 [NMLndpNDisc], now_msec=now_msec@entry=15132, changed=changed@entry=(NM_NDISC_CONFIG_DHCP_LEVEL | NM_NDISC_CONFIG_GATEWAYS | NM_NDISC_CONFIG_ADDRESSES | NM_NDISC_CONFIG_ROUTES | NM_NDISC_CONFIG_DNS_SERVERS | NM_NDISC_CONFIG_MTU)) at src/core/ndisc/nm-ndisc.c:1556
#14 0x0000558436dd8d50 in receive_ra (ndp=<optimized out>, msg=0x5584385e77c0, user_data=<optimized out>) at src/core/ndisc/nm-lndp-ndisc.c:333
#15 0x00007f4e794718a3 in ndp_call_handlers (msg=0x5584385e77c0, ndp=0x5584384db840) at libndp.c:1993
#16 0x00007f4e794718a3 in ndp_sock_recv (ndp=0x5584384db840) at libndp.c:1871
#17 0x00007f4e794718a3 in ndp_call_eventfd_handler (ndp=ndp@entry=0x5584384db840) at libndp.c:2097
#18 0x00007f4e7947199f in ndp_callall_eventfd_handler (ndp=0x5584384db840) at libndp.c:2126
#19 0x0000558436dda229 in event_ready (fd=<optimized out>, condition=<optimized out>, user_data=<optimized out>) at src/core/ndisc/nm-lndp-ndisc.c:588
#20 0x00007f4e782bf95d in g_main_dispatch (context=0x558438409a40) at gmain.c:3193
#21 0x00007f4e782bf95d in g_main_context_dispatch (context=context@entry=0x558438409a40) at gmain.c:3873
#22 0x00007f4e782bfd18 in g_main_context_iterate (context=0x558438409a40, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3946
#23 0x00007f4e782c0042 in g_main_loop_run (loop=0x5584383e5150) at gmain.c:4142
Above is a stack trace of commit af00e39dd2 ('libnm: add NMIPAddress
and NMIPRoute dups backported symbols from 1.30.8').
As workaround, ignore the ndisc signal, if we currently don't have an ifindex.
Also, recreate the NMIP6Config instances, if the ifindex doesn't match
(or we don't have one).
This workaround is probably good enough for the stable branch, as the
code on main (1.35+) was heavily reworked and the fix does not apply
there.
https://bugzilla.redhat.com/show_bug.cgi?id=2013266#c1https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1058
(cherry picked from commit 94215cdb07)
When the device gets realized, similar to the situation that the device
is unmanaged by platform-init, if the device is still unmanaged by
parent and we clear the assume state. Then, when the device becomes
managed, NM is not able to properly assume the device using the UUID.
Therefore, we should not clear the assume state if the device has only
the NM_UNMANAGED_PLATFORM_INIT or the NM_UNMANAGED_PARENT flag set
in the unmanaged flags.
The previous commit 3c4450aa4d ('core: don't reset assume state too
early') did something similar for NM_UNMANAGED_PLATFORM_INIT flag only.
(cherry picked from commit 87674740d8)
(cherry picked from commit 1b00c50d52)
valgrind might log warnings about syscalls that it doesn't implement.
When we run valgrind tests, we check that the command exits with
success, but we also check that there is no unexpected content in the
valgrind log.
Those warnings are not relevant for us. We don't unit-tests valgrind, we
unit tests NetworkManager. Let's always remove such warnings with `sed`.
We already did that previously, but only for a explicit list of tests.
Now do it for all tests.
This is currently relevant on Fedora 35 and Ubuntu devel, where the
"close_range" syscall is used by libc, but not supported by valgrind.
While at it, rework the confusing logic of "HAS_ERRORS" variable.
(cherry picked from commit fc220f94af)
$ nmcli connection modify dummy1 ethtool.feature-rx a
(process:3077356): GLib-WARNING **: GError set over the top of a previous GError or uninitialized memory.
This indicates a bug in someone's code. You must ensure an error is NULL before it's set.
The overwriting error message was: 'a' is not valid; use 'on', 'off', or 'ignore'
Error: failed to modify ethtool.feature-rx: 'a' is not valid; use [true, yes, on], [false, no, off] or [unknown].
Fixes: e5b46aa38a ('cli: use nmc_string_to_ternary() to parse ternary in _set_fcn_ethtool()')
(cherry picked from commit 25e705c361)
(cherry picked from commit 2aa19708c2)
In NetworkManager, a profile cannot have "ipvx.dns" or "ipvx.dns-search"
while the corresponding IP method is disabled. Together with the oddity
that in NetworkManager DNS settings are separate per IPv4 and IPv6, this
causes problems:
$ cat wg0.conf
[Interface]
PrivateKey = CBXpiLxQ98TLISJ2cypEFtQb/djzYzENyy0jzhWa/UA=
Address = 192.168.1.100
DNS = 10.11.12.13, foobar.de
[Peer]
PublicKey = Wus1sBzZiQkyxr6ZitUFNvfYD7KJkwTsWlcxvJ/4SHI=
Endpoint = 1.2.3.4:51827
AllowedIPs = 0.0.0.0/0
$ nmcli connection import type wireguard file wg0.conf
Error: failed to import 'wg0.conf': Failed to create WireGuard connection: ipv6.dns-search: this property is not allowed for 'method=disabled'.
Fixes: 3ab082ed96 ('cli: support dns-search for import of WireGuard profiles')
(cherry picked from commit db53e5f3cd)
When NetworkManager is reloaded the config from active devices is not
being reloaded properly.
Related: https://bugzilla.redhat.com/1852445
Fixes: 121c58f0c4 ('core: set number of SR-IOV VFs asynchronously')
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
(cherry picked from commit ff9b64c923)
nm_vpn_plugin_info_new_from_file() may fail as NMVpnPlugin is an
GInitable. As such, the destructor must handle the case where the
instance was only partly initialized.
#0 g_logv (log_domain=0x7f7144703071 "GLib", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=<optimized out>) at ../glib/gmessages.c:1413
#1 0x00007f71446b3903 in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at ../glib/gmessages.c:1451
#2 0x000056455b8e58d0 in finalize (object=0x7f7128008180 [NMVpnPluginInfo]) at src/libnm-core-impl/nm-vpn-plugin-info.c:1280
#3 0x00007f71447b8b18 in g_object_unref (_object=<optimized out>) at ../gobject/gobject.c:3524
#4 g_object_unref (_object=0x7f7128008180) at ../gobject/gobject.c:3416
#5 0x00007f714486bc09 in g_initable_new_valist
(object_type=<optimized out>, first_property_name=0x56455b925c20 "filename", var_args=var_args@entry=0x7ffe702b1140, cancellable=cancellable@entry=0x0, error=error@entry=0x7ffe702b1248) at ../gio/ginitable.c:250
#6 0x00007f714486bcad in g_initable_new
(object_type=<optimized out>, cancellable=cancellable@entry=0x0, error=error@entry=0x7ffe702b1248, first_property_name=first_property_name@entry=0x56455b925c20 "filename")
at ../gio/ginitable.c:162
#7 0x000056455b8e69f6 in nm_vpn_plugin_info_new_from_file
(filename=filename@entry=0x56455c951ec0 "/opt/test/lib/NetworkManager/VPN/nm-openvpn-service.name", error=error@entry=0x7ffe702b1248) at src/libnm-core-impl/nm-vpn-plugin-info.c:1221
#8 0x000056455b88ce9a in vpn_dir_changed
(monitor=monitor@entry=0x7f7128007860 [GInotifyFileMonitor], file=file@entry=0x7f712c005600, other_file=other_file@entry=0x0, event_type=<optimized out>, user_data=<optimized out>)
at src/core/vpn/nm-vpn-manager.c:182
#9 0x00007f71448697a3 in _g_cclosure_marshal_VOID__OBJECT_OBJECT_ENUMv
(closure=0x56455c7e4250, return_value=<optimized out>, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x56455c7355a0) at ../gio/gmarshal-internal.c:1380
Fixes: d6226bd987 ('libnm: add NMVpnPluginInfo class')
(cherry picked from commit 841c45a4f5)
We use the cleanup attribute heavily. It's useful for deferring
deallocation. For example, we have code like:
gs_unref_object NMBluezManager *self_keep_alive = g_object_ref(self);
where we don't use the variable otherwise, except for owning (and
freeing) the reference. This already lead to a compiler warning about
unused variable, which we would workaround with
_nm_unused gs_unref_object NMBluezManager *self_keep_alive = g_object_ref(self);
With clang 13.0.0~rc1-1.fc35, this got worse. Now for example also
static inline void
nm_strvarray_set_strv(GArray **array, const char *const *strv)
{
gs_unref_array GArray *array_old = NULL;
array_old = g_steal_pointer(array);
if (!strv || !strv[0])
return;
nm_strvarray_ensure(array);
for (; strv[0]; strv++)
nm_strvarray_add(*array, strv[0]);
}
leads to a warning
./src/libnm-glib-aux/nm-shared-utils.h:3078:28: error: variable array_old set but not used [-Werror,-Wunused-but-set-variable]
gs_unref_array GArray *array_old = NULL;
^
This is really annoying. We don't want to plaster our code with _nm_unused,
because that might hide actual issues. But we also want to keep using this
pattern and need to avoid the warning.
A problem is also that GCC usually does not warn about truly unused
variables with cleanup attribute. Clang was very useful here to flag
such variables. But now clang warns about cases which are no bugs, which
is a problem. So this does loose some useful warnings. On the other hand,
a truly unused variable (with cleanup attribute) is ugly, but not an actual
problem.
Now, with clang 13, automatically mark nm_auto() variables as _nm_unused
as workaround.
(cherry picked from commit 1c85bc5ead)
These containers are ancient. Also, when we update ci-templates
they will no longer build (because then a different container hub
will be used, which doesn't contain those images). Drop them.
(cherry picked from commit 82b72a7379)
Since Fedora 25, vala-tools was merged with "vala" package. And on
rawhide (f36) it's gone completely and leads to a failure of the script.
Drop it.
(cherry picked from commit 53562b1915)
The formatting produced by clang-format depends on the version of the
tool. The version that we use is the one of the current Fedora release.
Fedora 34 recently updated clang (and clang-tools-extra) from version
12.0.0 to 12.0.1. This brings some changes.
Update the formatting.
(cherry picked from commit 10e0c4261e)
The nm_ip_address_dup() and nm_ip_route_dup() symbols were exposed in
libnm 1.32 and then backported to 1.30.8.
Export it also with version @libnm_1_30_8; this allows a program build
against libnm 1.30.8 to keep working with later versions of the library.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
(cherry picked from commit ec8df200f6)
The nm_setting_ip_config_get_required_timeout() symbol was introduced
in libnm 1.32.4 and then backported to 1.30.8.
Export it also with version @libnm_1_30_8; this allows a program built
against libnm 1.30.8 to keep working with later versions of the
library.
(cherry picked from commit 57c1982867)
Background
==========
Imagine you run a container on your machine. Then the routing table
might look like:
default via 10.0.10.1 dev eth0 proto dhcp metric 100
10.0.10.0/28 dev eth0 proto kernel scope link src 10.0.10.5 metric 100
[...]
10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink
10.42.1.2 dev cali02ad7e68ce1 scope link
10.42.1.3 dev cali8fcecf5aaff scope link
10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink
That is, there are another interfaces with subnets and specific routes.
If nm-cloud-setup now configures rules:
0: from all lookup local
30400: from 10.0.10.5 lookup 30400
32766: from all lookup main
32767: from all lookup default
and
default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
10.0.10.1 dev eth0 table 30400 proto static scope link metric 10
then these other subnets will also be reached via the default route.
This container example is just one case where this is a problem. In
general, if you have specific routes on another interface, then the
default route in the 30400+ table will interfere badly.
The idea of nm-cloud-setup is to automatically configure the network for
secondary IP addresses. When the user has special requirements, then
they should disable nm-cloud-setup and configure whatever they want.
But the container use case is popular and important. It is not something
where the user actively configures the network. This case needs to work better,
out of the box. In general, nm-cloud-setup should work better with the
existing network configuration.
Change
======
Add new routing tables 30200+ with the individual subnets of the
interface:
10.0.10.0/24 dev eth0 table 30200 proto static metric 10
[...]
default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
10.0.10.1 dev eth0 table 30400 proto static scope link metric 10
Also add more important routing rules with priority 30200+, which select
these tables based on the source address:
30200: from 10.0.10.5 lookup 30200
These will do source based routing for the subnets on these
interfaces.
Then, add a rule with priority 30350
30350: lookup main suppress_prefixlength 0
which processes the routes from the main table, but ignores the default
routes. 30350 was chosen, because it's in between the rules 30200+ and
30400+, leaving a range for the user to configure their own rules.
Then, as before, the rules 30400+ again look at the corresponding 30400+
table, to find a default route.
Finally, process the main table again, this time honoring the default
route. That is for packets that have a different source address.
This change means that the source based routing is used for the
subnets that are configured on the interface and for the default route.
Whereas, if there are any more specific routes in the main table, they will
be preferred over the default route.
Apparently Amazon Linux solves this differently, by not configuring a
routing table for addresses on interface "eth0". That might be an
alternative, but it's not clear to me what is special about eth0 to
warrant this treatment. It also would imply that we somehow recognize
this primary interface. In practise that would be doable by selecting
the interface with "iface_idx" zero.
Instead choose this approach. This is remotely similar to what WireGuard does
for configuring the default route ([1]), however WireGuard uses fwmark to match
the packets instead of the source address.
[1] https://www.wireguard.com/netns/#improved-rule-based-routing
(cherry picked from commit fe80b2d1ec)
The table number is chosen as 30400 + iface_idx. That is, the range is
limited and we shouldn't handle more than 100 devices. Add a check for
that and error out.
(cherry picked from commit b68d694b78)
The routes/rules that are configured are independent of the
order in which we process the devices. That is, because they
use the "iface_idx" for cases where there is ambiguity.
Still, it feels nicer to always process them in a defined order.
(cherry picked from commit a95ea0eb29)
Sorted by iface_idx. The iface_idx is probably something useful and
stable, provided by the provider. E.g. it's the order in which
interfaces are exposed on the meta data.
(cherry picked from commit 1c5cb9d3c2)
get-config() gives a NMCSProviderGetConfigResult structure, and the
main part of data is the GHashTable of MAC addresses and
NMCSProviderGetConfigIfaceData instances.
Let NMCSProviderGetConfigIfaceData also have a reference to the MAC
address. This way, I'll be able to create a (sorted) list of interface
datas, that also contain the MAC address.
(cherry picked from commit ec56fe60fb)
nm-cloud-setup automatically configures the network. That may conflict
with what the user wants. In case the user configures some specific
setup, they are encouraged to disable nm-cloud-setup (and its
automatism).
Still, what we do by default matters, and should play as well with
user's expectations. Configuring policy routing and a higher priority
table (30400+) that hijacks the traffic can cause problems.
If the system only has one IPv4 address and one interface, then there
is no point in configuring policy routing at all. Detect that, and skip
the change in that case.
Note that of course we need to handle the case where previously multiple
IP addresses were configured and an update gives only one address. In
that case we need to clear the previously configured rules/routes. The
patch achieves this.
(cherry picked from commit 5f047968d7)
Now that we return a struct from get_config(), we can have system-wide
properties returned.
Let it count and cache the number of valid iface_datas.
Currently that is not yet used, but it will be.
(cherry picked from commit a3cd66d3fa)