Commit graph

73 commits

Author SHA1 Message Date
Thomas Haller
e57b3c8072
cloud-setup: log a warning when no provider is detected
When using nm-cloud-setup (and enabling any providers), then we expect
to also detect a provider. Otherwise, the user is running in an
environment where none of the provider exists (in which case they should
disable nm-cloud-setup) or there is an unexpected failure. In either
case, that's worth a warning message.

https://bugzilla.redhat.com/show_bug.cgi?id=2214880

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1659
2023-06-14 16:24:21 +02:00
Thomas Haller
c70a5470be
cloud-setup: clear error variable in nmcs_device_reapply()
This is rather bad, because if we reach the "goto again" case,
the error variable is not cleared. Subsequently passing the
error location to nm_device_reapply_finish() will trigger a glib
warning.

Fixes: 29b0420be7 ('nm-cloud-setup: set preserve-external-ip flag during reapply')
2023-06-12 10:38:00 +02:00
Thomas Haller
dab114f038
cloud-setup: fix terminating in the middle of reconfiguring the system
Once we start reconfiguring the system, we need to finish on all
interfaces. Otherwise, we might reconfigure some interfaces, abort
and leave the network broken. When that happens, a subsequent run
might also be unable to recover, because we are unable to reach the
HTTP meta data service.

https://bugzilla.redhat.com/show_bug.cgi?id=2207812

Fixes: 69f048bf0c ('cloud-setup: add tool for automatic IP configuration in cloud')
2023-06-12 10:37:53 +02:00
Wen Liang
f04a9eb098 cloud-setup: add pre-up event to prevent reaching network-online.target
network-online.target should not be reached before nm-cloud-setup
completes configuring the network, which may make user service get
started before the network is fully configured.

Setting nm-cloud-setup.service as "Before=network-online.target" would
maybe have already achieved that. However, also use a pre-up dispatcher
script, so that the device activation in NetworkManager is also waiting
for nm-cloud-setup to complete.

https://bugzilla.redhat.com/show_bug.cgi?id=2151040
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1653
2023-06-09 09:18:20 -04:00
Thomas Haller
a34ce19be7
cloud-setup: log about cancellation of provider detection
Otherwise, the logging output is confusing as it's not clear what happened about
the provider detection.

Also, don't return from _provider_detect() unless all pending callbacks
returned. It is ugly to leave callbacks pending, because they will be
later dispatched when the caller iterates the GMainContext further.
Instead, cancel pending operations bug keep running until we processed
all cancellation callbacks.
2023-05-22 17:26:19 +02:00
Thomas Haller
bc836c3991
cloud-setup: debug log when URL is mocked 2023-05-22 17:26:18 +02:00
Thomas Haller
4ef944d504
cloud-setup: move implementation of getters for base URL to macro
It's the same code, four times. And once it was messed up.
Move the definition of the function to a macro.
2023-05-22 17:26:18 +02:00
Thomas Haller
2cc54b2bb1
cloud-setup: allow mapping device for testing
In production systems, the associate an interface by their permanate MAC address.
For testing and CI, we may want to hook that. That allows for example to run
nm-cloud-setup on a veth interface, which doesn't have a permanent MAC address.

This is only for testing.
2023-05-22 17:26:18 +02:00
Thomas Haller
eaa2188119
cloud-setup: refactor handling of environment variables
Previously, there was NMCS_ENV_VARIABLE() macro. That macro did nothing,
it merely acted as something to grep for, when searching the source for
which environment variables nm-cloud-setup honors. That is an
interesting thing to know, because nm-cloud-setup is configured via
environment variables.

Change that. Instead add a define for each environment variable. You can
now instead grep for "NMCS_ENV_" to find them all.
2023-05-22 17:26:18 +02:00
Thomas Haller
62104477c3
cloud-setup: fix _aliyun_base() initializing base URL
Fixes: f3404435a9 ('cloud-setup: configure secondary ip in Aliyun cloud')
2023-05-22 17:26:18 +02:00
Lubomir Rintel
79f6a7da56
cloud-setup/gcp: add ability to redirect metadata API requests
A different host can be specified with (undocumented, private)
NM_CLOUD_SETUP_GCP_HOST environment variable.
2023-05-12 12:42:55 +02:00
Lubomir Rintel
515e69df3a
cloud-setup/azure: add ability to redirect metadata API requests
A different host can be specified with (undocumented, private)
NM_CLOUD_SETUP_AZURE_HOST environment variable.
2023-05-12 12:42:55 +02:00
Thomas Haller
0888ed93f7
cloud-init: fix leaking iproutes for GCP provider
The routes in iproutes were leaked (and ownership stolen
in _nmc_mangle_connection(), leaving dangling pointers).

Fix that by using a GPtrArray instead.
2023-05-12 12:42:54 +02:00
Corentin Noël
5d28a0dd89
doc: replace all (allow-none) annotations by (optional) and/or (nullable)
The (allow-none) annotation is deprecated since a long time now, it is better to
use (nullable) and/or (optional) which clarifies what it means with the (out)
annotation.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1551
2023-03-27 11:49:43 +02:00
Lubomir Rintel
f07da04cd9 cloud-setup: actually pass the HTTP method in nm_http_client_poll_req()
https://bugzilla.redhat.com/show_bug.cgi?id=2179718

Fixes: 8b7e12c2d6 ('cloud-setup/ec2: start with requesting a IMDSv2 token')
Fixes: cd74d75002 ('cloud-setup: make nm_http_client_req() accept a method argument')
2023-03-21 23:35:42 +01:00
Thomas Haller
5a564ca17c
glib-aux,cloud-setup: move nm{cs,}_utils_poll() to libnm-glib-aux
The idea is to have useful and correct helper functions, that are
generic and reusable.

nmcs_utils_poll() was done with that intent, and it is a generally
useful function. As the implementation shows, it's not entirely trivial
to get all the parameters right, when it comes to glib-integration
(GMainContext and GTask) and polling.

Whether something like a generic poll helper is a useful thing at all,
may be a valid question. E.g. you need several hooks, and the usage is
still not trivial. Regardless of that, "nm-cloud-setup-utils.c" already
had such a helper. This is only about moving it to a place where it is
actually accessible to others.

And, if it turns out to be a good idea after all, then somebody else
could use it.
2023-03-13 17:13:01 +01:00
Thomas Haller
90f7d7d77b
cloud-setup: add register-object callback to nmcs_utils_poll()
nmcs_utils_poll() calls nmcs_wait_for_objects_register(), which is
specific to nm-cloud-setup.

nmcs_utils_poll() should move to libnm-glib-aux, so it should not have a
direct dependency on nm-cloud-setup code. Add instead a hook that the
caller can use for registering the object.
2023-03-13 17:13:01 +01:00
Thomas Haller
1ee35cb638
cloud-setup/trivial: fix gtkdoc comment for nmcs_utils_poll()
@probe_start_fcn is not called the first time synchronously. Fix the comment.
While at it, reword a bit.
2023-03-13 17:13:01 +01:00
Thomas Haller
841e06be5e
cloud-setup: don't pass separate user-data when polling in nm_http_client_poll_req()
nmcs_utils_poll() accepts two different user-data. One is passed to the
probe callbacks (and returned by nmcs_utils_poll_finish()). The other
one is passed to the callback.

Having separate user data might be useful. It's not useful for
nm_http_client_poll_req(), which both passes the same.

It's confusing to pass the same data as different user-data. Don't do
that. Use only one way.
2023-03-13 17:12:54 +01:00
Lubomir Rintel
8b7e12c2d6 cloud-setup/ec2: start with requesting a IMDSv2 token
The present version of the EC2 metadata API (IMDSv2) requires a header
with a token to be present in all requests. The token is essentially a
cookie that's not actually a cookie that's obtained with a PUT call that
doesn't put anything. Apparently it's too easy to trick someone into
calling a GET method.

EC2 now supports IMDSv2 everywhere with IMDSv1 being optional, so let's
just use IMDSv2 unconditionally. Also, the presence of a token API can
be used to detect the AWS EC2 cloud.

https://bugzilla.redhat.com/show_bug.cgi?id=2151986
2023-03-07 13:54:08 +01:00
Lubomir Rintel
088bfd817a cloud-setup: document detect() and get_config() methods
Clarify that detect() needs to succeed before get_config().

I thought it's sort of common sense, but it's better to be explicit as
we're going to rely on that.
2023-03-07 13:54:03 +01:00
Lubomir Rintel
cd74d75002 cloud-setup: make nm_http_client_req() accept a method argument
We'll need to be able to issue PUT calls.
2023-03-07 13:54:03 +01:00
Lubomir Rintel
85ce088616 cloud-setup: rename get/Get identifiers to req and Req
We're going to extend those to issue methods other than GET.
Also, "request" would've been too long, "req" looks nicer.
2023-03-07 13:54:03 +01:00
Lubomir Rintel
ce225b2c06 cloud_setup: unexport nm_http_client_get()
It's not used anywhere.
2023-03-07 13:54:03 +01:00
Thomas Haller
599fe234ea
cloud-setup: use nm_strv_dup_packed() in nm_http_client_poll_get()
No need to do a deep clone. The strv array is not ever modified and we
pack it together in one memory allocation.
2023-02-27 14:01:04 +01:00
Thomas Haller
dabfea2fc2
curl: use CURLOPT_PROTOCOLS_STR instead of deprecated CURLOPT_PROTOCOLS
CURLOPT_PROTOCOLS [0] was deprecated in libcurl 7.85.0 with
CURLOPT_PROTOCOLS_STR [1] as a replacement.

Well, technically it was only deprecated in 7.87.0, and retroactively
marked as deprecated since 7.85.0 [2]. But CURLOPT_PROTOCOLS_STR exists
since 7.85.0, so that's what we want to use.

This causes compiler warnings and build errors:

  ../src/core/nm-connectivity.c: In function 'do_curl_request':
  ../src/core/nm-connectivity.c:770:5: error: 'CURLOPT_PROTOCOLS' is deprecated: since 7.85.0. Use CURLOPT_PROTOCOLS_STR [-Werror=deprecated-declarations]
    770 |     curl_easy_setopt(ehandle, CURLOPT_PROTOCOLS, CURLPROTO_HTTP | CURLPROTO_HTTPS);
        |     ^~~~~~~~~~~~~~~~
  In file included from ../src/core/nm-connectivity.c:13:
  /usr/include/curl/curl.h:1749:3: note: declared here
   1749 |   CURLOPTDEPRECATED(CURLOPT_PROTOCOLS, CURLOPTTYPE_LONG, 181,
        |   ^~~~~~~~~~~~~~~~~

This patch is largely taken from systemd patch [2].

Based-on-patch-by: Frantisek Sumsal <frantisek@sumsal.cz>

[0] https://curl.se/libcurl/c/CURLOPT_PROTOCOLS.html
[1] https://curl.se/libcurl/c/CURLOPT_PROTOCOLS_STR.html
[2] 6967571bf2
[3] e61a4c0b7c

Fixes: 7a1734926a ('connectivity,cloud-setup: restrict curl protocols to HTTP and HTTPS')
2023-01-18 20:21:52 +01:00
Thomas Haller
0b1177cb18
all: use _NM_G_TYPE_CHECK_INSTANCE_CAST() for internal uses
G_TYPE_CHECK_INSTANCE_CAST() can trigger a "-Wcast-align":

    src/core/devices/nm-device-macvlan.c: In function 'parent_changed_notify':
    /usr/include/glib-2.0/gobject/gtype.h:2421:42: error: cast increases required alignment of target type [-Werror=cast-align]
     2421 | #  define _G_TYPE_CIC(ip, gt, ct)       ((ct*) ip)
          |                                          ^
    /usr/include/glib-2.0/gobject/gtype.h:501:66: note: in expansion of macro '_G_TYPE_CIC'
      501 | #define G_TYPE_CHECK_INSTANCE_CAST(instance, g_type, c_type)    (_G_TYPE_CIC ((instance), (g_type), c_type))
          |                                                                  ^~~~~~~~~~~
    src/core/devices/nm-device-macvlan.h:13:6: note: in expansion of macro 'G_TYPE_CHECK_INSTANCE_CAST'
       13 |     (G_TYPE_CHECK_INSTANCE_CAST((obj), NM_TYPE_DEVICE_MACVLAN, NMDeviceMacvlan))
          |      ^~~~~~~~~~~~~~~~~~~~~~~~~~

Avoid that by using _NM_G_TYPE_CHECK_INSTANCE_CAST().

This can only be done for our internal usages. The public headers
of libnm are not changed.
2022-12-16 10:55:03 +01:00
Thomas Haller
911b550140
nm-cloud-setup: simplify clearing variables in retry loop
The label "try_again" is only reached by one goto. So it was correct
and sufficient to reset the state only there.

It is still error prone. The slighlty clearer approach is to clear
the state at each begin of the "try_again" step.

There should be no change in behavior.

I didn't confirm, but an optimizing compiler should (could) be able
to see that the cleanup is only necessary on retry, and generate the
same code as before. In any case, we should write code that is easier
to read, not optimize for something that a compiler should be able to
optimize itself.
2022-12-14 17:33:57 +01:00
Thomas Haller
bbd32fba15
nm-cloud-setup: refactor skipping reapply be checking for skip first
There should be no change in behavior, but this way seems nicer.
Now _nmc_mangle_connection() doesn't return FALSE, it always
will try to mangle the connection and requires the caller to
first check whether that is appropriate.

Just move some code outside of _nmc_mangle_connection() and let
the caller check for the skip first.

The point is consistency, as the caller already does some checks to
whether skip the reapply. So it should do all the checks, so that
"mangle" never fails/skips.
2022-12-14 17:33:56 +01:00
Thomas Haller
29b0420be7
nm-cloud-setup: set preserve-external-ip flag during reapply
Externally added IP addresses/routes should be preserved by
nm-cloud-setup. This allows other tools to also configure the interface
and the Reapply() call from nm-cloud-setup would not interfere
with those tools.

https://bugzilla.redhat.com/show_bug.cgi?id=2132754
2022-12-14 17:33:56 +01:00
Thomas Haller
6b74f3cc14
cloud-setup,glib-aux: use NULL instead of g_direct_equal() for hash tables 2022-08-31 09:47:48 +02:00
Thomas Haller
08eff4c46e
glib-aux: rename IP address related helpers from "nm-inet-utils.h"
- name things related to `in_addr_t`, `struct in6_addr`, `NMIPAddr` as
  `nm_ip4_addr_*()`, `nm_ip6_addr_*()`, `nm_ip_addr_*()`, respectively.

- we have a wrapper `nm_inet_ntop()` for `inet_ntop()`. This name
  of our wrapper is chosen to be familiar with the libc underlying
  function. With this, also name functions that are about string
  representations of addresses `nm_inet_*()`, `nm_inet4_*()`,
  `nm_inet6_*()`. For example, `nm_inet_parse_str()`,
  `nm_inet_is_normalized()`.

<<<<

  R() {
     git grep -l "$1" | xargs sed -i "s/\<$1\>/$2/g"
  }

  R NM_CMP_DIRECT_IN4ADDR_SAME_PREFIX          NM_CMP_DIRECT_IP4_ADDR_SAME_PREFIX
  R NM_CMP_DIRECT_IN6ADDR_SAME_PREFIX          NM_CMP_DIRECT_IP6_ADDR_SAME_PREFIX
  R NM_UTILS_INET_ADDRSTRLEN                   NM_INET_ADDRSTRLEN
  R _nm_utils_inet4_ntop                       nm_inet4_ntop
  R _nm_utils_inet6_ntop                       nm_inet6_ntop
  R _nm_utils_ip4_get_default_prefix           nm_ip4_addr_get_default_prefix
  R _nm_utils_ip4_get_default_prefix0          nm_ip4_addr_get_default_prefix0
  R _nm_utils_ip4_netmask_to_prefix            nm_ip4_addr_netmask_to_prefix
  R _nm_utils_ip4_prefix_to_netmask            nm_ip4_addr_netmask_from_prefix
  R nm_utils_inet4_ntop_dup                    nm_inet4_ntop_dup
  R nm_utils_inet6_ntop_dup                    nm_inet6_ntop_dup
  R nm_utils_inet_ntop                         nm_inet_ntop
  R nm_utils_inet_ntop_dup                     nm_inet_ntop_dup
  R nm_utils_ip4_address_clear_host_address    nm_ip4_addr_clear_host_address
  R nm_utils_ip4_address_is_link_local         nm_ip4_addr_is_link_local
  R nm_utils_ip4_address_is_loopback           nm_ip4_addr_is_loopback
  R nm_utils_ip4_address_is_zeronet            nm_ip4_addr_is_zeronet
  R nm_utils_ip4_address_same_prefix           nm_ip4_addr_same_prefix
  R nm_utils_ip4_address_same_prefix_cmp       nm_ip4_addr_same_prefix_cmp
  R nm_utils_ip6_address_clear_host_address    nm_ip6_addr_clear_host_address
  R nm_utils_ip6_address_same_prefix           nm_ip6_addr_same_prefix
  R nm_utils_ip6_address_same_prefix_cmp       nm_ip6_addr_same_prefix_cmp
  R nm_utils_ip6_is_ula                        nm_ip6_addr_is_ula
  R nm_utils_ip_address_same_prefix            nm_ip_addr_same_prefix
  R nm_utils_ip_address_same_prefix_cmp        nm_ip_addr_same_prefix_cmp
  R nm_utils_ip_is_site_local                  nm_ip_addr_is_site_local
  R nm_utils_ipaddr_is_normalized              nm_inet_is_normalized
  R nm_utils_ipaddr_is_valid                   nm_inet_is_valid
  R nm_utils_ipx_address_clear_host_address    nm_ip_addr_clear_host_address
  R nm_utils_parse_inaddr                      nm_inet_parse_str
  R nm_utils_parse_inaddr_bin                  nm_inet_parse_bin
  R nm_utils_parse_inaddr_bin_full             nm_inet_parse_bin_full
  R nm_utils_parse_inaddr_prefix               nm_inet_parse_with_prefix_str
  R nm_utils_parse_inaddr_prefix_bin           nm_inet_parse_with_prefix_bin
  R test_nm_utils_ip6_address_same_prefix      test_nm_ip_addr_same_prefix

  ./contrib/scripts/nm-code-format.sh -F
2022-08-25 19:05:51 +02:00
Thomas Haller
863b71a8fe
all: use internal _nm_utils_ip4_netmask_to_prefix()
We have two variants of the function: nm_utils_ip4_netmask_to_prefix()
and _nm_utils_ip4_netmask_to_prefix(). The former only exists because it
is public API in libnm. Internally, only use the latter.
2022-06-27 10:50:24 +02:00
Thomas Haller
a1d9fce147
cloud-setup: only pass config-iface-data as user-data for async functions
From config_iface_data->get_config_data we have access to the other pointer
already. No need to allocate a user data.

(cherry picked from commit 7d71aff247)
2022-05-05 08:37:29 +02:00
Thomas Haller
16ba58cfb1
cloud-setup: use union for NMCSProviderGetConfigIfaceData.priv
Use a union, it makes more sense.

Note that with union, C's struct initialization might not sufficiently
set all fields to the default. In practice yes, but theoretically in C
a NULL pointer and floats must not have all zero bits, so the following
is not guaranteed to work:

    struct {
        int some_field;
        union {
             void *v_ptr;
             int v_int;
        };
    } variable = {
        .some_field = 24,
    };

    assert(variable.union.v_ptr == 0);
    assert(variable.union.v_int == 0);

When initializing the variable, we should not rely on automatically
initialize all union members correctly. It cannot at the same time
set NULL pointers and zero integers -- well, on our architectures it
probably can, but not as far as guaranteed by C language.
We need to know which union field we are going to use and initialize
it explicitly.
As we know the provider type, we can do that.

Also, maybe in the future we need special free/unref calls when
destroying the type specific data in NMCSProviderGetConfigIfaceData.
As we know the provider, we can.

Note that having type specific data in NMCSProviderGetConfigIfaceData.priv
is a layering violation. But it is still simpler than implementing
type specific handlers (callbacks) or tracking the data somewhere else.
After all, we know at compile time all the existing provider types.

(cherry picked from commit 1e696c7e93)
2022-05-05 08:37:29 +02:00
Thomas Haller
061c05ca39
cloud-setup: track config-task-data in iface-data
Let NMCSProviderGetConfigIfaceData.get_config_data have a pointer to the
NMCSProviderGetConfigTaskData. This will allow two things:

- at several places we pass on `nm_utils_user_data_pack(get_config_data,
  config_iface_data)` as user data. We can avoid that, by just letting
  config_iface_data have a pointer to get_config_data.

- NMCSProviderGetConfigIfaceData contains a provider specific field
  "priv". That may also require special initialization or destruction,
  depending on the type. We thus need access to the provider type,
  which we have via iface_data->get_config_data->self.

Also let NMCSProviderGetConfigTaskData have a pointer "self" to the
NMCSProvider. While there was already the "task", which contains the
provider as source-object, this is more convenient.

(cherry picked from commit 069946cda1)
2022-05-05 08:37:29 +02:00
Thomas Haller
3a9ec3c5a3
cloud-setup: reorder addresses to honor "primary_ip_address"
The order of IPv4 addresses matters, in particular if they are in
the same subnet. Kernel will mark all but the first one as "secondary".
In NetworkManager's ipv4.addresses, the first address is the primary.

It seems that on aliyun cloud, "private-ipv4s" URL may give the
addresses in arbitrary order. The primary can be fetched from
"primary-ip-address".

Fix that by also fetching "primary-ip-address". Then, resort the array
so that the primary is the first one in the list.

https://bugzilla.redhat.com/show_bug.cgi?id=2079849
(cherry picked from commit 191baf84e2)
2022-05-05 08:37:29 +02:00
Thomas Haller
216c46c881
all: prefer nm wrappers to automatically attach GSource to default context
We often create the source with default priority, no destroy function and
attach it to the default context (g_main_context_default()). For that
case, we have wrapper functions like nm_g_timeout_add_source()
and nm_g_idle_add_source(). Use those.

There should be no change in behavior.
2022-03-13 11:59:42 +01:00
Thomas Haller
9b030a3988
all: change scheduling priority for idle actions to G_PRIORITY_DEFAULT_IDLE
g_idle_add() uses G_PRIORITY_DEFAULT_IDLE priority. Most of the time we don't
care much about the priority.

But at the places that this patch changes, I think that using
G_PRIORITY_DEFAULT_IDLE (and following g_idle_add()) is more correct. The
reason for this is not very strong, except that it's probably the better
choice. And the old choice was made because I didn't realize that
g_idle_add() uses another default priority. Hence, the old choice was not
for good reasons either.
2022-03-13 11:59:42 +01:00
Thomas Haller
7a1734926a
connectivity,cloud-setup: restrict curl protocols to HTTP and HTTPS
See-also: https://fedoraproject.org/wiki/Changes/CurlMinimal_as_Default#Benefit_to_Fedora
See-also: 55b90ee00b

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1121
2022-02-24 09:37:58 +01:00
Thomas Haller
615221a99c format: reformat source tree with clang-format 13.0
We use clang-format for automatic formatting of our source files.
Since clang-format is actively maintained software, the actual
formatting depends on the used version of clang-format. That is
unfortunate and painful, but really unavoidable unless clang-format
would be strictly bug-compatible.

So the version that we must use is from the current Fedora release, which
is also tested by our gitlab-ci. Previously, we were using Fedora 34 with
clang-tools-extra-12.0.1-1.fc34.x86_64.

As Fedora 35 comes along, we need to update our formatting as Fedora 35
comes with version "13.0.0~rc1-1.fc35".
An alternative would be to freeze on version 12, but that has different
problems (like, it's cumbersome to rebuild clang 12 on Fedora 35 and it
would be cumbersome for our developers which are on Fedora 35 to use a
clang that they cannot easily install).

The (differently painful) solution is to reformat from time to time, as we
switch to a new Fedora (and thus clang) version.
Usually we would expect that such a reformatting brings minor changes.
But this time, the changes are huge. That is mentioned in the release
notes [1] as

  Makes PointerAligment: Right working with AlignConsecutiveDeclarations. (Fixes https://llvm.org/PR27353)

[1] https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html#clang-format
2021-11-29 09:31:09 +00:00
Thomas Haller
fe80b2d1ec
cloud-setup: use suppress_prefixlength rule to honor non-default-routes in the main table
Background
==========

Imagine you run a container on your machine. Then the routing table
might look like:

    default via 10.0.10.1 dev eth0 proto dhcp metric 100
    10.0.10.0/28 dev eth0 proto kernel scope link src 10.0.10.5 metric 100
    [...]
    10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink
    10.42.1.2 dev cali02ad7e68ce1 scope link
    10.42.1.3 dev cali8fcecf5aaff scope link
    10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
    10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink

That is, there are another interfaces with subnets and specific routes.

If nm-cloud-setup now configures rules:

    0:  from all lookup local
    30400:  from 10.0.10.5 lookup 30400
    32766:  from all lookup main
    32767:  from all lookup default

and

    default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
    10.0.10.1 dev eth0 table 30400 proto static scope link metric 10

then these other subnets will also be reached via the default route.

This container example is just one case where this is a problem. In
general, if you have specific routes on another interface, then the
default route in the 30400+ table will interfere badly.

The idea of nm-cloud-setup is to automatically configure the network for
secondary IP addresses. When the user has special requirements, then
they should disable nm-cloud-setup and configure whatever they want.
But the container use case is popular and important. It is not something
where the user actively configures the network. This case needs to work better,
out of the box. In general, nm-cloud-setup should work better with the
existing network configuration.

Change
======

Add new routing tables 30200+ with the individual subnets of the
interface:

    10.0.10.0/24 dev eth0 table 30200 proto static metric 10
    [...]
    default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
    10.0.10.1 dev eth0 table 30400 proto static scope link metric 10

Also add more important routing rules with priority 30200+, which select
these tables based on the source address:

    30200:  from 10.0.10.5 lookup 30200

These will do source based routing for the subnets on these
interfaces.

Then, add a rule with priority 30350

    30350:  lookup main suppress_prefixlength 0

which processes the routes from the main table, but ignores the default
routes. 30350 was chosen, because it's in between the rules 30200+ and
30400+, leaving a range for the user to configure their own rules.

Then, as before, the rules 30400+ again look at the corresponding 30400+
table, to find a default route.

Finally, process the main table again, this time honoring the default
route. That is for packets that have a different source address.

This change means that the source based routing is used for the
subnets that are configured on the interface and for the default route.
Whereas, if there are any more specific routes in the main table, they will
be preferred over the default route.

Apparently Amazon Linux solves this differently, by not configuring a
routing table for addresses on interface "eth0". That might be an
alternative, but it's not clear to me what is special about eth0 to
warrant this treatment. It also would imply that we somehow recognize
this primary interface. In practise that would be doable by selecting
the interface with "iface_idx" zero.

Instead choose this approach. This is remotely similar to what WireGuard does
for configuring the default route ([1]), however WireGuard uses fwmark to match
the packets instead of the source address.

[1] https://www.wireguard.com/netns/#improved-rule-based-routing
2021-09-16 17:30:25 +02:00
Thomas Haller
0978be5e43
cloud-setup: cleanup configuring addresses/routes/rules in _nmc_mangle_connection() 2021-09-16 15:51:03 +02:00
Thomas Haller
b68d694b78
cloud-setup: limit number of supported interfaces to avoid overlapping table numbers
The table number is chosen as 30400 + iface_idx. That is, the range is
limited and we shouldn't handle more than 100 devices. Add a check for
that and error out.
2021-09-16 15:51:03 +02:00
Thomas Haller
a95ea0eb29
cloud-setup: process iface-datas in sorted order
The routes/rules that are configured are independent of the
order in which we process the devices. That is, because they
use the "iface_idx" for cases where there is ambiguity.

Still, it feels nicer to always process them in a defined order.
2021-09-16 15:51:02 +02:00
Thomas Haller
1c5cb9d3c2
cloud-setup: track sorted list of NMCSProviderGetConfigIfaceData
Sorted by iface_idx. The iface_idx is probably something useful and
stable, provided by the provider. E.g. it's the order in which
interfaces are exposed on the meta data.
2021-09-16 15:51:02 +02:00
Thomas Haller
ec56fe60fb
cloud-setup: add "hwaddr" to NMCSProviderGetConfigIfaceData struct
get-config() gives a NMCSProviderGetConfigResult structure, and the
main part of data is the GHashTable of MAC addresses and
NMCSProviderGetConfigIfaceData instances.

Let NMCSProviderGetConfigIfaceData also have a reference to the MAC
address. This way, I'll be able to create a (sorted) list of interface
datas, that also contain the MAC address.
2021-09-16 15:51:02 +02:00
Thomas Haller
5f047968d7
cloud-setup: skip configuring policy routing if there is only one interface/address
nm-cloud-setup automatically configures the network. That may conflict
with what the user wants. In case the user configures some specific
setup, they are encouraged to disable nm-cloud-setup (and its
automatism).

Still, what we do by default matters, and should play as well with
user's expectations. Configuring policy routing and a higher priority
table (30400+) that hijacks the traffic can cause problems.

If the system only has one IPv4 address and one interface, then there
is no point in configuring policy routing at all. Detect that, and skip
the change in that case.

Note that of course we need to handle the case where previously multiple
IP addresses were configured and an update gives only one address. In
that case we need to clear the previously configured rules/routes. The
patch achieves this.
2021-09-16 15:51:02 +02:00
Thomas Haller
7969ae1a82
cloud-setup: count numbers of valid IPv4 addresses in get-config result
Will be used next.
2021-09-16 15:51:02 +02:00
Thomas Haller
a3cd66d3fa
cloud-setup: cache number of valid interfaces in get-config result
Now that we return a struct from get_config(), we can have system-wide
properties returned.

Let it count and cache the number of valid iface_datas.

Currently that is not yet used, but it will be.
2021-09-16 15:51:02 +02:00