NML3Cfg is stateful, that means it remembers which address/route it
configured earlier. That is important because the API users of NML3Cfg
only say what the want to configure now, and NML3Cfg needs to remove
addresses/routes that it configured earlier but are no longer to be
present. Also, NetworkManager wants to allow the user to add
addresses/routes externally with `ip addr|route add` and NetworkManager
not removing it. This is a common use case for dispatcher scripts, but
in general, we want to allow other components to add addresses/routes.
We try something similar with the removal of routes/addresses managed by
NetworkManager. When NetworkManager adds a route/address, which later
disappears, then we assume that the user intentionally removed the
address/route and take the hint to not re-add it.
However, it doesn't work. It is problematic for two reasons:
- kernel can automatically remove routes. For example, deleting an IPv4
address that is the prefsrc of a route, will cause kernel to delete
that route. Sure, we may be unable to re-configure the route at this
moment, but we shouldn't remember indefinitely that the route is
supposed to be absent. Rather, we should re-add it when possible.
- kernel is a pain with validating consistencies of routes. For example,
when a route has a nexthop gateway, then the gateway must be onlink
(directly reachable), or kernel refuses to add it with "Nexthop has
invalid gateway". Of course, when removing the onlink route kernel is
fine leaving the gateway route behind, which it would otherwise refuse
to add.
Anyway. Such interdependencies for when kernel rejects adding a route
with "Nexthop has invalid gateway" are non-trivial. We try to work
around that by always adding the necessary onlink routes. See
nm_l3_config_data_add_dependent_onlink_routes(). But if the user
externally removed the dependent onlink route, and NetworkManager
remembers to not re-adding it, then the efforts from
nm_l3_config_data_add_dependent_onlink_routes() are ignored. This
causes ripple effects and NetworkManager will also be unable to add the
nexthop route.
Trying to preserve absence of routes that NetworkManager would like to
configure is not tenable. Don't do it anymore. There was anyway no
guarantee that on the next update NetworkManager wouldn't try to re-add
the route in question. For example, if the route came from DHCP, and the
lease temporarily went away and came back, then NetworkManager probably
would have (correctly) forgotten that the user wished that the route be
absent. This did not work reliably and it just causes problems.
NMPrioq is taken from systemd's "prioq.c". It is a nice data structure,
that accepts and an index pointer, to directly access elements inside
the heap.
Previously, the API didn't require a consistent index, while the data is
not inside the heap. nm_prioq_{update,shuffle,remove}()) call find_item(),
which silently accepts wrong indexes and assumes the element is not in
the heap.
Keeping the index in sync with the data seems error prone. Accepting any
index without asserting may be convenient for the user (as the user is
not required to pre-initialize the index with NM_PRIOQ_IDX_NULL).
However, it also misses to catch potential bugs.
Now the index must be kept consistent, in particular also if the element
is not enqueued. This means, you must initialize them with
NM_PRIOQ_IDX_NULL.
NMPrioq is copied from systemd's "prioq.c". It uses the index pointer to
find elements in the heap directly. However, it's not very careful about
clearing or maintaining the index pointer if the element is not in the
queue. Like, find_item() accepts a bogus index and only finds the data
if the data and the index matches. That seems error prone.
In the first step, when we remove an item from the queue, ensure to
reset the index to NM_PRIOQ_IDX_NULL.
The nm_ip_addr_*() APIs are supposed to work with unaligned input,
so we could use them while parsing binary data (where the field
may not be properly aligned). For that reason, it used to first
copy the argument to a local (properly aligned) variable.
Rework that a bit, and use unaligned_read_ne32() for IPv4.
NMIPAddr is a union, so you always need to know the right addr_family.
Often we track the addr_family separately or know it implicitly. Then we
don't need to glue them together.
But also often we need to associate the address_family with the address
union. Add a struct that bundles the NMIPAddr with the addr_family,
making it a tagged union. The benefit is that we now also have one
implementation for equal, cmp and hash-update, instead of reimplementing
them.
For each connection that corresponds to a software device, we create a
"unrealized" device that then becomes realized just before the
connection starts activating. Currently, in certain conditions NM
creates two devices with the same name and type, one realized and one
not; this is not expected and can lead to other issues especially when
a software device is reactivated.
Avoid that by relaxing the check in system_create_virtual_device(): if
a device exists with the same name and type, we don't want to create
another even if the type-specific parameters differ.
The idea is to have useful and correct helper functions, that are
generic and reusable.
nmcs_utils_poll() was done with that intent, and it is a generally
useful function. As the implementation shows, it's not entirely trivial
to get all the parameters right, when it comes to glib-integration
(GMainContext and GTask) and polling.
Whether something like a generic poll helper is a useful thing at all,
may be a valid question. E.g. you need several hooks, and the usage is
still not trivial. Regardless of that, "nm-cloud-setup-utils.c" already
had such a helper. This is only about moving it to a place where it is
actually accessible to others.
And, if it turns out to be a good idea after all, then somebody else
could use it.
nmcs_utils_poll() calls nmcs_wait_for_objects_register(), which is
specific to nm-cloud-setup.
nmcs_utils_poll() should move to libnm-glib-aux, so it should not have a
direct dependency on nm-cloud-setup code. Add instead a hook that the
caller can use for registering the object.
nmcs_utils_poll() accepts two different user-data. One is passed to the
probe callbacks (and returned by nmcs_utils_poll_finish()). The other
one is passed to the callback.
Having separate user data might be useful. It's not useful for
nm_http_client_poll_req(), which both passes the same.
It's confusing to pass the same data as different user-data. Don't do
that. Use only one way.
In practice, this does not cause an issue, because NMEtherAddr quite
quite obviously only contains uint8 values and has alignment 1.
It's still ugly to case nm_ip_addr_zero to NMEtherAddr if they are
entirely unrelated types.
Fixes: 58e2ba0535 ('glib-aux: drop ethernet fields from NMIPAddr union')
It is slightly confusing to be required to be aware whether something is
a union or a struct. Hence, the union was wrapped in a struct.
However, we anyway almost always use the typedef NMIPAddr. The single
place where we forward declare the type, we can correctly use the union
specifier.
It's not really used anyway.
The idea was, that the ethernet MAC address would fit from the size, and
we might at a few places use that for convenience. But it's more
confusing. Also, because there is already NMEtherAddr and `struct
ether_addr`.
Having a list with only one element is often interesting to know. For
example, if you are about to unlink an element, you may want to check
whether afterwards the list is empty.
Add c_list_is_empty_or_single() for that. It is probably more efficient than
plain c_list_length_is(list, 1) and also a better name.
When creating a parent dependent device it can have software device as
parent without an ifindex. In that case, it will fail on an ssertion on
parent being missing.
In order to avoid this, we are handling the situation similar to what we
do for VLAN devices. NetworkManager will raise different error and block
the autoconnection instead of asserting.
This solves the assert error for the following commands:
```
nmcli connection add type macvlan ifname mv1 con-name mv1+ macvlan.parent dummy0 mode vepa
nmcli connection add type dummy ifname dummy0 con-name dummy0+ autoconnect no
```
There were a few places where we did already this but there was one
place where we missed it, in nm-manager.c:do_sleep_wake(). Therefore,
the device end in DISCONNECTED state and did not assume the connection.
When the state is DISCONNECTED is being set from a
configuring/configured state we might want to always DECONFIGURE the
interface (ifindex, ip addresses, routes..) except if the
sys-iface-state is REMOVED in that case we would like to remove it.
Apparently, the pager being able to execute commands takes some people
by surprpise, making their poor configuration choices have consequences.
Let's pray for some mercy on their souls with the LESSECURE variable,
which makes less less likely to conduct evil deeds.
Systemd also deals with this, but being systemd they make it as
complicated as possible. We just set it unconditionally, hoping nobody
wanted the extra functionality and they're in only for the scrolling.
If anyone minds they can just set LESSSECURE=0 and we'll leave it alone.
See also: SYSTEMD_PAGERSECURE in systemctl(1) manual.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1559
The present version of the EC2 metadata API (IMDSv2) requires a header
with a token to be present in all requests. The token is essentially a
cookie that's not actually a cookie that's obtained with a PUT call that
doesn't put anything. Apparently it's too easy to trick someone into
calling a GET method.
EC2 now supports IMDSv2 everywhere with IMDSv1 being optional, so let's
just use IMDSv2 unconditionally. Also, the presence of a token API can
be used to detect the AWS EC2 cloud.
https://bugzilla.redhat.com/show_bug.cgi?id=2151986
Clarify that detect() needs to succeed before get_config().
I thought it's sort of common sense, but it's better to be explicit as
we're going to rely on that.