In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.
That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.
Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).
Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
- previously, parsing wireguard genl data resulted in memory corruption:
- _wireguard_update_from_allowedips_nla() takes pointers to
allowedip = &g_array_index (buf->allowedips, NMWireGuardAllowedIP, buf->allowedips->len - 1);
but resizing the GArray will invalidate this pointer. This happens
when there are multiple allowed-ips to parse.
- there was some confusion who owned the allowedips pointers.
_wireguard_peers_cpy() and _vt_cmd_obj_dispose_lnk_wireguard()
assumed each peer owned their own chunk, but _wireguard_get_link_properties()
would not duplicate the memory properly.
- rework memory handling for allowed_ips. Now, the NMPObjectLnkWireGuard
keeps a pointer _allowed_ips_buf. This buffer contains the instances for
all peers.
The parsing of the netlink message is the complicated part, because
we don't know upfront how many peers/allowed-ips we receive. During
construction, the tracking of peers/allowed-ips is complicated,
via a CList/GArray. At the end of that, we prettify the data
representation and put everything into two buffers. That is more
efficient and simpler for user afterwards. This moves complexity
to the way how the object is created, vs. how it is used later.
- ensure that we nm_explicit_bzero() private-key and preshared-key. However,
that only works to a certain point, because our netlink library does not
ensure that no data is leaked.
- don't use a "struct sockaddr" union for the peer's endpoint. Instead,
use a combintation of endpoint_family, endpoint_port, and
endpoint_addr.
- a lot of refactoring.
Also, add two more features "tx-tcp-segmentation" and
"tx-tcp6-segmentation". There are two reasons for that:
- systemd-networkd supports setting these two features,
so lets support them too (apparently they are important
enough for networkd).
- these two features are already implicitly covered by "tso".
Like for the "ethtool" program, "tso" is an alias for several
actual features. By adding two features that are already
also covered by an alias (which sets multiple kernel names
at once), we showcase how aliases for the same feature can
coexist. In particular, note how setting
"tso on tx-tcp6-segmentation off" will behave as one would
expect: all 4 tso features covered by the alias are enabled,
except that particular one.
Add platform support for IP6GRE and IP6GRETAP tunnels. The former is a
virtual tunnel interface for GRE over IPv6 and the latter is the L2
variant.
The platform code internally reuses and extends the same structure
used by IPv6 tunnels.
Add support for a new wireguard link type to the platform code. For now
this only covers querying existing links via genetlink and parsing them
into platform objects.
Devices of different link types can actually have the same MAC address.
We'll want to use this to find a device of a particular type by its
hardware address.
For completeness, extend the API to support non-persistant
device. That requires that nm_platform_link_tun_add()
returns the file descriptor.
While NetworkManager doesn't create such devices itself,
it recognizes the IFLA_TUN_PERSIST / IFF_PERSIST flag.
Since ip-tuntap (obviously) cannot create such devices,
we cannot add a test for how non-persistent devices look
in the platform cache. Well, we could instead add them
with ioctl directly, but instead, just extend the platform
API to allow for that.
Also, use the function from test-lldp.c to (optionally) use
nm_platform_link_tun_add() to create the tap device.
Instead have one particular nm_ip6_config_get_address_first_nontentative() function,
make it more extendable. Now, we pass a match-type argument, which can control which
element to search.
This patch has no change in behavior, but it already makes clear, that
nm_ip6_config_get_address_first_nontentative() was buggy, because it would
also return addresses that failed DAD.
Kernel recently got support for exposing TUN/TAP information on netlink
[1], [2], [3]. Add support for it to the platform cache.
The advantage of using netlink is that querying sysctl bypasses the
order of events of the netlink socket. It is out of sync and racy. For
example, platform cache might still think that a tun device exists, but
a subsequent lookup at sysfs might fail because the device was deleted
in the meantime. Another point is, that we don't get change
notifications via sysctl and that it requires various extra syscalls
to read the device information. If the tun information is present on
netlink, put it into the cache. This bypasses checking sysctl while
we keep looking at sysctl for backward compatibility until we require
support from kernel.
Notes:
- we had two link types NM_LINK_TYPE_TAP and NM_LINK_TYPE_TUN. This
deviates from the model of how kernel treats TUN/TAP devices, which
makes it more complicated. The link type of a NMPlatformLink instance
should match what kernel thinks about the device. Point in case,
when parsing RTM_NETLINK messages, we very early need to determine
the link type (_linktype_get_type()). However, to determine the
type of a TUN/TAP at that point, we need to look into nested
netlink attributes which in turn depend on the type (IFLA_INFO_KIND
and IFLA_INFO_DATA), or even worse, we would need to look into
sysctl for older kernel vesions. Now, the TUN/TAP type is a property
of the link type NM_LINK_TYPE_TUN, instead of determining two
different link types.
- various parts of the API (both kernel's sysctl vs. netlink) and
NMDeviceTun vs. NMSettingTun disagree whether the PI is positive
(NM_SETTING_TUN_PI, IFLA_TUN_PI, NMPlatformLnkTun.pi) or inverted
(NM_DEVICE_TUN_NO_PI, IFF_NO_PI). There is no consistent way,
but prefer the positive form for internal API at NMPlatformLnkTun.pi.
- previously NMDeviceTun.mode could not change after initializing
the object. Allow for that to happen, because forcing some properties
that are reported by kernel to not change is wrong, in case they
might change. Of course, in practice kernel doesn't allow the device
to ever change its type, but the type property of the NMDeviceTun
should not make that assumption, because, if it actually changes, what
would it mean?
- note that as of now, new netlink API is not yet merged to mainline Linus
tree. Shortcut _parse_lnk_tun() to not accidentally use unstable API
for now.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1277457
[2] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=1ec010e705934c8acbe7dbf31afc81e60e3d828b
[3] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=118eda77d6602616bc523a17ee45171e879d1818https://bugzilla.redhat.com/show_bug.cgi?id=1547213https://github.com/NetworkManager/NetworkManager/pull/77
GByteArray is a mutable array of bytes. For every practical purpose, the hwaddr
property of NMDhcpClient is an immutable sequence of bytes. Thus, make it a
GBytes.
We want to add addresses in a particular order so that source address
selection works.
Note that @known_addresses contains the desired addresses in order of
least-important first, while @plat_addresses contains them in opposite
order. Previously, this inverted order was not considered, and we
essentially ended up removing and re-adding all addresses every time.
Fix that. While at it, get rid of the O(n^2) runtime complexity, and
make it O(n) by iterating both lists simultaneously.
Often, we want in API that an input argument is read-only and not modified
by the function call. Not modifying input arguments is a good
convention.
However, in this case there are only two callers, and both clearly do
not care whether the @known_addresses array will be modified.
Clear out addresses that are already expired and enforce that there are
no duplicate addresses. Basically, use @known_addresses for bookkeeping
which addresses are to be ignored.
Add a function that allows to re-request all objects of a certain type.
Usually, the cache is supposed to keep itself in a consistent state and
this function is not useful.
It is however useful during testing and debugging to explicitly reload
an object type.
If you ever think to need this function in non-testing code, then
something else is probably wrong with the cache implementation.
XXX was used to either raise attention (NOTE) or to indicate
that this is ugly code that should be fixed (FIXME). The usage
was inconsistent.
Let's avoid XXX and use either NOTE or FIXME.
It only makes sense to call delete() with NMPObjects that
we obtained from the platform cache. Otherwise, if we didn't
get it from the cache in the first place, we wouldn't know
what to delete.
Hence, the input argument is (almost) always an NMPObject
in the first place. That is different from add(), where
we might create a new specific NMPlatform* instance on the
stack. For add() it makes slightly more sense to have different
functions depending on the type. For delete(), it doesn't.
The "onlink" flag for IPv4 routes is part of the route ID.
Consider it in nm_platform_ip4_route_cmp().
Also, allow configuring the flag when adding a route.
Note that for IPv6, the onlink flag is still ignored.
Pretty much like kernel does.
Setting the MTU failes under regular conditions, for example when
setting the MTU of a master larger then the MTU of the slaves.
Logging a warning it too alarming.
We often want to cascade hashing, meaning, to combine the
outcome of various hash functions in a larger hash.
Instead of having each hash function return a guint hash value,
accept a hash state argument. This saves the overhead of initializing
and completing the intermediate hash states.
It also avoids loosing entropy when we reduce the larger hash state
into the intermediate guint hash value.
Whenever we call a platform operation that reads or writes the netlink
socket, there is the possibility that the cache gets updated, as we
receive netlink events.
It is thus racy, if nm_platform_ip_route_sync() *first* adds routes, and
then obtains a list of routes to delete. The correct approach is to
determine which routes to delete first (and keep it in a list
@routes_prune), and pass that list down to nm_platform_ip_route_sync().
Arguably, this doesn't yet solve every race. For example, NMDevice
calls update_ext_ip_config() during ip4_config_merge_and_apply().
That is good, as it resyncs with platform. However, before calling
nm_ip4_config_commit() it calls other platform operations, like
_commit_mtu(). So, the race is still there.
Kernel does not allow to add a route with table 0 (RT_TABLE_UNSPEC). It
effectively is an alias for the main table. We must consider that when
comparing routes sementically.
Kernel does not allow to add IPv6 routes with "src", as long as the
corresponding address is still tentative (related bug rh#1457196).
The workaround for this is cumbersome. First, when we fail to add such a
route with "pref_src", we guess that it happend due to this issue. In
that case, nm_ip6_config_commit() returns the list of routes that could
not be added for the moment (but hopefully can be added later).
We track this list in NMDevice, and keep trying to merge the routes
back into ip6_config. In order to not try indefinitely, keep track of a
timestamp when we tried to add this route for the first time.
Another uglyness is that pending tentative routes don't explicitly block
activation. In practice they may do, because for these routes we also have
an IPv6 address that is still doing DAD, so the IP configuration is
still pending due to that.
https://bugzilla.redhat.com/show_bug.cgi?id=1452684
Let's not treat those routes special. I think this was originally done, because
we relied on kernel to add the IPv4 device route, so we would ignore RTPROT_KERNEL
routes and not delete them.
We want to track them for various reasons:
- for consistency, there is nothing special except that they might be
added by kernel.
- we expose the routes of NMIP4Config/NMIP6Config on D-Bus. That should
include also routes such as device routes. Note, this commit changes
that we now expose device routes on D-Bus too.