Devices of different link types can actually have the same MAC address.
We'll want to use this to find a device of a particular type by its
hardware address.
For completeness, extend the API to support non-persistant
device. That requires that nm_platform_link_tun_add()
returns the file descriptor.
While NetworkManager doesn't create such devices itself,
it recognizes the IFLA_TUN_PERSIST / IFF_PERSIST flag.
Since ip-tuntap (obviously) cannot create such devices,
we cannot add a test for how non-persistent devices look
in the platform cache. Well, we could instead add them
with ioctl directly, but instead, just extend the platform
API to allow for that.
Also, use the function from test-lldp.c to (optionally) use
nm_platform_link_tun_add() to create the tap device.
Instead have one particular nm_ip6_config_get_address_first_nontentative() function,
make it more extendable. Now, we pass a match-type argument, which can control which
element to search.
This patch has no change in behavior, but it already makes clear, that
nm_ip6_config_get_address_first_nontentative() was buggy, because it would
also return addresses that failed DAD.
Kernel recently got support for exposing TUN/TAP information on netlink
[1], [2], [3]. Add support for it to the platform cache.
The advantage of using netlink is that querying sysctl bypasses the
order of events of the netlink socket. It is out of sync and racy. For
example, platform cache might still think that a tun device exists, but
a subsequent lookup at sysfs might fail because the device was deleted
in the meantime. Another point is, that we don't get change
notifications via sysctl and that it requires various extra syscalls
to read the device information. If the tun information is present on
netlink, put it into the cache. This bypasses checking sysctl while
we keep looking at sysctl for backward compatibility until we require
support from kernel.
Notes:
- we had two link types NM_LINK_TYPE_TAP and NM_LINK_TYPE_TUN. This
deviates from the model of how kernel treats TUN/TAP devices, which
makes it more complicated. The link type of a NMPlatformLink instance
should match what kernel thinks about the device. Point in case,
when parsing RTM_NETLINK messages, we very early need to determine
the link type (_linktype_get_type()). However, to determine the
type of a TUN/TAP at that point, we need to look into nested
netlink attributes which in turn depend on the type (IFLA_INFO_KIND
and IFLA_INFO_DATA), or even worse, we would need to look into
sysctl for older kernel vesions. Now, the TUN/TAP type is a property
of the link type NM_LINK_TYPE_TUN, instead of determining two
different link types.
- various parts of the API (both kernel's sysctl vs. netlink) and
NMDeviceTun vs. NMSettingTun disagree whether the PI is positive
(NM_SETTING_TUN_PI, IFLA_TUN_PI, NMPlatformLnkTun.pi) or inverted
(NM_DEVICE_TUN_NO_PI, IFF_NO_PI). There is no consistent way,
but prefer the positive form for internal API at NMPlatformLnkTun.pi.
- previously NMDeviceTun.mode could not change after initializing
the object. Allow for that to happen, because forcing some properties
that are reported by kernel to not change is wrong, in case they
might change. Of course, in practice kernel doesn't allow the device
to ever change its type, but the type property of the NMDeviceTun
should not make that assumption, because, if it actually changes, what
would it mean?
- note that as of now, new netlink API is not yet merged to mainline Linus
tree. Shortcut _parse_lnk_tun() to not accidentally use unstable API
for now.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1277457
[2] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=1ec010e705934c8acbe7dbf31afc81e60e3d828b
[3] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=118eda77d6602616bc523a17ee45171e879d1818https://bugzilla.redhat.com/show_bug.cgi?id=1547213https://github.com/NetworkManager/NetworkManager/pull/77
GByteArray is a mutable array of bytes. For every practical purpose, the hwaddr
property of NMDhcpClient is an immutable sequence of bytes. Thus, make it a
GBytes.
We want to add addresses in a particular order so that source address
selection works.
Note that @known_addresses contains the desired addresses in order of
least-important first, while @plat_addresses contains them in opposite
order. Previously, this inverted order was not considered, and we
essentially ended up removing and re-adding all addresses every time.
Fix that. While at it, get rid of the O(n^2) runtime complexity, and
make it O(n) by iterating both lists simultaneously.
Often, we want in API that an input argument is read-only and not modified
by the function call. Not modifying input arguments is a good
convention.
However, in this case there are only two callers, and both clearly do
not care whether the @known_addresses array will be modified.
Clear out addresses that are already expired and enforce that there are
no duplicate addresses. Basically, use @known_addresses for bookkeeping
which addresses are to be ignored.
Add a function that allows to re-request all objects of a certain type.
Usually, the cache is supposed to keep itself in a consistent state and
this function is not useful.
It is however useful during testing and debugging to explicitly reload
an object type.
If you ever think to need this function in non-testing code, then
something else is probably wrong with the cache implementation.
XXX was used to either raise attention (NOTE) or to indicate
that this is ugly code that should be fixed (FIXME). The usage
was inconsistent.
Let's avoid XXX and use either NOTE or FIXME.
It only makes sense to call delete() with NMPObjects that
we obtained from the platform cache. Otherwise, if we didn't
get it from the cache in the first place, we wouldn't know
what to delete.
Hence, the input argument is (almost) always an NMPObject
in the first place. That is different from add(), where
we might create a new specific NMPlatform* instance on the
stack. For add() it makes slightly more sense to have different
functions depending on the type. For delete(), it doesn't.
The "onlink" flag for IPv4 routes is part of the route ID.
Consider it in nm_platform_ip4_route_cmp().
Also, allow configuring the flag when adding a route.
Note that for IPv6, the onlink flag is still ignored.
Pretty much like kernel does.
Setting the MTU failes under regular conditions, for example when
setting the MTU of a master larger then the MTU of the slaves.
Logging a warning it too alarming.
We often want to cascade hashing, meaning, to combine the
outcome of various hash functions in a larger hash.
Instead of having each hash function return a guint hash value,
accept a hash state argument. This saves the overhead of initializing
and completing the intermediate hash states.
It also avoids loosing entropy when we reduce the larger hash state
into the intermediate guint hash value.
Whenever we call a platform operation that reads or writes the netlink
socket, there is the possibility that the cache gets updated, as we
receive netlink events.
It is thus racy, if nm_platform_ip_route_sync() *first* adds routes, and
then obtains a list of routes to delete. The correct approach is to
determine which routes to delete first (and keep it in a list
@routes_prune), and pass that list down to nm_platform_ip_route_sync().
Arguably, this doesn't yet solve every race. For example, NMDevice
calls update_ext_ip_config() during ip4_config_merge_and_apply().
That is good, as it resyncs with platform. However, before calling
nm_ip4_config_commit() it calls other platform operations, like
_commit_mtu(). So, the race is still there.
Kernel does not allow to add a route with table 0 (RT_TABLE_UNSPEC). It
effectively is an alias for the main table. We must consider that when
comparing routes sementically.
Kernel does not allow to add IPv6 routes with "src", as long as the
corresponding address is still tentative (related bug rh#1457196).
The workaround for this is cumbersome. First, when we fail to add such a
route with "pref_src", we guess that it happend due to this issue. In
that case, nm_ip6_config_commit() returns the list of routes that could
not be added for the moment (but hopefully can be added later).
We track this list in NMDevice, and keep trying to merge the routes
back into ip6_config. In order to not try indefinitely, keep track of a
timestamp when we tried to add this route for the first time.
Another uglyness is that pending tentative routes don't explicitly block
activation. In practice they may do, because for these routes we also have
an IPv6 address that is still doing DAD, so the IP configuration is
still pending due to that.
https://bugzilla.redhat.com/show_bug.cgi?id=1452684
Let's not treat those routes special. I think this was originally done, because
we relied on kernel to add the IPv4 device route, so we would ignore RTPROT_KERNEL
routes and not delete them.
We want to track them for various reasons:
- for consistency, there is nothing special except that they might be
added by kernel.
- we expose the routes of NMIP4Config/NMIP6Config on D-Bus. That should
include also routes such as device routes. Note, this commit changes
that we now expose device routes on D-Bus too.
For kernel, route ID compare identical according to NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID.
Well, mostly. In practice, NM ignores several route properties that
kernel considers part of the ID too. This leaves the possibility that
kernel allows addition of two routes that compare idential for
NetworkManager.
Anyway, NMIP4Config/NMIP6Config should use the same equality as platform
cache. Otherwise, there is the odd situation that ip-config merges routes
that are treated as different by kernel.
For IP addresses the ID operator already corresponded to what kernel
does. There is no change for addresses.
Note that NMSettingIPConfig also uses a different algorithm for
comparing routes. But that doesn't really matter here, it it differed
before too.
Change the output of nm_platform_error_to_string() to print the numeric value.
Also, accept a string buffer instead of using an alloca() allocated buffer.
There is still a macro to provide the previous functionality, but it
was ill-suited to call from inside a loop.
Let nm_platform_ip_route_add() and friends return an NMPlatformError
failure reason.
Also, do_add_addrroute() did not return the response from kernel.
Instead, it determined success/failure based on the presence of the
object in the cache. That is racy and does not allow to give a failure
reason from kernel.
Instead, determine success solely based on the netlink reply from
kernel. The received errno shall be authorative, there is no need
to second guess the response.
There is a problem that netlink is not a reliable protocol. In case
of receive buffer overflow, the response is lost and we don't know
whether the command succeeded (it likely did). It's unclear how to fix
that, but for now just return "unspecified" error. We probably avoid
that already by having a huge buffer size.
Also, downgrade the error message to <warn> level. <error> is really
for bugs only.
Inspired from iproute2. As such, don't use libnl3's "struct nl_msg", but
add _nl_addattr_l() and use a stack-allocated "struct nlmsghdr". With
this, we are closer to the raw netlink API. It really is simple enough.
The complicated part of the patch is that we re-use the existing netlink
socket for events. Hence, we must process the socket via our common
event_handler_recvmsgs(). That also means, that we get the netlink
response a few layers down the stack and have to return the result
via DelayedActionWaitForNlResponseData.
Previously, we would add exclusive routes via netlink message flags
NLM_F_CREATE | NLM_F_REPLACE for RTM_NEWROUTE. Similar to `ip route replace`.
Using that form of RTM_NEWROUTE message, we could only add a certain
route with a certain network/plen,metric triple once. That was already
hugely inconvenient, because
- when configuring routes, multiple (managed) interfaces may get
conflicting routes (multihoming). Only one of the routes can be actually
configured using `ip route replace`, so we need to track routes that are
currently shadowed.
- when configuring routes, we might replace externally configured
routes on unmanaged interfaces. We should not interfere with such
routes.
That was worked around by having NMRouteManager (and NMDefaultRouteManager).
NMRouteManager would keep a list of the routes which NetworkManager would like
to configure, even if momentarily being unable to do so due to conflicting routes.
This worked mostly well but was complicated. It involved bumping metrics to
avoid conflicts for device routes, as we might require them for gateway routes.
Drop that now. Instead, use the corresponding of `ip route append` to configure
routes. This allows NetworkManager to confiure (almost) all routes that we care.
Especially, it can configure all routes on a managed interface, without
replacing/interfering with routes on other interfaces. Hence, NMRouteManager
becomes obsolete.
It practice it is a bit more complicated because:
- when adding an IPv4 address, kernel will automatically create a device route
for the subnet. We should avoid that by using the IFA_F_NOPREFIXROUTE flag for
IPv4 addresses (still to-do). But as kernel may not support that flag for IPv4
addresses yet (and we don't require such a kernel yet), we still need functionality
similar to nm_route_manager_ip4_route_register_device_route_purge_list().
This functionality is now handled via nm_platform_ip4_dev_route_blacklist_set().
- trying to configure an IPv6 route with a source address will be rejected
by kernel as long as the address is tentative (see related bug rh#1457196).
Preferably, NMDevice would keep the list of routes which should be configured,
while kernel would have the list of what actually is configured. There is a
feed-back loop where both affect each other (for example, when externally deleting
a route, NMDevice must forget about it too). Previously, NMRouteManager would have
the task of remembering all routes which we currently want to configure, but cannot
due to conflicting routes.
We get rid of that, because now we configure non-exclusive routes. We however still
will need to remember IPv6 routes with a source address, that currently cannot be
configured yet. Hence, we will need to keep track of routes that
currently cannot be configured, but later may be.
That is still not done yet, as NMRouteManager didn't handle this
correctly either.
Rename to nm_platform_ip_address_flush(), it's more consistent with naming
for other platform functions.
Also, pass an address family argument. Sometimes I feel an option makes it clearer
what the function does. Otherwise, from the name it's not clear which address
families are affected. As an API, it feels more correct to me.
We soon also get a nm_platform_ip_route_flush() function, which will
look similar.