We often want to cascade hashing, meaning, to combine the
outcome of various hash functions in a larger hash.
Instead of having each hash function return a guint hash value,
accept a hash state argument. This saves the overhead of initializing
and completing the intermediate hash states.
It also avoids loosing entropy when we reduce the larger hash state
into the intermediate guint hash value.
By using a macro, we don't cast all the types to guint. Instead,
we use their native types directly. Hence, we don't need
nm_hash_update_uint64() nor nm_hash_update_ptr().
Also, for types smaller then guint like char, we save hashing
the all zero bytes.
siphash24() is wildly used by projects nowadays.
It's certainly slower then our djb hashing that we used before.
But quite likely it's fast enough for us, given how wildly it is
used. I think it would be hard to profile NetworkManager to show
that the performance of hash tables is the issue, be it with
djb or siphash24.
Certainly with siphash24() it's much harder to exploit the hashing
algorithm to cause worst case hash operations (provided that the
seed is kept private). Does this better resistance against a denial
of service matter for us? Probably not, but let's better be safe then
sorry.
Note that systemd's implementation uses a different seed for each hash
table (at least, after the hash table grows to a certain size).
We don't do that and use only one global seed.
The privious NM_HASH_* macros directly operated on a guint value
and were thus close to the actual implementation.
Replace them by adding a NMHashState struct and accessors to
update the hash state. This hides the implementation better
and would allow us to carry more state. For example, we could
switch to siphash24() transparently.
For now, we still do a form basically djb2 hashing, albeit with
differing start seed.
Also add nm_hash_str() and nm_str_hash():
- nm_hash_str() is our own string hashing implementation
- nm_str_hash() is our own string implementation, but with a
GHashFunc signature, suitable to pass it to g_hash_table_new().
Also, it has this name in order to remind you of g_str_hash(),
which it is replacing.
Introduce a NM_HASH_INIT() function. It makes the places
where we initialize a hash with a certain seed visually clear.
Also, move them from "shared/nm-utils/nm-shared-utils.h" to
"shared/nm-utils/nm-macros-internal.h". We might want to
have NM_HASH_INIT() non-inline (hence, define it in the
source file).
We added "ipv4.route-table-sync" and "ipv6.route-table-sync" to not change
behavior for users that configured policy routing outside of NetworkManager,
for example, via a dispatcher script. Users had to explicitly opt-in
for NetworkManager to fully manage all routing tables.
These settings were awkward. Replace them with new settings "ipv4.route-table"
and "ipv6.route-table". Note that this commit breaks API/ABI on the unstable
development branch by removing recently added API.
As before, a connection will have no route-table set by default. This
has the meaning that policy-routing is not enabled and only the main table
will be fully synced. Once the user sets a table, we recognize that and
NetworkManager manages all routing tables.
The new route-table setting has other important uses: analog to
"ipv4.route-metric", it is the default that applies to all routes.
Currently it only works for static routes, not DHCP, SLAAC,
default-route, etc. That will be implemented later.
For static routes, each route still can explicitly set a table, and
overwrite the per-connection setting in "ipv4.route-table" and
"ipv6.route-table".
gcc doesn't consider variables with cleanup attribute as unused.
clang does, and warns about them.
In one case, clang is right, in the other one the warning is bogus.
Fix both.
- use nm_utils_addr_family_to_char(). It asserts that the input argument
is either AF_INET or AF_INET6.
- rename variable @family to @addr_family for consistency.
- when logging addr_family for activation-stage, use v4 or v6 instead
of numeric AF_INET/AF_INET6.
Whenever we call a platform operation that reads or writes the netlink
socket, there is the possibility that the cache gets updated, as we
receive netlink events.
It is thus racy, if nm_platform_ip_route_sync() *first* adds routes, and
then obtains a list of routes to delete. The correct approach is to
determine which routes to delete first (and keep it in a list
@routes_prune), and pass that list down to nm_platform_ip_route_sync().
Arguably, this doesn't yet solve every race. For example, NMDevice
calls update_ext_ip_config() during ip4_config_merge_and_apply().
That is good, as it resyncs with platform. However, before calling
nm_ip4_config_commit() it calls other platform operations, like
_commit_mtu(). So, the race is still there.
Kernel does not allow to add a route with table 0 (RT_TABLE_UNSPEC). It
effectively is an alias for the main table. We must consider that when
comparing routes sementically.
No need for duplicate log lines
<debug> [1506146476.8462] platform: link: adding tap tap0 owner 107 group -1
<debug> [1506146476.8462] platform-linux: link: add tap tap0 owner 107 group -1
Merge them.
Also, for consistency change the logging output for adding generic
interfaces in nm_platform_link_add().
Before commit 6698bf58bb, we would rely on
kernel to add the device-route for manual IPv6 routes. We broke that and now
kernel would still add the device-route, however nm_platform_ip_route_sync()
would delete it immediately after.
That is because previously nm_platform_ip_route_sync() would ignore routes
with rtm_protocol RTPRO_KERNEL. Now, it will sync and delete those too.
Fix that by adding the device-route like we do it for IPv4. This also
fixes an actual issue where the automatically added route always had
route-metric 256. Instead, we now use the metric from ipv6.route-metric
setting.
Fixes: 6698bf58bb
Kernel does not allow to add IPv6 routes with "src", as long as the
corresponding address is still tentative (related bug rh#1457196).
The workaround for this is cumbersome. First, when we fail to add such a
route with "pref_src", we guess that it happend due to this issue. In
that case, nm_ip6_config_commit() returns the list of routes that could
not be added for the moment (but hopefully can be added later).
We track this list in NMDevice, and keep trying to merge the routes
back into ip6_config. In order to not try indefinitely, keep track of a
timestamp when we tried to add this route for the first time.
Another uglyness is that pending tentative routes don't explicitly block
activation. In practice they may do, because for these routes we also have
an IPv6 address that is still doing DAD, so the IP configuration is
still pending due to that.
https://bugzilla.redhat.com/show_bug.cgi?id=1452684
Let's not treat those routes special. I think this was originally done, because
we relied on kernel to add the IPv4 device route, so we would ignore RTPROT_KERNEL
routes and not delete them.
We want to track them for various reasons:
- for consistency, there is nothing special except that they might be
added by kernel.
- we expose the routes of NMIP4Config/NMIP6Config on D-Bus. That should
include also routes such as device routes. Note, this commit changes
that we now expose device routes on D-Bus too.
For kernel, route ID compare identical according to NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID.
Well, mostly. In practice, NM ignores several route properties that
kernel considers part of the ID too. This leaves the possibility that
kernel allows addition of two routes that compare idential for
NetworkManager.
Anyway, NMIP4Config/NMIP6Config should use the same equality as platform
cache. Otherwise, there is the odd situation that ip-config merges routes
that are treated as different by kernel.
For IP addresses the ID operator already corresponded to what kernel
does. There is no change for addresses.
Note that NMSettingIPConfig also uses a different algorithm for
comparing routes. But that doesn't really matter here, it it differed
before too.
Remove NMDefaultRouteManager. Instead, add the default-route to the
NMIP4Config/NMIP6Config instance.
This basically reverts commit e8824f6a52.
We added NMDefaultRouteManager because we used the corresponding to `ip
route replace` when configuring routes. That would replace default-routes
on other interfaces so we needed a central manager to coordinate routes.
Now, we use the corresponding of `ip route append` to configure routes,
and each interface can configure routes indepdentently.
In NMDevice, when creating the default-route, ignore @auto_method for
external devices. We shall not touch these devices.
Especially the code in NMPolicy regarding selection of the best-device
seems wrong. It probably needs further adjustments in the future.
Especially get_best_ip_config() should be replaced, because this
distinction VPN vs. devices seems wrong to me.
Thereby, remove the @ignore_never_default argument. It was added by
commit bb75026004, I don't think it's
needed anymore.
This brings another change. Now that we track default-routes in
NMIP4Config/NMIP6Config, they are also exposed on D-Bus like regular
routes. I think that makes sense, but it is a change in behavior, as
previously such routes were not exposed there.
Previously, we would first delete routes that are not to be added,
before adding the new ones.
This has the advantage, that even if delete removes the wrong route,
add would restore the expected state. This tries to workaround the fact
that RTM_DELROUTE allows for wild-card fields, and might delete the
wrong route.
However, for example when bumping the route metric after connectivty
check (removing the default-route with metric 20100 and adding the one
with metric 100), there is a short moment when there is no
default-route.
To avoid that, don't do delete-then-add, but add-then-delete.
Rework to use nm_platform_ip_route_sync() broke to fail
activation when we were unable to configure a route.
Fix it. As before, we only do this for routes that
are configured manually by the user. Invalid routes from
DHCP do not break activation.
Also, improve logging to give a hint what's wrong.
Change the output of nm_platform_error_to_string() to print the numeric value.
Also, accept a string buffer instead of using an alloca() allocated buffer.
There is still a macro to provide the previous functionality, but it
was ill-suited to call from inside a loop.
Let nm_platform_ip_route_add() and friends return an NMPlatformError
failure reason.
Also, do_add_addrroute() did not return the response from kernel.
Instead, it determined success/failure based on the presence of the
object in the cache. That is racy and does not allow to give a failure
reason from kernel.
Instead, determine success solely based on the netlink reply from
kernel. The received errno shall be authorative, there is no need
to second guess the response.
There is a problem that netlink is not a reliable protocol. In case
of receive buffer overflow, the response is lost and we don't know
whether the command succeeded (it likely did). It's unclear how to fix
that, but for now just return "unspecified" error. We probably avoid
that already by having a huge buffer size.
Also, downgrade the error message to <warn> level. <error> is really
for bugs only.
Inspired from iproute2. As such, don't use libnl3's "struct nl_msg", but
add _nl_addattr_l() and use a stack-allocated "struct nlmsghdr". With
this, we are closer to the raw netlink API. It really is simple enough.
The complicated part of the patch is that we re-use the existing netlink
socket for events. Hence, we must process the socket via our common
event_handler_recvmsgs(). That also means, that we get the netlink
response a few layers down the stack and have to return the result
via DelayedActionWaitForNlResponseData.
For IPv6 addresses we use IFA_F_NOPREFIXROUTE for a long time.
If we detect that kernel does not support the flag (for IPv6), we
add addresses as /128 to prevent kernel from adding an onlink route.
We add IPv6 device routes explicitly, whenever needed according
to the onlink RA flag.
For IPv4, we also don't want the route added by kernel. The reason is
that is has an undesired metric of zero. However, usually we want the route
to have a different metric. The complicated part is that kernel does
not add the route immediately but sometimes later. For that we have
nm_platform_ip4_dev_route_blacklist_set() (previously that was
nm_route_manager_ip4_route_register_device_route_purge_list()). It
watches the interface and when a registered device route shows up,
it deletes it.
The better solution is to use the IFA_F_NOPREFIXROUTE flag to prevent
the creation of the route in the first place. It was added for IPv4 to
kernel in commit 7b1311807f3d3eb8bef3ccc53127838b3bea3771, October 2015.
Contrary to IPv6, we cannot (easily) detect whether kernel supports IFA_F_NOPREFIXROUTE
for IPv4 routes. Hence keep nm_platform_ip4_dev_route_blacklist_set() for older
kernels.
- cache the result in NMPlatformPrivate. No need to call the virtual
function every time. The result is not ever going to change.
- if we are unable to detect support, assume support. Those features
were added quite a while ago to kernel, we should default to "support".
Note, that we detect support based on the presence of the absence of
certain netlink flags. That means, we will still detect no support.
The only moment when we actually use the fallback value, is when we
didn't encounter an RTM_NEWADDR or AF_INET6-IFLA_AF_SPEC message yet,
which would be very unusual, because we fill the cache initially and
usually will have some addresses there.
- for no strong reason, track "undetected" as numerical value zero,
and "support"/"no-support" as 1/-1. We already did that previously for
_support_user_ipv6ll, so this just unifies the implementations.
The minor reason is that this puts @_support_user_ipv6ll to the BSS
section and allows us to omit initializing priv->check_support_user_ipv6ll_cached
in platforms constructor.
- detect _support_kernel_extended_ifa_flags also based on IPv4
RTM_NEWADDR messages. Originally, extended flags were added for IPv6,
and later to IPv4 as well. Once we see an IPv4 message with IFA_FLAGS,
we know we have support.
Previously, we would add exclusive routes via netlink message flags
NLM_F_CREATE | NLM_F_REPLACE for RTM_NEWROUTE. Similar to `ip route replace`.
Using that form of RTM_NEWROUTE message, we could only add a certain
route with a certain network/plen,metric triple once. That was already
hugely inconvenient, because
- when configuring routes, multiple (managed) interfaces may get
conflicting routes (multihoming). Only one of the routes can be actually
configured using `ip route replace`, so we need to track routes that are
currently shadowed.
- when configuring routes, we might replace externally configured
routes on unmanaged interfaces. We should not interfere with such
routes.
That was worked around by having NMRouteManager (and NMDefaultRouteManager).
NMRouteManager would keep a list of the routes which NetworkManager would like
to configure, even if momentarily being unable to do so due to conflicting routes.
This worked mostly well but was complicated. It involved bumping metrics to
avoid conflicts for device routes, as we might require them for gateway routes.
Drop that now. Instead, use the corresponding of `ip route append` to configure
routes. This allows NetworkManager to confiure (almost) all routes that we care.
Especially, it can configure all routes on a managed interface, without
replacing/interfering with routes on other interfaces. Hence, NMRouteManager
becomes obsolete.
It practice it is a bit more complicated because:
- when adding an IPv4 address, kernel will automatically create a device route
for the subnet. We should avoid that by using the IFA_F_NOPREFIXROUTE flag for
IPv4 addresses (still to-do). But as kernel may not support that flag for IPv4
addresses yet (and we don't require such a kernel yet), we still need functionality
similar to nm_route_manager_ip4_route_register_device_route_purge_list().
This functionality is now handled via nm_platform_ip4_dev_route_blacklist_set().
- trying to configure an IPv6 route with a source address will be rejected
by kernel as long as the address is tentative (see related bug rh#1457196).
Preferably, NMDevice would keep the list of routes which should be configured,
while kernel would have the list of what actually is configured. There is a
feed-back loop where both affect each other (for example, when externally deleting
a route, NMDevice must forget about it too). Previously, NMRouteManager would have
the task of remembering all routes which we currently want to configure, but cannot
due to conflicting routes.
We get rid of that, because now we configure non-exclusive routes. We however still
will need to remember IPv6 routes with a source address, that currently cannot be
configured yet. Hence, we will need to keep track of routes that
currently cannot be configured, but later may be.
That is still not done yet, as NMRouteManager didn't handle this
correctly either.
Kernel does not allow to add an IPv4 route with rt_scope RT_SCOPE_NOWHERE
(255). It would fail with EINVAL.
While adding a route, we coerce/normalize the scope in
nm_platform_ip_route_normalize(). However, that should only be
done, if the scope is not explicitly set already. Otherwise,
leave it unchanged.
nm_platform_ip_route_normalize() is related to the compare functions.
Several compare modes do a fuzzy comparison, and they should compare
equal as if they would be normalized. Hence, we must do the same
normalization there.
One pecularity in NetworkManager is that we track scope as it's
inverse. The reason is to have a default value of zero meaning
RT_SCOPE_NOWHERE. Hence "scope_inv".
Rename to nm_platform_ip_address_flush(), it's more consistent with naming
for other platform functions.
Also, pass an address family argument. Sometimes I feel an option makes it clearer
what the function does. Otherwise, from the name it's not clear which address
families are affected. As an API, it feels more correct to me.
We soon also get a nm_platform_ip_route_flush() function, which will
look similar.
In several cases, does the route compare function a fuzzy match, to get
the result as what would happen if you add that route to kernel.
The rt_source enum contains some NetworkManager specific values which
are mapped to a certain rtm_protocol value. Especially, when adding
a route to kernel, the resulting value will be coerced (and end up being
different).
We must take this coercion into account.
Until now, NetworkManager's platform cache for routes used the quadruple
network/plen,metric,ifindex for equaliy. That is not kernel's
understanding of how routes behave. For example, with `ip route append`
you can add two IPv4 routes that only differ by their gateway. To
the previous form of platform cache, these two routes would wrongly
look identical, as the cache could not contain both routes. This also
easily leads to cache-inconsistencies.
Now that we have NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID, fix the route's
compare operator to match kernel's.
Well, not entirely. Kernel understands more properties for routes then
NetworkManager. Some of these properties may also be part of the ID according
to kernel. To NetworkManager such routes would still look identical as
they only differ in a property that is not understood. This can still
cause cache-inconsistencies. The only fix here is to add support for
all these properties in NetworkManager as well. However, it's less serious,
because with this commit we support several of the more important properties.
See also the related bug rh#1337855 for kernel.
Another difficulty is that `ip route replace` and `ip route change`
changes an existing route. The replaced route has the same
NM_PLATFORM_IP_ROUTE_CMP_TYPE_WEAK_ID, but differ in the actual
NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID:
# ip -d -4 route show dev v
# ip monitor route &
# ip route add 192.168.5.0/24 dev v
192.168.5.0/24 dev v scope link
# ip route change 192.168.5.0/24 dev v scope 10
192.168.5.0/24 dev v scope 10
# ip -d -4 route show dev v
unicast 192.168.5.0/24 proto boot scope 10
Note that we only got one RTM_NEWROUTE message, although from NMPCache's
point of view, a new route (with a particular ID) was added and another
route (with a different ID) was deleted. The cumbersome workaround is,
to keep an ordered list of the routes, and figure out which route was
replaced in response to an RTM_NEWROUTE. In absence of bugs, this should
work fine. However, as we only rely on events, we might wrongly
introduce a cache-inconsistancy as well. See the related bug rh#1337860.
Also drop nm_platform_ip4_route_get() and the like. The ID of routes
is complex, so it makes little sense to look up a route directly.
Via the flags of the RTM_NEWROUTE netlink message, kernel and iproute2
support various variants to add a route.
- ip route add
- ip route change
- ip route replace
- ip route prepend
- ip route append
- ip route test
Previously, our nm_platform_ip4_route_add() function was basically
`ip route replace`. In the future, we should rather user `ip route
append` instead.
Anyway, expose the netlink message flags in the API. This allows to
use the various forms, and makes it also more apparent to the user that
they even exist.
- kernel ignores rtm_tos for IPv6 routes. While iproute2 accepts it,
let libnm reject TOS attribute for routes as well.
- move the tos field from NMPlatformIPRoute to NMPlatformIP4Route.
- the tos field is part of the weak-id of an IPv4 route. Meaning,
`ip route add` can add routes that only differ by their TOS.
There are various notions of how to compare routes. Collect them all
in nm_platform_ip4_route_cmp(), nm_platform_ip4_route_hash(),
nm_platform_ip6_route_cmp(), and nm_platform_ip6_route_hash().
This way, we have them side-by-side, which makes the differences more
discoverable.