IPv4 routes that are a response to RTM_GETROUTE must have the cloned
flag while IPv6 routes don't have to. Don't check the flag for IPv6
routes and add a test case to verify that RTM_GETROUTE works for IPv6.
https://bugzilla.gnome.org/show_bug.cgi?id=793962
(cherry picked from commit 2d1fad641b)
The bridge test (and no other either) no longer sets sysfs properties,
so this whole madness is no longer needed. That is good, because Linux
got somewhat stricter (at least in 4.15) about mounting sysfs and the
whole thing wouldn't work with containers where /sys is red-only from
the start.
(cherry picked from commit 6788ced98d)
This is basically the case in the COPR build system where this
(mount -o bind,ro /proc/sys /proc/sys) is the case for reasons unknown.
(cherry picked from commit 984e9d5655)
It only makes sense to call delete() with NMPObjects that
we obtained from the platform cache. Otherwise, if we didn't
get it from the cache in the first place, we wouldn't know
what to delete.
Hence, the input argument is (almost) always an NMPObject
in the first place. That is different from add(), where
we might create a new specific NMPlatform* instance on the
stack. For add() it makes slightly more sense to have different
functions depending on the type. For delete(), it doesn't.
(cherry picked from commit 7573594a21)
# random seed: R02S4ca8cfc3dace399c0f15b42411e45d2e
1..48
# Start of link tests
ok 1 /link/bogus
PASS: src/platform/tests/test-link-linux 1 /link/bogus
ok 2 /link/loopback
PASS: src/platform/tests/test-link-linux 2 /link/loopback
nmtst: initialize nmtst_get_rand() with NMTST_SEED_RAND=2697682474
ok 3 /link/internal
PASS: src/platform/tests/test-link-linux 3 /link/internal
ok 4 /link/external
PASS: src/platform/tests/test-link-linux 4 /link/external
# Start of software tests
./tools/run-nm-test.sh: line 193: 7589 Trace/breakpoint trap (core dumped) "${NMTST_DBUS_RUN_SESSION[@]}" "$TEST" "$@"
NMPlatformSignalAssert: src/platform/tests/test-link.c:298, test_slave(): failure to accept signal 0 times: 'link-changed-changed' ifindex 9 (1 times received)
ERROR: src/platform/tests/test-link-linux - too few tests run (expected 48, got 4)
ERROR: src/platform/tests/test-link-linux - exited with status 133 (terminated by signal 5?)
src/platform/tests/test-common.c: In function ‘_ip4_route_get’:
./src/platform/nmp-object.h:336:18: error: ‘o’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
return obj ? obj->_class->obj_type : NMP_OBJECT_TYPE_UNKNOWN;
~~~^~~~~~~~
src/platform/tests/test-common.c:308:19: note: ‘o’ was declared here
const NMPObject *o;
^
Fixes: cdd8c65799
Let nm_platform_ip_route_add() and friends return an NMPlatformError
failure reason.
Also, do_add_addrroute() did not return the response from kernel.
Instead, it determined success/failure based on the presence of the
object in the cache. That is racy and does not allow to give a failure
reason from kernel.
Instead, determine success solely based on the netlink reply from
kernel. The received errno shall be authorative, there is no need
to second guess the response.
There is a problem that netlink is not a reliable protocol. In case
of receive buffer overflow, the response is lost and we don't know
whether the command succeeded (it likely did). It's unclear how to fix
that, but for now just return "unspecified" error. We probably avoid
that already by having a huge buffer size.
Also, downgrade the error message to <warn> level. <error> is really
for bugs only.
Inspired from iproute2. As such, don't use libnl3's "struct nl_msg", but
add _nl_addattr_l() and use a stack-allocated "struct nlmsghdr". With
this, we are closer to the raw netlink API. It really is simple enough.
The complicated part of the patch is that we re-use the existing netlink
socket for events. Hence, we must process the socket via our common
event_handler_recvmsgs(). That also means, that we get the netlink
response a few layers down the stack and have to return the result
via DelayedActionWaitForNlResponseData.
This will make us stop worry how relevant are chunks of compat code with
older kernels when deciding whether it's worth supporting/testing them.
As if we actually were testing old kernels.
nmp_lookup_init_route_visible() was originally named this way, to only return routes
that are nmp_object_is_visible(). However, all routes are visible (as long as they are
nmp_object_is_alive()). Hence, this is a historic misnomer.
Also, passing @only_default FALSE is identical to the
nmp_lookup_init_addrroute() lookup.
So, rename the function to indicate it is a lookup for default routes
only. Also, get rid of the unsupported ifindex argument for which there
is no index.
Until now, NetworkManager's platform cache for routes used the quadruple
network/plen,metric,ifindex for equaliy. That is not kernel's
understanding of how routes behave. For example, with `ip route append`
you can add two IPv4 routes that only differ by their gateway. To
the previous form of platform cache, these two routes would wrongly
look identical, as the cache could not contain both routes. This also
easily leads to cache-inconsistencies.
Now that we have NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID, fix the route's
compare operator to match kernel's.
Well, not entirely. Kernel understands more properties for routes then
NetworkManager. Some of these properties may also be part of the ID according
to kernel. To NetworkManager such routes would still look identical as
they only differ in a property that is not understood. This can still
cause cache-inconsistencies. The only fix here is to add support for
all these properties in NetworkManager as well. However, it's less serious,
because with this commit we support several of the more important properties.
See also the related bug rh#1337855 for kernel.
Another difficulty is that `ip route replace` and `ip route change`
changes an existing route. The replaced route has the same
NM_PLATFORM_IP_ROUTE_CMP_TYPE_WEAK_ID, but differ in the actual
NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID:
# ip -d -4 route show dev v
# ip monitor route &
# ip route add 192.168.5.0/24 dev v
192.168.5.0/24 dev v scope link
# ip route change 192.168.5.0/24 dev v scope 10
192.168.5.0/24 dev v scope 10
# ip -d -4 route show dev v
unicast 192.168.5.0/24 proto boot scope 10
Note that we only got one RTM_NEWROUTE message, although from NMPCache's
point of view, a new route (with a particular ID) was added and another
route (with a different ID) was deleted. The cumbersome workaround is,
to keep an ordered list of the routes, and figure out which route was
replaced in response to an RTM_NEWROUTE. In absence of bugs, this should
work fine. However, as we only rely on events, we might wrongly
introduce a cache-inconsistancy as well. See the related bug rh#1337860.
Also drop nm_platform_ip4_route_get() and the like. The ID of routes
is complex, so it makes little sense to look up a route directly.
NMPCache can preserve the order of the objects. Until now, the order
was however arbitrary. Soon we will require to preserve the order of
routes.
During a dump, force appending new objects at the end. That ensures,
correct ordering during the dump.
Note that we track objects in several distrinct indexes. Those partition the
set of all objects. Outside a dump when receiving events about new objects (e.g.
RTM_NEWROUTE), it is very unclear at which place the new object should be sorted.
It is especially unclear, as an object might move from one partition (of
an index) to another.
In general, a deterministic order will only be useful in one particular
instance: the NMP_CACHE_ID_TYPE_ROUTES_BY_DESTINATION index for routes.
In this case, we will ensure a particular order of the routes.
Via the flags of the RTM_NEWROUTE netlink message, kernel and iproute2
support various variants to add a route.
- ip route add
- ip route change
- ip route replace
- ip route prepend
- ip route append
- ip route test
Previously, our nm_platform_ip4_route_add() function was basically
`ip route replace`. In the future, we should rather user `ip route
append` instead.
Anyway, expose the netlink message flags in the API. This allows to
use the various forms, and makes it also more apparent to the user that
they even exist.
Routes are complicated.
`ip route add` and `ip route append` behaves differently with respect to
determine whether an existing route is idential or not.
Extend the cmp() and hash() functions to have a compare type, that
covers the different semantics.
Contrary to addresses, routes have no ID. When deleting a route,
you cannot just specify certain properties like network/plen,metric.
Well, actually you can specify only certain properties, but then kernel
will treat unspecified properties as wildcard and delete the first matching
route. That is not something we want, because we need to be in control which
exact route shall be deleted.
Also, rtm_tos *must* match. Even if we like the wildcard behavior,
we would need to pass TOS to nm_platform_ip4_route_delete() to be
able to delete routes with non-zero TOS. So, while certain properties
may be omitted, some must not. See how test_ip4_route_options() was
broken.
For NetworkManager it only makes ever sense to call delete on a route,
if the route is already fully known. Which means, we only delete routes
that we have already in the platform cache (otherwise, how would we know
that there is something to delete). Because of that, no longer have separate
IPv4 and IPv6 functions. Instead, have nm_platform_ip_route_delete() which
accepts a full NMPObject from the platform cache.
The code in core doesn't jet make use of this new functionality. It will
in the future.
At least, it fixes deleting routes with differing TOS.
The return value for the delete methods checks whether the object
is actually deleted. That is questionable behavior, because if the netlink
request succeeds, there is little point in checking with the platform cache.
As it is, it is racy.
Anyway, the previous value was totally wrong.
But it also uncovers another platform bug, which currently breaks
route tests. Will be fixed next.
and refactor NMFakePlatform to also track links via NMPCache.
For one, now NMFakePlatform also tests NMPCache, increasing the
coverage of what we care about.
Also, all our NMPlatform implementations now use NMPObject and NMPCache.
That means, we can expose those as part of the public API. Which is
great, because callers can keep a reference to the NMPObject object
and make use of generic functions like nmp_object_to_string().
Routes and addresses don't implement cmd_obj_is_visible(),
hence they are always visible, and NMP_CACHE_ID_TYPE_OBJECT_TYPE_VISIBLE_ONLY
is identical to NMP_CACHE_ID_TYPE_OBJECT_TYPE.
Only link objects can be alive but invisible. Still, drop the index
for looking up visible links entirely. Let callers do the filtering,
if they care.
Maintaining an index is expensive.Not so much in term of runtime, but
in term of memory.
Drop some indexes, and require the caller to use a more broad index (and
filter out unwanted elements).
Dropped:
- can no longer lookup visible default-routes by ifindex.
If you care about default-routes, lookup all and search for the
desired ifindex. The overall number of default-routes is expected
to be small.
We drop NMP_CACHE_ID_TYPE_ROUTES_VISIBLE_BY_IFINDEX_WITH_DEFAULT
entirely.
- no longer have a separate index for non-default routes. We
expect that the most routes are non-default routes. So, don't
have an index without default-routes, instead let the caller
just lookup all routes, and reject default-routes themself.
We keep NMP_CACHE_ID_TYPE_ROUTES_VISIBLE_BY_DEFAULT, but it
now no longer tracks non-default routes.
This drops 1 out of 6 route indexes, and modifes another one, so
that we expect that there are almost no entires tracked by it.
Implement the reference counting of NMPObject as part of
NMDedupMultiObj and get rid of NMDedupMultiBox.
With this change, the NMPObject is aware in which NMDedupMultiIndex
instance it is tracked.
- this saves an additional GSlice allocation for the NMDedupMultiBox.
- it is immediately known, whether an NMPObject is tracked by a
certain NMDedupMultiIndex or not. This saves an additional hash
lookup.
- previously, when all idx-types cease to reference an NMDedupMultiObj
instance, it was removed. Now, a tracked objects stays in the
NMDedupMultiIndex until it's last reference is deleted. This possibly
extends the lifetime of the object and we may reuse it better.
- it is no longer possible to add one object to more then one
NMDedupMultiIndex instance. As we anyway want to have only one
instance to deduplicate the objects, this is fine.
- the ref-counting implementation is now part of NMDedupMultiObj.
Previously, NMDedupMultiIndex could also track objects that were
not ref-counted. Hoever, the object anyway *must* implement the
NMDedupMultiObj API, so this flexibility is unneeded and was not
used.
- a downside is, that NMPObject grows by one pointer size, even if
it isn't tracked in the NMDedupMultiIndex. But we really want to
put all objects into the index for sharing and deduplication. So
this downside should be acceptable. Still, code like
nmp_object_stackinit*() needs to handle a larger object.
Rework platform object cache to use NMDedupMultiIndex.
Already previously, NMPCache used NMMultiIndex and had thus
O(1) for most operations. What is new is:
- Contrary to NMMultiIndex, NMDedupMultiIndex preserves the order of
the cached items. That is crucial to handle routes properly as kernel
will replace the first matching route based on network/plen/metric
properties. See related bug rh#1337855.
Without tracking the order of routes as they are exposed
by kernel, we cannot properly maintain the route cache.
- All NMPObject instances are now treated immutable, refcounted
and get de-duplicated via NMDedupMultiIndex. This allows
to have a global NMDedupMultiIndex that can be shared with
NMIP4Config and NMRouteManager. It also allows to share the
objects themselves.
Immutable objects are so much nicer. We can get rid of the
update pre-hook callback, which was required previously because
we would mutate the object inplace. Now, we can just update
the cache, and compare obj_old and obj_new after the fact.
- NMMultiIndex was treated as an internal of NMPCache. On the other
hand, NMDedupMultiIndex exposes NMDedupMultiHeadEntry, which is
basically an object that allows to iterate over all related
objects. That means, we can now lookup objects in the cache
and give the NMDedupMultiHeadEntry instance to the caller,
which then can iterate the list on it's own -- without need
for copying anything.
Currently, at various places we still create copies of lookup
results. That can be improved later.
The ability to share NMPObject instances should enable us to
significantly improve performance and scale with large number
of routes.
Of course there is a memory overhead of having an index for each list
entry. Each NMPObject may also require an NMDedupMultiEntry,
NMDedupMultiHeadEntry, and NMDedupMultiBox item, which are tracked
in a GHashTable. Optimally, one NMDedupMultiHeadEntry is the head
for multiple objects, and NMDedupMultiBox is able to deduplicate several
NMPObjects, so that there is a net saving.
Also, each object type has several indexes of type NMPCacheIdType.
So, worst case an NMPlatformIP4Route in the platform cache is tracked
by 8 NMPCacheIdType indexes, for each we require a NMDedupMultiEntry,
plus the shared NMDedupMultiHeadEntry. The NMDedupMultiBox instance
is shared between the 8 indexes (and possibly other).
Platform has it's own, simple implementation of object types:
NMPObject. Extract a base type and move it to "shared/nm-utils/nm-obj.h"
so it can be reused.
The base type is trival, but it allows us to implement other objects
which are compatible with NMPObjects. Currently there is no API for generic
NMObjBaseInst type, so compatible in this case only means, that they
can be used in the same context (see example below).
The only thing that you can do with a NMObjBaseInst is check it's
NMObjBaseClass.
Incidentally, NMObjBaseInst is also made compatible to GTypeInstance.
It means, an NMObjBaseInst is not necessarily a valid GTypeInstance (like NMPObject
is not), but it could be implemented as such.
For example, you could do:
if (NMP_CLASS_IS_VALID ((NMPClass *) obj->klass)) {
/* is an NMPObject */
} else if (G_TYPE_CHECK_INSTANCE_TYPE (obj, NM_TYPE_SOMETHING)) {
/* it a NMSometing GType */
} else {
/* something else? */
}
The reason why NMPObject is not implemented as proper GTypeInstance is
because it would require us to register a GType (like
g_type_register_fundamental). However, then the NMPClass struct can
no longer be const and immutable memory. But we could.
NMObjBaseInst may or may not be a GTypeInstance. In a sense, it's
a base type of GTypeInstance and all our objects should be based
on it (optionally, they we may make them valid GTypes too).
- no need to call nm_platform_process_events() after
nmtstp_wait_for_signal(). The latter processes all events
that are pending.
- with addr_n number of addresses, we still only want to wait
a maximum time, not for each addresss individually. Basically,
the for-loop must be inside NMTST_WAIT(), not the other way around.
Having an unsigned "guint timeout_ms" argument is very inconvenient, because
nmtstp_wait_for_signal (NM_PLATFORM_GET (), end_time - now_time);
might easily be negative. In such a case, the correct behavior
is to wait not at all.
The previous behavior, of treating timeout_ms as *no timeout*, makes
no sense. At least not for unit tests. If you have a really long timeout,
then set it. "0" should really mean to schedule a zero timeout.
when adding a route with RTA_PREFSRC some kernel versions will reject
the request if the specified source address is still tentative: be sure
that the just added addresses are no more tentative before adding the
routes.
nmtstp_env1_add_test_func() allows to register test functions in a
particular test environment ("env1", for lack of a better name).
It will be reused for "test-route.c"