Routing rules are unlike addresses or routes not tied to an interface.
NetworkManager thinks in terms of connection profiles. That works well
for addresses and routes, as one profile configures addresses and routes
for one device. For example, when activating a profile on a device, the
configuration does not interfere with the addresses/routes of other
devices. That is not the case for routing rules, which are global, netns-wide
entities.
When one connection profile specifies rules, then this per-device configuration
must be merged with the global configuration. And when a device disconnects later,
the rules must be removed.
Add a new NMPRulesManager API to track/untrack routing rules. Devices can
register/add there the routing rules they require. And the sync method will
apply the configuration. This is be implemented on top of NMPlatform's
caching API.
Until now, all implemented NMPObject types have an ifindex field (from
links, addresses, routes, qdisc to tfilter).
The NMPObject structure contains a union of all available types, that
makes it easier to down-case from an NMPObject pointer to the actual
content.
The "object" field of NMPObject of type NMPlatformObject is the lowest
common denominator.
We will add NMPlatformRoutingRules (for policy routing rules). That type
won't have an ifindex field.
Hence, drop the "ifindex" field from NMPlatformObject type. But also add
a new type NMPlatformObjWithIfindex, that can represent all types that
have an ifindex.
Older versions of meson don't support building the same names
multiple times.
Meson encountered an error in file src/tests/meson.build, line 14, column 2:
Tried to create target "test-general", but a target of that name already exists.
We really need to use unique filenames everywhere. Revert the name
change for now.
This breaks again the valgrind workaround in "tools/run-nm-test.sh".
This reverts commit 5466edc63e.
Meson and autotools should name the tests the same way.
Also, all tests binaries built by autotools start on purpose
with "test-". Do that for meson too.
Also, otherwise "tools/run-nm-test.sh" fails to workaround
valgrind failures for platform tests as it does not expect
the tests to be named that way:
if [ $HAS_ERRORS -eq 0 ]; then
# valgrind doesn't support setns syscall and spams the logfile.
# hack around it...
if [ "$TEST_NAME" = 'test-link-linux' -o \
"$TEST_NAME" = 'test-acd' ]; then
if [ -z "$(sed -e '/^--[0-9]\+-- WARNING: unhandled .* syscall: /,/^--[0-9]\+-- it at http.*\.$/d' "$LOGFILE")" ]; then
HAS_ERRORS=1
fi
fi
fi
The defaults for test timeouts in meson is 30 seconds. That is not long
enough when running
$ NMTST_USE_VALGRIND=1 ninja -C build test
Note that meson supports --timeout-multiplier, and automatically
increases the timeout when running under valgrind. However, meson
does not understand that we are running tests under valgrind via
NMTST_USE_VALGRIND=1 environment variable.
Timeouts are really not expected to be reached and are a mean of last
resort. Hence, increasing the timeout to a large value is likely to
have no effect or to fix test failures where the timeout was too rigid.
It's unlikely that the test indeed hangs and the increase of timeout
causes a unnecessary increase of waittime before aborting.
We will need more flags.
WireGuard internal tools solve this by embedding the change flags inside
the structure that corresponds to NMPlatformLnkWireGuard. We don't do
that, NMPlatformLnkWireGuard is only for containing the information about
the link.
NMPNetns instances are immutable, hence they can be easily shared
between threads. All we need, is that the stack of namespaces is
thread-local.
Also note that NMPNetns uses almost no other API, except some bits from
"shared/nm-utils/" and nm-logging. These parts are already supposed to
be thread-safe.
The only complications is that when the thread exits, we need to
destroy the NMPNetns instances. That is especially important because
they hold file descriptors. This is accomplished using pthread's
thread-specific data. An alternative would be C11 threads' tss_create(),
but not all systems that we run against support that yet. This means,
we need to link with pthreads, but we already do that anyway.
Note that glib also requires pthreads. So, we don't get an additional
dependency here.
The caller may not wish to replace existing peers, but only update/add
the peers explicitly passed to nm_platform_link_wireguard_change().
I think that is in particular interesting, because for the most part
NetworkManager will configure the same set of peers over and over again
(whenever we resolve the DNS name of an IP endpoint of the WireGuard
peer).
At that point, it seems disruptive to drop all peers and re-add them
again. Setting @replace_peers to %FALSE allows to only update/add.
Platform had it's own scheme for reporting errors: NMPlatformError.
Before, NMPlatformError indicated success via zero, negative integer
values are numbers from <errno.h>, and positive integer values are
platform specific codes. This changes now according to nm-error:
success is still zero. Negative values indicate a failure, where the
numeric value is either from <errno.h> or one of our error codes.
The meaning of positive values depends on the functions. Most functions
can only report an error reason (negative) and success (zero). For such
functions, positive values should never be returned (but the caller
should anticipate them).
For some functions, positive values could mean additional information
(but still success). That depends.
This is also what systemd does, except that systemd only returns
(negative) integers from <errno.h>, while we merge our own error codes
into the range of <errno.h>.
The advantage is to get rid of one way how to signal errors. The other
advantage is, that these error codes are compatible with all other
nm-errno values. For example, previously negative values indicated error
codes from <errno.h>, but it did not entail error codes from netlink.
While nm_utils_inet*_ntop() accepts a %NULL buffer to fallback
to a static buffer, don't do that.
I find the possibility of using a static buffer here error prone
and something that should be avoided. There is of course the downside,
that in some cases it requires an additional line of code to allocate
the buffer on the stack as auto-variable.
We want that all code paths assert strictly and gracefully.
That means, if we have function nm_platform_link_get() which calls
nm_platform_link_get_obj(), then we don't need to assert the same things
twice. Don't have the calling function assert itself, if it is obvious
that the first thing that it does, is calling a function that itself
asserts the same conditions.
On the other hand, it simply indicates a bug passing a non-positive
ifindex to any of these platform functions. No longer let
nm_platform_link_get_obj() handle negative ifindex gracefully. Instead,
let it directly pass it to nmp_cache_lookup_link(), which eventually
does a g_return_val_if_fail() check. This quite possible enables
assertions on a lot of code paths. But note that g_return_val_if_fail()
is graceful and does not lead to a crash (unless G_DEBUG=fatal-criticals
is set for debugging).
nm_platform_link_delete() will soon assert against positive ifindex
argument.
nm_platform_link_delete (NM_PLATFORM_GET, nm_platform_link_get_ifindex (NM_PLATFORM_GET, DEVICE_NAME));
will result in an assertion, if the link does not exist.
Extend nmtstp_link_delete() to gracefully skip deleting the link
so that it can be used in such situations.
Also, rename nmtstp_link_del() to nmtstp_link_delete(), because it's
closer to nm_platform_link_delete().
Sometimes the test fail:
$ make -j 10 src/platform/tests/test-address-linux
$ while true; do
NMTST_DEBUG=d ./tools/run-nm-test.sh src/platform/tests/test-address-linux 2>&1 > log.txt || break;
done
fails with:
ERROR: src/platform/tests/test-address-linux - Bail out! test:ERROR:src/platform/tests/test-common.c:790:nmtstp_ip_address_assert_lifetime: assertion failed (adr <= lft): (1001 <= 1000)
That is, because of a wrong check. Fix it.
In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.
That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.
Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).
Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
When reading a file, we may allocate intermediate buffers (realloc()).
Also, reading might fail halfway through the process.
Add a new flag that makes sure that this memory is cleared. The
point is when reading secrets, that we don't accidentally leave
private sensitive material in memory.
Also, add two more features "tx-tcp-segmentation" and
"tx-tcp6-segmentation". There are two reasons for that:
- systemd-networkd supports setting these two features,
so lets support them too (apparently they are important
enough for networkd).
- these two features are already implicitly covered by "tso".
Like for the "ethtool" program, "tso" is an alias for several
actual features. By adding two features that are already
also covered by an alias (which sets multiple kernel names
at once), we showcase how aliases for the same feature can
coexist. In particular, note how setting
"tso on tx-tcp6-segmentation off" will behave as one would
expect: all 4 tso features covered by the alias are enabled,
except that particular one.
We commonly don't use the glib typedefs for char/short/int/long,
but their C types directly.
$ git grep '\<g\(char\|short\|int\|long\|float\|double\)\>' | wc -l
587
$ git grep '\<\(char\|short\|int\|long\|float\|double\)\>' | wc -l
21114
One could argue that using the glib typedefs is preferable in
public API (of our glib based libnm library) or where it clearly
is related to glib, like during
g_object_set (obj, PROPERTY, (gint) value, NULL);
However, that argument does not seem strong, because in practice we don't
follow that argument today, and seldomly use the glib typedefs.
Also, the style guide for this would be hard to formalize, because
"using them where clearly related to a glib" is a very loose suggestion.
Also note that glib typedefs will always just be typedefs of the
underlying C types. There is no danger of glib changing the meaning
of these typedefs (because that would be a major API break of glib).
A simple style guide is instead: don't use these typedefs.
No manual actions, I only ran the bash script:
FILES=($(git ls-files '*.[hc]'))
sed -i \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>\( [^ ]\)/\1\2/g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\> /\1 /g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>/\1/g' \
"${FILES[@]}"
Add platform support for IP6GRE and IP6GRETAP tunnels. The former is a
virtual tunnel interface for GRE over IPv6 and the latter is the L2
variant.
The platform code internally reuses and extends the same structure
used by IPv6 tunnels.
Otherwise, we easily get a failure
test:ERROR:src/platform/tests/test-cleanup.c:78:test_cleanup_internal: assertion failed (addresses6->len == 2): (1 == 2)
Avoid that by waiting for kernel to add the link-local
address.
Coccinelle:
@@
expression a, b;
@@
-a ? a : b
+a ?: b
Applied with:
spatch --sp-file ternary.cocci --in-place --smpl-spacing --dir .
With some manual adjustments on spots that Cocci didn't catch for
reasons unknown.
Thanks to the marvelous effort of the GNU compiler developer we can now
spare a couple of bits that could be used for more important things,
like this commit message. Standards commitees yet have to catch up.
There are multiple tests with the same in different directories; add a
unique prefix to test names so that it is clear from the output which
one is running.
For completeness, extend the API to support non-persistant
device. That requires that nm_platform_link_tun_add()
returns the file descriptor.
While NetworkManager doesn't create such devices itself,
it recognizes the IFLA_TUN_PERSIST / IFF_PERSIST flag.
Since ip-tuntap (obviously) cannot create such devices,
we cannot add a test for how non-persistent devices look
in the platform cache. Well, we could instead add them
with ioctl directly, but instead, just extend the platform
API to allow for that.
Also, use the function from test-lldp.c to (optionally) use
nm_platform_link_tun_add() to create the tap device.
Previously, it was not (reliably) possible to use nmtstp_wait_for_link*() to
only look into the platform cache, without trying to poll the netlink
socket for events.
Add this option. Now, if the timeout is specified as zero, we never actually
read the netlink socket.
Currently, there are no callers who make use of this (by passing
a zero timeout). So, this is no change in existing behavior.
Implement nmtstp_assert_wait_for_link() and nmtstp_assert_wait_for_link_until()
as macros, based on nmtst_assert_nonnull().
This way, the assertion will report a more helpful file:line location,
instead of being somewhere nested inside test-common.c.