Checking whether the link exists in the cache, before talking to kernel
serves no purpose.
- in all cases, the caller already has a good indication that the link
in fact exists. That is, because the caller makes decisions on what to
do, based on what platform told it earlier. Thus, the check usually succeeds
anyway.
- in the unexpected case it doesn't succeed, we
- should not silently return without logging at least a message
- we possibly still want to send the netlink message to kernel,
just to have it fail. Note that the ifindex is indeed the identifier
for the link, so there is no danger of accidentally killing the
wrong link.
Well, theoretically there is, because the kernel's ifindex counter can
wrap or get reused when moving links between namespaces. But checking
the cache would not protect against that anyway! Worst case, the cache
would already have the impostor link and would not prevent from doing
the wrong thing. After all, they do have the same identifier, so how
would we know that this is in fact a different link?
We want that all code paths assert strictly and gracefully.
That means, if we have function nm_platform_link_get() which calls
nm_platform_link_get_obj(), then we don't need to assert the same things
twice. Don't have the calling function assert itself, if it is obvious
that the first thing that it does, is calling a function that itself
asserts the same conditions.
On the other hand, it simply indicates a bug passing a non-positive
ifindex to any of these platform functions. No longer let
nm_platform_link_get_obj() handle negative ifindex gracefully. Instead,
let it directly pass it to nmp_cache_lookup_link(), which eventually
does a g_return_val_if_fail() check. This quite possible enables
assertions on a lot of code paths. But note that g_return_val_if_fail()
is graceful and does not lead to a crash (unless G_DEBUG=fatal-criticals
is set for debugging).
nm_platform_link_delete() will soon assert against positive ifindex
argument.
nm_platform_link_delete (NM_PLATFORM_GET, nm_platform_link_get_ifindex (NM_PLATFORM_GET, DEVICE_NAME));
will result in an assertion, if the link does not exist.
Extend nmtstp_link_delete() to gracefully skip deleting the link
so that it can be used in such situations.
Also, rename nmtstp_link_del() to nmtstp_link_delete(), because it's
closer to nm_platform_link_delete().
We want to assert for valid input arguments, but we don't want
multiple assertions for the same.
Move the assertion from nm_platform_link_get() to
nm_platform_link_get_obj().
That way, nm_platform_link_get_obj() also checks the input arguments.
At the same time, nm_platform_link_get() gets simpler and still does
the same amount of assertions.
In nmp_cache_lookup_link_full(), we may have multiple candidates that match.
Continue searching, until we find a visible one. That way, visible results
are preferred.
Note that for links, nmp_object_is_visible() checks whether the link is
visible in netlink (instead of only udev).
Correct the spelling across the *entire* tree, including translations,
comments, etc. It's easier that way.
Even the places where it's not exposed to the user, such as tests, so
that we learn how is it spelled correctly.
Add helper nm_platform_link_get_ifi_flags() to access the
ifi-flags.
This replaces the internal API _link_get_flags() and makes it public.
However, the return value also allows to distinguish between errors
and valid flags.
Also, consider non-visible links. These are links that are in netlink,
but not visible in udev. The ifi-flags are inherrently netlink specific,
so it seems wrong to pretend that the link doesn't exist.
Sometimes the test fail:
$ make -j 10 src/platform/tests/test-address-linux
$ while true; do
NMTST_DEBUG=d ./tools/run-nm-test.sh src/platform/tests/test-address-linux 2>&1 > log.txt || break;
done
fails with:
ERROR: src/platform/tests/test-address-linux - Bail out! test:ERROR:src/platform/tests/test-common.c:790:nmtstp_ip_address_assert_lifetime: assertion failed (adr <= lft): (1001 <= 1000)
That is, because of a wrong check. Fix it.
In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.
That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.
Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).
Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
Rename variables for the error number. Commonly the naming
is:
- errno: the error number from <errno.h> itself
- errsv: a copy of errno
- nlerr: a netlink error number
- err: an error code, but not a errno/errsv and not
a netlink error number.
(cherry picked from commit f4de941d98)
- previously, parsing wireguard genl data resulted in memory corruption:
- _wireguard_update_from_allowedips_nla() takes pointers to
allowedip = &g_array_index (buf->allowedips, NMWireGuardAllowedIP, buf->allowedips->len - 1);
but resizing the GArray will invalidate this pointer. This happens
when there are multiple allowed-ips to parse.
- there was some confusion who owned the allowedips pointers.
_wireguard_peers_cpy() and _vt_cmd_obj_dispose_lnk_wireguard()
assumed each peer owned their own chunk, but _wireguard_get_link_properties()
would not duplicate the memory properly.
- rework memory handling for allowed_ips. Now, the NMPObjectLnkWireGuard
keeps a pointer _allowed_ips_buf. This buffer contains the instances for
all peers.
The parsing of the netlink message is the complicated part, because
we don't know upfront how many peers/allowed-ips we receive. During
construction, the tracking of peers/allowed-ips is complicated,
via a CList/GArray. At the end of that, we prettify the data
representation and put everything into two buffers. That is more
efficient and simpler for user afterwards. This moves complexity
to the way how the object is created, vs. how it is used later.
- ensure that we nm_explicit_bzero() private-key and preshared-key. However,
that only works to a certain point, because our netlink library does not
ensure that no data is leaked.
- don't use a "struct sockaddr" union for the peer's endpoint. Instead,
use a combintation of endpoint_family, endpoint_port, and
endpoint_addr.
- a lot of refactoring.
Move NMLinuxPlatformPrivate earlier.
In the past, I moved the declaration of NMLinuxPlatformPrivate
after utility functions which are independent from platform
instance.
However, parsing netlink messages actually requires
NMLinuxPlatformPrivate, because we want to access the "genl"
socket.
So, move the types to the beginning of the file, like we do
for most other source files.
The _lookup_cached_link() function, should not skip over links which are
currently in the cache, but not in netlink. Instead, let the callers
skip them, as they see fit.
No change in behavior, because the few callers now explicitly check
for this.
- drop "goto error_label" in favor of returning right away.
At most places, there was no need to do any cleanup or
the cleanup is handled via nm_auto().
- adjust the return types of wireguard functions to return
a boolean success/failure, instead of some error code which
we didn't use.
- the change to _wireguard_get_link_properties() is intentional.
This was wrong previously, because in _wireguard_get_link_properties()
obj is always a newly created instance, and never has a genl
family ID set. This will be improved later.
It's only used internally, and it seems not very useful to have.
As it is confusing to have multiple functions for doing something
similar, drop it -- since it's not really used. I also cannot imagine
a good use-case for it.
When we print info about the link, we also want to print
info about the referenced lnk instance, which commonly contains
link-specific data.
For NMP_OBJECT_TO_STRING_PUBLIC this was done correctly, by
calling to-string of public fields on the lnk object.
For NMP_OBJECT_TO_STRING_ALL, we wrongly just delegated to the
public to-string function, this means, for the lnk object we
would not print all fields.
Fix that.
When reading a file, we may allocate intermediate buffers (realloc()).
Also, reading might fail halfway through the process.
Add a new flag that makes sure that this memory is cleared. The
point is when reading secrets, that we don't accidentally leave
private sensitive material in memory.
Using '#ifdef' is generally error prone. It's better to always define
a define and check for it explicitly. This way, the compiler can issue
a warning if the define does not exist.
Also, note how meson would always define NM_MORE_LOGGING, possibly to
"0". That means, for meson, we unintentionally always enabled more
logging because the define was always present.
Fix that.
GBytes makes more sense, because it's immutable.
Also, since at other places we use GBytes, having
different types is combersome and requires needless
conversions.
Also:
- avoid nm_utils_escape_ssid() instead of _nm_utils_ssid_to_string().
We use nm_utils_escape_ssid() when we want to log the SSID. However, it
does not escape newlines, which is bad.
- also no longer use nm_utils_same_ssid(). Since it no longer
treated trailing NUL special, it is not different from
g_bytes_equal().
- also, don't use nm_utils_ssid_to_utf8() for logging anymore.
For logging, _nm_utils_ssid_escape_utf8safe() is better because
it is loss-less escaping which can be unambigously reverted.
These should be logged on DEBUG level:
<warn> platform-linux: do-change-link[2]: failure changing link: failure 97 (Address family not supported by protocol)
<warn> device (wlo1): failed to enable userspace IPv6LL address handling (unspecified)
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/10
As of upstream kernel v4.18-rc8.
Note that we name the features like they are called in ethtool's
ioctl API ETH_SS_FEATURES.
Except, for features like "tx-gro", which ethtool utility aliases
as "gro". So, for those features where ethtool has a built-in,
alternative name, we prefer the alias.
And again, note that a few aliases of ethtool utility ("sg", "tso", "tx")
actually affect more than one underlying kernel feature.
Note that 3 kernel features which are announced via ETH_SS_FEATURES are
explicitly exluded because kernel marks them as "never_changed":
#define NETIF_F_NEVER_CHANGE (NETIF_F_VLAN_CHALLENGED | \
NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)