Cleanup the handling of close().
First of all, closing an invalid (non-negative) file descriptor (EBADF) is
always a serious bug. We want to catch that. Hence, we should use nm_close()
(or nm_close_with_error()) which asserts against such bugs. Don't ever use
close() directly, to get that additional assertion.
Also, our nm_close() handles EINTR internally and correctly. Recent
POSIX defines that on EINTR the close should be retried. On Linux,
that is never correct. After close() returns, the file descriptor is
always closed (or invalid). nm_close() gets this right, and pretends
that EINTR is a success (without retrying).
The majority of our file descriptors are sockets, etc. That means,
often an error from close isn't something that we want to handle. Adjust
nm_close() to return no error and preserve the caller's errno. That is
the appropriate reaction to error (ignoring it) in most of our cases.
And error from close may mean that there was an IO error (except EINTR
and EBADF). In a few cases, we may want to handle that. For those
cases we have nm_close_with_error().
TL;DR: use almost always nm_close(). Unless you want to handle the error
code, then use nm_close_with_error(). Never use close() directly.
There is much reading on the internet about handling errors of close and
in particular EINTR. See the following links:
https://lwn.net/Articles/576478/https://askcodes.net/coding/what-to-do-if-a-posix-close-call-fails-https://www.austingroupbugs.net/view.php?id=529https://sourceware.org/bugzilla/show_bug.cgi?id=14627https://news.ycombinator.com/item?id=3363819https://peps.python.org/pep-0475/
For g_assert() and g_return*() we already do the same, in
"src/libnm-glib-aux/nm-gassert-patch.h"
Also patch __assert_fail() so that it omits the condition text and the
function name in production builds.
Note that this is a bit ugly, for two reasons:
- again, we make assumptions that __assert_fail() exists. In practice,
this is the case for glibc and musl.
- <assert.h> can be included multiple times, while also forward
declaring __assert_fail(). That means, we cannot add a macro
#define __assert_fail(...)
because that would break the forward declaration. Instead,
just `#define __assert_fail _nm_assert_fail_internal`
Of course, this only affects direct calls to assert(), which we have
few. nm_assert() is not affected, because that anyway doesn't do
anything, unless NM_MORE_ASSERTS is enabled.
1) ensure the compiler always sees the condition (even if
it's unreachable). That is important, to avoid warnings
about unused variables and to ensure the condition compiles.
Previously, if NM_MORE_ASSERTS was enabled and NDEBUG (or
G_DISABLE_ASSERT) was defined, the condition was not seen by
the compiler.
2) to achieve point 1, we evaluate the expression now always in
nm_assert() macro directly. This also has the benefit that we
control exactly what is done there, and the implementation is
guaranteed to not use any code that is not async-signal-safe
(unless the assertion fails, of course).
3) add NM_MORE_ASSERTS_EFFECTIVE.
When using no glib (libnm-std-aux), the assert is implemented
by C89 assert(), while libnm-glib-aux redirects that to g_assert().
Note that these assertions are only in effect, if both NM_MORE_ASSERTS
and NDEBUG/G_DISABLE_ASSERT say so. NM_MORE_ASSERTS_EFFECTIVE
is thus the effectively used level for nm_assert().
4) use the proper __assert_fail() and g_assertion_message_expr()
calls. __assert_fail() is not standard, but it is there for glibc
and musl. So relying on this is probably fine. Otherwise, we will
get a compilation error and notice it.
While we usually don't do that, we also want to build with NDEBUG.
But in that case, we don't want that the assertions from our unit
tests are disabled.
Solve that by undefining NDEBUG and re-including <assert.h>.
It's not needed outside the source file, and lgtm.com complains
that global variables should have a long name.
Poor global variable name 'gl'. Prefer longer, descriptive names for
globals (eg. kMyGlobalConstant, not foo).
We currently use the systemd LLDP client, which we consume by forking
systemd code. That is a maintenance burden, because it's not a
self-contained, stable library that we use. Hence there is a need for an
individual library or properly integrating the fork in our tree.
Optimally, we would create a new nettools project with an LLDP library.
That was not done because:
- nettools may want to be dual licensed with LGPL-2.1+ and Apache.
Systemd code is LGPL-2.1+ so it is fine for NetworkManager but
possibly not for nettools.
- nettools provides independent librares, as such they don't have an
event loop, instead they expose an epoll file descriptor and the user
needs to integrate it. Systemd and NetworkManager on the other hand
have their established event loop (sd_event and GMainContext,
respectively). It's simpler to implement the library on those terms,
in particular porting the systemd library from sd_event to
GMainContext.
- NetworkManager uses glib and has various helper utils. While it's
possible to do without them, it's more work.
The main reason to not write a new NetworkManager-agnostic library from
scratch, is that it's much simpler to fork the systemd library and make
it part of NetworkManager, than making it a nettools library.
Do it.
The boottime argument might come from the system, and we should not
assert that it's reasonably small. It might be infinity. In that
case, keep it at infinity.
It belongs there, beside NMEtherAddr. Maybe NMEtherAddr should be moved to a
separate header, but it here for now.
The only oddity is that nm_ether_addr_zero actually aliases nm_ip_addr_zero,
which is in "libnm-glib-aux/nm-inet-utils.h". We can workaround that.
Taken from systemd's "Prioq".
Differences from Prioq:
- It is glib-ized, so certain operations cannot fail since g_malloc()
never fails.
- Unlike Prioq, this structure is stack allocated. I think that makes
sense, because we basically always want to embed the data structure
in another object. There is never a need for passing this around as a
pointer. And if you really want, you can box it yourself.
- The queue either accepts a GCompareFunc or a GComareDataFunc. This
is for convenience. The prioq_ensure_allocated() and
prioq_ensure_put() consequently are dropped, as they would be
cumbersome with this pattern and don't seem useful.
instead of always re-requesting secrets on authentication failure ask NMSetting
if this is really needed. Currently only for the case "802.1x with TLS" this
behaves differently, i.e. no re-request.
When an authentication attempt fails, NetworkManager re-requests new secrets
from agents before retrying. This is currently decided outside of the NMSetting
objects. With this change the decision if a re-request of new secrets is really
needed is moved down to the NMSetting implementations.
For the case "802.1x authentication with TLS" a certificate with password is
configured and the assumption is, that this can never be wrong and no re-request
is needed.
The pager_fallback() runs in the forked child process.
As such, it can only use functions from `man signal-safety`
or that are explicitly allowed.
We are mostly good, but g_printerr() is not allowed. It can deadlock.
Just avoid it. It's not very to print those error messages anyway.
setenv() cannot be called after fork, because it might allocate memory,
which can deadlock.
Instead, prepare the environment and use execvpe().
`man 2 fork` says:
After a fork() in a multithreaded program, the child can safely call
only async-signal-safe functions (see signal-safety(7)) until such time
as it calls execve(2).
This means, we are quite strongly limited what can be done in the child
process, before exec. setenv() is not listed as async-signal-safe, obviously
because it allocates memory, and malloc() isn't async-signal-safe either.
See also glib's documentation of GSpawnChildSetupFunc ([1]) about what
can be done in the child process.
[1] 08cb200aec/glib/gspawn.h (L124)
Currently, when performing DNS resolution with systemd-resolved,
NetworkManager tells systemd-resolved to consider only DNS configuration
for the network interface that the connectivity check request will be
routed through. But this is not correct because DNS and routing are
configured entirely separately. For example, say we have a VPN that
receives all DNS but only a subset of routing. NetworkManager will
configure systemd-resolved with no DNS servers on any interface except
for the VPN interface, but will still route traffic through other
interfaces. This is entirely legitimate and works fine in practice,
except for the connectivity check.
To fix this, we just drop the restriction and allow systemd-resolved to
consider its full configuration, which is what gets used normally
anyway. This allows our connectivity check to match the real
configuration instead of failing spuriously.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1107https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1415
Fix the following crash:
$ nmcli device monitor a
Error: Device 'a' not found.
Segmentation fault (core dumped)
Found by coverity:
1. NetworkManager-1.41.3/src/nmcli/devices.c:0: scope_hint: In function 'do_devices_monitor'
2. NetworkManager-1.41.3/src/nmcli/devices.c:2932:28: warning[-Wanalyzer-null-dereference]: dereference of NULL 'devices'
2930| }
2931|
2932|-> for (i = 0; i < devices->len; i++)
2933| device_watch(nmc, g_ptr_array_index(devices, i));
2934|
Fixes: 2074b28976 ('nmcli/devices: return GPtrArray instead of GSList from get_device_list()')
g_memdup()'s size argument is a guint. There was CVE-2021-27219
about an integer overflow, which results in a buffer overflow.
In response to that, g_memdup2() was introduced in 2.68.
We can't use g_memdup2(), because our currently required glib
version is still 2.40.
There was no bug at those two places where g_memdup() was used.
It's just that g_memdup() is a code smell. Prevent any questions that
a reader of the code might have regarding the correctness of g_memdup()
(w.r.t. integer/buffer overflow), by not using it.
Instead use our internal nm_memdup() variant, which exactly exists for
this reason.
See-also: https://gitlab.gnome.org/GNOME/glib/-/issues/2319