IPv6 router advertisement messages contain the following parameters
(RFC 4861):
- Reachable time: 32-bit unsigned integer. The time, in
milliseconds, that a node assumes a neighbor is reachable after
having received a reachability confirmation. Used by the Neighbor
Unreachability Detection algorithm. A value of zero means
unspecified (by this router).
- Retrans Timer: 32-bit unsigned integer. The time, in milliseconds,
between retransmitted Neighbor Solicitation messages. Used by
address resolution and the Neighbor Unreachability Detection
algorithm. A value of zero means unspecified (by this router).
Currently NM ignores them; however, since it leaves accept_ra=1, the
kernel parses RAs and applies those parameters for us [1].
In the next commit kernel handling of RAs will be disabled, so let NM
set those neighbor-related parameters.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv6/ndisc.c?h=v5.2#n1353
We no longer add these. If you use Emacs, configure it yourself.
Also, due to our "smart-tab" usage the editor anyway does a subpar
job handling our tabs. However, on the upside every user can choose
whatever tab-width he/she prefers. If "smart-tabs" are used properly
(like we do), every tab-width will work.
No manual changes, just ran commands:
F=($(git grep -l -e '-\*-'))
sed '1 { /\/\* *-\*- *[mM]ode.*\*\/$/d }' -i "${F[@]}"
sed '1,4 { /^\(#\|--\|dnl\) *-\*- [mM]ode/d }' -i "${F[@]}"
Check remaining lines with:
git grep -e '-\*-'
The ultimate purpose of this is to cleanup our files and eventually use
SPDX license identifiers. For that, first get rid of the boilerplate lines.
For the most part, we only have one main .gitignore file.
There were a few nested files, merge them into the main file.
I find it better to have only one gitignore file, otherwise the
list of ignored files is spread out through the working directory.
The defaults for test timeouts in meson is 30 seconds. That is not long
enough when running
$ NMTST_USE_VALGRIND=1 ninja -C build test
Note that meson supports --timeout-multiplier, and automatically
increases the timeout when running under valgrind. However, meson
does not understand that we are running tests under valgrind via
NMTST_USE_VALGRIND=1 environment variable.
Timeouts are really not expected to be reached and are a mean of last
resort. Hence, increasing the timeout to a large value is likely to
have no effect or to fix test failures where the timeout was too rigid.
It's unlikely that the test indeed hangs and the increase of timeout
causes a unnecessary increase of waittime before aborting.
Use the NM_ERRNO_NATIVE() macro that asserts that these errno numbers are
indeed positive. Using the macro also serves as a documentation of what
the meaning of these numbers is.
That is often not obvious, whether we have an nm_errno(), an nm_errno_native()
(from <errno.h>), or another error number (e.g. WaitForNlResponseResult). This
situation already improved by merging netlink error codes (nle),
NMPlatformError enum and <errno.h> as nm_errno(). But we still must
always be careful about not to mix error codes from different
domains or transform them appropriately (like nm_errno_from_native()).
For static functions inside a module, the compiler determines on its own
whether to inline the function.
Also, "inline" was used at some places that don't immediatly look like
candidates for inlining. It was most likely a copy&paste error.
While nm_utils_inet*_ntop() accepts a %NULL buffer to fallback
to a static buffer, don't do that.
I find the possibility of using a static buffer here error prone
and something that should be avoided. There is of course the downside,
that in some cases it requires an additional line of code to allocate
the buffer on the stack as auto-variable.
Correct the spelling across the *entire* tree, including translations,
comments, etc. It's easier that way.
Even the places where it's not exposed to the user, such as tests, so
that we learn how is it spelled correctly.
In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.
That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.
Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).
Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
I am not sure, we ever call complete_address() for router-configurations.
Maybe not, so the dad-counter is never incremented and does not matter either.
If we however do, then we certainly want to preserve the DAD counter
when the address is already tracked.
we use get_expiry() to compare two lifetimes. Note, that previously,
it would correctly truncate the calculated expiry at G_MAXINT32-1.
However, that means, that two different lifetimes that both lie
more than 68 years in the future would compare equal.
Fix that, but extending the range to int64, so that no overflow
can happen.
RFC4862 5.5.3, points d) and e) make it clear, that the list of
addresses should be compared based on the prefix.
d) If the prefix advertised is not equal to the prefix of an
address configured by stateless autoconfiguration already in the
list of addresses associated with the interface (where "equal"
means the two prefix lengths are the same and the first prefix-
length bits of the prefixes are identical), and if the Valid
Lifetime is not 0, form an address (and add it to the list) by
combining the advertised prefix with an interface identifier of
the link as follows:
That means, we should not initialize the interface identifier first
(via complete_address()) and then search for the full address.
See-also: https://tools.ietf.org/search/rfc4862#section-5.5.3
Previously, we would coerce the value so that preferred is the same
as lifetime. However, RFC4862 5.5.3.c) says:
c) If the preferred lifetime is greater than the valid lifetime,
silently ignore the Prefix Information option. A node MAY wish to
log a system management error in this case.
See-also: https://tools.ietf.org/search/rfc4862#section-5.5.3
Note how the nm_ndisc_add_*() return a boolean to indicate whether
anything changes. That is taken to decide whether to emit a changed
signal.
Previously, we would not consider all fields which are exposed
as public API.
Note that nm-ip6-config.c would care about the lifetime of NMNDiscAddress.
For that, nm_ndisc_add_address() would correctly consider a change of
the lifetime as relevant. So, this was for the most part not broken.
However, for example nm_ndisc_add_route() would ignore changes to the
gateway.
Always signal changes if anything changes at all. It's more correct
and robust.
It should never happen that we are unable to switch the namespace.
However, in case it does, we cannot just return G_SOURCE_CONTINUE,
because we will just endlessly trying to process IO without actually
reading from the socket.
This shouldn't happen, but the instance is hosed and something is
very wrong. No longer handle the socket to avoid an endless loop.
event_ready() calls ndp_callall_eventfd_handler(), which invokes
our own callback, which may invoke change notification.
At that point, it's not guaranteed that the signal handler won't
destroy the ndisc instance, which means, the "struct ndp" gets destroyed
while invoking callbacks. That's bad, because libndp is not robust
against that.
Ensure the object stays alive long enough.
We first iterate over addresses that might have failed IPv6 DAD and
update the state in NMNDisc.
However, while we do that, don't yet invoke the changed signal.
Otherwise, we will invoke it multiple times (in case multiple addresses
failed). Instead, keep track of whether something changed, and handle
it once a bit later.
(cherry picked from commit f312620276)
There are multiple tests with the same in different directories; add a
unique prefix to test names so that it is clear from the output which
one is running.
Platform invokes change events while reading netlink events. However,
platform code is not re-entrant and calling into platform again is not
allowed (aside operations that do not process the netlink socket, like
lookup of the platform cache).
That basically means, we have to always process events in an idle
handler. That is not a too strong limitation, because we anyway don't
know the call context in which the platform event is emitted and we
should avoid unguarded recursive calls into platform.
Otherwise, we get hit an assertion/crash in nm-iface-helper:
1 raise()
2 abort()
3 g_assertion_message()
4 g_assertion_message_expr()
5 do_delete_object()
6 ip6_address_delete()
>>> 7 nm_platform_ip6_address_delete()
8 nm_platform_ip6_address_sync()
9 nm_ip6_config_commit()
10 ndisc_config_changed()
11 ffi_call_unix64()
12 ffi_call()
13 g_cclosure_marshal_generic_va()
14 _g_closure_invoke_va()
15 g_signal_emit_valist()
16 g_signal_emit()
>>> 17 nm_ndisc_dad_failed()
18 ffi_call_unix64()
19 ffi_call()
20 g_cclosure_marshal_generic()
21 g_closure_invoke()
22 signal_emit_unlocked_R()
23 g_signal_emit_valist()
24 g_signal_emit()
>>> 25 nm_platform_cache_update_emit_signal()
26 event_handler_recvmsgs()
27 event_handler_read_netlink()
28 delayed_action_handle_one()
29 delayed_action_handle_all()
30 do_delete_object()
31 ip6_address_delete()
32 nm_platform_ip6_address_delete()
33 nm_platform_ip6_address_sync()
>>> 34 nm_ip6_config_commit()
35 ndisc_config_changed()
36 ffi_call_unix64()
37 ffi_call()
38 g_cclosure_marshal_generic_va()
39 _g_closure_invoke_va()
40 g_signal_emit_valist()
41 g_signal_emit()
42 check_timestamps()
43 receive_ra()
44 ndp_call_eventfd_handler()
45 ndp_callall_eventfd_handler()
46 event_ready()
47 g_main_context_dispatch()
48 g_main_context_iterate.isra.22()
49 g_main_loop_run()
>>> 50 main()
NMPlatform already has a check to assert against recursive calls
in delayed_action_handle_all():
g_return_val_if_fail (priv->delayed_action.is_handling == 0, FALSE);
priv->delayed_action.is_handling++;
...
priv->delayed_action.is_handling--;
Fixes: f85728ecffhttps://bugzilla.redhat.com/show_bug.cgi?id=1546656
In ndisc_set_router_config(), we initialize NMNDiscAddress based on
NMPlatformIP6Address instances. Note that their handling of timestamps
is not entirely identical.
For convenience of the user, NMPlatformIP6Address allows to not specify
any timestamp. On the contrary, for convenience of implementation does
NMNDiscAddress always require fully specified timestamps.
Properly convert one representation into the other.
Previously, we would directly log get_expiry(), which is the absolute timestamp
inn nm_utils_get_monotonic_timestamp_s() scale. This time scale starts counting
somewhere around the time when the NetworkManager process starts, hence it is not
very intuitive to look at.
Instead, print the remaining time that is left counting from now. Since
we anyway only track timeouts with a granularity of whole seconds,
printing up to 4 decimal places is sufficiently precise.
Some targets are missing dependencies on some generated sources in
the meson port. These makes the build to fail due to missing source
files on a highly parallelized build.
These dependencies have been resolved by taking advantage of meson's
internal dependencies which can be used to pass source files,
include directories, libraries and compiler flags.
One of such internal dependencies called `core_dep` was already in
use. However, in order to avoid any confusion with another new
internal dependency called `nm_core_dep`, which is used to include
directories and source files from the `libnm-core` directory, the
`core_dep` dependency has been renamed to `nm_dep`.
These changes have allowed minimizing the build details which are
inherited by using those dependencies. The parallelized build has
also been improved.
Insert the new gateway at the end when it has the least preference.
Fixes the following runtime error:
src/ndisc/nm-ndisc.c:204:_ASSERT_data_gateways: assertion failed:
(_preference_to_priority (item_prev->preference) >=
_preference_to_priority (item->preference))
and nm_utils_ip6_property_path(). The API with static buffers
looks a bit nicer. But I think they are dangerous, because
we tend to pass the buffer down several layers of the stack, and
it's not immediately clear, that we don't overwrite the static
buffer again (which we probably did not, but it's hard to verify
that there is no bug there).
- add assert code to check that our internal tracking of
the gateways is consistent.
- assert (gracefully) against libndp returning :: as gateway
address.
We encounter the same enum in 3 forms:
- NMNDiscPreference in NetworkManager
- "enum ndp_route_preference" in <ndp.h>
- ICMPV6_ROUTER_PREF_* in <linux/icmpv6.h>
Move our enum to nm-core-utils.h, so that it can be used
by platform code as well (platform code should not include
ndisc/nm-ndisc.h).
Also, NMNDiscPreference was not numerically identical to their
native values (meaning: it shuffled the names and numbers).
Make them all numerically equal, so that they can be used in
the same context.
This means, while previously we could compare NMNDiscPreference
directly according to their priority, we now need _preference_to_priority().
On the other hand, we could omit translate_preference() -- but actually,
we still have _route_preference_coerce() because pref comes from libndp
and is thus untrusted. We still have to range check it.
We have the timestamp nm_utils_get_monotonic_time_s(), which should be
gint32 type. Then we also have timestamps in the NMNDisc* objects, which
consist of guint32 timestamp and lifetime.
Cleanup handling the times and calculation of the timestamps by using
the correct integer type consistently and ensuring that no integer overflow
occurs.
The @i variable to loop over the arrays should have the same type as
GArray.len, to which it is compared. In this case "guint".
As we remove items from the arrays while iterating over it, it gets
a bit complicated either way. I disliked that
g_array_remove_index (rdata->gateways, i--);
would underflow for unsigned integers. While that would work fine,
I think that is confusing. So, the variable is no longer incremented
in the increment statement of the for loop.