Heavily rework NML3Cfg's ACD handling.
- the (user facing) API changed, so that we can ask the current ACD
state of an address with nm_l3cfg_get_acd_addr_info(). So, the
acd-event signal is only to notify when the state changes, it does
not carry information that you couldn't fetch anytime.
- add clearer ACD states (NML3AcdAddrState). The current (ACD) state
of an address is important and becomes part of the information that
we expose.
- add new ACD state "USED", when ACD fails. This blocks the address from
being used. Usually the caller would either remove the (used) address
or force reconfigure it (by setting acd_timeout_msec to zero).
- add new ACD state "CONFLICT". Previously conflicts were not handled.
Now the API allows to specify the defend policy. A conflicted address
also gets blocked from being used.
- add new ACD state "EXTERNAL_REMOVED". This happens when we have an
address we wanted to configure, but then the address is no longer
on the interface. For example because the user removed it from the
interface. This also leaves the device indefinitely blocked, and
is important to stop announcing the address.
- add a new ACD state "READY". This indicates that the address is ready
to be configured, but not yet actually configured on the device. This
is the step before "DEFENDING".
ACD is handled by NML3Cfg and it intercepts the IP addresses when
merging the NML3ConfigData.
Originally, I thought that in such a case, the merged l3cd instance
would simply not contain any addresses that ACD have still pending or
which have a conflict.
However, I think it's better (clearer and possibly useful), to still
merge such addresses, but flag them that they are ignored when syncing
the addresses to platform.
It is not yet used, but it will be used to mark instances that
are not supposed to be configured in platform, because ACD is
either still pending of failed.
When a wifi device is in a bridge, the supplicant must be aware of it,
as a socket must be opened on the bridge to receive packets.
Set the BridgeIfname property of the supplicant Interface object
before starting the association. Note that the property was read-only
in the past and recently [1] became read-write. When using a
supplicant version without the patch, writing the property will return
an InvalidArgs error and NetworkManager will print a warning.
[1] https://w1.fi/cgit/hostap/commit/?id=1c58317f56e312576b6872440f125f794e45f991https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/83
The underscore somehow indicated that these would be an internal
function. Which they are in the sense that they are in "shared/nm-glib-aux/".
But they part of our internal helper functions, and in our code base
their use is no discouraged or "private.
Also, next I'll replace the function call with a macro, so, I will
have a need for the underscore name.
Rename.
This is potentially a breaking change, formerly speciyfing 'none|off'
in the kernel cmdline option 'ip' was understood by the dracut
network-module as doing 'ipv6.method=auto' which is clearly incosistent
with the 'off' naming, thus 'off|none' now means to actually disable
both ipv6 and ipv4 (unless a static ip is provided).
Unit test added.
https://bugzilla.redhat.com/show_bug.cgi?id=1883958
Reverts: 440a0b4078 ('initrd: set ipv6.method=auto when the autoconfiguration field is 'none'')
Signed-off-by: Antonio Cardace <acardace@redhat.com>
- use a guint64 variable to avoid wrapping the counter
- cache the used ID in NMDevice. This way, the same NMDevice
instance will get the same UDI path when it realizes
and unrealizes multiple times.
Change the default DNS priority of VPNs to -50, to avoid leaking
queries out of full-tunnel VPNs.
This is a change in behavior. In particular:
- when using dns=default (i.e. no split-dns) before this patch both
VPN and the local name server were added (in this order) to
resolv.conf; the result was that depending on resolv.conf options
and resolver implementation, the name servers were tried in a
certain manner which does not prevent DNS leaks.
With this change, only the VPN name server is added to resolv.conf.
- When using a split-dns plugin (systemd-resolved or dnsmasq), before
this patch the full-tunnel VPN would get all queries except those
ending in a local domain, that would instead be directed to the
local server.
After this patch, the VPN gets all queries.
To revert to the old behavior, set the DNS priority to 50 in the
connection profile.
If a VPN has never-default=no but doesn't get a default route (this
can happen for example when the server pushes routes with
openconnect), and there are no search domains, then the name servers
pushed by the server would be unused. It is preferable in this case to
use the VPN DNS server for all queries.
https://bugzilla.redhat.com/show_bug.cgi?id=1863041
The test spawns processes and tries to kill them, with timeouts and retry.
That is inherently racy, and it's hard to deterministically test the
interesting cases, without having unstable tests.
Try to adjust the timeout, to make it more stable:
14:02:27 /builds/NetworkManager/NetworkManager/tools/run-nm-test.sh --called-from-make /builds/NetworkManager/NetworkManager/build --launch-dbus=auto /builds/NetworkManager/NetworkManager/build/src/tests/test-core-with-expect
--- stdout ---
# random seed: R02S7748fae8fc946b7a755b72efb5815250
1..5
# Start of general tests
ok 1 /general/nm_utils_monotonic_timestamp_as_boottime
# NetworkManager-DEBUG: <debug> [1601992953.4091] kill child process 'test-s-1-3' (18615): sending SIGKILL...
# NetworkManager-DEBUG: <debug> [1601992953.4242] kill child process 'test-s-1-3' (18615): waiting for process to terminate after sending no signal (0) and SIGKILL...
# NetworkManager-DEBUG: <debug> [1601992953.4257] kill child process 'test-s-1-3' (18615): after sending no signal (0) and SIGKILL, process 18615 exited by signal 9 (20807 usec elapsed)
Bail out! GLib:ERROR:../src/tests/test-core-with-expect.c:154:test_nm_utils_kill_child_sync_do: Did not see expected message NetworkManager-DEBUG: *<debug> [*] kill child process 'test-s-1-3' (*): waiting up to 1 milliseconds for process to terminate normally after sending no signal (0)...
Bail out! test:ERROR:../src/tests/test-core-with-expect.c:457:test_nm_utils_kill_child: assertion failed (exit_status == 0): (6 == 0)
--- stderr ---
**
GLib:ERROR:../src/tests/test-core-with-expect.c:154:test_nm_utils_kill_child_sync_do: Did not see expected message NetworkManager-DEBUG: *<debug> [*] kill child process 'test-s-1-3' (*): waiting up to 1 milliseconds for process to terminate normally after sending no signal (0)...
**
test:ERROR:../src/tests/test-core-with-expect.c:457:test_nm_utils_kill_child: assertion failed (exit_status == 0): (6 == 0)
/builds/NetworkManager/NetworkManager/tools/run-nm-test.sh: line 279: 18325 Aborted "${NMTST_DBUS_RUN_SESSION[@]}" "${NMTST_LIBTOOL[@]}" "$NMTST_VALGRIND" --quiet --error-exitcode=$VALGRIND_ERROR --leak-check=full --gen-suppressions=all "${NMTST_SUPPRESSIONS[@]}" --num-callers=100 --log-file="$LOGFILE" "$TEST" "$@"
GDBusObjectManagerClient's interface-added and interface-removed signals
are not emitted when the new interfaces are added to a completely new
object or the removal results in the object disappearing. In other
words one interface is never reported both through interface-added and
object-added (or -removed) signals. This kind of makes sense but isn't
documented explicitly so interface-added seemed to correspond to DBus
InterfacesAdded signals which it doesn't.
We need to watch for both kinds of signals and although most things
work without us receiving the signals at all, it causes some race
conditions. For example on hotplug, devices wouldn't transition to
"disconnected" if a device was discovered by NMManager before it
appeared on IWD's dbus interface because that scenario relied on the
dbus signal.
The automatic scanning every 20 seconds while connected has been
annoying users because of the extra connection latency, drop it. The
UIs are supposed to be requesting scans whenever an AP list update is
needed (?).
Fix a crash on device unplugging caused by keeping our signal handlers
for GDBusProxies connected after a call to dispose(). Do this by
replacing most cleanup steps by a nm_device_iwd_set_dbus_object(self, NULL)
call which is more meticulous.
As one of the arguments in unsigned, the calculation is performed as
unsigned integers. That can actually lead to the wrong result. Fix it by
casting to the right (signed) types.
Emitting signals is relatively expensive, because the arguments have to be packed
into a GValue. Avoid some overhad by only passing one signal argument: the notify-data
which also contains the type. Also with this we can use g_cclosure_marshal_VOID__POINTER.
Also, it's nice to have the type field part of the notify-data. Because clearly
the notify-data union is unusable without knowing the type. That means, if a user
passes the notify-data to a function, they anyway would also need to pass along
the type.
NML3Cfg tends to perform actions on an idle handler. That means, when
it configures something on platform, it tends to ignore the changes and
process them later.
That means the currently tracked NMPObject with the platform link may
not be the same as NMPlatform currently has cached.
Instead, track them both, and extend the API so that it's clear that
there is a difference. You now need to say whether you want the instance
from the platform cache (the "next") or the currently used instance. Of
course, after the idle handler runs, "next" and the current one
converge.
This is useful because we want to reason about the link state (also) by
looking a our NML3Cfg instance. Since it already is connected to
platform, it can expose the same NMPObject.
- add nm_l3cfg_platform_commit_on_idle_schedule() so that internal (and
external) code can schedule a commit on an idle handler. This already
existed, but is exposed now.
- rename nm_l3cfg_platform_commit() to simply nm_l3cfg_commit(). There
is no other form than "platform" commit, so the name was
unnecessarily long.
- also don't let nm_l3cfg_commit() return a boolean success. It's not
useful, because commits can be triggered internally (by NML3Cfg
itself) or by other users. Instead, there is the "post-commit" event,
and anybody who cares about such a failure would need to handle it
there.
Our NML3Cfg instance is the IP configuration manage of one ifindex.
Often users have an NML3Cfg instance at hand, but they still need to
react to platform signals. Instead of requiring those users to register
their own signal (which also gets notifications about uninteresting
interfaces), re-emit the signal from NML3Cfg.
We already had NM_L3_CONFIG_NOTIFY_TYPE_PLATFORM_CHANGE_ON_IDLE which
does something similar, but collects multiple changes and emits them
on an idle handler.
This flag indicates that the NML3ConfigData should be ignored for most
purposes, except for doing ACD.
Note that as users can call nm_l3cfg_add_config() multiple times for
the same NML3ConfigData, a higher layer that enables ACD/IPv4LL can
then decide to actually use the configuration, while some layers
only have it hooked up to do ACD.
Now that NMPlatformIP[46]Route can contain a wildcard table/metric, we
can set the effectivey table/metric per NML3ConfigData that we merge.
Pass it to nm_l3cfg_add_config().
When we (for example) receive a DHCP lease, we track the routes that
should be configured via NMPlatformIP[46]Route instances. Thus, this
structure does not only track the routes that are configured (and
cached in NMPlatform), but it is also used to track the routes that
we want to configure.
This is also the case with the "rt_source" field, which represents the
NMIPConfigSource enum for routes that we want to configure, but
for routes in the cache it corresponds to rtm_protocol.
Note that NMDhcpClient creates NMIP4Config instances, which tracks the
routes as NMPlatformIP4Route instances. Previously, NMDhcpClient didn't
have any way to leave the table/metric undecided, but this information
isn't part of the DHCP lease tself. Instead, NMDevice knows the table/metric
to use. This has various problems:
- NMDhcpClient needs to know the table/metric, for no other purpose
than to set the value when creating the NMIP4Config instance for the
lease. We first pass the information down, only so that it can be
returned with the lease information.
- during reapply or when connectivity check changes, the effectively
used table/metric can change. Previously, we would have to
re-generate the NMIP4Config instances.
Improve that by allowing to leave the table/metric undecided. Higher
layers can decide the effective metric to use.