Commit 345bd1b187 ('libnmc: fix secrets request on 2nd stage of 2FA
authentication') and commit 27c701ebfb ('libnmc: allow user input in
ECHO mode for 2FA challenges') introduced 2 new tags that hints for the
secret agents can have as prefix.
These tags were processed (and removed) in the secret agents, not in the
daemon. This is wrong because a system with an updated VPN plugin but a
not yet updated secret agent (like nm-plasma) will fail: it won't remove
the prefix and the daemon will save the secret with the prefix, i.e.
"x-dynamic-challenge:challenge-response" instead of just
"challenge-response". Then, VPN plugins doesn't recognize it, failing the
profile's activation. This is, in fact, an API break.
Also, if the VPN connection already existed before updating NM and the
VPN plugin, the secret flags are not added to the profile (they are only
added when the profile is created or modified). This causes the user's
first time response is saved to the profile, so the activation fails the
second and next times.
See:
- https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1536
- https://gitlab.gnome.org/GNOME/NetworkManager-openvpn/-/issues/142
Anyway, in a good design the daemon should contain almost all the logic
and the clients should keep as simple as possible. Fix above's problems
by letting the daemon to receive the secret names with the prefix
already included. The daemon will strip it and will know what it means.
Note that this is done only in the functions that saves the secrets from
the data received via D-Bus. For example, nm_setting_vpn_add_secret
doesn't need to do it because this value shouldn't come from VPN
plugin's hints.
(cherry picked from commit 0583e1f843)
(cherry picked from commit 574741783c)
Usually, when the method is "auto" we want to avoid configuring routes
until the automatic method completes. To achieve that, we clear the
"allow_routes_without_address" flag of l3cds when the method is "auto".
For VPNs, IP configurations with only routes are perfectly valid,
therefore set the flag.
(cherry picked from commit d1ffdb28eb)
(cherry picked from commit 5b4ed809cc)
Add a function to set the allow-routes-without-address flag for
l3cds. It will be used in the next commit.
(cherry picked from commit a3ce13c947)
(cherry picked from commit 5fa063f90d)
The name "dhcp_enabled" is misleading because the flag is set for
method=auto, which doesn't necessarily imply DHCP. Also, it doesn't
convey what the flag is used for. Rename it to
"allow_routes_without_address".
(cherry picked from commit b31febea22)
(cherry picked from commit 6897b6ecfd)
An IPv4-over-IPv6 (or vice-versa) IPsec VPN can return IP
configurations with routes and without addresses. For example, in this
scenario:
+---------------+ +---------------+
| fd01::10/64 <-- VPN --> fd02::20/64 |
| host1 | | host2 |
+-------^-------+ +-------^-------+
| |
+-------v-------+ +-------v-------+
| subnet1 | | subnet2 |
| 172.16.1.0/24 | | 172.16.2.0/24 |
+---------------+ +---------------+
host1 and host2 establish a IPv6 tunnel which encapsulates packets
between the two IPv4 subnets. Therefore, in routed mode, host1 will
need to configure a route like "172.16.2.0/24 via ipsec1" even if the
host doesn't have any IPv4 address on the VPN interface.
Accept IP configurations without address from the VPN; only check that
the address and prefix are sane if they are provided.
(cherry picked from commit 97f185e1f8)
(cherry picked from commit 518b7c5bd5)
Problem:
Given a OVS port with `autoconnect-ports` set to default or false,
when reactivation required for checkpoint rollback,
previous activated OVS interface will be in deactivate state after
checkpoint rollback.
The root cause:
The `activate_stage1_device_prepare()` will mark the device as
failed when controller is deactivating or deactivated.
In `activate_stage1_device_prepare()`, the controller device is
retrieved from NMActiveConnection, it will be NULL when NMActiveConnection
is in deactivated state. This will cause device been set to
`NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED` which prevent all follow
up `autoconnect` actions.
Fix:
When noticing controller is deactivating or deactivated with reason
`NM_DEVICE_STATE_REASON_NEW_ACTIVATION`, use new function
`nm_active_connection_set_controller_dev()` to wait on controller
device state between NM_DEVICE_STATE_PREPARE and
NM_DEVICE_STATE_ACTIVATED. After that, use existing
`nm_active_connection_set_controller()` to use new
NMActiveConnection of controller to move on.
Resolves: https://issues.redhat.com/browse/RHEL-31972
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit a68d2fd780)
(cherry picked from commit 4726822fb0)
If a connection is in-memory (i.e. has flag "unsaved"), after a
checkpoint and rollback it can be wrongly persisted to disk:
- if the connection was modified and written to disk after the
rollback, during the rollback we update it again with persist mode
"keep", which keeps it on disk;
- if the connection was deleted after the rollback, during the
rollback we add it again with persist mode "to-disk".
Instead, remember whether the connection had the "unsaved" flag set
and try to restore the previous state.
However, this is not straightforward as there are 4 different possible
states for the settings connection: persistent; in-memory only;
in-memory shadowing a persistent file; in-memory shadowing a detached
persistent file (i.e. the deletion of the connection doesn't delete
the persistent file). Handle all those cases.
Fixes: 3e09aed2a0 ('checkpoint: add create, rollback and destroy D-Bus API')
(cherry picked from commit c979bfeb8b)
GCC 14 with LTO complains with:
In function 'nm_team_link_watcher_new_ethtool',
inlined from 'nm_team_link_watcher_new_ethtool' at src/libnm-core-impl/nm-setting-team.c:106:1:
src/libnm-core-impl/nm-setting-team.c:130:33: error: array subscript 'struct NMTeamLinkWatcher[0]' is partly outside array bounds of 'unsigned char[16]' [-Werror=array-bounds=]
130 | watcher->ref_count = 1;
| ^
src/libnm-core-impl/nm-setting-team.c:128:15: note: object of size 16 allocated by 'g_malloc'
128 | watcher = g_malloc(nm_offsetofend(NMTeamLinkWatcher, ethtool));
| ^
even if the warning is disabled via pragma directives in that
code. This looks like the following GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80922
saying
We do not track warning options (and thus optimize pragmas /
attributes) across LTO because they are not saved in the function
specific optimization flag section.
We use a (NMTeamLinkWatcher *) to point to a memory area that is
shorter than the struct, because depending on the watcher type we need
to store different parameters; in this way we can save few bytes of
memory for some watcher types. However, this often breaks when
upgrading the compiler; instead just allocate the full struct.
(cherry picked from commit d369f55192)
GCC 14 complans with:
src/libnm-glib-aux/nm-uuid.c: In function 'nm_uuid_generate_from_strings_strv':
src/libnm-glib-aux/nm-uuid.c:492:12: error: '_1' may be used uninitialized [-Werror=maybe-uninitialized]
492 | return nm_uuid_generate_from_string_str(s, slen, uuid_type, type_args);
| ^
src/libnm-glib-aux/nm-uuid.c:392:1: note: by argument 1 of type 'const char *' to 'nm_uuid_generate_from_string_str' declared here
392 | nm_uuid_generate_from_string_str(const char *s,
| ^
"-Wmaybe-uninitialized" diagnoses passing pointers or references to
uninitialized memory to functions taking const-qualified arguments.
In this case, nm_uuid_generate_from_string_str()'s first argument is a
"const char *" and so the compiler expects that the string is always
initialized. However, it is not initialized when len is zero.
A non-null zero-length array can be specified in two ways: by setting
len to zero, or by setting len to -1 and having NULL as first
element. Handle both cases in the same way.
(cherry picked from commit 2386c0f52d)
When we recibe a Netlink message with a "route change" event, normally
we just ignore it if it's a route that we don't track (i.e. because of
the route protocol).
However, it's not that easy if it has the NLM_F_REPLACE flag because
that means that it might be replacing another route. If the kernel has
similar routes which are candidates for the replacement, it's hard for
NM to guess which one of those is being replaced (as the kernel doesn't
have a "route ID" or similar field to indicate it). Moreover, the kernel
might choose to replace a route that we don't have on cache, so we know
nothing about it.
It is important to note that we cannot just discard Netlink messages of
routes that we don't track if they has the NLM_F_REPLACE. For example,
if we are tracking a route with proto=static, we might receive a replace
message, changing that route to proto=other_proto_that_we_dont_track. We
need to process that message and remove the route from our cache.
As NM doesn't know what route is being replaced, trying to guess will
lead to errors that will leave the cache in an inconsistent state.
Because of that, it just do a cache resync for the routes.
For IPv4 there was an optimization to this: if we don't have in the
cache any route candidate for the replacement there are only 2 possible
options: either add the new route to the cache or discard it if we are
not interested on it. We don't need a resync for that.
This commit is extending that optimization to IPv6 routes. There is no
reason why it shouldn't work in the same way than with IPv4. This
optimization will only work well as long as we find potential candidate
routes in the same way than the kernel (comparing the same fields). NM
calls to this "comparing by WEAK_ID". But this can also happen with IPv4
routes.
It is worth it to enable this optimization because there are routing
daemons using custom routing protocols that makes tens or hundreds of
updates per second. If they use NLM_F_REPLACE, this caused NM to do a
resync hundreds of times per second leading to a 100% CPU usage:
https://issues.redhat.com/browse/RHEL-26195
An additional but smaller optimization is done in this commit: if we
receive a route message for routes that we don't track AND doesn't have
the NLM_F_REPLACE flag, we can ignore the entire message, thus avoiding
the memory allocation of the nmp_object. That nmp_object was going to be
ignored later, anyway, so better to avoid these allocations that, with
the routing daemon of the above's example, can happen hundreds of times
per second.
With this changes, the CPU usage doing `ip route replace` 300 times/s
drops from 100% to 1%. Doing `ip route replace` as fast as possible,
without any rate limitting, still keeps NM with a 3% CPU usage in the
system that I have used to test.
(cherry picked from commit 4d426f581d)
In the gtkdoc comments, the text below tags like `Since: 1.2` is
discarded. In the property `autoconnect-slaves` a line indicating its
deprecation was below one of these tags. As a result, it was missing in
the man page. Fix it.
Fixes: 194455660d ('connection: deprecate NMSettingConnection autoconnect-slaves property')
(cherry picked from commit 7427e9d320)
When creating the socket for listening to LLDP frames we are setting
NM_ETHERTYPE_LLDP (0x88cc) as protocol. In most of the cases, that is
correct but when the interface is attached as a port to a OVS bridge,
kernel is not matching the protocol correctly. The reason might be that
some metadata is added to the packet, but we are not completely sure
about it.
Instead, we should use ETH_P_ALL to match all the protocols. Later, we
have a eBPF filter to drop the packet by multicast MAC address or
protocol. This is how lldpd is doing it for example.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1903
(cherry picked from commit 9ac1d6e22b)
While enumerating devices at startup, we take a snapshot of existing
links from platform and we start creating device instances for
them. It's possible that in the meantime, while processing netlink
events in platform_link_added(), a link gets renamed. If that happens,
then we have two different views of the same ifindex: the cached link
from `links` and the link in platform.
This can cause issues: in platform_link_added() we create the device
with the cached name; then in NMDevice's constructor(), we look up
from platform the ifindex for the given name. Because of the rename,
this lookup can match a newly created, different link.
The end result is that the ifindex from the initial snapshot doesn't
get a NMDevice and is not handled by NetworkManager.
Fix this problem by fetching the latest version of the link from
platform to make sure we have a consistent view of the state.
https://issues.redhat.com/browse/RHEL-25808https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1897
(cherry picked from commit de130df3e2)
'stdout' is NULL when the script didn't write anything or failed.
Fixes the following crash detected by NMCI in test
'dispatcher_device_handler_dummy'.
nm-dispatcher[936339]: g_strsplit: assertion 'string != NULL' failed
build_result_options (nm-dispatcher)
complete_request (nm-dispatcher)
complete_script (nm-dispatcher)
script_watch_cb (nm-dispatcher)
g_child_watch_dispatch (libglib-2.0.so.0)
g_main_dispatch (libglib-2.0.so.0)
g_main_context_iterate (libglib-2.0.so.0)
g_main_context_iteration (libglib-2.0.so.0)
main (nm-dispatcher)
__libc_start_main (libc.so.6)
_start (nm-dispatcher)
Fixes: d72f26b875 ('dispatcher: read device-handler's stdout into a dictionary')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1889
(cherry picked from commit e5c2c5f1c2)
When creating VLAN over OVS internal interface which holding the same
name as its controller OVS bridge, NetworkManager will fail with error:
Error: Connection activation failed: br0.101 failed to create
resources: cannot retrieve ifindex of interface br0 (Open vSwitch
Bridge)
Expanded the `find_device_by_iface()` with additional argument
`child: NmConnection *` which will validate whether candidate is
suitable to be parent device.
In `nm_device_check_parent_connection_compatible()`, we only not allow OVS
bridge and OVS port being parent.
Resolves: https://issues.redhat.com/browse/RHEL-26753
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit 7096f52a59)
With `NM_CHECKPOINT_CREATE_FLAG_TRACK_INTERNAL_GLOBAL_DNS` flag set on
checkpoint creation, the checkpoint rollback will restore the
global DNS in internal configure file
`/var/lib/NetworkManager/NetworkManager-intern.conf`.
If user has set global DNS in /etc folder, this flag will not take any
effect.
Resolves: https://issues.redhat.com/browse/RHEL-23446
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit 86d67da28d)
GKeyfile changed something about how to handle invalid escape sequences.
As we don't want to test GKeyfile (per-se), just adjust to test to not
hit the problem.
This would fail with glib2-2.79.1-1.fc40:
# ./tools/run-nm-test.sh -m src/core/tests/config/test-config -p /config/set-values
TAP version 13
# random seed: R02Sb8afff1ec38ca5a1b7713e8c40eb4f56
# Start of config tests
# GLib-GIO-DEBUG: _g_io_module_get_default: Found default implementation local (GLocalVfs) for ?gio-vfs?
# (src/core/tests/config/test-config.c:1107) invalid value in config-data .intern.with-whitespace.key2 = (null) (instead of " b c\, d ")
./tools/run-nm-test.sh: line 307: 245847 Trace/breakpoint trap (core dumped) "${NMTST_DBUS_RUN_SESSION[@]}" "$TEST" "${TEST_ARGV[@]}"
exec "src/core/tests/config/test-config" failed with exit code 133
(cherry picked from commit 7f2a32fa11)
Upstream systemd fixed this compiler warning. What really needs to be
done, is re-importing the upstream code.
In the meantime, suppress the warning that hits on GCC 14.
This is a temporary workaround!
See-also: fdd84270df
(cherry picked from commit b1016e3be8)
systemd uses that too. We cannot enable compiler warnings that
upstream doesn't want to support.
See-also: b59bce308d
(cherry picked from commit ad22a96da9)
If a generic device is present and the name matches, it is compatible
with any link type.
For example, if a generic connection has a device-handler that creates
a dummy interface, the link is compatible with the NMDeviceGeneric.
(cherry picked from commit 5978fb2b27)
When a generic connection has a custom device-handler, it always
generates a NMDeviceGeneric, even when the link that gets created is
of a type natively supported by NM. On service restart, we need to
keep track that the device is generic or otherwise a different device
type will be instantiated.
(cherry picked from commit f2613be150)
If the device-handler of the generic connection is set, the connection
is virtual and the device is created by invoking the device-handler
via NetworkManager-dispatcher service.
With this change, a generic device now represents two different device
classes:
- existing interfaces that are not natively supported or recognized
by NetworkManager. Those devices have the `has_device_handler`
property set to FALSE;
- interfaces that are created by NM by invoking the device-handler;
they have `has_device_handler` set to TRUE.
(cherry picked from commit df6c35ec75)
Device handlers need a way to pass data back to NetworkManager, such
as the ifindex and an error message. Allow them to return a dictionary
on standard output, where each line contains a "$key=$value" pair.
In the daemon, the dictionary is returned via the callback function.
(cherry picked from commit d72f26b875)
"device-add" and "device-delete" actions are called for
device-handlers of generic devices. They differ from other actions in
the following aspects:
- only one script is invoked, the one with name specified by the
device-handler property;
- the script is searched in the "device" subdirectory;
- since there is only one script executed, the result and error
string from that script are returned by NM in the callback function.
(cherry picked from commit ee5845063d)
Call the Action2() method first, and fall back to the old Action() if
the new one is not available. This allows full interoperability
between different versions of the dispatcher service and NM.
(cherry picked from commit 8fd0d39444)
Currently, the dispatcher service implements an Action() method to
dispatch events. In the next commits, we'll need to add new
parameters, which is not possible with the current signature.
Introduce a new Action2() method, similar to the existing one but with
the following changes:
- it accepts an additional "options" input parameter of type a{sv};
- for each script executed, it also returns a dictionary of type
a{sv}.
The new parameters will allow to easily extend functionality in the
future without having to implement an Action3().
(cherry picked from commit abf0f03d25)
Introduce request_dbus_method_return() and call it whenever we need to
return a result. Don't collect the list of scripts in case the
parameters can't be parsed.
(cherry picked from commit 703efdfbbf)
Move common code from nm_dispatcher_call_device() and
nm_dispatcher_call_device_sync() to a new function; it will also be
used in the next commits by a new variant of the function.
(cherry picked from commit fa6ce51a0b)
Add a new "generic.device-handler" property that specifies the name of
a dispatcher script to be invoked to add and delete the interface for
this connection.
(cherry picked from commit e686ab35b3)
Properties in the "user" setting are a convenient way to associate any
kind of user-provided metadata to connections.
However, nmcli doesn't support the user setting at the moment and
adding this feature requires a significant effort. Without nmcli
support, dispatcher scripts can only access user properties by either
parsing connection files or by using D-Bus (with or without libnm and
GObject introspection). Since both these solutions are not very
convenient, provide an alternative way: pass the properties as
environment variables.
(cherry picked from commit d7c311eb85)
The error messages are logged by the dispatcher and passed back to
NetworkManager which also logs them. NetworkManager log messages
usually don't end with a dot: remove it.
(cherry picked from commit 7b769e9e49)