Commit graph

1878 commits

Author SHA1 Message Date
Beniamino Galvani
9c74fa8e36 device: remove the prefix-delegation IP configuration on cleanup
When a device in IPv6 shared mode obtains a prefix, it adds a new l3cd
of type L3_CONFIG_DATA_TYPE_PD_6 for that prefix. However, that l3cd
is never removed later and so the address lingers on the interface
even after the connection goes down. Remove the l3cd on cleanup.

(cherry picked from commit 4a8bedcd89)
2025-06-27 10:04:39 +02:00
Íñigo Huguet
9ec498f321 core: ovs: fix NULL pointer dereference in ovsdb read timeout callback
Fixes: f7d321c6d6 ('ovsdb: add watchdog for unparsable JSON data in socket')
(cherry picked from commit dc9bf255ee)
2025-05-14 07:59:44 -04:00
Jan Vaclav
ae420a8dd6 firewall/utils: replace ipv4 iptables macro with ipxtables macro
(cherry picked from commit 2106251e46)
2025-05-12 13:38:38 +02:00
Jan Vaclav
3f2c0869dc firewall/utils: remove _share prefix from iptables_get_name
It's no longer used just for shared mode.

(cherry picked from commit 18d5b7d641)
2025-05-12 13:38:38 +02:00
Jan Vaclav
4d0223f8a4 firewall/wireguard: drop packets received to wrong interface
If we receive a packet sent to the WG interface's address,
but it does not come from the WG tunnel, let's assume something
is broken and drop the packet.

This is also inspired by wg-quick firewall rules:
https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick/linux.bash?id=17c78d31c27a3c311a2ff42a881057753c6ef2a4#n221

(cherry picked from commit a769c17af7)
2025-05-12 13:38:38 +02:00
Jan Vaclav
2afcebe0c7 wireguard: add firewall rules to copy mark
When a WG connection is connecting to an IPv6 endpoint, configures a
default route, and firewalld is active with IPv6_rpfilter=yes, it never
handshakes and doesn't pass traffic. This is because firewalld has a
IPv6 reverse path filter which is discarding these packets.

Thus, we add some firewall rules whenever a WG connection is brought up
that ensure the conntrack mark and packet mark are copied over.
These rules are largely inspired by wg-quick:

https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick/linux.bash?id=17c78d31c27a3c311a2ff42a881057753c6ef2a4#n221
(cherry picked from commit db557908a2)
2025-05-12 13:38:38 +02:00
Jan Vaclav
57321f78c9 build: add path definition for ip6tables
(cherry picked from commit 0f469b30ad)
2025-05-12 13:38:38 +02:00
Jan Vaclav
ff853203d9 firewall/utils: move logs from sharing to firewall domain
(cherry picked from commit 10c2892d57)
2025-05-12 13:38:38 +02:00
Jan Vaclav
e77a1df6e7 firewall/utils: fix ntf -> nft typo
Fixes: 4badc1f33a ('firewall: fix signalling timeout error reason from _fw_nft_call()')
(cherry picked from commit e39e119636)
2025-05-12 13:38:38 +02:00
Beniamino Galvani
6f480d9494 ovs: allow reapplying ovs-bridge and ovs-port properties
Allow reapplying the following properties:

 - ovs-bridge.fail-mode
 - ovs-bridge.mcast-snooping-enable
 - ovs-bridge.rstp-enable
 - ovs-bridge.stp-enable
 - ovs-port.bond-downdelay
 - ovs-port.bond-mode
 - ovs-port.bond-updelay
 - ovs-port.lacp
 - ovs-port.tag
 - ovs-port.trunks
 - ovs-port.vlan-mode

(cherry picked from commit 4f577d677f)
2025-05-09 16:45:50 +02:00
Íñigo Huguet
094a542546 core: optimize hash table search in _ethtool_fec_set
Break the loop as soon as we've found the value.

Fixes: 19bed3121f ('ethtool: support Forward Error Correction(fec)')
(cherry picked from commit 245f0e0b35)
2025-04-07 08:10:47 -04:00
Íñigo Huguet
b7e34f225a core: fail early if we cannot get current FEC value
If we cannot get current FEC value probably we won't be able to set it a
few lines later. Also, if it fails to set, we try to use the value of
the old one that we tried to retrieve without success. In that case, the
variable old_fec_mode would be uninitialized. Fix it by returning early
if we cannot get the current value.

Fixes: 19bed3121f ('ethtool: support Forward Error Correction(fec)')
(cherry picked from commit cbdd0d9cca)
2025-04-07 08:10:39 -04:00
Tomas Korbar
873adc4dc0 dns: Refactor changing of Dnsconfd plugin state
(cherry picked from commit 7ba27f7a13)
2025-03-24 09:15:39 +01:00
Tomas Korbar
de4f4e870d dns: Fix invalid memory access on Dnsconfd DBUS error
DBus errors were not properly handled after DBus calls and
that caused SIGSEGV. Now they are checked.

Fixes #1738
Fixes: b8714e86e4 ('dns: introduce configuration_serial support to the dnsconfd plugin')

(cherry picked from commit 4ad20787bb)
2025-03-24 09:15:39 +01:00
Lubomir Rintel
a0779a9339 keyfile: don't crash on failure to write
The log statement ended up using wrong (always NULL) connection to get
ID from. Fix.

Resolves: https://issues.redhat.com/browse/RHEL-77157
(cherry picked from commit a7cf9d399f)
2025-02-28 09:18:26 +01:00
Lubomir Rintel
b7114d00ed Reapply "manager: create virtual devices on AddAndActivate()"
This reverts commit ccae5dc0e2.

(cherry picked from commit 11045cfa00)
2025-02-26 13:29:53 +01:00
Lubomir Rintel
4a1c51317e manager: make system_create_virtual_device() return a GError
This is done so that AddAndActivate() will return sensible errors in a
future patch that makes it support creating virtual devices.

In effect, all errors are logged in one place, therefore the log levels
are different. I don't think we're losing anything of value by being
a little less verbose here.

(cherry picked from commit 45d82f720c)
2025-02-26 13:29:49 +01:00
Íñigo Huguet
949c7b84a3 policy: fix unitialized variable
The variable 'change' may be used uninitialized.

Fixes: 7acc66699a ('policy: always reset retries when unblocking children or ports')
(cherry picked from commit af6aca3527)
2025-02-24 16:47:19 +01:00
Beniamino Galvani
ceef38d9a5 l3cfg: only add MPTCP endpoints for non-tentative IPv6 addresses
An IPv6 endpoint is not usable until the address is non-tentative. Add
a mechanism to wait until the address is ready.

(cherry picked from commit 227cd6307b)
2025-02-24 09:07:54 +01:00
Beniamino Galvani
2c5a51201d l3cfg: wait for the address before configuring an MPTCP endpoint
Skip the configuration of the MPTCP endpoint when the address is in
the l3cd but is not yet configured in the platform. This typically
happens when IPv4 DAD is enabled and the address is being probed.

If we configure the endpoint without the address set, the kernel will
try to use the endpoint immediately but it will fail. Then, the
endpoint will not be used ever again after the address is added.

(cherry picked from commit 6bf859af79)
2025-02-24 09:07:54 +01:00
Beniamino Galvani
a301c259f2 core: split nm_netns_watcher_remove_all()
The name suggests that the function always removes all the watchers
with the given tag; instead it removes only "dirty" ones when the
"all" parameter is FALSE. Split the function in two variants.

(cherry picked from commit b6e67c6abc)
2025-02-24 09:07:53 +01:00
Tomas Korbar
39b7a8df91 dns: fix Dnsconfd autostart
When Dnsconfd service is enabled but not started, NetworkManager
should attempt to start it through DBus at least once.

Fixes: c6e1925dec ('dns: Add dnsconfd DNS plugin')
(cherry picked from commit 1463b1c0a3)
2025-02-20 19:02:25 +01:00
Fernando Fernandez Mancera
b8ef2a551e core: prevent the activation of unavailable OVS interfaces only
Preventing the activation of unavailable devices for all device types is
too aggresive and leads to race conditions, e.g when a non-virtual bond
port gets a carrier, preventing the device to be a good candidate for
the connection.

Instead, enforce this check only on OVS interfaces as NetworkManager
just makes sure that ovsdb->ready is set to TRUE.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2139

Fixes: 774badb151 ('core: prevent the activation of unavailable devices')
(cherry picked from commit a1c05d2ce6)
2025-02-18 12:29:19 +01:00
Fernando Fernandez Mancera
2daeef668d policy: always reset retries when unblocking children or ports
When calling activate_port_or_children_connections() we are unblocking
the ports and children but we are not resetting the number of retries if
it is an internal activation.

This is wrong as even if it's an internal activation the number of
retries should be reset. It won't interferfe with other blocking reasons
like USER_REQUESTED or MISSING_SECRETS.

(cherry picked from commit 7acc66699a)
2025-02-13 12:03:05 +01:00
Beniamino Galvani
5e18da31a4 dnsconfd: drop "connection-*" entries from the update method
Stop passing "connection-*" entries in the update method to
dnsconfd. The plugin tries to determine the connection from the
ifindex, but it's not possible to do it right at the moment because
the same ifindex can be used at the same time e.g. by a policy-based
VPN like ipsec and a normal device. Instead, it should be NM that
explicitly passes the information about the connection to the DNS
plugin. Anyway, these variables are not used at the moment by
dnsconfd.

Fixes: c6e1925dec ('dns: Add dnsconfd DNS plugin')
(cherry picked from commit 4d84e6cddf)
2025-02-13 10:38:34 +01:00
Beniamino Galvani
e20794989b dnsconfd: set the state to idle when connection fails
If the plugin can't connect to D-Bus, it is not waiting for an update;
set the state to idle.

(cherry picked from commit 2bfd27f74d)
2025-02-13 10:38:34 +01:00
Beniamino Galvani
dc0ff10efb dnsconfd: fix handling of the update-pending flag
After every state change of the plugin there should be an invocation
of _nm_dns_plugin_update_pending_maybe_changed() to re-evaluate
whether we are waiting for an update. send_dnsconfd_update() doesn't
change the state and so there is need to check again afterwards.

(cherry picked from commit 8ff1cbf38b)
2025-02-13 10:38:34 +01:00
Beniamino Galvani
774badb151 core: prevent the activation of unavailable devices
When autoconnecting ports of a controller, we look for all candidate
(device,connection) tuples through the following call trace:

 -> autoconnect_ports()
   -> find_ports()
     -> nm_manager_get_best_device_for_connection()
       -> nm_device_check_connection_available()
         -> _nm_device_check_connection_available()

The last function checks that a specific device is available to be
activated with the given connection. For virtual devices, it only
checks that the device is compatible with the connection based on the
device type and characteristics, without considering any live network
information.

For OVS interfaces, this doesn't work as expected. During startup, NM
performs a cleanup of the ovsdb to remove entries that were previously
added by NM. When the cleanup is terminated, NMOvsdb sets the "ready"
flag and is ready to start the activation of new OVS interfaces. With
the current mechanism, it is possible that a OVS-interface connection
gets activated via the autoconnect-ports mechanism without checking
the "ready" flag.

Fix that by also checking that the device is available for activation.
2025-02-12 09:53:06 +01:00
Beniamino Galvani
6c1eb99d32 core: cleanup nm_manager_get_best_device_for_connection()
Rename "unavailable_devices" to "exclude_devices", as the
"unavailable" term has a specific, different meaning in NetworkManager
(i.e. the device is in the UNAVAILABLE state). Also, use
nm_g_hash_table_contains() when needed.
2025-02-12 09:51:01 +01:00
Jason A. Donenfeld
c627bbea4c nm-random-utils: always generate good random bytes and prioritize getrandom support
The current mess of code seems like a hodgepodge of complex ideas,
partially copied from systemd, but then subtly different, and it's a
mess. Let's simplify this drastically.

First, assume that getrandom() is always available. If the kernel is too
old, we have an unoptimized slowpath for still supporting ancient
kernels, a path that should be removed at some point. If getrandom()
isn't available and the fallback path doesn't work, the system has much
larger problems, so just crash. This should basically never happen.
getrandom() and having randomness available in general is a critical
system API that should be expected to be available on any functioning
system.

Second, assume that the rng is initialized, so that asking for random
numbers should never block. This is virtually always true on modern
kernels. On ancient kernels, it usually becomes true. But, more
importantly, this is not the responsibility of various daemons, even
ones that run at boot. Instead, this is something for the kernel and/or
init to ensure.

Putting these together, we adopt new behavior:

- First, try getrandom(..., ..., 0). The 0 flags field means that this
  call will only return good random bytes, not insecure ones.

- If this fails for some reason that isn't ENOSYS, crash.

- If this fails due to ENOSYS, poll on /dev/random until 1 byte is
  available, suggesting that subsequent reads from the rng will almost
  have good random bytes. If this fails, crash. Then, read from
  /dev/urandom. If this fails, crash.

We don't bother caching when getrandom() returns ENOSYS. We don't apply
any other fancy optimizations to the slow fallback path. We keep that as
barebones and minimal as we can. It works. It's for ancient kernels. It
should be removed soon. It's not worth spending cycles over. Instead,
the goal is to eventually reduce all of this down to a simple boring
call to getrandom(..., ..., 0).

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2127
2025-02-11 10:04:26 +01:00
Lubomir Rintel
d1725cd288 Revert "manager: create virtual devices on AddAndActivate()"
This reverts commit eb635c23a7.
2025-02-06 10:35:02 +01:00
Beniamino Galvani
05efd5ab62 l3cfg: add the DNS routing rules explicitly
Add the DNS routing rules explicitly instead of tracking them via the
NMGlobalTracker mechanism. Since we do not plan to ever remove them,
there is no reason to track the rules. Also, the current
implementation is buggy because in some situations the rules are
wrongly removed when they should not.

Fixes: bf3ecd9031 ('l3cfg: fix DNS routes')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2125
2025-02-04 10:35:55 +01:00
Tomas Korbar
b8714e86e4 dns: introduce configuration_serial support to the dnsconfd plugin
"configuration_serial" dbus property ensures that the plugin
can mark update 'not pending' when the update is trully finished.
This mechanism exists because of underlying problem of having
to restart, or perform similarly time consuming operation, to change
certain configuration parameters of resolver. If Dnsconfd would
block the update call until the update is finished, we could not
respond to any other requests until the call is finished.
2025-01-29 14:41:47 +01:00
Tomas Korbar
c6e1925dec dns: Add dnsconfd DNS plugin
dnsconfd can now be used as DNS configuration plugin.

If ipvX.routed-dns is set to -1 and dnsconfd plugin is enabled then
routes are added by default.
2025-01-29 14:41:47 +01:00
Tomas Korbar
c08ecfd5fe dns: Add resolve-mode and certification-authority keys to global-dns
Resolve-mode allows user to specify way how the global-dns domains
and DNS connection information should be merged and used.

Certification-authority allows user to specify certification
authority that should be used to verify certificates of encrypted
DNS servers.
2025-01-29 14:41:47 +01:00
Beniamino Galvani
c9be26cf9a format: run nm-code-format
The new clang-format changed the formatting output, update the code.
2025-01-29 14:38:22 +01:00
Beniamino Galvani
98b124a661 dhcp: drop dhcpcanon support
Drop support for the "dhcpcanon" DHCP client. It's unmantained, as the
last code change was in 2018:

  https://github.com/juga0/dhcpcanon/commits

There is no need to first deprecate it because it was still marked as
"experimental" in NM. Also, it's not packaged by any recent distro, so
we can assume that nobody will miss it.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2112
2025-01-20 18:56:41 +01:00
Lubomir Rintel
79219553be cloud-setup: fix build
Fixes: 6ff4b9e57c ('cloud-setup: create VLANs for multiple VNICs on OCI')
2025-01-20 17:53:58 +01:00
Lubomir Rintel
eb635c23a7 manager: create virtual devices on AddAndActivate()
If the connection didn't exist in advance, there's no unrealized device,
and find_device_by_iface() is not going to get us one.

Call system_create_virtual_device() afrer nm_utils_complete_generic()
completes the connection for virtual devices. Make sure we do proper
cleanup if we happen to fail the activation later, so that de device
doesn't end up hanging there.
2025-01-20 06:18:45 +01:00
Lubomir Rintel
57e140d961 manager: split device creation off from validate_activation_request()
Make validate_activation_request() only do the validation -- split the
determination of the device into find_device_for_activation().

The point of this is to be able complete the connection and actually
create a virtual device after the validation.

I believe this is also somewhat easier to follow now that the procedure
does what its name says.
2025-01-20 06:15:54 +01:00
Lubomir Rintel
25871f1971 manager: reword some error messages
They've been a little too cryptic and unnecessarily long before.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
cfe6e730b3 device: don't log connection UUIDs on device creation
It's irrelevant, doesn't look good, and might possibly be not there
because the connection has not been normalized yet.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
be034a1f3f device: simplify the nm_utils_complete_generic() machinery
The point is to get rid of device/connection type specific arguments, to
eventually be able to complete the connection on AddAndActivate before knowing
which factory is going to take care of creating the device.

Aside from that, the whole thing is pretty awful -- with complicated
macros and variadic argument (ugh). Let's get rid of that.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
6635aeed99 device: get_connection_parent() accept incomplete connections
All of these are wrong asserting that a connection has a particular
setting. On AddAndActivate, the connection can be pretty much empty:

  impl_manager_add_and_activate_connection ()
    validate_activation_request ()
      nm_manager_get_best_device_for_connection ()
      iface = nm_manager_get_connection_iface ()
        find_parent_device_for_connection ()
          nm_device_factory_get_connection_parent () <====== *shriek*
        nm_device_factory_get_connection_iface ()
      find_device_by_iface (iface)
    nm_device_complete_connection ()

Remove those assertions.
2025-01-20 06:13:58 +01:00
Lubomir Rintel
b7a8486c53 device: cleanup get_connection_iface() callbacks
Some of them are wrong: they assert a connection has a particular
setting even though this can be called on AddAndActivate against a
connection that is not complete or normalized:

  impl_manager_add_and_activate_connection ()
    validate_activation_request ()
      nm_manager_get_best_device_for_connection ()
      iface = nm_manager_get_connection_iface ()
        find_parent_device_for_connection ()
          nm_device_factory_get_connection_parent ()
        nm_device_factory_get_connection_iface () <====== here
      find_device_by_iface (iface)
    nm_device_complete_connection ()

Fix those by removing the assertions.

Some of them are also fall back to just calling
nm_connection_get_interface_name() which is a pretty useless thing to do
because nm_device_factory_get_connection_iface() only calls the
device-specific routine if nm_device_factory_get_connection_iface()
doesn't return anything, to give the factory a chance to make up a name
(like <parent>.<vlan-id> for Vlan) on its own. Drop those.
2025-01-20 06:13:58 +01:00
Lubomir Rintel
e3d3f1315a device/factory: document that some callbacks get an incomplete connection
It's get_connection_parent() and get_connection_iface().
2025-01-20 06:13:58 +01:00
Jan Vaclav
4107a6883f platform/test: reenable xgress qos tests
Fixes: 6e30e37ebe ('test: disable vlan_xgress unit test')
2025-01-16 11:08:44 +00:00
Beniamino Galvani
bf3ecd9031 l3cfg: fix DNS routes
The current approach is flawed. During a commit of the L3
configuration we do a RTM_GETROUTE to find the next-hop to the DNS
server on the current interface, in order to create the DNS route to
inject into the l3cd. However, we haven't added routes to kernel yet
and so the result of the RTM_GETROUTE is going to be wrong.

In some cases, for example when IPv4 DAD is enabled, the bug can't be
easily noticed because we perform multiple commits for the interface,
and the regular routes are already set in kernel from the 2nd commit
on.

To fix the problem, do the following: during a commit we first add
addresses and routes to platform. Then, we create a list of DNS routes
to configure, we collect the old DNS routes, and do a comparison. If
they changed, we need to add the DNS routes to platform in a 2nd step.

Note that in the previous approach we tracked the routes in the
committed-l3cd object of the l3cfg, and so they were applied to kernel
automatically. Because of the 2-step requirement, that no longer works
and we must apply the DNS routes manually.

Fixes: 5449b18a94 ('core: support automatically adding DNS routes')
2025-01-14 23:31:59 +01:00
Beniamino Galvani
aefc7732f0 l3cfg: add the DNS routing rule only when needed
Don't try to add the routing rule that points to the table containing
DNS routes at every commit.

Instead, look into the platform cache to see if the rule already
exists and add it only when needed.
2025-01-14 23:31:59 +01:00
Wen Liang
96ff17fd48 dhcp-client4: do not send release message when there is no lease
The daemon crashes when NM tries to send the release message when there
is no lease yet and the UDP socket is still in the PACKET state, which
causes an assertion failure as the result.

Add the condition to guarantee that n-dhcp4 only sends the release
message when there is a lease.

Resolves: https://issues.redhat.com/browse/RHEL-69132

Stack trace of the crash:
0  0x00007f5ac248bacc __pthread_kill_implementation (libc.so.6 + 0x8bacc)
1  0x00007f5ac243e686 __GI_raise (libc.so.6 + 0x3e686)
2  0x00007f5ac2428833 __GI_abort (libc.so.6 + 0x28833)
3  0x00007f5ac242875b __assert_fail_base (libc.so.6 + 0x2875b)
4  0x00007f5ac24373c6 __assert_fail (libc.so.6 + 0x373c6)
5  0x00005607ec7f194a n_dhcp4_c_connection_udp_send (NetworkManager + 0x8594a)
6  0x00005607eca228cc n_dhcp4_c_connection_start_request (NetworkManager + 0x2b68cc)
7  0x00005607eca14b31 nm_dhcp_client_stop (NetworkManager + 0x2a8b31)
8  0x00005607eca8a4ce _dev_ipdhcpx_cleanup (NetworkManager + 0x31e4ce)
9  0x00005607ecac144d _cleanup_ip_pre (NetworkManager + 0x35544d)
10 0x00005607ecac3f04 _cleanup_generic_pre (NetworkManager + 0x357f04)
11 0x00005607ecad5006 nm_device_cleanup (NetworkManager + 0x369006)
12 0x00005607ecac5230 _set_state_full (NetworkManager + 0x359230)
13 0x00005607ecac8c4a nm_device_state_changed (NetworkManager + 0x35cc4a)
14 0x00007f5ac2daa47b g_idle_dispatch (libglib-2.0.so.0 + 0x5147b)
15 0x00007f5ac2dadf4f g_main_dispatch (libglib-2.0.so.0 + 0x54f4f)
16 0x00007f5ac2e03268 g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa268)
17 0x00007f5ac2dad5a3 g_main_loop_run (libglib-2.0.so.0 + 0x545a3)
18 0x00005607ec7c3eed main (NetworkManager + 0x57eed)
19 0x00007f5ac24295d0 __libc_start_call_main (libc.so.6 + 0x295d0)
20 0x00007f5ac2429680 __libc_start_main_impl (libc.so.6 + 0x29680)
21 0x00005607ec7c43f5 _start (NetworkManager + 0x583f5)
2025-01-14 10:58:36 -05:00