Commit graph

17575 commits

Author SHA1 Message Date
Lubomir Rintel
a6af0ad443 cloud-setup: avoid accidental cast into a smaller type
This resulted in what looked like more significant bits of GType pointer
sometimes falling off the cliff, presumably because of a cast to
NMDeviceType enum (that probably ends up actually being a char).

This was silent enough to not emit compiler warnings and only occurring
with some very rare situations (needs GCC with LTO and some of the
optimization flags used by Fedora 41).

Fixes: cf6af54ffa ('cloud-setup: allow VETH along with ETHERNET too')
Fixes: 6ff4b9e57c ('cloud-setup: create VLANs for multiple VNICs on OCI')
2025-01-28 08:25:16 +01:00
Lubomir Rintel
0b26733189 test-client: collect pexpect subprocess status
Always explicitly tear down pexpect instances and collect their
results. Assert on the results after orderly teardowns.

Track the current pexpect instance in test context so that it could be
still collected if the test blows up. That could provide more clue into
what went wrong in the test if it's due to a crash the testee.

Before:

  [1573928.02238] <debug> config device C0:00:00:00:00:10: creating vlan connection for VLAN 700 on C0:00:00:00:00:10...
  [1573928.02330] <debug> config device C0:00:00:00:00:10: connection "vlan2" (ac3c08f5-3e5c-38a3-a366-c16253de6db2) created
  ======================================================================
  ERROR: test_oci_vlans (__main__.TestNmCloudSetup.test_oci_vlans)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
  ...
      pexp.expect("some changes were applied for provider oci")
      ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ...
  pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.

After:

  [1573928.02238] <debug> config device C0:00:00:00:00:10: creating vlan connection for VLAN 700 on C0:00:00:00:00:10...
  [1573928.02330] <debug> config device C0:00:00:00:00:10: connection "vlan2" (ac3c08f5-3e5c-38a3-a366-c16253de6db2) created
  *** pexpect'd process killed by SIGABRT ***
  ======================================================================
  ERROR: test_oci_vlans (__main__.TestNmCloudSetup.test_oci_vlans)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
  ...
      pexp.expect("some changes were applied for provider oci")
      ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ...
  pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
2025-01-24 18:29:57 +01:00
Lubomir Rintel
a6402fed71 test-client: fix TestNmCloudSetup valgrind run
Allow running the following locally (for quick loval nm-c-s valgrind check),
without requiring $NM_TEST_CLIENT_NMCLI_PATH to be set.

  $ NM_TEST_CLIENT_CLOUD_SETUP_PATH=build/src/nm-cloud-setup/nm-cloud-setup \
      NMTST_USE_VALGRIND=1 python src/tests/client/test-client.py TestNmCloudSetup
2025-01-24 18:29:57 +01:00
Beniamino Galvani
98b124a661 dhcp: drop dhcpcanon support
Drop support for the "dhcpcanon" DHCP client. It's unmantained, as the
last code change was in 2018:

  https://github.com/juga0/dhcpcanon/commits

There is no need to first deprecate it because it was still marked as
"experimental" in NM. Also, it's not packaged by any recent distro, so
we can assume that nobody will miss it.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2112
2025-01-20 18:56:41 +01:00
Lubomir Rintel
79219553be cloud-setup: fix build
Fixes: 6ff4b9e57c ('cloud-setup: create VLANs for multiple VNICs on OCI')
2025-01-20 17:53:58 +01:00
Lubomir Rintel
cf6af54ffa cloud-setup: allow VETH along with ETHERNET too
Pairs of veth devides are used for CI testing in place of real
ethernets. Use GLib types instead of NM numbers, since it's possible to
match them in hierarchical manner, with NMDeviceVeth being a subclass of
NMDeviceEthernet.

Fixes: 6ff4b9e57c ('cloud-setup: create VLANs for multiple VNICs on OCI')
2025-01-20 14:24:11 +01:00
Lubomir Rintel
9b258faab4 client/test: add test for VLANs on OCI 2025-01-20 14:08:12 +01:00
Lubomir Rintel
6ff4b9e57c cloud-setup: create VLANs for multiple VNICs on OCI
The idea is to create a pair of VLAN and MACVLAN with AddAndActivate if
they are not present, and otherwise follow the ordinary (GetApplied &
Reapply) procedure if the devices are already present.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
55ed4f7f6d cloud-setup: skip connections unless given type mismatches
Wired and Ipv4 always there, rest varies by connection type (Wired,
Vlan, MacVlan).
2025-01-20 14:08:12 +01:00
Lubomir Rintel
daef3b7b3f cloud-setup: lookup device by MAC + type instead of just MAC
This will be useful for updating configuration of Vlans and MacVlans,
some of having same MAC addresses as devices of other type.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
04f0491a58 cloud-setup: make _device_get_hwaddr() work with all kinds of devices
We'll have Vlans and MacVlans soon, and those don't have permanent
addresses.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
04e426a7bf cloud-setup: s/_nmc_get_hwaddrs()/_nmc_get_ethernet_hwaddrs()/
Make it clear that this only works for Ethernet devices.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
c3861bc50a cloud-setup: add a sync wrapper around AddAndActivate
These will be used to create the software devices.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
87b23669fa client/test: add nm-c-s OCI test
Very basic.
2025-01-20 14:08:12 +01:00
Lubomir Rintel
ecafa051df client/test: add ability to log pexpect traffic 2025-01-20 14:08:12 +01:00
Lubomir Rintel
a0c14665a3 client/test: move run_post() into TestNmcli
It attempts to modify attributes clearly belong to TestNmcli such as
_skip_test_for_l10n_diff or call methods that are in unittest.TestCase:

  ======================================================================
  ERROR: test_002 (__main__.TestNmcli.test_002)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
    File ".../src/tests/client/test-client.py", line 1508, in f
      self.ctx.run_post()
      ~~~~~~~~~~~~~~~~~^^
    File ".../src/tests/client/test-client.py", line 1185, in run_post
      self.fail(
      ^^^^^^^^^
  AttributeError: 'NMTestContext' object has no attribute 'fail'

It has presumably been moved out of TestNmcli at some point, but that
seems to have been in error, as it's also pretty specific to the nmcli
test cases. Not useful for cloud-init tests that also utilize
NMTestContext. Move it back.
2025-01-20 14:08:12 +01:00
Íñigo Huguet
98f8224376 cloud-setup: oci: remove the max 2 phys NICs limit
Right now, on any baremetal only max. 2 physical NICs are available.
This might change in the future, so better to directly accept larger
nicIndex if we receive it. No behaviour change with this, just remove
an artificial limit.
2025-01-20 14:08:12 +01:00
Íñigo Huguet
cfd7dd86c9 cloud-setup: parse OCI metadata related to VLAN config
Baremetal instances in Oracle Cloud require special VLAN config. Parse
the metadata related to it.
2025-01-20 14:08:12 +01:00
Íñigo Huguet
dd5b4fcf24 nmcs: remove nmcs_provider_*_get_type forward declaration
There is no need to avoid including the full header, they are small
headers with some GLib type system stuff and no more. Just include them
where they are needed.
2025-01-20 06:18:45 +01:00
Íñigo Huguet
7fc22ef5de nmcs: oci: use macro to log warnings 2025-01-20 06:18:45 +01:00
Lubomir Rintel
eb635c23a7 manager: create virtual devices on AddAndActivate()
If the connection didn't exist in advance, there's no unrealized device,
and find_device_by_iface() is not going to get us one.

Call system_create_virtual_device() afrer nm_utils_complete_generic()
completes the connection for virtual devices. Make sure we do proper
cleanup if we happen to fail the activation later, so that de device
doesn't end up hanging there.
2025-01-20 06:18:45 +01:00
Lubomir Rintel
57e140d961 manager: split device creation off from validate_activation_request()
Make validate_activation_request() only do the validation -- split the
determination of the device into find_device_for_activation().

The point of this is to be able complete the connection and actually
create a virtual device after the validation.

I believe this is also somewhat easier to follow now that the procedure
does what its name says.
2025-01-20 06:15:54 +01:00
Lubomir Rintel
25871f1971 manager: reword some error messages
They've been a little too cryptic and unnecessarily long before.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
cfe6e730b3 device: don't log connection UUIDs on device creation
It's irrelevant, doesn't look good, and might possibly be not there
because the connection has not been normalized yet.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
be034a1f3f device: simplify the nm_utils_complete_generic() machinery
The point is to get rid of device/connection type specific arguments, to
eventually be able to complete the connection on AddAndActivate before knowing
which factory is going to take care of creating the device.

Aside from that, the whole thing is pretty awful -- with complicated
macros and variadic argument (ugh). Let's get rid of that.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
6635aeed99 device: get_connection_parent() accept incomplete connections
All of these are wrong asserting that a connection has a particular
setting. On AddAndActivate, the connection can be pretty much empty:

  impl_manager_add_and_activate_connection ()
    validate_activation_request ()
      nm_manager_get_best_device_for_connection ()
      iface = nm_manager_get_connection_iface ()
        find_parent_device_for_connection ()
          nm_device_factory_get_connection_parent () <====== *shriek*
        nm_device_factory_get_connection_iface ()
      find_device_by_iface (iface)
    nm_device_complete_connection ()

Remove those assertions.
2025-01-20 06:13:58 +01:00
Lubomir Rintel
b7a8486c53 device: cleanup get_connection_iface() callbacks
Some of them are wrong: they assert a connection has a particular
setting even though this can be called on AddAndActivate against a
connection that is not complete or normalized:

  impl_manager_add_and_activate_connection ()
    validate_activation_request ()
      nm_manager_get_best_device_for_connection ()
      iface = nm_manager_get_connection_iface ()
        find_parent_device_for_connection ()
          nm_device_factory_get_connection_parent ()
        nm_device_factory_get_connection_iface () <====== here
      find_device_by_iface (iface)
    nm_device_complete_connection ()

Fix those by removing the assertions.

Some of them are also fall back to just calling
nm_connection_get_interface_name() which is a pretty useless thing to do
because nm_device_factory_get_connection_iface() only calls the
device-specific routine if nm_device_factory_get_connection_iface()
doesn't return anything, to give the factory a chance to make up a name
(like <parent>.<vlan-id> for Vlan) on its own. Drop those.
2025-01-20 06:13:58 +01:00
Lubomir Rintel
e3d3f1315a device/factory: document that some callbacks get an incomplete connection
It's get_connection_parent() and get_connection_iface().
2025-01-20 06:13:58 +01:00
Jan Vaclav
4107a6883f platform/test: reenable xgress qos tests
Fixes: 6e30e37ebe ('test: disable vlan_xgress unit test')
2025-01-16 11:08:44 +00:00
Jan Vaclav
84bcc0eab9 platform/vlan: fix incorrect type for ingress/egress qos mappings
The kernel was updated to add stricter validation to netlink messages,
which revealed this bug:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c21660fe221a15c789dee2bc2fd95516bc5aeaf

Fixes: a5ea141956 ('platform/vlan: add support for ingress/egress-qos-mappings and changing flags')
2025-01-16 11:08:44 +00:00
Beniamino Galvani
bf3ecd9031 l3cfg: fix DNS routes
The current approach is flawed. During a commit of the L3
configuration we do a RTM_GETROUTE to find the next-hop to the DNS
server on the current interface, in order to create the DNS route to
inject into the l3cd. However, we haven't added routes to kernel yet
and so the result of the RTM_GETROUTE is going to be wrong.

In some cases, for example when IPv4 DAD is enabled, the bug can't be
easily noticed because we perform multiple commits for the interface,
and the regular routes are already set in kernel from the 2nd commit
on.

To fix the problem, do the following: during a commit we first add
addresses and routes to platform. Then, we create a list of DNS routes
to configure, we collect the old DNS routes, and do a comparison. If
they changed, we need to add the DNS routes to platform in a 2nd step.

Note that in the previous approach we tracked the routes in the
committed-l3cd object of the l3cfg, and so they were applied to kernel
automatically. Because of the 2-step requirement, that no longer works
and we must apply the DNS routes manually.

Fixes: 5449b18a94 ('core: support automatically adding DNS routes')
2025-01-14 23:31:59 +01:00
Beniamino Galvani
aefc7732f0 l3cfg: add the DNS routing rule only when needed
Don't try to add the routing rule that points to the table containing
DNS routes at every commit.

Instead, look into the platform cache to see if the rule already
exists and add it only when needed.
2025-01-14 23:31:59 +01:00
Wen Liang
55710d3d7c n-dhcp4: send request directly to avoid unnecessary retransmission timeout
Using `n_dhcp4_c_connection_start_request()` will cause staying in
`connection->request`, as a result, it will cause the resending of
DHCPRELEASE and DHCPDECLINE message, thus, use
`n_dhcp4_c_connection_send_request()` directly instead to avoid
unnecessary retransmission timeout, as suggested by
f030927a54 (r1531834009).
2025-01-14 10:58:36 -05:00
Wen Liang
bccf031591 n-dhcp4: do not send the release message in wrong connection state
We should not send the DHCP release message when udp socket is still in
the PACKET state, this state is typically used during the discovery and
offer phases, where the client broadcasts DHCP packets like DHCPDISCOVER
and receives responses like DHCPOFFER. At this point, the client has no
lease because it has not yet completed the DHCP handshake.
2025-01-14 10:58:36 -05:00
Wen Liang
96ff17fd48 dhcp-client4: do not send release message when there is no lease
The daemon crashes when NM tries to send the release message when there
is no lease yet and the UDP socket is still in the PACKET state, which
causes an assertion failure as the result.

Add the condition to guarantee that n-dhcp4 only sends the release
message when there is a lease.

Resolves: https://issues.redhat.com/browse/RHEL-69132

Stack trace of the crash:
0  0x00007f5ac248bacc __pthread_kill_implementation (libc.so.6 + 0x8bacc)
1  0x00007f5ac243e686 __GI_raise (libc.so.6 + 0x3e686)
2  0x00007f5ac2428833 __GI_abort (libc.so.6 + 0x28833)
3  0x00007f5ac242875b __assert_fail_base (libc.so.6 + 0x2875b)
4  0x00007f5ac24373c6 __assert_fail (libc.so.6 + 0x373c6)
5  0x00005607ec7f194a n_dhcp4_c_connection_udp_send (NetworkManager + 0x8594a)
6  0x00005607eca228cc n_dhcp4_c_connection_start_request (NetworkManager + 0x2b68cc)
7  0x00005607eca14b31 nm_dhcp_client_stop (NetworkManager + 0x2a8b31)
8  0x00005607eca8a4ce _dev_ipdhcpx_cleanup (NetworkManager + 0x31e4ce)
9  0x00005607ecac144d _cleanup_ip_pre (NetworkManager + 0x35544d)
10 0x00005607ecac3f04 _cleanup_generic_pre (NetworkManager + 0x357f04)
11 0x00005607ecad5006 nm_device_cleanup (NetworkManager + 0x369006)
12 0x00005607ecac5230 _set_state_full (NetworkManager + 0x359230)
13 0x00005607ecac8c4a nm_device_state_changed (NetworkManager + 0x35cc4a)
14 0x00007f5ac2daa47b g_idle_dispatch (libglib-2.0.so.0 + 0x5147b)
15 0x00007f5ac2dadf4f g_main_dispatch (libglib-2.0.so.0 + 0x54f4f)
16 0x00007f5ac2e03268 g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa268)
17 0x00007f5ac2dad5a3 g_main_loop_run (libglib-2.0.so.0 + 0x545a3)
18 0x00005607ec7c3eed main (NetworkManager + 0x57eed)
19 0x00007f5ac24295d0 __libc_start_call_main (libc.so.6 + 0x295d0)
20 0x00007f5ac2429680 __libc_start_main_impl (libc.so.6 + 0x29680)
21 0x00005607ec7c43f5 _start (NetworkManager + 0x583f5)
2025-01-14 10:58:36 -05:00
Wen Liang
5993ee8a8a nm-dhcp-client: add argument controlling whether to get next or current lease
In the scenario for sending the release message, we need to guarantee
that NM only sends the release message when the client received a lease
from the server. However, there is some distinction between the
`l3cd_curr` and `l3cd_next` when ACD is pending, because `l3cd_curr` is
NULL but `l3cd_next` is not NULL when ACD is pending. Regardless of
whether ACD is pending or completed, these are all considered the client
have received the release from the server. Therefore, adapt the function
`nm_dhcp_client_get_lease()` to control whether to get next or current
lease.
2025-01-14 10:58:36 -05:00
Beniamino Galvani
8f569814ae libnm-core: remove old DNS parsing functions
They are no longer used.
2025-01-07 15:41:45 +01:00
Beniamino Galvani
dd09af121b dispatcher: fix serialization of DNS servers
In the "Action()" D-Bus method, the "nameservers" key used to contain
an array of binary addresses. If we change the key to contain
something else, there can be problems when the NM and the
NM-dispatcher versions mismatch (right after an upgrade or a
downgrade).

To avoid such problem, still send the old key in the old format, and
introduce a new key for the new format. The new format carries the
name servers as a string list, and can encode encrypted DNS servers.
2025-01-07 15:41:45 +01:00
Beniamino Galvani
4c7f1857df initrd-generator: add support for the "rd.net.dns" command line option
Introduce a new kernel command line option named "rd.net.dns" that can
be used to specify a global name server. It accepts name server in a
URI-like form, as for example:

  rd.net.dns=dns+tls://[fd01::1]:5353#mydomain.com
2025-01-07 15:41:44 +01:00
Beniamino Galvani
4422b14704 core, libnm: support per-connection DNS URIs
Accept name servers with a URI syntax in the ipv4.dns and ipv6.dns
properties; and accept them everywhere else in the core and libnm.
2025-01-07 15:41:44 +01:00
Beniamino Galvani
8416a58e26 core: accept DNS URIs in global configuration
Accept name servers specified with an URI syntax in the global
configuration. A plugin that doesn't support a specific scheme can
decide to ignore it and use only the servers it understands. At the
moment there is no plugin that supports DNS-over-TLS servers in the
global configuration.
2025-01-07 15:41:43 +01:00
Beniamino Galvani
4dee109b8d libnm-core: add new functions for DNS parsing
Introduce new functions to parse and normalize name servers. Their
name contains "dns_uri" because they also support a URI-like syntax
as: "dns+tls://192.0.2.0:553#example.org".
2025-01-07 15:41:43 +01:00
Beniamino Galvani
28668f8698 core: simplify nm_l3_config_data_add_nameserver_detail()
Remove unused "server_name" argument. It is still possible to pass the
server name, if needed, with the nm_l3_config_data_add_nameserver()
function. After this change, rename the function to
nm_l3_config_data_add_nameserver_addr(), since the function only
accepts an address.
2025-01-07 15:41:43 +01:00
Wen Liang
308e34a501 vpn: fix routing rules support in vpn conenctions
This commit introduces the ability to manage routing rules specifically
for VPN connections. These rules allow finer control over traffic
routing by enabling the specification of policy-based routing for
traffic over the VPN.

- Updated the connection backend to apply rules during VPN activation.
- Ensured proper cleanup of routing rules upon VPN deactivation.

This enhancement improves VPN usability in scenarios requiring advanced
routing configurations, such as split tunneling and traffic
prioritization.

Resolves: https://issues.redhat.com/browse/RHEL-70160
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2092
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1842
2025-01-07 08:57:11 -05:00
eaglegai
9c42177d09 mptcp: fix error handling rp_filter when kernel don't support mptcp
When the kernel don't support mptcp, NetworkManager should disable mptcp
and shouldn't change rp_filter from 1 to 2. However, when checking file
/proc/sys/net/mptcp/enabled, val v's type is defined to guint32, and
nm_platform_sysctl_get_int32 return -1, v becomes a very large number
and can't set mptcp_flags to NM_MPTCP_FLAGS_DISABLED.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1686
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2093

Fixes: c00873e08f ('mptcp: rework "connection.mptcp-flags" for enabling MPTCP')
2025-01-07 11:20:16 +01:00
Íñigo Huguet
fdf1a9fb1e libnm: use nm_strndup_a instead of strndupa
Alpine doesn't have strndupa.

Fixes: 38d1bcee3b ('ip: configurable address pool and lease time of DHCP server in shared mode')
2024-12-20 20:25:14 +01:00
Fernando Fernandez Mancera
3f2f922dd9 bonding: send ARP announcement on bonding-slb link/carrier down
When a bond in balance-slb is created, the ports are enabled or disabled
based on carrier and link state. If the link/carrier goes down, the port
becomes disabled and we must make sure the MAC tables of the switches
are updated properly so the traffic is redirected.

In order to solve this, we send a GARP or RARP broadcast packet on the
bond. This fix cover 3 different balance-slb scenarios.

Scenario 1: The bond in balance-slb mode has IPv4 address configured and
some ports connected. Here the bond is acting like active-backup as the
packets will always have as source MAC the address of the bond
interface. When a port goes down, NetworkManager will send a GARP
broadcast announcing the address configured on the bond with the MAC
address configured on the port.

Scenario 2: The bond in balance-slb mode is connected to a bridge and has
some ports connected. The bridge has IPv4 configured. When a port goes
down, NetworkManager will send a GARP broadcast announcing the address
configured on the bridge with the MAC address configured on the port.

Scenario 3: The bond in balance-slb mode is connected to a bridge and
has some ports connected. The bridge does not have IP configuration and
therefore everything is L2. When a port goes down, NetworkManager will
query the FDB table and filter the entries by the ones belonging to the
bridge and the bond ifindexes. Then, it will send a RARP broadcast
announcing every learned MAC address from FDB.

Fixes: e9268e3924 ('firewall: add mlag firewall utils for multi chassis link aggregation (MLAG) for bonding-slb')
2024-12-18 14:45:54 +01:00
Fernando Fernandez Mancera
00f47efcb2 linux-platform: add helper function to query FDB table
The function introduced queries the FDB table via netlink socket. It
accepts a list of ifindexes to filter out the FDB content not related to
it. It returns an array of MAC addresses.

To cltarify this function is unusually exposed directly on
nm-linux-platform.h as we don't want this be part of the whole
NMPlatform object or cache. This, is an exception to the rule to
simplify the integration of this functionality on NetworkManager.

In addition, it also doesn't use the async mechanism that is widely used
on netlink communication across nm-linux-platform. Again, the reason is
to simplify its use, as async communication won't provide a benefit to
the use cases we have planned for this, i.e balance-slb RARP announcing.
2024-12-18 14:45:54 +01:00
Fernando Fernandez Mancera
a63eec924c glib-aux: add nm_ether_addr_hash() helper
Add a hash generation helper for NMEtherAddr struct. This can be used
for HashTables containing pointers to NMEtherAddr structs.
2024-12-18 14:45:54 +01:00
Fernando Fernandez Mancera
69f3493670 l3cfg: add helper function to fetch all the IPv4 configured addresses
This function would be useful when performing operations related to the
IPv4 addresses configured on the l3cfg. E.g this function will be used
for getting the IPv4 to announce on a GARP on bonding-slb when one of
the ports failover.
2024-12-18 14:45:54 +01:00