Commit graph

467 commits

Author SHA1 Message Date
Matthieu Baerts (NGI0)
2b03057de0 mptcp: add 'laminar' endpoint support
This new endpoint type has been recently added to the kernel in v6.18
[1]. It will be used to create new subflows from the associated address
to additional addresses announced by the other peer. This will be done
if allowed by the MPTCP limits, and if the associated address is not
already being used by another subflow from the same MPTCP connection.

Note that the fullmesh flag takes precedence over the laminar one.
Without any of these two flags, the path-manager will create new
subflows to additional addresses announced by the other peer by
selecting the source address from the routing tables, which is harder to
configure if the announced address is not known in advance.

The support of the new flag is easy: simply by declaring a new flag for
NM, and adding it in the related helpers and existing checks looking at
the different MPTCP endpoint. The documentation now references the new
endpoint type.

Note that only the new 'define' has been added in the Linux header file:
this file has changed a bit since the last sync, now split in two files.
Only this new line is needed, so the minimum has been modified here.

Link: https://git.kernel.org/torvalds/c/539f6b9de39e [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
2025-11-19 12:54:09 +00:00
Jan Vaclav
17efec8b06 platform: configure HSR interlink from property
Uses the `hsr.interlink` property defined in the previous
commit to configure the property in the kernel.
2025-10-08 22:52:03 +02:00
Jan Vaclav
0b99629278 platform: configure HSR protocol version from property
Uses the `hsr.protocol-version` property defined in the previous
commit to configure the property in the kernel.
2025-09-30 14:28:49 +02:00
Beniamino Galvani
394f6281ea platform: fix GError free function
Fixes: dd7810c473 ('platform: destroy VFs before changing the eswitch mode')
2025-07-23 11:11:59 +02:00
Beniamino Galvani
b45d5f41dd platform: fix harmless typo
The function should modify the "ip6_address" member of the union. In
practice, it doesn't matter because the ifindex is the first member of
both "ip4_address" and "ip6_address".
2025-07-10 09:12:05 +02:00
Beniamino Galvani
7d23ed9f73 platform: rename nm_linux_platform_get_link_fdb_table()
Rename nm_linux_platform_get_link_fdb_table() to
nm_linux_platform_get_bridge_fdb(). The new name better indicates that
the function returns the bridge FDB entries.
2025-07-10 09:12:02 +02:00
Beniamino Galvani
45ab9d96f1 platform: use g_strdup() instead of strdup() in ethtool code
The string is freed with g_free(), it needs to be allocated with
g_strdup(). In practice, the GLib allocator uses malloc() nowadays,
but it is better to be consistent.
2025-07-10 09:12:00 +02:00
Beniamino Galvani
00257a9cf7 platform: parse the RT_VIA route attribute
Parse the "via" attribute in netlink routes received by kernel, so
that we can update the internal cache.
2025-06-26 11:37:16 +02:00
Beniamino Galvani
9c70a43775 platform: use the "via" attribute in route NMPObject methods
Update the cmd_obj_hash_update(), cmd_obj_cmp(), cmd_obj_to_string()
NMPObject methods for IPv4 routes to consider the "via" attribute.
2025-06-26 11:37:16 +02:00
Mary Strodl
2ffaebd4ae platform: support the RT_VIA attribute for IPv4 routes
The RT_VIA attribute is used to specify a gateway of a different
address family. It is currently used only for IPv4 routes.

[bgalvani@redhat.com: amended the commit message]
2025-06-26 11:37:15 +02:00
Íñigo Huguet
bcbe386823 all: code format 2025-05-13 11:43:33 +02:00
Beniamino Galvani
250475c0fd platform: replace ring ethtool ioctl calls with netlink 2025-04-17 08:10:53 +02:00
Beniamino Galvani
3580dfe517 platform: replace EEE ethtool ioctl calls with netlink 2025-04-17 08:10:51 +02:00
Beniamino Galvani
62c841afcf platform: use the new ethtool-netlink API for pause settings 2025-04-17 08:10:51 +02:00
Beniamino Galvani
79ba228c59 platform: add ethtool netlink implementation
Introduce some basic infrastructure to perform ethtool operations via
netlink. As a proof of concept, implement the pause settings.

Netlink has some advantages over ioctl():

 - it can be easily extended with new attributes;

 - it can return descriptive error messages via the extended ack
   mechanism. For example, when setting the ring parameters to a value
   outside the allowed range, userspace receives error code -EINVAL
   and message "requested ring size exceeds maximum". ioctl() gets
   only -EINVAL, which is shared among many error reasons;

 - since it's possible to specify an ifindex in the request, there are
   no race conditions when the interface name changes;

New ethtool API is available only via netlink; however it makes sense
to start using netlink also for the old API that NM is already using
(pause, eee, rings, etc.) over ioctl() because of the advantages
described above.
2025-04-17 08:10:50 +02:00
Beniamino Galvani
e8a3cd611e platform: move ethtool ioctl functions to a separate file
We're going to replace most of the ioctl-based ethtool functions with
a netlink-based equivalent. Move the ioctl ones to a separate file so
that it's easier to see what still needs to be converted. Also add a
common prefix to the function names.
2025-04-17 08:10:49 +02:00
Beniamino Galvani
37785a57e0 platform: use consistent naming for ethtool functions
For unknown reasons (wrong copy and paste?) the getter functions had a
"link" in the name. Remove it.
2025-04-17 08:10:48 +02:00
Beniamino Galvani
6478e5158a platform: always set the lock flag for RTO_MIN
The rto-min value is ignored by kernel unless the lock flag is set.
2025-04-14 16:41:39 +02:00
Beniamino Galvani
2b922a93a5 platform: accept 0 as valid rto_min value
iproute2 and the kernel accept 0 as valid rto_min value:

  # ip route add 172.16.0.1 dev enp1s0 rto_min 0ms
  # ip route show
  172.16.0.1 dev enp1s0 scope link rto_min lock 0ms

Even if a value of 0ms would not be useful in practice, it is better
to exactly track what kernel reports, instead of assuming that when
the value is zero it is "not set".
2025-04-14 16:41:39 +02:00
Herman Semenov
3aa6e689ec libnm-platform: fix not set MACVTAP when cache ops added or updated
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2151
2025-03-31 14:58:45 +02:00
Michael Biebl
898db303c3 typo fix: allow to -> allow one to
Detected by lintian:

Example:
I: network-manager: typo-in-manual-page "allow to" "allow one to" [usr/share/man/man5/NetworkManager.conf.5.gz:1392]
2025-03-26 19:22:56 +01:00
Michael Biebl
10e58f7c3c typo fix: allows to -> allows one to
Detected by lintian:

Example:
I: network-manager: typo-in-manual-page "allows to" "allows one to" [usr/share/man/man5/NetworkManager.conf.5.gz:1266]
2025-03-26 19:22:01 +01:00
Jan Vaclav
84bcc0eab9 platform/vlan: fix incorrect type for ingress/egress qos mappings
The kernel was updated to add stricter validation to netlink messages,
which revealed this bug:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c21660fe221a15c789dee2bc2fd95516bc5aeaf

Fixes: a5ea141956 ('platform/vlan: add support for ingress/egress-qos-mappings and changing flags')
2025-01-16 11:08:44 +00:00
Fernando Fernandez Mancera
00f47efcb2 linux-platform: add helper function to query FDB table
The function introduced queries the FDB table via netlink socket. It
accepts a list of ifindexes to filter out the FDB content not related to
it. It returns an array of MAC addresses.

To cltarify this function is unusually exposed directly on
nm-linux-platform.h as we don't want this be part of the whole
NMPlatform object or cache. This, is an exception to the rule to
simplify the integration of this functionality on NetworkManager.

In addition, it also doesn't use the async mechanism that is widely used
on netlink communication across nm-linux-platform. Again, the reason is
to simplify its use, as async communication won't provide a benefit to
the use cases we have planned for this, i.e balance-slb RARP announcing.
2024-12-18 14:45:54 +01:00
Íñigo Huguet
e330eb9c4a l3cfg: remove routes added by NM on reapply
By default, on reapply we were only syncing the main routes table. This
causes that routes added by NM to other tables are not removed on
reapply. This was done to preserve routes added externally, but routes
added by NM itself should be removed.

Add a new route table syncing mode "main + NM routes". This mode
maintains the normal behaviour of syncing completely the main table,
and for other tables removes only routes that were added by us, leaving
the rest untouched. Use this mode by default, as this is what a user
would expect on reapply.

Note: this might not work if NM is restarted between the profile being
modified and the reapply, because NM forgets what routes were added by
itself because of the restart. This is a rare corner case, though.

Use the D-Bus property "VersionInfo" to expose a capability flag
indicating that this bug is fixed. It is the first capability that we
expose in this way. However, it is convenient to do it this way as it's
something that clients like nmstate needs to know, so they can decide
whether a conn down is needed or not. It is not enough to decide that by
version number because it might be fixed via a downstream patch in distros
like RHEL.

https://issues.redhat.com/browse/RHEL-67324
https://issues.redhat.com/browse/RHEL-66262

Fixes: e9c17fcc9b ('l3cfg: default to 'main' route table sync mode')
2024-12-11 15:52:09 +00:00
Íñigo Huguet
e1840ad5fb platform: rename NM_IP_ROUTE_TABLE_SYNC_MODE_FULL -> ALL_EXCEPT_LOCAL
The difference between FULL and ALL was not obvious without reading the
documentation. Moreover, a new mode is going to be introduced so the
confusion could grow. Rename to a more explicit name.
2024-12-11 15:52:09 +00:00
Beniamino Galvani
eb620e0e7e platform: fix to_string() functions for IPv6 tunnels
We can hit an assertion at trace log level when printing IPv6 tunnel
links, because the buffer for the local and remote addresses is not
big enough. Increase the buffer size.

Fixes: 32f6e1ef2e ('platform: add IP6TNL links support')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2063
2024-11-14 09:57:26 +01:00
Gris Ge
19bed3121f ethtool: support Forward Error Correction(fec)
Introducing support of ethtool FEC mode:

D-BUS API: `fec-mode: uint32_t`.
Keyfile:

```
[ethtool]
fec-mode=<uint32_t>
```

nmcli: `ethtool.fec-mode` allowing values are any combination of:
 * auto
 * off
 * rs
 * baser
 * llrs

Unit test cases included.

Resolves: https://issues.redhat.com/browse/RHEL-24055

Signed-off-by: Gris Ge <fge@redhat.com>
2024-11-07 17:38:04 +08:00
Beniamino Galvani
45535cbf9f platform: support specifying the fwmark in ip_route_get()
Add an optional argument to specify the fwmark, which will be used in
the next commits to return results that match a specific rule.
2024-10-23 15:06:59 +02:00
Beniamino Galvani
bb6881f88c format: run nm-code-format
Reformat with:

  clang-format version 19.1.0 (Fedora 19.1.0-1.fc41)

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2046
2024-10-04 11:07:35 +02:00
Fernando Fernandez Mancera
d238ff487b ipvlan: add support to IPVLAN interface
This patch add support to IPVLAN interface. IPVLAN is a driver for a
virtual network device that can be used in container environment to
access the host network. IPVLAN exposes a single MAC address to the
external network regardless the number of IPVLAN device created inside
the host network. This means that a user can have multiple IPVLAN
devices in multiple containers and the corresponding switch reads a
single MAC address. IPVLAN driver is useful when the local switch
imposes constraints on the total number of MAC addresses that it can
manage.
2024-09-18 13:19:42 +02:00
Dominique Martinet
beaf4f8db3 l3cfg/ipv4ll: add new nm_platform_ip4_address_is_link_local() helper
Move the static _ip4_address_is_link_local() check to a new global
nm_platform_ip4_address_is_link_local() helper so we can check if
an IPv4 is link local in other files
2024-09-02 08:16:18 +00:00
Íñigo Huguet
50c8e6e6b5 format: run nm-code-format 2024-08-21 08:13:08 +02:00
Íñigo Huguet
830dd4ad9c platform: add small backoff time before resync
If the socket's RX buffer is full it's probably because other
process is doing lot of changes very quickly, faster than we
can process them. Let's give the writer a small time to finish:
1. Avoid contending the kernel's RTNL lock, so we don't make
   the whole situation even worse and it can finish earlier.
2. Avoid having to resync again and again due to trying to
   resync while the writer is still doing quick changes, so
   we are unable to catch up yet.

This won't help if this situation takes a long time or is
continuous, but that's unlikely to happen, and if it does,
it's the writer's fault for starving the whole system.

There is no need to progresively increase the backoff time
for the same reason: if this situation takes lot of time,
it's the writer's fault. It's neither a good idea because the whole NM
process will end being sleeping long times, not doing anything at all,
without being able to react when the Netlink messages burst stops.
2024-08-21 07:32:22 +02:00
Beniamino Galvani
1c43fe5235 platform: add nmp_utils_bridge_normalized_vlans_equal()
Add a function to compare two arrays of NMPlatformBridgeVlan. It will
be used in the next commit to compare the VLANs from platform to the
ones we want to set.

To compare in a performant way, the vlans need to be normalized (no
duplicated VLANS, ranges into their minimal expression...). Add the
function nmp_utils_bridge_vlan_normalize.

Co-authored-by: Íñigo Huguet <ihuguet@redhat.com>
2024-08-21 07:29:39 +02:00
Beniamino Galvani
7ae4660a77 platform: support reading bridge VLANs
Add a function to read the list of bridge VLANs on an interface.
2024-08-21 07:29:38 +02:00
Beniamino Galvani
e00c81b153 bridge: change the signature for nm_platform_link_set_bridge_vlans()
Currently, nm_platform_link_set_bridge_vlans() accepts an array of
pointers to vlan objects; to avoid multiple allocations,
setting_vlans_to_platform() creates the array by piggybacking the
actual data after the pointers array.

In the next commits, the array will need to be manipulated and
extended, which is difficult with the current structure. Instead, pass
separately an array of objects and its size.
2024-08-21 07:29:36 +02:00
Beniamino Galvani
7d3bfb101f platform: add define for IFLA_BOND_SLAVE_PRIO
The enum value was added in kernel 5.19; add a define for it so that
the compilation doesn't fail with earlier kernels.

Fixes: 79221f79a2 ('src: drop most slave references from the code')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1596
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2012
2024-08-20 13:29:48 +02:00
Fernando Fernandez Mancera
79221f79a2 src: drop most slave references from the code
While we cannot remove all the references to "slave" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Fernando Fernandez Mancera
090d617017 src: drop most master references from the code
While we cannot remove all the references to "master" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Íñigo Huguet
d219277f1a platform: log routes dump failure as error 2024-08-06 10:15:56 +00:00
Beniamino Galvani
7efab8baeb platform: add a retry mechanism in case route dump fails
In case the platform fails dumping a specific route protocol, retry
multiple times. If all attempts fail, emit a warning and proceed as
there is nothing more to do.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
b2635d3461 platform: assert that we only generate route message of tracked proto 2024-08-06 10:15:56 +00:00
Beniamino Galvani
f6411ed941 platform: dump only selected route protocols
When doing a dump of routes, we want to exclude routes having
protocols we do not care about. Since the netlink socket has
STRICT_CHK enabled, we can request multiple dumps for the protocols we
need.

While doing 6 dumps is less efficient than doing 1, it normally
doesn't matter. However, the new implementation is more efficient when
there are e.g. millions of BGP routes that can be excluded from the
results.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
c0ac920f9c platform: introduce array of tracked protocols
Introduce an array of tracked route protocols that will be used in the
next commit. To have the list of protocols defined in a single place,
define a macro.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
185932a1a2 platform: enable strict check on netlink socket dumps
In the future we might want to specify filters when requesting netlink
dumps; this requires that strict check is enabled on the socket.

When enabling strict check, we need to pass a full struct in the
netlink message, otherwise kernel ignores it.

This commit doesn't change behavior.
2024-06-26 09:52:50 +02:00
Beniamino Galvani
2b8d8fe92a platform: don't set RTM_F_LOOKUP_TABLE for IPv6
RTM_F_LOOKUP_TABLE is only needed for IPv4. IPv6 dumps with the flag
are rejected in strict mode.
2024-06-26 09:52:50 +02:00
Fernando Fernandez Mancera
a4bbdeaf54 src: fix code formatting to last clang version 2024-05-30 15:23:37 +02:00
Íñigo Huguet
4d426f581d platform: avoid routes resync for routes that we don't track
When we recibe a Netlink message with a "route change" event, normally
we just ignore it if it's a route that we don't track (i.e. because of
the route protocol).

However, it's not that easy if it has the NLM_F_REPLACE flag because
that means that it might be replacing another route. If the kernel has
similar routes which are candidates for the replacement, it's hard for
NM to guess which one of those is being replaced (as the kernel doesn't
have a "route ID" or similar field to indicate it). Moreover, the kernel
might choose to replace a route that we don't have on cache, so we know
nothing about it.

It is important to note that we cannot just discard Netlink messages of
routes that we don't track if they has the NLM_F_REPLACE. For example,
if we are tracking a route with proto=static, we might receive a replace
message, changing that route to proto=other_proto_that_we_dont_track. We
need to process that message and remove the route from our cache.

As NM doesn't know what route is being replaced, trying to guess will
lead to errors that will leave the cache in an inconsistent state.
Because of that, it just do a cache resync for the routes.

For IPv4 there was an optimization to this: if we don't have in the
cache any route candidate for the replacement there are only 2 possible
options: either add the new route to the cache or discard it if we are
not interested on it. We don't need a resync for that.

This commit is extending that optimization to IPv6 routes. There is no
reason why it shouldn't work in the same way than with IPv4. This
optimization will only work well as long as we find potential candidate
routes in the same way than the kernel (comparing the same fields). NM
calls to this "comparing by WEAK_ID". But this can also happen with IPv4
routes.

It is worth it to enable this optimization because there are routing
daemons using custom routing protocols that makes tens or hundreds of
updates per second. If they use NLM_F_REPLACE, this caused NM to do a
resync hundreds of times per second leading to a 100% CPU usage:
https://issues.redhat.com/browse/RHEL-26195

An additional but smaller optimization is done in this commit: if we
receive a route message for routes that we don't track AND doesn't have
the NLM_F_REPLACE flag, we can ignore the entire message, thus avoiding
the memory allocation of the nmp_object. That nmp_object was going to be
ignored later, anyway, so better to avoid these allocations that, with
the routing daemon of the above's example, can happen hundreds of times
per second.

With this changes, the CPU usage doing `ip route replace` 300 times/s
drops from 100% to 1%. Doing `ip route replace` as fast as possible,
without any rate limitting, still keeps NM with a 3% CPU usage in the
system that I have used to test.
2024-04-30 13:13:46 +02:00
Jan Vaclav
886146b5b1 platform/netlink: use nm_random_get_bytes() for initial seq value
Coverity warns when a time_t is cast to 32-bits -- however, we do not
need to use the time here at all, since it is only used as an initializing value
that is not expected to be a timestamp, and we can use random bytes instead.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1904
2024-04-17 08:30:46 +00:00