Commit graph

202 commits

Author SHA1 Message Date
Íñigo Huguet
bcbe386823 all: code format 2025-05-13 11:43:33 +02:00
Beniamino Galvani
250475c0fd platform: replace ring ethtool ioctl calls with netlink 2025-04-17 08:10:53 +02:00
Beniamino Galvani
3580dfe517 platform: replace EEE ethtool ioctl calls with netlink 2025-04-17 08:10:51 +02:00
Beniamino Galvani
62c841afcf platform: use the new ethtool-netlink API for pause settings 2025-04-17 08:10:51 +02:00
Beniamino Galvani
e8a3cd611e platform: move ethtool ioctl functions to a separate file
We're going to replace most of the ioctl-based ethtool functions with
a netlink-based equivalent. Move the ioctl ones to a separate file so
that it's easier to see what still needs to be converted. Also add a
common prefix to the function names.
2025-04-17 08:10:49 +02:00
Beniamino Galvani
6478e5158a platform: always set the lock flag for RTO_MIN
The rto-min value is ignored by kernel unless the lock flag is set.
2025-04-14 16:41:39 +02:00
Beniamino Galvani
2b922a93a5 platform: accept 0 as valid rto_min value
iproute2 and the kernel accept 0 as valid rto_min value:

  # ip route add 172.16.0.1 dev enp1s0 rto_min 0ms
  # ip route show
  172.16.0.1 dev enp1s0 scope link rto_min lock 0ms

Even if a value of 0ms would not be useful in practice, it is better
to exactly track what kernel reports, instead of assuming that when
the value is zero it is "not set".
2025-04-14 16:41:39 +02:00
Herman Semenov
3aa6e689ec libnm-platform: fix not set MACVTAP when cache ops added or updated
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2151
2025-03-31 14:58:45 +02:00
Jan Vaclav
84bcc0eab9 platform/vlan: fix incorrect type for ingress/egress qos mappings
The kernel was updated to add stricter validation to netlink messages,
which revealed this bug:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c21660fe221a15c789dee2bc2fd95516bc5aeaf

Fixes: a5ea141956 ('platform/vlan: add support for ingress/egress-qos-mappings and changing flags')
2025-01-16 11:08:44 +00:00
Fernando Fernandez Mancera
00f47efcb2 linux-platform: add helper function to query FDB table
The function introduced queries the FDB table via netlink socket. It
accepts a list of ifindexes to filter out the FDB content not related to
it. It returns an array of MAC addresses.

To cltarify this function is unusually exposed directly on
nm-linux-platform.h as we don't want this be part of the whole
NMPlatform object or cache. This, is an exception to the rule to
simplify the integration of this functionality on NetworkManager.

In addition, it also doesn't use the async mechanism that is widely used
on netlink communication across nm-linux-platform. Again, the reason is
to simplify its use, as async communication won't provide a benefit to
the use cases we have planned for this, i.e balance-slb RARP announcing.
2024-12-18 14:45:54 +01:00
Beniamino Galvani
45535cbf9f platform: support specifying the fwmark in ip_route_get()
Add an optional argument to specify the fwmark, which will be used in
the next commits to return results that match a specific rule.
2024-10-23 15:06:59 +02:00
Beniamino Galvani
bb6881f88c format: run nm-code-format
Reformat with:

  clang-format version 19.1.0 (Fedora 19.1.0-1.fc41)

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2046
2024-10-04 11:07:35 +02:00
Fernando Fernandez Mancera
d238ff487b ipvlan: add support to IPVLAN interface
This patch add support to IPVLAN interface. IPVLAN is a driver for a
virtual network device that can be used in container environment to
access the host network. IPVLAN exposes a single MAC address to the
external network regardless the number of IPVLAN device created inside
the host network. This means that a user can have multiple IPVLAN
devices in multiple containers and the corresponding switch reads a
single MAC address. IPVLAN driver is useful when the local switch
imposes constraints on the total number of MAC addresses that it can
manage.
2024-09-18 13:19:42 +02:00
Íñigo Huguet
830dd4ad9c platform: add small backoff time before resync
If the socket's RX buffer is full it's probably because other
process is doing lot of changes very quickly, faster than we
can process them. Let's give the writer a small time to finish:
1. Avoid contending the kernel's RTNL lock, so we don't make
   the whole situation even worse and it can finish earlier.
2. Avoid having to resync again and again due to trying to
   resync while the writer is still doing quick changes, so
   we are unable to catch up yet.

This won't help if this situation takes a long time or is
continuous, but that's unlikely to happen, and if it does,
it's the writer's fault for starving the whole system.

There is no need to progresively increase the backoff time
for the same reason: if this situation takes lot of time,
it's the writer's fault. It's neither a good idea because the whole NM
process will end being sleeping long times, not doing anything at all,
without being able to react when the Netlink messages burst stops.
2024-08-21 07:32:22 +02:00
Beniamino Galvani
7ae4660a77 platform: support reading bridge VLANs
Add a function to read the list of bridge VLANs on an interface.
2024-08-21 07:29:38 +02:00
Beniamino Galvani
e00c81b153 bridge: change the signature for nm_platform_link_set_bridge_vlans()
Currently, nm_platform_link_set_bridge_vlans() accepts an array of
pointers to vlan objects; to avoid multiple allocations,
setting_vlans_to_platform() creates the array by piggybacking the
actual data after the pointers array.

In the next commits, the array will need to be manipulated and
extended, which is difficult with the current structure. Instead, pass
separately an array of objects and its size.
2024-08-21 07:29:36 +02:00
Beniamino Galvani
7d3bfb101f platform: add define for IFLA_BOND_SLAVE_PRIO
The enum value was added in kernel 5.19; add a define for it so that
the compilation doesn't fail with earlier kernels.

Fixes: 79221f79a2 ('src: drop most slave references from the code')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1596
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2012
2024-08-20 13:29:48 +02:00
Fernando Fernandez Mancera
79221f79a2 src: drop most slave references from the code
While we cannot remove all the references to "slave" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Fernando Fernandez Mancera
090d617017 src: drop most master references from the code
While we cannot remove all the references to "master" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Íñigo Huguet
d219277f1a platform: log routes dump failure as error 2024-08-06 10:15:56 +00:00
Beniamino Galvani
7efab8baeb platform: add a retry mechanism in case route dump fails
In case the platform fails dumping a specific route protocol, retry
multiple times. If all attempts fail, emit a warning and proceed as
there is nothing more to do.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
b2635d3461 platform: assert that we only generate route message of tracked proto 2024-08-06 10:15:56 +00:00
Beniamino Galvani
f6411ed941 platform: dump only selected route protocols
When doing a dump of routes, we want to exclude routes having
protocols we do not care about. Since the netlink socket has
STRICT_CHK enabled, we can request multiple dumps for the protocols we
need.

While doing 6 dumps is less efficient than doing 1, it normally
doesn't matter. However, the new implementation is more efficient when
there are e.g. millions of BGP routes that can be excluded from the
results.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
c0ac920f9c platform: introduce array of tracked protocols
Introduce an array of tracked route protocols that will be used in the
next commit. To have the list of protocols defined in a single place,
define a macro.
2024-08-06 10:15:56 +00:00
Beniamino Galvani
185932a1a2 platform: enable strict check on netlink socket dumps
In the future we might want to specify filters when requesting netlink
dumps; this requires that strict check is enabled on the socket.

When enabling strict check, we need to pass a full struct in the
netlink message, otherwise kernel ignores it.

This commit doesn't change behavior.
2024-06-26 09:52:50 +02:00
Beniamino Galvani
2b8d8fe92a platform: don't set RTM_F_LOOKUP_TABLE for IPv6
RTM_F_LOOKUP_TABLE is only needed for IPv4. IPv6 dumps with the flag
are rejected in strict mode.
2024-06-26 09:52:50 +02:00
Íñigo Huguet
4d426f581d platform: avoid routes resync for routes that we don't track
When we recibe a Netlink message with a "route change" event, normally
we just ignore it if it's a route that we don't track (i.e. because of
the route protocol).

However, it's not that easy if it has the NLM_F_REPLACE flag because
that means that it might be replacing another route. If the kernel has
similar routes which are candidates for the replacement, it's hard for
NM to guess which one of those is being replaced (as the kernel doesn't
have a "route ID" or similar field to indicate it). Moreover, the kernel
might choose to replace a route that we don't have on cache, so we know
nothing about it.

It is important to note that we cannot just discard Netlink messages of
routes that we don't track if they has the NLM_F_REPLACE. For example,
if we are tracking a route with proto=static, we might receive a replace
message, changing that route to proto=other_proto_that_we_dont_track. We
need to process that message and remove the route from our cache.

As NM doesn't know what route is being replaced, trying to guess will
lead to errors that will leave the cache in an inconsistent state.
Because of that, it just do a cache resync for the routes.

For IPv4 there was an optimization to this: if we don't have in the
cache any route candidate for the replacement there are only 2 possible
options: either add the new route to the cache or discard it if we are
not interested on it. We don't need a resync for that.

This commit is extending that optimization to IPv6 routes. There is no
reason why it shouldn't work in the same way than with IPv4. This
optimization will only work well as long as we find potential candidate
routes in the same way than the kernel (comparing the same fields). NM
calls to this "comparing by WEAK_ID". But this can also happen with IPv4
routes.

It is worth it to enable this optimization because there are routing
daemons using custom routing protocols that makes tens or hundreds of
updates per second. If they use NLM_F_REPLACE, this caused NM to do a
resync hundreds of times per second leading to a 100% CPU usage:
https://issues.redhat.com/browse/RHEL-26195

An additional but smaller optimization is done in this commit: if we
receive a route message for routes that we don't track AND doesn't have
the NLM_F_REPLACE flag, we can ignore the entire message, thus avoiding
the memory allocation of the nmp_object. That nmp_object was going to be
ignored later, anyway, so better to avoid these allocations that, with
the routing daemon of the above's example, can happen hundreds of times
per second.

With this changes, the CPU usage doing `ip route replace` 300 times/s
drops from 100% to 1%. Doing `ip route replace` as fast as possible,
without any rate limitting, still keeps NM with a 3% CPU usage in the
system that I have used to test.
2024-04-30 13:13:46 +02:00
Íñigo Huguet
dda94d6f66 sriov: don't fail if sriov_totalvfs sysfs file is missing
If sriov_totalvfs file doesn't exist we don't need to consider it a
fatal failure. Try to create the required number of VFs as we were doing
before.

Note: at least netdevsim doesn't have sriov_totalvfs file, I don't know
if there are real drivers that neither has it.

(cherry picked from commit 27eaf34fcf)
2024-02-21 11:27:34 +01:00
Íñigo Huguet
f0133e1a5e sriov: set the devlink's eswitch inline-mode and encap-mode
Set these parameters according to the values set in the new properties
sriov.eswitch-inline-mode and sriov.eswitch-encap-mode.

The number of parameters related to SR-IOV was becoming too big.
Refactor to group them in a NMPlatformSriovParams struct and pass it
around.

(cherry picked from commit 4669f01eb0)
2024-02-21 11:27:32 +01:00
Íñigo Huguet
03aaff8fc2 devlink: get and set eswitch inline-mode and encap-mode
The setter function allow to set to "preserve" to modify only some of
them.

(cherry picked from commit bf654ef39e)
2024-02-21 11:27:31 +01:00
Íñigo Huguet
dd7810c473 platform: destroy VFs before changing the eswitch mode
It is not safe to change the eswitch mode when there are VFs already
created: it often fails, or even worse, doesn't fail immediatelly but
there are later problems with the VFs.

What is supposed to be well tested in all drivers is to change the
eswitch mode with no VFs created, and then create the VFs, so let's set
num_vfs=0 before changing the eswitch mode.

As we want to change num_vfs asynchronously in a separate thread, we
need to do a multi-step process with callbacks each time that a step
finish (before it was just set num_vfs asynchronously and invoke the
callback when it's done).

This makes link_set_sriov_params_async to become even larger and more
complex than it already was. Refactor it to make it cleaner and easier
to follow, and hopefully less error prone, and implement that multi-step
process.

(cherry picked from commit 770340627b)
2024-02-21 11:27:29 +01:00
Íñigo Huguet
1ba2b77402 sriov: set the devlink's eswitch mode
Use the new property sriov.eswitch-mode to select between legacy SRIOV
and switchdev mode.

(cherry picked from commit 837549ea94)
2024-02-21 11:27:29 +01:00
Gris Ge
f990f9b4e4 bridge: skip VLAN filtering resetting in reapply if no vlan change changed
When doing reapply on linux bridge interface, NetworkManager will reset
the VLAN filtering and default PVID which cause PVID been readded to all
bridge ports regardless they are managed by NetworkManager.

This is because Linux kernel will re-add PVID to bridge port upon the
changes of bridge default-pvid value.

To fix the issue, this patch introduce netlink parsing code for
`vlan_filtering` and `default_pvid` of NMPlatformLnkBridge, and use that
to compare desired VLAN filtering settings, skip the reset of VLAN
filter if `default_pvid` and `vlan_filtering` are unchanged.

Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit 02c34d538c)
2024-02-09 10:03:39 +01:00
Fernando Fernandez Mancera
0e893593a9 hsr: drop supervision-address from HSR setting
The supervision address is read-only. It is constructed by kernel and
only the last byte can be modified by setting the multicast-spec as
documented indeed.

As 1.46 was not released yet, we still can drop the whole API for this
setting property. We are keeping the NMDeviceHsr property as it is a
nice to have for reading it.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1823

Fixes: 5426bdf4a1 ('HSR: add support to HSR/PRP interface')
2023-12-19 13:54:21 +01:00
Fernando Fernandez Mancera
5426bdf4a1 HSR: add support to HSR/PRP interface
This patch add support to HSR/PRP interface. Please notice that PRP
driver is represented as HSR too. They are different drivers but on
kernel they are integrated together.

HSR/PRP is a network protocol standard for Ethernet that provides
seamless failover against failure of any network component. It intends
to be transparent to the application. These protocols are useful for
applications that request high availability and short switchover time
e.g electrical substation or high power inverters.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1791
2023-12-05 08:05:56 +01:00
Thomas Haller
b4dd83975e
all: use NM_MIN() instead of MIN() 2023-11-15 09:32:20 +01:00
Thomas Haller
5acd30ca44
all: use NM_MIN_CONST()/NM_MAX_CONST() instead of MIN()/MAX()
glib's MIN()/MAX() will be replaced by NM_MIN()/NM_MAX().
There are however a few places where NM_MIN()/NM_MAX() cannot
be used.

Adjust those places to use NM_MIN_CONST()/NM_MAX_CONST() instead.
2023-11-15 09:32:19 +01:00
Javier Sánchez Parra
b38e8c053b platform: add netlink support for bridge port options
sysfs is deprecated and kernel will not add new bridge port options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
2023-10-09 12:25:45 +00:00
Emmanuel Grumbach
3476135911 platform: remove CSME related code
Remove all the code that was added for the CSME coexistence.
The Intel WiFi team can't commit on when, if at all, this feature will
be completely integrated and tested in the NetworkManager.
The preferred solution for now is the solution that involves the kernel
only.
Remove the code that was merged so far.
2023-09-25 11:46:24 +00:00
Thomas Haller
5ff1468717
all: ensure signendess for arguments of NM_{MIN,MAX,CLAMP}() macros matches 2023-08-07 09:24:36 +02:00
Thomas Haller
f727c233c4
platform: rename NMP_SYSCTL_PATHID_NETDIR() to have "_A" suffix
The macro uses g_alloca(). Using alloca() is potentially dangerous. For
example, it must never be used in an unbounded loop. This should be
immediately obvious from the name, so we don't accidentally use them
in the wrong context.

All other alloca() macros should have such a prefix already. And they
always have to be macros, because you couldn't use alloca() to return
memory from a function.
2023-06-26 15:15:49 +02:00
Thomas Haller
b8b74f4000
libnm-base: move nmp_utils_new_infiniband_name() to nm_net_devname_infiniband() 2023-05-30 08:52:17 +02:00
Fernando Fernandez Mancera
e200b16291 platform: add support to prio property in bond ports 2023-05-03 10:43:58 +02:00
Fernando Fernandez Mancera
bb435674b5 platform: add netlink support for bond port options
sysfs is deprecated and kernel will not add new bond port options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
2023-05-03 09:55:45 +02:00
Thomas Haller
5eb584f84b
platform: explicitly compare seq_result number against WAIT_FOR_NL_RESPONSE_RESULT_UNKNOWN
We have other places like

  nm_assert(!out_seq_result || *out_seq_result == WAIT_FOR_NL_RESPONSE_RESULT_UNKNOWN);

where we explicitly compare against WAIT_FOR_NL_RESPONSE_RESULT_UNKNOWN.
Do that here too.
2023-03-29 15:27:51 +02:00
Lubomir Rintel
da9745b961
platform: always retry when netlink drops messages
Netlink is capable of dropping not only outbout messages, but also the
requests. We should always try to recover from those.
2023-03-29 15:27:51 +02:00
Lubomir Rintel
0a549bfad2
platform: increase log level for some failures
These are not expected to happen. While probably harmless, we should notice
when they do.
2023-03-29 11:49:59 +02:00
Lubomir Rintel
090ff4ae95
platform: limit retry count on link change
This is a nice safeguard, also consistent with ip_route_get().
2023-03-29 11:49:59 +02:00
Lubomir Rintel
fee7832bde
platform: increase netlink resync retry count
With a small buffer (of 4K) and many links (100 ethernet adapters), I've
seen up to ~15 retries of link change until things settled.

Let's increase this. Still a »bulharská konštanta« but possibly safer and
more broadly useful (so we can cap the link change retry count too).
2023-03-29 11:49:58 +02:00
Lubomir Rintel
e45b27a937
platform: create a define for retry count when netlink drops data
We're going to use it elsewhere.
2023-03-29 11:49:58 +02:00