It doesn't scale to export all addresses/routes on D-Bus as properties.
In particular not combined with PropertiesChanged signal. On a busy
system, this causes severe performance issues. It also doesn't seem very
useful. Routes and addresses are complex things (e.g. policy routing).
If you want to do anything serious, you must check netlink (or find
another way to get the information).
Note that NMPlatform already ignores routes of certain protocols
(ip_route_is_alive()). It also does not expose most route attributes,
making the output only useful for very limited cases (e.g. displaying to
the user for information).
Limit the number of exported entries to 100.
Try adding 100K routes one-by-one. Run a `nmcli monitor` instance.
Re-nice the nmcli process and/or keep the CPUs busy. Then start a script
that adds 100k routes. Observe. Glib's D-Bus worker thread receives the
messages and queues them for the main thread. The main thread is too
slow to process them, the memory consumption grows very quickly in Giga
bytes. Afterwards, the memory also is not returned to the operation
system, either because of fragmentation or because the libc allocator
does anyway not return heap memory.
It doesn't work to expose an unlimited number of objects on D-Bus. At
least not with an API, that sends the full list of all routes, whenever
a route changes. Nobody can use that feature either, because the only
use is a quick overview in `nmcli` output or a GUI. If you see 100+
routes there, that becomes unmanageable anyway. Instead use netlink if
you want to handle the full list of addresses/routes (or some other
API).
It doesn't scale. If you add 100k routes one-by-one, then upon each
change from platform, we will send the growing list of the routes on
D-Bus.
That is too expensive. Especially, if you imagine that the receiving end
is a NMClient instance. There is a D-Bus worker thread that queues the
received GVariant messages, while the main thread may not be able to
process them fast enough. In that case, the memory keeps growing very
fast and due to fragmentation it is not freed.
Instead, rate limit updates to 3 per second.
Note that the receive buffer of the netlink socket can fill up and we
loose messages. Therefore, already on the lowest level, we may miss
addresses/routes. Next, on top of NMPlatform, NMIPConfig listens to
NM_L3_CONFIG_NOTIFY_TYPE_PLATFORM_CHANGE_ON_IDLE events. Thereby it
further will miss intermediate state (e.g. a route that exists only for
a short moment).
Now adding another delay and rate limiting on top of that, does not make
that fundamentally different, we anyway didn't get all intermediate states
from netlink. We may miss addresses/routes that only exist for a short
amount of time. This makes "the problem" worse, but not fundamentally
new.
We can only get a (correct) settled state, after all events are
processed. And we never know, whether there isn't the next event
just waiting to be received.
Rate limiting is important to not overwhelm D-Bus clients. In reality,
none of the users really need this information, because it's also
incomplete. Users who really need to know addresses/routes should use
netlink or find another way (a way that scales and where they explicitly
request this information).
_nm_utils_dns_option_validate() allows specifying the address family,
and filters based on that. Note that all options are valid for IPv6,
but some are not valid for IPv4.
It's not obvious, that such filtering is only performed if
"option_descs" argument is provied. Otherwise, the "ipv6" argument is
ignored.
Regardless, it's also confusing to have a boolean "ipv6". When most
callers don't want a filtering based on the address family. They
actually don't want any filtering at all, as they don't pass an
"option_descs". At the same time passing a TRUE/FALSE "ipv6" is
redundant and ignored. It should be possible, to explicitly not select
an address family (as it's ignored anyway).
Instead, make the "gboolean ipv6" argument an "int addr_family".
Selecting AF_UNSPEC means clearly to accept any address family.
When connection down is explicitly called on the controller, the port
connection should also be deactivated with the reason user-requested,
otherwise any following connection update on the controller profile
will unblock the port connection and unnessarily make the port to
autoconnet again.
Fixes: 645a1bb0ef ('core: unblock autoconnect when master profile changes')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1568
Add a new "stable-ssid" mode that generates the MAC address based on the
Wi-Fi's SSID.
Note that this gives the same MAC address as setting
connection.stable-id="${NETWORK_SSID}"
wifi.cloned-mac-address="stable"
The difference is that changing the stable ID of a profile also affects
"ipv6.addr-gen-mode=stable-privacy" and other settings.
For Wi-Fi profiles, this will encode the SSID in the stable-id.
For other profiles, this encodes the connection UUID (but the SSID and
the UUID will always result in distinct stable IDs).
Also escape the SSID, so that the generated stable-id is always valid
UTF-8.
Comparing integers of different signedness gives often unexpected
results. Adjust usages of MIN()/MAX() to ensure that the arguments agree
in signedness.
nm_hash_siphash42() uses a randomized seed like nm_hash*(). In this case,
we want to always generate the same fake timestamp, based on the host-id.
In practice, it doesn't really matter, because this is only the fallback
path for something gone horribly wrong already.
Most profiles don't have "connection.permissions" set. Avoid resolving the
UID to the name via getpwuid() (in nm_auth_is_subject_in_acl()), until we
know that we require it.
I think GSource* is preferable, because it's more type-safe than the
guint numbers. Also, g_source_remove() only works with
g_main_context_default(), while g_source_detach() of a GSource pointer
works with any GMainContext (so it's more general, even if we in this
case only have sources attached to g_main_context_default()).
Handling GSource* pointers is possibly also faster, since it saves one
extra g_main_context_find_source_by_id() -- on the other hand, tracking
the pointer costs 8 bytes while tracking the guint only costs 4 bytes.
Whether it's faster is unproven, but possibly it is. In any case it's
not an argument against using pointers.
Anyway. Update another usage of source-ids to use GSource pointers.
This is also the pattern that "checkpatch.pl" suggests.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1774
Some Applications require to explicitly enable or disable EEE.
Therefore introduce EEE (Energy Efficient Ethernet) support with:
* ethtool.eee on/off
Unit test case included.
Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
When creating default connections automatically, we check if udev has
set the NM_AUTO_DEFAULT_LINK_LOCAL_ONLY variable, and if so, we create
the connection with method=link-local. It was checked only for ethernet
connection type, which works fine because it's the only device type that
we create connections automatically for.
However, link-local connections are not specific to Ethernet, and if we
add auto-default connections for more devices in the future we should
honor this configuration too. Do it from nm-device, but only if the
device class has defined a "new_default_connection" method so the
behaviour is identical as the current one, but will be used by future
implementors of this method too.
See-also: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1780https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1779
We honored "NM_AUTO_DEFAULT_LINK_LOCAL_ONLY" udev property, for when we
generate a "Wired connection 1" (aka the "auto-default connection").
Systemd now also honors and may set ID_NET_AUTO_LINK_LOCAL_ONLY for
a similar purpose. Honore that too.
The NM specific variable still is preferred, also because "NM_AUTO_DEFAULT_LINK_LOCAL_ONLY"
is about something very NetworkManager specific (controlling "Wired connection 1").
Maybe one day, we should drop "data/90-nm-thunderbolt.rules" and only
rely on what systemd provides. But not yet.
See-also: https://github.com/systemd/systemd/pull/29767https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1413
If a ovs interface has the cloned-mac-address property set, we pass
the desired MAC to ovsdb when creating the db entry, and openvswitch
will eventually assign it to the interface.
Note that usually the link will not have the desired MAC when it's
created. Therefore, currently we also change the MAC via netlink
before proceeding with IP configuration. This is important to make
sure that ARP announcements, DHCP client-id, etc. will use the correct
MAC address.
This doesn't work when using the "netdev" (userspace) datapath, as the
attempts to change the MAC of the tun interface via netlink fail,
leading to an activation failure.
To properly handle both cases in the same way, adopt a different
strategy: now we don't set the MAC address explicitly via netlink but
we only wait until ovs does that.
The code to determine if we are using the netdev datapath is logically
separated from the code to start IP configuration; move it to its own
function to make the code easier to follow.
The deactivation can happen while we are waiting for the ifindex, and
it can happen via two code paths, depending on the state. For a
regular deactivation, method deactivate_async() is called. Otherwise,
if the device goes directly to UNMANAGED or UNAVAILABLE, deactivate()
is called. We need to make sure that signal and source handlers are
disconnected, so that they are not called at the wrong time.
Fixes: 99a6c6eda6 ('ovs, dpdk: fix creating ovs-interface when the ovs-bridge is netdev')
If the device is being disconnected for a user request, at the moment
the active connection goes to state DEACTIVATED through the following
transitions, independently of the reason for the disconnection:
- state: DEACTIVATING, reason: UNKNOWN
- state: DEACTIVATED, reason: DEVICE_DISCONNECTED
For VPNs, a disconnection is always user-initiated, and the active
connection states emitted are:
- state: DEACTIVATING, reason: USER_DISCONNECTED
- state: DEACTIVATED, reason: USER_DISCONNECTED
This difference poses problems for clients that want to handle device
and VPNs in the same way, especially because WireGuard is implemented
as a device, but is logically a VPN.
Let NMActRequest translate the USER_REQUESTED device state reason to
USER_DISCONNECTED active connection state reason, in case of
disconnection.
This is an API change, but the previous behavior of reporting generic
uninformative reasons seems a bug. See for example
nmc_activation_get_effective_state(), which inspects the AC state
reason and in case it's generic (DEVICE_DISCONNECTED), it considers
the device state instead.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1405
NMActiveConnection implements method device_state_changed() that
re-emits device state changes as convenience for subclasses. Add the
reason for the state change to the handler, as it will be used in the
next commit.
This reverts commit 05c31da4d9.
In the linked commit the callback was made repeating on the assumption
that forward progress would result in the callback getting canceled in
cb_data_complete. However, this assumption does not hold since a timeout
callback does not guarantee completion (or error out) of a request.
curl tweaked some internals in v8.4.0 and started giving 0 timeouts, and
a repeating callback is firing back-to-back without making any progress
in doing so.
Revert the change and make the callback non-repeating again.
Fixes: 05c31da4d9 ('connectivity: don't cancel curl timerfunction from timeout')
nm_strv_find_first() is useful (and used) to find the first index (if
any). I can thus also used to check for membership.
However, we also have nm_strv_contains(), which seems better for
readability, when we check for membership. Use it.
When IPv6 is disabled in kernel but ipv6.method is set to auto, NetworkManager repeatedly attempts
IPv6 configuration internally, resulting in unnecessary warning messages being output infinitely.
platform-linux: do-add-ip6-address[2: fe80::5054:ff:fe7c:4293]: failure 95 (Operation not supported)
ipv6ll[e898db403d9b5099,ifindex=2]: changed: no IPv6 link local address to retry after Duplicate Address Detection failures (back off)
platform-linux: do-add-ip6-address[2: fe80::5054:ff:fe7c:4293]: failure 95 (Operation not supported)
ipv6ll[e898db403d9b5099,ifindex=2]: changed: no IPv6 link local address to retry after Duplicate Address Detection failures (back off)
platform-linux: do-add-ip6-address[2: fe80::5054:ff:fe7c:4293]: failure 95 (Operation not supported)
ipv6ll[e898db403d9b5099,ifindex=2]: changed: no IPv6 link local address to retry after Duplicate Address Detection failures (back off)
To prevent this issue, let's disable IPv6 in NetworkManager when it is disabled in the kernel.
In order to do it in activate_stage3_ip_config() only once during activation,
the firewall initialization needed to be moved earlier. Otherwise, the IPv6 disablement could occur
twice during activation because activate_stage3_ip_config() is also executed from subsequent of fw_change_zone().
ethtool "channels" parameters can be used to configure multiple queues
for a NIC, which helps to improve performances. Until now, users had
to use dispatcher scripts to change those parameters. Introduce native
support in NetworkManager by adding the following properties:
- ethtool.channels-rx
- ethtool.channels-tx
- ethtool.channels-other
- ethtool.channels-combined
`_ethtool_*_reset()` functions already check that the state is not
NULL, no need to check it before. The only exception was for "feature"
settings, where the check was missing.
If the client-id has been set to "none", the DHCP client-id option
(option 61) mustn't be sent. Honor this when the dhclient plugin is
used.
If dhclient has been called with the -i option (Use a DUID with DHCPv4
clients), it will send a Client-ID even without setting one in dhclient.conf.
In this case, this option needs to be explicitly overwritten with:
send dhcp-client-identifier = "";
At least in RHEL 8, dhclient is launched with `-i` turned on by default.
The function merge_dhclient_config was called only once from
create_dhclient_config. The content of both of them is short and simple,
so moving the content from merge_dhclient_config to the caller
improves the readability and makes the functions call chain easier to
follow. Also, both functions takes a long list of arguments which are
almost the same, so we can avoid having to pass them over and over in a
long call chain.
Sending a client-id is not mandatory according to RFC2131. It is
mandatory according to RFC4361 that superseedes it.
Some weird DHCP servers conforming RFC2131 can get confused and break
existing DHCP leases if they start receiving a client-id when it was not
being previously received. Users that were using other DHCP client like
dhclient, but want to use NetworkManager's internal DHCP client, can
suffer this problem.
Add "none" as accepted value in ipv4.dhcp-client-id to specify that
client-id must not be sent. Note that this is generally not recommended
unless it's explicitly needed for some reason like the explained above.
Client-id is mandatory in DHCPv6.
This commit allow to set the "none" value and properly parse it in the
NMDhcpClientConfig struct. Next commits will modify the different DHCP
plugins to honor it.
If a commit is invoked without any change to the l3cd or to the ACD
data, in _l3cfg_update_combined_config() we skip calling
_l3_acd_data_add_all(), which should clear the dirty flag from ACDs.
Therefore, in case of such no-op commits the ACDs still marked as
dirty - but valid - are removed via:
_l3_commit()
_l3_acd_data_process_changes()
_l3_acd_data_prune()
_l3_acd_data_prune_one()
Invoking a l3cfg commit without any actual changes is allowed, see the
explanation in commit e773559d9d ('device: schedule an idle commit
when setting device's sys-iface-state').
The bug is visible by running test 'bond_addreses_restart_persistence'
with IPv4 ACD/DAD is enabled by default: after restart IPv6 completes
immediately, the devices becomes ACTIVATED, the sys-iface-state
transitions from ASSUME to MANAGED, a commit is done, and it
incorrectly prunes the ACD data. The result is that the IPv4 address
is never added again.
Fix this by doing the pruning only when we update the dirty flags.
This is a respin of commit ed565f9146 ('l3cfg: fix pruning of ACD
data') that was reverted because it was causing a crash. The crash was
caused by unconditionally clearing `acd_data_pruning_needed` in
_l3cfg_update_combined_config(), while we need to do it only when
actually committing the configuration.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1749
sysfs is deprecated and kernel will not add new bridge port options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
The commit causes the following assertion failure:
0 0x00007f4187e22884 in __pthread_kill_implementation () from target:/lib64/libc.so.6
1 0x00007f4187dd1afe in raise () from target:/lib64/libc.so.6
2 0x00007f4187dba87f in abort () from target:/lib64/libc.so.6
3 0x00007f4188386f4e in g_assertion_message (domain=domain@entry=0x6fc1bc "nm", file=file@entry=0x722e94 "../src/core/nm-l3cfg.c", line=line@entry=2134,
func=func@entry=0x727730 <__func__.49> "_l3_acd_data_add_all", message=message@entry=0x23b3bb0 "assertion failed: (acd_data->info.track_infos[i]._priv.acd_dirty_track)")
at ../glib/gtestutils.c:3450
4 0x00007f41883f1597 in g_assertion_message_expr (domain=domain@entry=0x6fc1bc "nm", file=file@entry=0x722e94 "../src/core/nm-l3cfg.c", line=line@entry=2134,
func=func@entry=0x727730 <__func__.49> "_l3_acd_data_add_all", expr=expr@entry=0x726450 "acd_data->info.track_infos[i]._priv.acd_dirty_track") at ../glib/gtestutils.c:3476
5 0x0000000000587209 in _l3_acd_data_add_all (self=self@entry=0x23a7020, infos=infos@entry=0x0, infos_len=infos_len@entry=0, reapply=reapply@entry=1)
at ../src/core/nm-l3cfg.c:2134
6 0x0000000000587702 in _l3cfg_update_combined_config (self=self@entry=0x23a7020, to_commit=to_commit@entry=1, reapply=reapply@entry=1, out_old=out_old@entry=0x7ffd09ea4ca8,
out_changed_combined_l3cd=out_changed_combined_l3cd@entry=0x7ffd09ea4c7c) at ../src/core/nm-l3cfg.c:3858
7 0x000000000058a202 in _l3_commit (self=0x23a7020, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY, is_idle=is_idle@entry=0) at ../src/core/nm-l3cfg.c:5046
8 0x000000000058a49f in nm_l3cfg_commit (self=<optimized out>, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY) at ../src/core/nm-l3cfg.c:5115
9 0x00000000004856cd in nm_device_l3cfg_commit (self=self@entry=0x23ab870, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY, commit_sync=commit_sync@entry=1)
at ../src/core/devices/nm-device.c:4155
10 0x00000000004b1814 in nm_device_cleanup (self=self@entry=0x23ab870, reason=reason@entry=NM_DEVICE_STATE_REASON_NEW_ACTIVATION,
cleanup_type=cleanup_type@entry=CLEANUP_TYPE_DECONFIGURE) at ../src/core/devices/nm-device.c:15884
11 0x00000000004b26c9 in _set_state_full (self=self@entry=0x23ab870, state=state@entry=NM_DEVICE_STATE_DISCONNECTED, reason=NM_DEVICE_STATE_REASON_NEW_ACTIVATION,
quitting=quitting@entry=0) at ../src/core/devices/nm-device.c:16291
12 0x00000000004b2fe4 in nm_device_state_changed (self=self@entry=0x23ab870, state=state@entry=NM_DEVICE_STATE_DISCONNECTED, reason=<optimized out>)
at ../src/core/devices/nm-device.c:16505
13 0x00000000004b69de in queued_state_set (user_data=user_data@entry=0x23ab870) at ../src/core/devices/nm-device.c:16532
14 0x00007f41883bf4fd in g_idle_dispatch (source=0x23a88e0, callback=0x4b6956 <queued_state_set>, user_data=0x23ab870) at ../glib/gmain.c:6163
15 0x00007f41883c34fc in g_main_dispatch (context=0x22c4d10) at ../glib/gmain.c:3460
16 g_main_context_dispatch (context=0x22c4d10) at ../glib/gmain.c:4200
17 0x00007f41884216b8 in g_main_context_iterate.isra.0 (context=0x22c4d10, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4276
18 0x00007f41883c2aff in g_main_loop_run (loop=0x22c3b50) at ../glib/gmain.c:4479
19 0x0000000000423a37 in main (argc=<optimized out>, argv=<optimized out>) at ../src/core/main.c:519
This reverts commit ed565f9146.
It seems more useful to have a best effort approach and configure
everything we can; in that way we achieve at least some connectivity,
and then sysadmin can check the logs in case something is
missing. Currently instead, the whole activation fails (so, no address
is configured) if just one of the addresses fails DAD.
Ideally, we should have a way to make this configurable; but for now,
implement the more useful behavior as default.