It's wrong, and it breaks certain uses.
Fixes: 13d25f9d0b ('glib-aux: add support for starting with stack-allocated buffer in NMStrBuf')
(cherry picked from commit c5ec4ebd77)
(cherry picked from commit 7b487e6951)
NM_STR_BUF_INIT() and nm_str_buf_init() were pretty much redundant. Drop one of
them.
Usually our pattern is that we don't have functions that return structs.
But NM_STR_BUF_INIT() returns a struct, because it's convenient to use
with
nm_auto_str_buf NMStrBuf strbuf = NM_STR_BUF_INIT(...);
So use that variant instead.
(cherry picked from commit 532f3e34a8)
Allow to initialize NMStrBuf with an externally allocated array.
Usually a stack buffer. If the NMStrBuf grows beyond the size of
that initial buffer, then it would switch using malloc.
The idea is to support the common case where the result is small enough
to fit on the stack.
I always wanted to do such optimization because the main purpose of
NMStrBuf is to put it on the stack and ad-hoc construct a string.
I just figured, it would complicate the implementation and add
a runtime overhead. But turns out, it doesn't really.
The biggest question is how NMStrBuf should behave with a pre-allocated
buffer? Turns out, most choices can be made in a rather obvious way.
The only non-obvious thing is that nm_str_buf_finalize() would malloc()
a buffer, but that too seems consistent and what a user would probably
expect. As such, this doesn't seem to add unexpected semantics to the API.
(cherry picked from commit 13d25f9d0b)
Sync/blocking methods are ugly. Their name should highlight this.
Also, we may have an async variant, so we will need the "good" name
for apply() and apply_finish().
(cherry picked from commit dc66fb7d04)
(cherry picked from commit 558bcd5aae)
Blocking calls are ugly. Rename those to have a "_sync()" suffix.
Also, split from _fw_nft_set_shared() the part that constructs the
stdin for nft.
(cherry picked from commit 7362ad6266)
(cherry picked from commit bbf3d01e82)
In practice there is little difference.
Previously, "strbuf" would own the string until the end of the function,
when the "nm_auto_str_buf" cleanup attribute destroys it. In the
meantime, we would pass it on to _fw_nft_call_sync(), which in fact
won't access the string after returning.
Instead, we can just transfer ownership to the GBytes instance. That seems
more logical and safer than aliasing the buffer owned by NMStrBuf with
a g_bytes_new_static(). That way, we don't add a non-obvious restriction
on the lifetime of the string. The lifetime is now guarded by the GBytes
instance, which, could be referenced and kept alive longer.
There is also no runtime/memory overhead in doing this.
(cherry picked from commit 6a04bcc59d)
NMPGlobalTracker allows to track objects for independent users/callers.
That is, callers that are not aware whether another caller tracks the
same/similar object. It thus groups all objects by their nmp_object_id_equal()
(as `TrackObjData` struct), while keeping a list of each individually tracked
object (as `TrackData` struct which honors the object and the user-tag parameter).
When the same caller (based on the user-tag) tracks the same object again, then
NMPGlobalTracker will only track it once and combine the objects. That is done by
also having a dictionary for the `TrackData` entries (`self->by_data`).
This latter dictionary lookup wrongly considered nmp_object_id_equal().
Instead, it needs to consider all minor differences of the objects, and
use nmp_object_equal().
For example, for NMPlatformMptcpAddress, only the "address" is part of
the ID. Other fields, like the MPTCP flags are not. Imagine a profile is
active with MPTCP endpoints configured with flags "subflow". During reapply,
the user can only update the MPTCP flags (e.g. to "signal"). When that happens,
the caller (NML3Cfg) would track a new NMPlatformMptcpAddress instance, that only
differs by MPTCP flags. In this case, we need to track the new address for the
differences that it has according to nmp_object_equal(), and not
nmp_object_id_equal().
Due to this bug, reapply might not work correctly. For other supported types (routing
rules and routes) this bug may have been harder to reproduce, because most attributes
of rules/routes are also part of the ID and because it's uncommon to reapply a minor
change to a rule/route.
https://bugzilla.redhat.com/show_bug.cgi?id=2120471
Fixes: b8398b9e79 ('platform: add NMPRulesManager for syncing routing rules')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1375
(cherry picked from commit d8aacba3b2)
(cherry picked from commit c456bfa7c4)
It is allowed to have a connection with empty connection.slave-type
and a NMSettingBondPort; the property will be set automatically during
normalization if a master is set, otherwise the setting will be removed.
With this change, it becomes possible to remove a port from a bond
from nmcli, turning it into a non-slave connection. Before, this used
to fail with:
$ nmcli connection add type ethernet ifname test con-name test+ connection.master bond0 connection.slave-type bond
$ nmcli connection modify test+ connection.master '' connection.slave-type ''
Error: Failed to modify connection 'test+': connection.slave-type: A connection with a 'bond-port' setting must have the slave-type set to 'bond'
https://bugzilla.redhat.com/show_bug.cgi?id=2126262https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1382
Fixes: 9958510f28 ('bond: add support of queue_id of bond port')
(cherry picked from commit 23ce9cff99)
(cherry picked from commit 30366e5b3a)
When disposing NMPolicy all the devices in the devices hash-table should
be unregistered and removed from the hash-table.
Fixes: 7e3d090acb ('policy: refactor tracking of registered devices')
(cherry picked from commit 5a87683b14)
(cherry picked from commit d23c6040f8)
We've been outright ignoring master-slave checks if the link ended up
without a master since commit 2e22880894 ('device: don't remove the
device from master if its link has no master').
This was done to deal with OpenVSwitch port-interface relationship,
where the interface's platform link lacked an actual master in platform
(what matters there is the OVSDB entry), but the fix was too wide.
Let's limit the special case to devices whose were not enslaved to
masters that lack a platform link, which pretty much happens for
OpenVSwitch only.
Morale: Write better commit messages of future you is going to be upset
Fixes: 2e22880894 ('device: don't remove the device from master if its link has no master')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1358
(cherry picked from commit a1de6810df)
(cherry picked from commit dc2d2da9db)
The @dracut_NM_vlan_over_team_no_boot sometimes fails, among other
things, because it fails to assume an indicated connection after a
restart.
That seems to happen because after the decision to activate the
indicated connection, the device does not move from DISCONNECTED state
quickly enough. Another assumption recheck runs in between and decides
to generate a connection, because the assume state was already reset
in between.
First start, creates and activates b3a61b68-f744-4a4c-a513-61399c154a67
on vlan0017:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
...
settings: update[b3a61b68-f744-4a4c-a513-61399c154a67]: adding connection "vlan0017"
(45113870df0a4cfb/keyfile)
Second start:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(after a restart, asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
Assumption attempt successfully picks the right connection and thus
proceeds to reset the assume state:
manager: (vlan0017): assume: will attempt to assume matching connection 'vlan0017'
(b3a61b68-f744-4a4c-a513-61399c154a67) (indicated)
device[c7c5101cf0b73f5f] (vlan0017): assume-state: set guess-assume=0, connection=(null)
Everything great so far, activation of the right connection is enqueued
and the device moves away from unavailable state. However, the
activation can't proceed immediately:
device (vlan0017): state change: unmanaged -> unavailable
(reason 'connection-assumed', sys-iface-state: 'assume')
device (vlan0017): state change: unavailable -> disconnected
(reason 'connection-assumed', sys-iface-state: 'assume')
active-connection[0x55ba1162f1c0]: set device "vlan0017" [0x55ba1163c4f0]
device[c7c5101cf0b73f5f] (vlan0017): queue activation request waiting for carrier
Now another assumption attempt is done. The original assume state is
gone, so a connection is generated:
platform-linux: UDEV event: action 'add' subsys 'net' device 'vlan0017' (6); seqnum=1959
device[c7c5101cf0b73f5f] (vlan0017): queued link change for ifindex 6
manager: (vlan0017): assume: generated connection 'vlan0017' (57627119-8c20-4f9e-bf4d-4fc427b4a6a9)
keyfile: commit: 57627119-8c20-4f9e-bf4d-4fc427b4a6a9 (vlan0017) added as
"/run/NetworkManager/system-connections/vlan0017-57627119-8c20-4f9e-bf4d-4fc427b4a6a9.nmconnection"
(nm-generated,volatile,external)
I think this shouldn't have happened. We've picked the correct
connection already and it's enqueued for activation!
Change the check in nm_device_emit_recheck_assume() to also consider
any queued activation.
Fixes-test: @dracut_NM_vlan_over_team_no_boot
Co-authored-by: Lubomir Rintel <lkundrak@v3.sk>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1351
(cherry picked from commit 9eb8cbca76)
(cherry picked from commit bdaba47a68)
If the MAC changes there is the possibility that the DHCP client will
not be able to renew the address because it uses the old MAC as
CHADDR. Depending on the implementation, the DHCP server might use
CHADDR (so, the old address) as the destination MAC for DHCP replies,
and those packets will be lost.
To avoid this problem, restart the DHCP client when the MAC changes.
https://bugzilla.redhat.com/show_bug.cgi?id=2110000
(cherry picked from commit 905adabdba)
(cherry picked from commit 5a49a2f6b2)
During the deactivation of ovs interfaces, ovsdb receives the command to
remove the interface but for OVS system ports the device won't
disappear.
When reconnecting, ovsdb will update first the status and it will notice
that the OVS system interface was removed and it will set the status as
DEACTIVATING. This is incorrect if the status is already DEACTIVATING,
DISCONNECTED, UNMANAGED or UNAVAILABLE because it will block the
activation of the interface.
https://bugzilla.redhat.com/show_bug.cgi?id=2080236
(cherry picked from commit bc6e28e585)
When only one of those connection.{lldp,mdns,llmnr,dns-over-tls}
settings changes, we still need to do a full restart of the IP
configuration to reapply the changes.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
(cherry picked from commit f4b128c63b)
_host_id_read() is the only place where we really care to have good
random numbers, because that is the secret key that we persist to disk.
Previously, we tried only nm_random_get_bytes_full(), which is a best
effort to get strong random numbers. If it fails to generate those,
it would simply remember the generated key in memory and proceed, but not
persist it to disk.
nm_random_get_bytes_full() does not block waiting for good numbers.
Change that. Now, first call nm_random_get_crypto_bytes(), which would
block and try hard to get good random numbers. Only if that fails,
fallback to nm_random_get_bytes_full() as before. The difference is of
course only in early boot, when we might not yet have entropy. In that
case, I think it's better for NetworkManager to block.
(cherry picked from commit 67a5cf7675)
Heavily inspired by systemd ([1]).
We now also have nm_random_get_bytes{,_full}() and
nm_random_get_crypto_bytes(), like systemd's random_bytes()
and crypto_random_bytes(), respectively.
Differences:
- instead of systemd's random_bytes(), our nm_random_get_bytes_full()
also estimates whether the output is of high quality. The caller
may find that interesting. Due to that, we will first try to call
getrandom(GRND_NONBLOCK) before getrandom(GRND_INSECURE). That is
reversed from systemd's random_bytes(), because we want to find
out whether we can get good random numbers. In most cases, kernel
should have entropy already, and it makes no difference.
Otherwise, heavily rework the code. It should be easy to understand
and correct.
There is also a major bugfix here. Previously, if getrandom() failed
with ENOSYS and we fell back to /dev/urandom, we would assume that we
have high quality random numbers. That assumption is not warranted.
Now instead poll on /dev/random to find out.
[1] a268e7f402/src/basic/random-util.c (L81)
(cherry picked from commit d20343c9d0)
nm_utils_random_bytes() is supposed to give us good random number from
kernel. It guarantees to always provide some bytes, and it has a
boolean return value that estimates whether the bytes are good
randomness. In practice, most callers ignore that return value, because
what would they do about it anyway?
Of course, we want to primarily use getrandom() (or "/dev/urandom"). But
if we fail to get random bytes from them, we have a fallback path that
tries to generate "random" bytes.
It does so, by initializing a global seed from various sources, and keep
sha256 hashing the buffer in a loop. That's certainly not efficient nor
elegant, but we already are in a fallback path.
Still, we can do slightly better. Instead of just using the global state
and keep updating it (entirely deterministically), every time also mix in
the results from getrandom() and a current timestamp. The idea is that if you
have a virtual machine that gets cloned, we don't want that our global
state keeps giving the same random numbers. In particular, because
getrandom() might handle that case, even if it doesn't have good
entropy.
(cherry picked from commit 3c349ee11b)
By default, wpa_supplicant sets these parameters according to the
802.11 standard:
dot11RSNAConfigPMKLifetime = 43200 seconds (12 hours)
dot11RSNAConfigPMKReauthThreshold = 70%
With these, the supplicant triggers a new EAP authentication every 8
hours and 24 minutes. If the network uses one-time secrets, the
reauthentication fails and the supplicant disconnects. It doesn't seem
desirable that the client starts a reauthentication so early; bump the
lifetime to a week.
Currently, due to a bug, the new value is ignored by wpa_supplicant
when set via D-Bus. This patch needs the fix at [1], not yet merged.
[1] http://lists.infradead.org/pipermail/hostap/2022-July/040664.htmlhttps://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1306
(cherry picked from commit e35f2494f8)
n-dhcp4 only supports calling ACCEPT during the GRANTED state.
Not during a EXTENDED event. So usually, we would not want
to call accept in that case.
And we didn't. During EXTENDED event, we would usually skip ACD (because
it's either not enabled or we already passed ACD for the current address).
In that case, in _nm_dhcp_client_notify() we hit the line
if (client_event_type == NM_DHCP_CLIENT_EVENT_TYPE_BOUND && priv->l3cd_curr
&& nm_l3_config_data_get_num_addresses(priv->l3cd_curr, priv->config.addr_family) > 0)
priv->l3cfg_notify.wait_dhcp_commit = TRUE;
else
priv->l3cfg_notify.wait_dhcp_commit = FALSE;
and would not set `wait_dhpc_commit`. That means, we never called _dhcp_client_accept().
For nettools, that doesn't really matter because calling ACCEPT during EXTENDED
is invalid anyway. However, for dhclient that is fatal because we wouldn't reply the
D-Bus request from nm-dhcp-helper. The helper times out after 60 seconds and dhclient
would misbehave.
We need to fix that by also calling _dhcp_client_accept() in the case when we don't
need to wait (the EXTENDED case).
However, previously _dhcp_client_accept() was rather peculiar and didn't like to be
called in an unexpected state. Relax that. Now, when calling accept in an unexpected
state, just do nothing and signal success. That frees the caller from the complexity
to understand when they must/must not call accept.
https://bugzilla.redhat.com/show_bug.cgi?id=2109285
Fixes: 156d84217c ('dhcp/dhclient: implement accept/decline (ACD) for dhclient plugin')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1308
(cherry picked from commit 5077018ff4)
It's no longer necessary, as modem devices get the priority from the
ipmanual configuration created from the profile.
(cherry picked from commit 8c17760f62)
Before 1.36, manual addresses from the profile were assigned to the
interface; restore that behavior.
The manual IP configuration also contains the DNS priority from the
profile; so this change ensures that the merged l3cd has a DNS
priority and that dynamically discovered DNS servers are not ignored
by the DNS manager.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
(cherry picked from commit 0717589972)