Sync/blocking methods are ugly. Their name should highlight this.
Also, we may have an async variant, so we will need the "good" name
for apply() and apply_finish().
(cherry picked from commit dc66fb7d04)
(cherry picked from commit 558bcd5aae)
Blocking calls are ugly. Rename those to have a "_sync()" suffix.
Also, split from _fw_nft_set_shared() the part that constructs the
stdin for nft.
(cherry picked from commit 7362ad6266)
(cherry picked from commit bbf3d01e82)
In practice there is little difference.
Previously, "strbuf" would own the string until the end of the function,
when the "nm_auto_str_buf" cleanup attribute destroys it. In the
meantime, we would pass it on to _fw_nft_call_sync(), which in fact
won't access the string after returning.
Instead, we can just transfer ownership to the GBytes instance. That seems
more logical and safer than aliasing the buffer owned by NMStrBuf with
a g_bytes_new_static(). That way, we don't add a non-obvious restriction
on the lifetime of the string. The lifetime is now guarded by the GBytes
instance, which, could be referenced and kept alive longer.
There is also no runtime/memory overhead in doing this.
(cherry picked from commit 6a04bcc59d)
When disposing NMPolicy all the devices in the devices hash-table should
be unregistered and removed from the hash-table.
Fixes: 7e3d090acb ('policy: refactor tracking of registered devices')
(cherry picked from commit 5a87683b14)
(cherry picked from commit d23c6040f8)
We've been outright ignoring master-slave checks if the link ended up
without a master since commit 2e22880894 ('device: don't remove the
device from master if its link has no master').
This was done to deal with OpenVSwitch port-interface relationship,
where the interface's platform link lacked an actual master in platform
(what matters there is the OVSDB entry), but the fix was too wide.
Let's limit the special case to devices whose were not enslaved to
masters that lack a platform link, which pretty much happens for
OpenVSwitch only.
Morale: Write better commit messages of future you is going to be upset
Fixes: 2e22880894 ('device: don't remove the device from master if its link has no master')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1358
(cherry picked from commit a1de6810df)
(cherry picked from commit dc2d2da9db)
The @dracut_NM_vlan_over_team_no_boot sometimes fails, among other
things, because it fails to assume an indicated connection after a
restart.
That seems to happen because after the decision to activate the
indicated connection, the device does not move from DISCONNECTED state
quickly enough. Another assumption recheck runs in between and decides
to generate a connection, because the assume state was already reset
in between.
First start, creates and activates b3a61b68-f744-4a4c-a513-61399c154a67
on vlan0017:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
...
settings: update[b3a61b68-f744-4a4c-a513-61399c154a67]: adding connection "vlan0017"
(45113870df0a4cfb/keyfile)
Second start:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(after a restart, asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
Assumption attempt successfully picks the right connection and thus
proceeds to reset the assume state:
manager: (vlan0017): assume: will attempt to assume matching connection 'vlan0017'
(b3a61b68-f744-4a4c-a513-61399c154a67) (indicated)
device[c7c5101cf0b73f5f] (vlan0017): assume-state: set guess-assume=0, connection=(null)
Everything great so far, activation of the right connection is enqueued
and the device moves away from unavailable state. However, the
activation can't proceed immediately:
device (vlan0017): state change: unmanaged -> unavailable
(reason 'connection-assumed', sys-iface-state: 'assume')
device (vlan0017): state change: unavailable -> disconnected
(reason 'connection-assumed', sys-iface-state: 'assume')
active-connection[0x55ba1162f1c0]: set device "vlan0017" [0x55ba1163c4f0]
device[c7c5101cf0b73f5f] (vlan0017): queue activation request waiting for carrier
Now another assumption attempt is done. The original assume state is
gone, so a connection is generated:
platform-linux: UDEV event: action 'add' subsys 'net' device 'vlan0017' (6); seqnum=1959
device[c7c5101cf0b73f5f] (vlan0017): queued link change for ifindex 6
manager: (vlan0017): assume: generated connection 'vlan0017' (57627119-8c20-4f9e-bf4d-4fc427b4a6a9)
keyfile: commit: 57627119-8c20-4f9e-bf4d-4fc427b4a6a9 (vlan0017) added as
"/run/NetworkManager/system-connections/vlan0017-57627119-8c20-4f9e-bf4d-4fc427b4a6a9.nmconnection"
(nm-generated,volatile,external)
I think this shouldn't have happened. We've picked the correct
connection already and it's enqueued for activation!
Change the check in nm_device_emit_recheck_assume() to also consider
any queued activation.
Fixes-test: @dracut_NM_vlan_over_team_no_boot
Co-authored-by: Lubomir Rintel <lkundrak@v3.sk>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1351
(cherry picked from commit 9eb8cbca76)
(cherry picked from commit bdaba47a68)
If the MAC changes there is the possibility that the DHCP client will
not be able to renew the address because it uses the old MAC as
CHADDR. Depending on the implementation, the DHCP server might use
CHADDR (so, the old address) as the destination MAC for DHCP replies,
and those packets will be lost.
To avoid this problem, restart the DHCP client when the MAC changes.
https://bugzilla.redhat.com/show_bug.cgi?id=2110000
(cherry picked from commit 905adabdba)
(cherry picked from commit 5a49a2f6b2)
During the deactivation of ovs interfaces, ovsdb receives the command to
remove the interface but for OVS system ports the device won't
disappear.
When reconnecting, ovsdb will update first the status and it will notice
that the OVS system interface was removed and it will set the status as
DEACTIVATING. This is incorrect if the status is already DEACTIVATING,
DISCONNECTED, UNMANAGED or UNAVAILABLE because it will block the
activation of the interface.
https://bugzilla.redhat.com/show_bug.cgi?id=2080236
(cherry picked from commit bc6e28e585)
When only one of those connection.{lldp,mdns,llmnr,dns-over-tls}
settings changes, we still need to do a full restart of the IP
configuration to reapply the changes.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
(cherry picked from commit f4b128c63b)
_host_id_read() is the only place where we really care to have good
random numbers, because that is the secret key that we persist to disk.
Previously, we tried only nm_random_get_bytes_full(), which is a best
effort to get strong random numbers. If it fails to generate those,
it would simply remember the generated key in memory and proceed, but not
persist it to disk.
nm_random_get_bytes_full() does not block waiting for good numbers.
Change that. Now, first call nm_random_get_crypto_bytes(), which would
block and try hard to get good random numbers. Only if that fails,
fallback to nm_random_get_bytes_full() as before. The difference is of
course only in early boot, when we might not yet have entropy. In that
case, I think it's better for NetworkManager to block.
(cherry picked from commit 67a5cf7675)
Heavily inspired by systemd ([1]).
We now also have nm_random_get_bytes{,_full}() and
nm_random_get_crypto_bytes(), like systemd's random_bytes()
and crypto_random_bytes(), respectively.
Differences:
- instead of systemd's random_bytes(), our nm_random_get_bytes_full()
also estimates whether the output is of high quality. The caller
may find that interesting. Due to that, we will first try to call
getrandom(GRND_NONBLOCK) before getrandom(GRND_INSECURE). That is
reversed from systemd's random_bytes(), because we want to find
out whether we can get good random numbers. In most cases, kernel
should have entropy already, and it makes no difference.
Otherwise, heavily rework the code. It should be easy to understand
and correct.
There is also a major bugfix here. Previously, if getrandom() failed
with ENOSYS and we fell back to /dev/urandom, we would assume that we
have high quality random numbers. That assumption is not warranted.
Now instead poll on /dev/random to find out.
[1] a268e7f402/src/basic/random-util.c (L81)
(cherry picked from commit d20343c9d0)
By default, wpa_supplicant sets these parameters according to the
802.11 standard:
dot11RSNAConfigPMKLifetime = 43200 seconds (12 hours)
dot11RSNAConfigPMKReauthThreshold = 70%
With these, the supplicant triggers a new EAP authentication every 8
hours and 24 minutes. If the network uses one-time secrets, the
reauthentication fails and the supplicant disconnects. It doesn't seem
desirable that the client starts a reauthentication so early; bump the
lifetime to a week.
Currently, due to a bug, the new value is ignored by wpa_supplicant
when set via D-Bus. This patch needs the fix at [1], not yet merged.
[1] http://lists.infradead.org/pipermail/hostap/2022-July/040664.htmlhttps://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1306
(cherry picked from commit e35f2494f8)
n-dhcp4 only supports calling ACCEPT during the GRANTED state.
Not during a EXTENDED event. So usually, we would not want
to call accept in that case.
And we didn't. During EXTENDED event, we would usually skip ACD (because
it's either not enabled or we already passed ACD for the current address).
In that case, in _nm_dhcp_client_notify() we hit the line
if (client_event_type == NM_DHCP_CLIENT_EVENT_TYPE_BOUND && priv->l3cd_curr
&& nm_l3_config_data_get_num_addresses(priv->l3cd_curr, priv->config.addr_family) > 0)
priv->l3cfg_notify.wait_dhcp_commit = TRUE;
else
priv->l3cfg_notify.wait_dhcp_commit = FALSE;
and would not set `wait_dhpc_commit`. That means, we never called _dhcp_client_accept().
For nettools, that doesn't really matter because calling ACCEPT during EXTENDED
is invalid anyway. However, for dhclient that is fatal because we wouldn't reply the
D-Bus request from nm-dhcp-helper. The helper times out after 60 seconds and dhclient
would misbehave.
We need to fix that by also calling _dhcp_client_accept() in the case when we don't
need to wait (the EXTENDED case).
However, previously _dhcp_client_accept() was rather peculiar and didn't like to be
called in an unexpected state. Relax that. Now, when calling accept in an unexpected
state, just do nothing and signal success. That frees the caller from the complexity
to understand when they must/must not call accept.
https://bugzilla.redhat.com/show_bug.cgi?id=2109285
Fixes: 156d84217c ('dhcp/dhclient: implement accept/decline (ACD) for dhclient plugin')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1308
(cherry picked from commit 5077018ff4)
It's no longer necessary, as modem devices get the priority from the
ipmanual configuration created from the profile.
(cherry picked from commit 8c17760f62)
Before 1.36, manual addresses from the profile were assigned to the
interface; restore that behavior.
The manual IP configuration also contains the DNS priority from the
profile; so this change ensures that the merged l3cd has a DNS
priority and that dynamically discovered DNS servers are not ignored
by the DNS manager.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
(cherry picked from commit 0717589972)
==30980== 8 bytes in 1 blocks are definitely lost in loss record 1,117 of 6,137
==30980== at 0x4841C38: malloc (vg_replace_malloc.c:309)
==30980== by 0x4A246C7: g_malloc (gmem.c:106)
==30980== by 0x4A4A4BB: g_variant_get_strv (gvariant.c:1607)
==30980== by 0x4A4CA73: g_variant_valist_get_nnp (gvariant.c:4901)
==30980== by 0x4A4CA73: g_variant_valist_get_leaf (gvariant.c:5058)
==30980== by 0x4A4CA73: g_variant_valist_get (gvariant.c:5239)
==30980== by 0x4A4D11D: g_variant_get_va (gvariant.c:5502)
==30980== by 0x4A4D1BD: g_variant_lookup (gvariant.c:989)
==30980== by 0xE9389: parse_capabilities (nm-supplicant-interface.c:1241)
==30980== by 0xEBF99: _properties_changed_main (nm-supplicant-interface.c:1941)
==30980== by 0xEF549: _properties_changed (nm-supplicant-interface.c:2867)
==30980== by 0xEF7ED: _get_all_main_cb (nm-supplicant-interface.c:2972)
==30980== by 0x262057: _nm_dbus_connection_call_default_cb (nm-dbus-aux.c:70)
==30980== by 0x48DB6A3: g_task_return_now (gtask.c:1215)
==30980== by 0x48DBF43: g_task_return.part.3 (gtask.c:1285)
==30980== by 0x4918885: g_dbus_connection_call_done (gdbusconnection.c:5765)
==30980== by 0x48DB6A3: g_task_return_now (gtask.c:1215)
==30980== by 0x48DB6D7: complete_in_idle_cb (gtask.c:1229)
==30980== by 0x4A20981: g_main_dispatch (gmain.c:3325)
==30980== by 0x4A20981: g_main_context_dispatch (gmain.c:4016)
==30980== by 0x4A20BEF: g_main_context_iterate.isra.23 (gmain.c:4092)
==30980== by 0x4A20E33: g_main_loop_run (gmain.c:4290)
==30980== by 0x2C5C9: main (main.c:509)
Fixes: cd1e0193ab ('supplicant: add BIP interface capability')
(cherry picked from commit 8c5356cec6)
dhclient itself doesn't do ACD. However, it expects the dhclient-script
to exit with non-zero status, which causes dhclient to send a DECLINE.
`man dhclient-script`:
BOUND:
Before actually configuring the address, dhclient-script should
somehow ARP for it and exit with a nonzero status if it receives a
reply. In this case, the client will send a DHCPDECLINE message to
the server and acquire a different address. This may also be done in
the RENEW, REBIND, or REBOOT states, but is not required, and indeed may
not be desirable.
See also Fedora's dhclient-script ([1]).
https://gitlab.isc.org/isc-projects/dhcp/-/issues/67#note_9722633226f2d76/client/dhclient.c (L1652)
[1] a8f6fd046f/f/dhclient-script (_878)https://bugzilla.redhat.com/show_bug.cgi?id=1713380
(cherry picked from commit 156d84217c)
- assign the result of NM_DHCP_CLIENT_GET_CLASS() to a local variable.
It feels nicer to only call the macro once. Of course, the macro
expands to plain pointer dereferences, so there is little difference
in terms of executed code.
- handle the default case with no virtual function first.
(cherry picked from commit 0f6df633fa)
It's pretty pointless to log
<trace> [1653389116.6288] dhcp4 (br0): client event 7
<debug> [1653389116.6288] dhcp4 (br0): received OFFER of 192.168.121.110 from 192.168.121.1
where the obscure event #7 is only telling you that we are going
to log something. Handle logging events first.
In general, drop the "client event %d" message and make sure that all
code paths log something (useful), so we can see in the log that the
event was reached.
(cherry picked from commit 85b15e02fd)
When we accept/decline a lease, then that only works if we are in state
GRANTED. n-dhcp4 API also requires us, to provide the exact lease, that
we were announced earlier.
As such, we need to make sure that we don't accept/decline in the wrong
state. That means, to keep track of what we are doing more carefully.
The functions _dhcp_client_accept()/_dhcp_client_decline() now take
a l3cd argument, the one that we announced earlier. And we check that it
still matches.
(cherry picked from commit 52a0fe584c)
They are no longer used from outside, NMDhcpClient fully handles this.
Make them static and internal.
Also, decline is currently unused. It will be used soon, with ACD
support.
(cherry picked from commit 4a256092ee)
The function subscribes a callback l3_cfg_notify_cb(). Rename so that
related functions have a clearly related name.
(cherry picked from commit 9abcf3a53c)
The l3_cfg_notify_cb() handler is used for different purposes, and
different events will be considered.
Usually a switch statement is very nice for enums, especially if all
enum values should be handled (because the compiler can warn about
unhandled cases). In this case, not all events are supposed to be
handled. At this point, it seems nicer to just use an if block. It
better composes.
The compiler should be able to optimize both variants to the same
result. In any case, checking some integers for equality is in any case
going to be efficient.
(cherry picked from commit 7db07faa5e)
As almost always, there is a point in keeping IPv4 and IPv6 implementations
similar. Behave different where there is an actual difference, at the bottom
of the stack.
(cherry picked from commit 7f943f5fa6)
Technically, g_warn_if_reached() may not be an assertion, according to
glib. However, there is G_DEBUG=fatal-warnings and we want to run with
that.
So this is an assertion to us. Also, logging to stderr/stdout is not a
useful thing to the daemon. Don't do this. Especially, since it depends
on user provided (untrusted) input.
(cherry picked from commit 892cde1436)
Optimally we want stateless, pure code. Obviously, NMDhcpClient needs to
keep state to know what it's doing. However, we should well encapsulate
the state inside NMDhcpClient, and only accept events/notifications that
mutate the internal state according to certain rules.
Having a function public set_state(self, new_state) means that other
components (subclasses of NMDhcpClient) can directly mangle the state.
That means, you no longer need to only reason about the internal state
of NMDhcpClient (and the events/notifications/state-changes that it
implements). You also need to reason that other components take part of
maintaining that internal state.
Rename nm_dhcp_client_set_state() to nm_dhcp_client_notify(). Also, add
a new enum NMDhcpClientEventType with notification/event types.
In practice, this is only renaming. But naming is important, because it
suggests the reader how to think about the code.
(cherry picked from commit 97e65e4b50)
The "noop" state is almost unused, however, nm_dhcp_set_state()
has a check "if (new_state >= NM_DHCP_STATE_TIMEOUT)", so the order
of the NOOP state matters.
Fix that by reordering.
Also, just return right away from NOOP.
(cherry picked from commit 9761e38f7e)
NMDhcpState is very tied to events from dhclient. But most of these
states we don't care about, and NMDhcpClient definitely should abstract
and hide them.
We should repurpose NMDhcpState to simpler state. For that, first drop
the state from nm_dhcp_client_handle_event().
This is only the first step (which arguably makes the code more
complicated, because reason_to_state() gets spread out and the logic
happens more than once). That will be addressed next.
(cherry picked from commit f102051a29)