The "noop" state is almost unused, however, nm_dhcp_set_state()
has a check "if (new_state >= NM_DHCP_STATE_TIMEOUT)", so the order
of the NOOP state matters.
Fix that by reordering.
Also, just return right away from NOOP.
NMDhcpState is very tied to events from dhclient. But most of these
states we don't care about, and NMDhcpClient definitely should abstract
and hide them.
We should repurpose NMDhcpState to simpler state. For that, first drop
the state from nm_dhcp_client_handle_event().
This is only the first step (which arguably makes the code more
complicated, because reason_to_state() gets spread out and the logic
happens more than once). That will be addressed next.
- return early to avoid nested block.
- use NM_STR_HAS_PREFIX() over g_str_has_prefix(), because that
can be inlined and only accepts a C literal as prefix argument.
- the code comment was unclear/wrong. If something comes from an environment
variables it is *NOT* UTF-8 safe. Also, we convert all non-ASCII characters,
not only non UTF-8 characters.
- as we already convert the string to ASCII, the check whether it's UTF-8
is bogus.
- using GString is unnecessary.
- use NM_IN_STRSET_ASCII_CASE().
- don't use else block after we return.
- don't accept the "iface" argument just for logging. The caller
can do the logging, if they wish.
Log messages when invalid DHCP options are found. For example:
<info> dhcp4 (eth0): error parsing DHCP option 6 (domain_name_servers): address 0.0.0.0 is ignored
<info> dhcp4 (eth0): error parsing DHCP option 12 (host_name): '.example.com' is not a valid DNS domain
<info> dhcp4 (eth0): error parsing DHCP option 26 (interface_mtu): value 60 is smaller than 68
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1225
audit_encode_nv_string() is documented that it might fail. Handle
the error.
Also, the returned string was allocated with malloc(). We must free
that with free()/nm_auto_free, not g_free()/gs_free.
Currently nm_setting_bond_get_option_normalized() and
nm_setting_bond_get_option_or_default() are identical functions. As the
first one is exposed as public API and has a better name, let's drop the
second one.
IPv6 temporary addresses are configured by kernel, with the
"ipv6.ip6-privacy" setting ("use_tempaddr" sysctl) and the
IFA_F_MANAGETEMPADDR flag.
As such, the idea was that during reapply we would not remove them.
However, that is wrong.
The only case when we want to keep those addresses, is if during reapply
we are going to configure the same primary address (with mngtmpaddr
flag) again. Otherwise, theses addresses must always go away.
This is quite serious. This not only affects Reapply. Also during disconnect
we clear IP configuration via l3cfg.
Have an ethernet profile active with "ipv6.ip6-privacy". Unplug
the cable, the device disconnects but the temporary IPv6 address is not
cleared. As such, nm_device_generate_connection() will now generate
an external profile (with "ipv6.method=disabled" and no manual IP addresses).
The result is, that the device cannot properly autoconnect again,
once you replug the cable.
This is serious for disconnect. But I could not actually reproduce the
problem using reapply. That is, because during reapply we usually
toggle ipv6_disable sysctl, which drops all IPv6 addresses. I still
went through the effort of trying to preserve addresses that we still
want to have, because I am not sure whether there are cases where we
don't toggle ipv6_disable. Also, doing ipv6_disable during reapply is
bad anyway, and we might want to avoid that in the future.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
NM_STR_BUF_INIT() and nm_str_buf_init() were pretty much redundant. Drop one of
them.
Usually our pattern is that we don't have functions that return structs.
But NM_STR_BUF_INIT() returns a struct, because it's convenient to use
with
nm_auto_str_buf NMStrBuf strbuf = NM_STR_BUF_INIT(...);
So use that variant instead.
For some device types the attach-port operation doesn't complete
immediately. NMDevice needs to wait that the operation completes
before proceeding (for example, before starting stage3 for the port).
Change attach_port() so that it can return TERNARY_DEFAULT to indicate
that the operation will complete asynchronously. Most of devices are
not affected by this and can continue returning TRUE/FALSE as before
without callback.
DHCP leases for a given interface are already exported on D-Bus
through DHCP4Config and DHCP6Config objects. It is useful to have the
same information also available on the filesystem so that it can be
easily used by scripts.
NM already saves some information about DHCP leases in /var, however
that directory can only be accessed by root, for good reasons.
Append lease options to the existing state file
/run/NetworkManager/devices/$ifindex. Contrary to /var this directory
is not persistent, but it seems more correct to expose the lease only
when it is active and not after it expired or after a reboot.
Since the file is in keyfile format, we add new [dhcp4] and [dhcp6]
sections; however, since some options have the same name for DHCPv4
and DHCPv6, we add a "dhcp4." or "dhcp6." prefix to make the parsing
by scripts (e.g. via "grep") easier.
The option name is the same we use on D-Bus. Since some DHCPv6 options
also have a "dhcp6_" prefix, the key name can contain "dhcp6" twice.
The new sections look like this:
[dhcp4]
dhcp4.broadcast_address=172.25.1.255
dhcp4.dhcp_lease_time=120
dhcp4.dhcp_server_identifier=172.25.1.4
dhcp4.domain_name_servers=172.25.1.4
dhcp4.domain_search=example.com
dhcp4.expiry=1641214444
dhcp4.ip_address=172.25.1.182
dhcp4.next_server=172.25.1.4
dhcp4.routers=172.25.1.4
dhcp4.subnet_mask=255.255.255.0
[dhcp6]
dhcp6.dhcp6_name_servers=fd01::1
dhcp6.dhcp6_ntp_servers=ntp.example.com
dhcp6.ip6_address=fd01::1aa
Instead of logging the event-id, which is composed from options that
are already visible in the log, it's more interesting to log that the
lease was merged.
In practice there is little difference.
Previously, "strbuf" would own the string until the end of the function,
when the "nm_auto_str_buf" cleanup attribute destroys it. In the
meantime, we would pass it on to _fw_nft_call_sync(), which in fact
won't access the string after returning.
Instead, we can just transfer ownership to the GBytes instance. That seems
more logical and safer than aliasing the buffer owned by NMStrBuf with
a g_bytes_new_static(). That way, we don't add a non-obvious restriction
on the lifetime of the string. The lifetime is now guarded by the GBytes
instance, which, could be referenced and kept alive longer.
There is also no runtime/memory overhead in doing this.
ASSUME is causing more troubles than benefits it provides. This patch is
dropping NM_L3_CFG_COMMIT_TYPE_ASSUME and assume_config_once. NM3LCfg
will commit as if the sys-iface-state is MANAGED.
This patch is part of the effort to remove ASSUME from NetworkManager.
After ASSUME is dropped when starting NetworkManager it will take full
control of the interface, re-configuring it. The interface will be
managed from the start instead of assumed and then managed.
This will solve the situations where an interface is half-up and then a
restart happens. When NetworkManager is back it won't add the missing
addresses (which is what assume does) so the interface will fail during
the activation and will require a full activation.
https://bugzilla.redhat.com/show_bug.cgi?id=2050216https://bugzilla.redhat.com/show_bug.cgi?id=2077605https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1196
When attaching a bond port, kernel will reset the MTU of the port ([1],
[2]). Configuring a different MTU on the port seems not a sensible
thing for the user to do.
Still, before commit e67ddd826f ('device: commit MTU during stage2')
we would first attach the bond port before setting the MTU. That
changed, and now the MTU set by kernel wins.
Btw, this change in behavior happens because we attach the port in
stage3 (ip-config), which seems an ugly thing to do.
Anyway, fix this by setting the MTU after attaching the ports, but still
in stage3.
It is probably not sensible for the user to configure a different MTU.
Still, if the user requested it by configuration, we should apply it.
Note that NetworkManager has some logic to constrain the MTU based on
the parent/child and controller/port. In many regards however, NetworkManager
does not fully understand or enforce the correct MTU and relies on the
user to configure it correctly. After all, if the user misconfigures the
MTU, the setup will have problems anyway (and in many cases neither
kernel nor NetworkManager could know that the configuration is wrong).
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/bonding/bond_main.c?h=v5.17#n3603
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/bonding/bond_main.c?h=v5.17#n4372https://bugzilla.redhat.com/show_bug.cgi?id=2071985
Fixes: e67ddd826f ('device: commit MTU during stage2')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1199
On glibc, HOST_NAME_MAX is defined as 64. Also, Linux'
sethostname() enforces that limit (__NEW_UTS_LEN). Also,
`man gethostname` comments that HOST_NAME_MAX on Linux is
64.
However, when building against musl, HOST_NAME_MAX is defined as 255.
That seems wrong. We use this limit to validate the hostname, and that
should not depend on the libc or on the compilation.
Hardcode the value to 64.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1197
This was only for unit testing, to check whether our reader
for "/etc/machine-id" agrees with systemd's.
That unit test was anyway flawed, because it actually accesses
the machine-id on the test system.
Anyway. Drop this. Most likely our parser is good enough, and
if we get a bug report with a defect, we can unit test against
that.
The goal would be to ensure that a device cannot move to activated,
while a DNS update is still pending.
This does not really work for most cases. That is, because NMDevice does
not directly push DNS updates to NMDnsManager, instead, NMPolicy is
watching all device changes, and doing it. But when NMPolicy decides to
to that, may not be the right moment.
We really should let NMDevice (or better, NML3Cfg) directly talk to
NMDnsManager. Why not? They have all the information when new DNS
configuration is available. The only thing that NMPolicy does on top of
that, is determining which device has the best default route. NMPolicy
could continue to do that (or maybe NMDnsManager could), but the update
needs to be directly triggered by NMDevice/NML3Cfg.
nm_dns_manager_get() is already a singleton. So users usually
can just get it whenever they need -- except during shutdown
after the singleton was destroyed. This is usually fine, because
users really should not try to get it late during shutdown.
However, if you subscribe a signal handler on the singleton, then you
will also eventually want to unsubscribe it. While the moment when you
subscribe it is clearly not during late-shutdown, it's not clear how
to ensure that the signal listener gets destroyed before the DNS manager
singleton.
So usually, whenever you are going to subscribe a signal, you need to
make sure that the target object stays alive long enough. Which may
mean to keep a reference to it.
Next, we will have NMDevice subscribe to the singleton. With above said,
that would mean that potentially every NMDevice needs to keep a
reference to the NMDnsManager. That is not best. Also, later NMManager
will face the same problem, because it will also subscribe to
NMDnsManager.
So, instead let NMManager own a reference to the NMDnsManager. This
ensures the lifetimes are properly guarded (NMDevice also references
NMManager already).
Also, access nm_dns_manager_get() lazy on first use, to only initialize
it when needed the first time (which might be quite late).
For example, if you have a dnsmasq service running and bound to port 53, then
NetworkManager's [main].dns=dnsmasq will fail to start. And we keep retrying
to start it. But then update pending would hang indefinitely, and devices could
not become active. That must not happen.
Give the DNS update only 5 seconds. If it's not done by then, assume we
have a problem and unblock.