When the attach_port()/detach_port() methods do not return immediately
(currently, only for OVS ports), the following situation can arise:
- nm_device_controller_attach_port() starts the attachment by sending
the command to ovsdb. Note that here we don't set
`PortInfo->port_is_attached` to TRUE yet; that happens only after
the asynchronous command returns;
- the activation of the port gets interrupted because the connection
is deleted;
- the port device enters the deactivating state, triggering function
port_state_changed()
- the function calls nm_device_controller_release_port() which checks
whether the port is already attached; since
`PortInfo->port_is_attached` is not set yet, it assumes the port
doesn't need to be detached;
- in the meantime, the ovsdb operation succeeds. As a consequence,
the kernel link is created even if the connection no longer exists.
Fix this by turning `port_is_attached` into a tri-state variable that
also tracks when the port is attaching. When it is, we need to perform
an explicit detach during deactivation.
Fixes: 9fcbc6b37d ('device: make attach_port() asynchronous')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2043
Resolves: https://issues.redhat.com/browse/RHEL-58026
This patch add support to IPVLAN interface. IPVLAN is a driver for a
virtual network device that can be used in container environment to
access the host network. IPVLAN exposes a single MAC address to the
external network regardless the number of IPVLAN device created inside
the host network. This means that a user can have multiple IPVLAN
devices in multiple containers and the corresponding switch reads a
single MAC address. IPVLAN driver is useful when the local switch
imposes constraints on the total number of MAC addresses that it can
manage.
When using the netdev datapath, we wait for the link to appear in
different steps:
1. initially, in act_stage3_ip_config() connects to platform's
"link-changed" signal to detect when the TUN interface appears;
2. when the interface appears, _netdev_tun_link_cb() schedules
_set_ip_ifindex_tun() in a idle handler;
3. _set_ip_ifindex_tun() checks if the link is ready (e.g. if the MAC
address is correct) and in that case it reschedules stage3, which
will move forward with the activation;
4. if the link is not ready in _set_ip_ifindex_tun(), the function
connects again to platform's "link-changed" signal to react to link
changes;
5. after the link changes and it is ready, _netdev_tun_link_cb()
reschedules stage3, which moves forward with the activation;
With the current implementation it is possible that after step 2, if
act_stage3_ip_config() runs because it was already scheduled, it
registers again to the "link-changed" event; then when
_set_ip_ifindex_tun() is invoked it will hit assertion:
nm_assert(!priv->wait_link.tun_link_signal_id);
Fix this by preventing that the signal gets registered again after
step 2.
Fixes-test: @ovs_datapath_type_netdev_with_cloned_mac
Fixes: acf485196c ('ovs-interface: wait that the cloned MAC changes instead of setting it')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2024
The string array returned by nm_l3_config_data_get_searches() is not
NULL-terminated; we need to pass the exact length to
nm_utils_buf_utf8safe_escape_strv() instead of letting the function
scan for the NULL terminator.
Fix the following error reported by valgrind:
Conditional jump or move depends on uninitialised value(s)
at 0x4B287DB: g_strv_length (gstrfuncs.c:2948)
by 0x6EBDBE: nm_utils_buf_utf8safe_escape_strv (nm-shared-utils.c:3047)
by 0x59A3F1: get_property_ip (nm-ip-config.c:198)
by 0x4A6E150: UnknownInlinedFun (gobject.c:2140)
by 0x4A6E150: g_object_get_property (gobject.c:3454)
by 0x56FB1A: nm_dbus_utils_get_property (nm-dbus-utils.c:95)
by 0x44B343: _obj_get_property (nm-dbus-manager.c:880)
by 0x44DC4F: _nm_dbus_manager_obj_notify (nm-dbus-manager.c:1201)
by 0x56EE77: dispatch_properties_changed (nm-dbus-object.c:253)
by 0x4A5BF1E: g_object_notify_queue_thaw.lto_priv.0 (gobject.c:755)
by 0x5997BD: _handle_l3cd_changed (nm-ip-config.c:837)
by 0x59A129: _l3cfg_notify_cb (nm-ip-config.c:147)
by 0x4A5B649: g_closure_invoke (gclosure.c:834)
Fixes: 522a7d6baf ('nm-ip-config: escape searches when exposing to dbus')
Previously, when a connection was configured with search domains
that contained non-ASCII characters, GLib would try to parse the
search name as UTF-8, and an assertion would fail (which meant
that if NM was running with fatal assertions, it would crash).
Expose the search domains only as an escaped string to avoid this.
When a connection with ipv4.method=auto (DHCP) is configured with
ipv4.link-local=enable we were leaving the link-local address forever,
but this is not correct according to RFC3927[1] which says:
a host SHOULD NOT have both an operable routable address and an IPv4
Link-Local address configured on the same interface.
This adds a new mode that is more compliant, which only sets an IPv4
link-local address if no other address is set (through either DHCP lease
or ivp4.addresses setting)
Closes#1562
Link: https://github.com/systemd/systemd/issues/13316
Link: https://datatracker.ietf.org/doc/html/rfc3927#section-1.9 [1]
Add a new l3cfg DatFlag to specify that a given l3cd has a
non-link-local IPv4 set.
This will be used to enable or disable IPv4LL automatically in fallback
mode.
Move the static _ip4_address_is_link_local() check to a new global
nm_platform_ip4_address_is_link_local() helper so we can check if
an IPv4 is link local in other files
When lease is lost we would keep the DHCP state to READY, but we are
trying to get a new lease at this point so it is closer to PENDING.
Note this does not change how the device is displayed in `nmcli device`,
a connection with an expired lease is still displayed as `connected`.
When handling event TIMEOUT, "acd_data->probing_timeout_msec" needs to
be always initialized before jumping to "handle_start_probing:";
otherwise, an assertion failure is triggered at:
static void
_l3_acd_data_timeout_schedule_probing_restart(AcdData *acd_data, gint64 now_msec)
{
...
nm_assert(acd_data->probing_timeout_msec > 0);
Even if the ACD data is already in state PROBE, that doesn't mean that
the timeout is already initialized because the PROBE state can also be
reached from a INSTANCE_RESET event; and depending on the previous
state "acd_data->probing_timeout_msec" could be uninitialized.
Fixes-test: @iptunnel_restart
Fixes: b8f9d7b5dd ('l3cfg: rework ACD handling in NML3Cfg to support handling conflicts')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2023
Managed type = managed is a bit unclear, because all managed types are
for devices that are managed, but with different levels. Managed type =
managed could be interpreted as other types are unmanaged. Change it to
managed type = full.
Don't enforce IP cleanup when devices are in deactivating state, to
make sure that network connection is still available for pre-down
dispatcher phase.
Fixes ac4e63ddda ('ip: support dhcp-send-release in NMSettingIpConfig')
https://bugzilla.suse.com/show_bug.cgi?id=1228154
Currently if the system hostname can't be determined, NetworkManager
only retries when something changes: a new address is added, the DHCP
lease changes, etc.
However, it might happen that the current failure in looking up the
hostname is caused by an external factor, like a temporary outage of
the DNS server.
Add a mechanism to retry the resolution with an increasing timeout.
https://issues.redhat.com/browse/RHEL-17972
For now, always reapply the VLANs unconditionally, even if they didn't
change in kernel.
To set again the VLANs on the port we need to clear all the existing
one before. However, this deletes also the VLAN for the default-pvid
on the bridge. Therefore, we need some additional logic to inject the
default-pvid in the list of VLANs.
Co-authored-by: Íñigo Huguet <ihuguet@redhat.com>
Currently, nm_platform_link_set_bridge_vlans() accepts an array of
pointers to vlan objects; to avoid multiple allocations,
setting_vlans_to_platform() creates the array by piggybacking the
actual data after the pointers array.
In the next commits, the array will need to be manipulated and
extended, which is difficult with the current structure. Instead, pass
separately an array of objects and its size.
During nm_lldp_neighbor_parse(), the NMLldpNeighbor is not yet added to
the NMLldpRX instance. Consequently, n->lldp_rx is NULL.
Note how we use lldp_x for logging, because we need it for the context
for which interface the logging statement is.
Thus, those debug logging statements will follow a NULL pointer and lead
to a crash.
Fixes: 630de288d2 ('lldp: add libnm-lldp as fork of systemd's sd_lldp_rx')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1550
As part of the conscious language effort we must provide an alternative
option to configure autoconnect-ports system-wide on NetworkManager
configuration file.
The in_buffer is initialized with a NULL buffer. If we never receive and
data, the buffer might still be NULL.
Maybe it actually can never happen, but it's not clear that this is
always the case. To be sure, ensure we don't return a NULL buffer on
success.
It is possible that we learn the link is ready on stage3_ip_config
rather than in link_changed event due to a stage3_ip_config scheduled by
another component. In such cases, we proceed with IP configuration
without allocating the resources needed like initializing DHCP client.
In order to avoid that, if we learn during stage3_ip_config that the
link is now ready, we need to schedule another stage3_ip_config to
allocate the resources we might need.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2004
Fixes: 83bf7a8cdb ('ovs: wait for the link to be ready before activating')
If we add multiple default routes with the same metric and different
preferences, kernel merges them into a single ECMP route, with overall
preference equal to the preference of the first route
added. Therefore, the preference of individual routes is not
respected.
To avoid that, add routes with different metrics if they have
different preferences, so that they are not merged together.
We could configure only the route(s) with highest preference ignoring
the others, and the effect would be the same. However, it is better to
add all routes so that users can easily see from "ip route" that there
are multiple routers available.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1468https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1983
Fixes: 032b4e4371 ('core: use router preference for IPv6 routes')
When activating an ovs-interface we already wait for the cloned MAC
address to be set, ifindex is present and platform link also present but
in some cases this is not enough.
If an udev rule is in place it might modify the interface when it is in
a later stage of the activation causing some race conditions or
problems. In order to solve that, we must wait until the link is fully
initialized.
We are currently asserting that the list of devices waiting for
auto-activation in NMPolicy is not empty. This condition is always
false because:
- NMDevice holds a reference to NMManager
- NMManager holds a reference to NMPolicy
- on dispose, NMDevice asserts that it's not in NMPolicy's
auto-activate list
Therefore if there is any NMDevice alive, NMPolicy must be alive as
well. Instead, if there is no NMDevice alive the list must be empty.
The assertion could fail only when the NMPolicy instance gets
disposed, which usually doesn't happen because it's still referenced
at shutdown.
Fixes: aede228974 ('core: assert that devices are not registered when disposing NMPolicy')
When activating a port with its controller deactivating by new
activation, NM will register `state-change` signal waiting controller to
have new active connections. Once controller got new active connection,
the port will invoke `nm_active_connection_set_controller()` which lead
to assert error on
g_return_if_fail(!nm_dbus_object_is_exported(NM_DBUS_OBJECT(self)))
because this active connection is already exposed as DBUS object.
To fix the problem, we remove the restriction on controller been
write-only and notify DBUS object changes for controller property.
Signed-off-by: Gris Ge <fge@redhat.com>
It's deprecated. Warn any time it is being considered for loading,
instead of at load time, so that the user gets a warning when they got
the plugin in configuration, even if it's build time disabled.
https://issues.redhat.com/browse/RHEL-24622
Prioritize internal, which is what most people should be using. Try
dhclient last, so that it's not attempted when not explicitly
configured or everything else fails.
Before introducing the hostname lookup via nm-daemon-helper and
systemd-resolved, we used GLib's GResolver which internally relies on
the libc resolver and generally also returns results from /etc/hosts.
With the new mechanism we only ask to systemd-resolved (with
NO_SYNTHESIZE) or perform the lookup via the "dns" NSS module. In both
ways, /etc/hosts is not evaluated.
Since users relied on having the hostname resolved via /etc/hosts,
restore that behavior. Now, after trying the resolution via
systemd-resolved and the "dns" NSS module, we also try via the "files"
NSS module which reads /etc/hosts.
Fixes: 27eae4043b ('device: add a nm_device_resolve_address()')
When ModemManager become available, NetworkManager resets
GDBusObjectManagerClient object.
But there is a race condition if object-added is emitted before
modm_ensure_manager(), we need to check existing objects if we want to be
in sync with ModemManager.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1957
Also check for gateway equality when deduplicate routing entries. This
allows to support multiple routes to the same network using different
gateways. This is useful for Thread networks where multiple BRs route
to the same Thread network. If one of these BRs go offline, fallback to
a different router will be much quicker if multiple entries are present.
Note that quick fallback to a different router requires IPv6
reachability probe to be active. Typically Linux disables reachability
probes on Linux machines which act as IPv6 gateway (when forwarding is
enabled).
When the lease expires, the DHCP client emits a LEASE_UPDATE event
with a NULL l3cd. After returning from the handler, it sends
immediately a DHCP DISCOVER message to try to get a new lease.
It is important that when the DISCOVER gets sent the address is no
longer configured on the interface. Otherwise, the server could see
that it is already in use and assign a different one. Therefore,
remove the address synchronously when handling the event.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1532
Currently, when the agent manager is sent a registration request
containing UTF-8 characters, it will form an invalid error message
using only one of the bytes from the UTF-8 sequence, which causes
an assertion in glib to fail, which replaces the returned error message
with "[Invalid UTF-8]". It will also print an assertion failure to the
console, or crash NetworkManager on non-release builds.
This commit makes it so that it instead prints out the character in
hexadecimal form if it isn't normally printable, so that it is once
again a valid UTF-8 string.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1965
Fixes: a30cf19858 ('agent: add agent manager and minimal agent class')
Adds an option in the connectivity section to change the timeout before
the interface is deemed "limited". Previously, it was hardcoded to
20 seconds, but for our usecase (failing over to cell modem if
hardwired ethernet drops), it's nice to be able to failover to another
interface more quickly.
The OVS interface can be matched via MAC address; in that case, the
"connection.interface-name" property of the connection is empty.
When populating the ovsdb, we need to pass the actual interface name
from the device, not the one from the connection.
Fixes: 830a5a14cb ('device: add support for OpenVSwitch devices')
https://issues.redhat.com/browse/RHEL-34617
The group interface is only used during activation; there is no need
to add a pending action for it, because when the device is in
activating state it already delays "startup-complete" via other
pending actions.