It seems more useful to have a best effort approach and configure
everything we can; in that way we achieve at least some connectivity,
and then sysadmin can check the logs in case something is
missing. Currently instead, the whole activation fails (so, no address
is configured) if just one of the addresses fails DAD.
Ideally, we should have a way to make this configurable; but for now,
implement the more useful behavior as default.
IPv4 and IPv6 DAD work slightly differently: for IPv4 the presence or
absence of carrier doesn't have any effect on the duration of the
probe; for IPv6, DAD never completes without carrier because kernel
never removes the tentative flag.
In both cases, we shouldn't ignore the DAD result because that would
mean that we complete the ipmanual method without addresses actually
configured.
Currently, IPv4 shared mode fails to start when DAD is enabled because
dnsmasq tries to bind to an address that is not yet configured on the
interface. Delay the start of dnsmasq until the shared4 l3cd is ready.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
The condition in `_get_maybe_ipv6_disabled()` is improperly set which
returns the wrong value on if an device is disabled or not when
generating the assume connection. And when
`/proc/sys/net/ipv6/conf/$DEV/disable_ipv6` is not existed (not
disabling ipv6 through sysctl setting), IPv6 is disabled by default.
Fixes: be655e6ed1 ('core: read "disable_ipv6" sysctl before nm_ip6_config_create_setting()')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1743
On very particular timing, if a connection is currently activating
on a modem device and user remove the remote settings associated
an device state change:
prepare -> deactivating (reason 'connection-removed', sys-iface-state: 'managed')
pops before entering into modem_prepare_result, resulting to a crash
on assertion.
We can simply check for the modem state to failed, set the success flag
to FALSE and continue.
Closes: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1354
Signed-off-by: Frederic Martinsons <frederic.martinsons@unabiz.com>
l3cfg emits a log for ACD conflicts. However, l3cfg is not aware of
what are the related NMDevice or the currently active connection, and
so it can't log the proper metadata fields (NM_DEVICE and
NM_CONNECTION) to the journal.
Instead, let NMDevice log about ACD collisions; in this way, it is
possible to get the message when filtering by device and connection.
For example:
$ journalctl -e NM_CONNECTION=d1df47be-721f-472d-a1bf-51815ac7ec3d + NM_DEVICE=veth0
<info> device (veth0): IP address 172.25.42.1 cannot be configured because it is already in use in the network by host 00:99:88:77:66:55
<info> device (veth0): state change: ip-config -> failed (reason 'ip-config-unavailable', sys-iface-state: 'managed')
<warn> device (veth0): Activation: failed for connection 'veth0+'
Parse the access point announced bandwidth in MHz. This is considering
both HT and VHT. Please notice that for VHT 80+80 MHz we are representing it
as 160 MHz.
Software devices that are controllers like bond/bridge/team when
configured to not ignore carrier are being deleted when deactivating the
device. Software devices that are not controllers, shouldn't be deleted.
Otherwise, if a VLAN link is deleted because the ethernet carrier-change
then NetworkManager won't be able to reactivate the VLAN once the
ethernet gets carrier because the link is not present.
This is restoring the previous behaviour and it's know to be relied on
by users.
https://bugzilla.redhat.com/show_bug.cgi?id=2224479https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1701
Fixes: efa63aef3a ('device: delete software device when software devices lose carrier')
The `nm_device_hw_addr_reset()` should only set MAC address on NIC
with valid(>0) interface index.
The failure was found by `ovs_mtu` test of NMCI, failed to reproduce
the original problem (`ovs_mtu` test of NMCI) with 100 times retry.
And no trace log found for original test failure, hence cannot tell why
`nm_device_hw_addr_reset()` been invoked with iface index 0.
Signed-off-by: Gris Ge <fge@redhat.com>
We delete devices when the connection goes down and NetworkManager
created the device earlier.
Software devices like bond/bridge/team default to ignoring carrier.
However, when configuring them to not ignore carrier
([device].ignore-carrier), they were not deleted when deactivating the
devices.
This adjusts commit d0c2a24b71 ('device: do not remove software devices
on initial disconnected (rh #1035814)'). Note that back then there was
no check whether the device has an activation queued, so it behaved
differently then.
When the software device enters the UNAVAILABLE state from UNMANAGED,
during cleanup we shouldn't delete the link.
Co-Authored-By: Beniamino Galvani <bgalvani@redhat.com>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1686
When user are changing SR-IOV VF settings for options like `max-tx-rate`
which some hardware not supported yet, the failure of this VF will fail
the whole activation, then the SR-IOV will be disabled means all the VFs
will be deleted.
Deleting VFs might break network connectivity and this collateral
damage of VF option failure is not acceptable for OpenShift use cases
even they have checkpoint protection.
This patch only log warn message on failure of VF options and will not
fail the activation.
NetworkManager also ignore MTU failure during activation, I believe this
fit into the same assumption.
User case reference: https://bugzilla.redhat.com/show_bug.cgi?id=2210164
Signed-off-by: Gris Ge <fge@redhat.com>
The macro uses g_alloca(). Using alloca() is potentially dangerous. For
example, it must never be used in an unbounded loop. This should be
immediately obvious from the name, so we don't accidentally use them
in the wrong context.
All other alloca() macros should have such a prefix already. And they
always have to be macros, because you couldn't use alloca() to return
memory from a function.
By default, bond/bridge/team devices ignore carrier, and so do their
ports. However, it can make sense to set '[device*].ignore-carrier' for
the controller device. Meaningfully support that.
This is a follow up to commit 8c91422954 ('device: handle carrier
changes for master device differently'), which didn't fully solve the
problem.
What already works, is that when you set ignore-carrier for the
controller, then after loss of carrier and a carrier wait timeout, the
controller and ports go down. If both the controller and port profiles
have autoconnect disabled, they stay down and that's it. It works as
expected, but is not very useful, because when we want to automatically
react on carrier loss, we also want to automatically reconnect.
For controller profiles, carrier only makes sense when ports are
attached. However, we can (auto) activate controller profiles without
ports. So when the user enables autoconnect for the controller profile,
then the profile will eagerly reconnect. That means, after loss of
carrier, the device goes down and reconnects right away. It means, when
configuring a bond with ignore-carrier=no and autoconnect=yes, then
the sensible thing happens (an immediate reconnect). That is just not
a useful configuration.
The useful way to configure configure ignore-carrier=no for a controller
device, autoconnect on the master must be disabled while being enabled
on the ports. After all, it's the ports that will autoconnect based on
the carrier state and bring up the controller with them.
Note that at the moment when a port decides to autoconnect, the
controller profile is not yet selected. That only happens later during
_internal_activate_device() after searching it with find_master(). At
that point, the port profile checks whether it should autoconnect based
on its own carrier state, and abort if not.
If autoconnect is aborted due to lack of carrier, the profile gets
blocked from autoconnect with reason "failed". Hence, when the carrier
returns, we need to clear any "failed" blocked reasons and schedule
another autoconnect check,
Note that this really only works if the port is itself a simple device,
like an ethernet. If the port is itself a software device (like a bond,
or a VLAN), then the carrier state in _internal_activate_device() is
unknown, and we cannot avoid autoconnect. It's unclear how that could
make sense, if at all.
This setup can be combined with "connection.autoconnect-slaves=yes". In
that case, we have the first port to autoconnect when they get carrier,
bringing up the controller too. Usually the other ports that don't have
carrier would not autoconnect, but with autoconnect-slaves they will.
The effect is, that we autoconnect whenever any of the ports has
carrier, and then we immediately also bring up the ports that don't have
carrier (which we usually would not).
https://bugzilla.redhat.com/show_bug.cgi?id=2156684https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1658
Struct allow named arguments, which seems easier to maintain instead of
a function with many arguments. Also, adding a new parameter does not
require changes to most of the callers.
The real advantage of this is that we encode all the search parameters
in one argument. And we can add that argument to
_match_section_infos_lookup(), alongside lookup by NMDevice or
NMPlatformLink.
All callers eventually want a boolean instead of a NMMatchSpecMatchType.
I think the NMMatchSpecMatchType enum still has value at the lower
layers, where the enum values are clearer (when reading the code). So
don't drop NMMatchSpecMatchType entirely.
However, let's add nm_match_spec_match_type_to_bool() to convert the
match-type to a boolean to avoid duplicating the code.
Arguably, a kernel link is needed for DHCP and so the interface name
univocally identifies a device (for example, the OVS interface). But
for consistency and clarity, store the device type to be used for
logging.
When logging, messages include the interface name to specify what
device they refer to. In most case the interface name is unique.
There are some devices that don't have a kernel link associated, and
their interface name is not guaranteed to be unique. This is currently
the case for OVS bridges and OVS ports. When reading a log with
duplicate interface names, it is difficult to understand what is
happening. And this is made worse by the fact that it is common
practice to assign the same name to all devices in a OVS hierarchy
(bridge, port, interface).
To make logs unambiguous, we want to print the device type together
with the name; however we don't want to *always* print the type
because in most cases it's not useful and it would consume valuable
real estate on the screen. Adopt a simple heuristic of showing the
type only for OVS devices.
This commit adds a helper function to return the device type to show
in logs, when it is needed.
It's not obvious, why we couldn't have a pending dever action
at that point. Maybe we cannot, but just to be explicit about it,
handle that we potentially might.
For example, we tend to schedule the timeout priv->carrier_defer_source
only from within nm_device_set_carrier() if `priv->carrier` is FALSE.
At the same time, nm_device_set_carrier() does nothing `if
(priv->carrier == carrier)`. So probably there is no problem.
However, we also set priv->carrier directly in
nm_device_set_carrier_from_platform() without clearing the timer. It's
hard to imagine whether there can be a case where we might have two
timeouts pending.
commit_option() was used in the past to set both bridge and bridge port
options using sysfs. Currently it is only used for bridge port options.
This patch removes the dead code for bridge options and unify it on
commit_port_options(). This is simplifying the work needed to support
bridge port option through netlink.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1643
When a fixed address is assigned by the P2P group owner, then the code
would set the IPv4 configuration method to DISABLED internally. However,
this causes issues, because it means that IPv4 is considered to not have
come up internally which can cause the connection to later time out even
though it was configured properly.
As such, map this method to MANUAL in this case. The AUTO mapping
becomes then:
* MANUAL: If the remote part is the GO and assigned an IP address
* DHCP: If the remote part is the GO and did not assign an address
* SHARED: If we are the GO
This fixes an issue where the connection established by GNOME Network
Displays would fail once IPv6 configuration also times out.
See-also: https://gitlab.gnome.org/GNOME/gnome-network-displays/-/issues/279https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1636
This is the version shipped in Fedora 38. As Fedora 38 is now out, the
core developers switch to it. Our gitlab-ci will also use that as base
image for the check-{patch.tree} tests and to generate the pages. There
is a need that everybody agrees on which clang-format version to use,
and that version should be the one of the currently used Fedora release.
Also update the used Fedora image in "contrib/scripts/nm-code-format-container.sh"
script.
The gitlab-ci still needs update in the following commit. This change
in isolation will break the "check-tree" test.
In constructed(), NMDevice starts watching the D-Bus name owner or
monitoring the unix socket, and so it is always aware if teamd is
running. When it is, NMDevice connects to it and initializes
priv->tdc.
It is not useful to try to connect to teamd in update_connection()
because warnings will be generated by NM and by libteam if teamd is
not running. As explained above the connection is always initialized
when teamd is available, and so we can just check priv->tdc.
Fixes: ab586236e3 ('core: implement update_connection() for Team')
https://bugzilla.redhat.com/show_bug.cgi?id=2182029https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1631
According to systemd, IPv6 forwarding is special anyway, and they only
enable forwarding for "net.ipv6.conf.all.forwarding" ([1]).
Since commit 46e63e03af ('device: announce the managed IPv6
configuration with ipv6.method=shared') we support "ipv6.method=shared"
and enable forwarding for IPv6, on the interface. Whether that makes
sense is questionable, given [1] and the claim that setting it
per-interface is not useful.
Anyway, since that change we always reset the "forwarding" sysctl to
zero, when we don't enable shared mode. That is not right, because the
user didn't explicitly ask for that (and there is no configuration
option like systemd-networkd's "IPForward=" setting to control that).
What we instead should do, not touch/reset the sysctl, unless we really
want to.
No longer set "forwarding" to zero by default. And only restore the
previous value (_dev_sysctl_save_ip6_properties()) if we actually
changed the value to "1".
[1] b8fba0cded/src/network/networkd-sysctl.c (L79)https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/923
Fixes: 46e63e03af ('device: announce the managed IPv6 configuration with ipv6.method=shared')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1616
Add per port priority support for bond active port re-selection during
failover. A higher number means a higher priority in selection. The
primary port still has the highest priority. This option is only
compatible with active-backup, balance-tlb and balance-alb modes.
sysfs is deprecated and kernel will not add new bond port options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
Before commit a42682d44f ('device: take reference to device object
before 'delete_on_deactivate''), we used a weak pointer to track the
idle action.
As we now use a strong reference, we can store all data about the idle
action in NMDevice itself. Drop DeleteOnDeactivateData.
Hook the information for tracking the activation of a device, to the
NMDevice itself. Sure, that slightly couples the NMPolicy closer to
NMDevice, but the result is still simpler code because we don't need a
separate ActivateData.
It also means we can immediately tell whether the auto activation check
for NMDevice is already scheduled and don't need to search through the
list.