When activating a port with its controller deactivating by new
activation, NM will register `state-change` signal waiting controller to
have new active connections. Once controller got new active connection,
the port will invoke `nm_active_connection_set_controller()` which lead
to assert error on
g_return_if_fail(!nm_dbus_object_is_exported(NM_DBUS_OBJECT(self)))
because this active connection is already exposed as DBUS object.
To fix the problem, we remove the restriction on controller been
write-only and notify DBUS object changes for controller property.
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit 83a2595970)
Connection timestamps are updated (saved to disk) on connection up and
down. This way, the last used connection will take precedence for
autoconnect if they have the same priority.
But as we don't actually do connection down when NM stops, the last
connection timestamp of all active connections is the timestamp of when
they were brought up. Then, the activation order might be wrong on next
start.
One case where timestamps are wrong (although it is not clear how
important it is because the connections are activated on different
interfaces):
1. Activate con1 <- timestamp updated
2. Activate con2 <- timestamp updated
3. Deactivate con2 <- timestamp updated
4. Stop NM <- timestamp of con2 is higher than con1, but con1 was still
active when con2 was brought down.
Other case that is reproducible (from
https://issues.redhat.com/browse/RHEL-35539):
1. Activate con1
2. Activate con2 on same interface:
- As a consequence con1 is deactivated and its timestamp updated
- The timestamp of con2 is also updated
3. Stop NM <- timestamp of con1 and con2 is the same, next activation
order will be undefined.
Fix by saving the timestamps on NM shutdown.
Resolves: https://issues.redhat.com/browse/RHEL-35539
(cherry picked from commit 4bf11b7d66)
Problem:
Given a OVS port with `autoconnect-ports` set to default or false,
when reactivation required for checkpoint rollback,
previous activated OVS interface will be in deactivate state after
checkpoint rollback.
The root cause:
The `activate_stage1_device_prepare()` will mark the device as
failed when controller is deactivating or deactivated.
In `activate_stage1_device_prepare()`, the controller device is
retrieved from NMActiveConnection, it will be NULL when NMActiveConnection
is in deactivated state. This will cause device been set to
`NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED` which prevent all follow
up `autoconnect` actions.
Fix:
When noticing controller is deactivating or deactivated with reason
`NM_DEVICE_STATE_REASON_NEW_ACTIVATION`, use new function
`nm_active_connection_set_controller_dev()` to wait on controller
device state between NM_DEVICE_STATE_PREPARE and
NM_DEVICE_STATE_ACTIVATED. After that, use existing
`nm_active_connection_set_controller()` to use new
NMActiveConnection of controller to move on.
Resolves: https://issues.redhat.com/browse/RHEL-31972
Signed-off-by: Gris Ge <fge@redhat.com>
(cherry picked from commit a68d2fd780)
Fix the following:
NetworkManager: file ../src/libnm-core-impl/nm-connection.c: line 321 (nm_connection_get_setting): should not be reached
NetworkManager.service: Main process exited, code=dumped, status=5/TRAP
Fixes: bd38a19832 ('connection: add support to down-on-poweroff')
While enumerating devices at startup, we take a snapshot of existing
links from platform and we start creating device instances for
them. It's possible that in the meantime, while processing netlink
events in platform_link_added(), a link gets renamed. If that happens,
then we have two different views of the same ifindex: the cached link
from `links` and the link in platform.
This can cause issues: in platform_link_added() we create the device
with the cached name; then in NMDevice's constructor(), we look up
from platform the ifindex for the given name. Because of the rename,
this lookup can match a newly created, different link.
The end result is that the ifindex from the initial snapshot doesn't
get a NMDevice and is not handled by NetworkManager.
Fix this problem by fetching the latest version of the link from
platform to make sure we have a consistent view of the state.
https://issues.redhat.com/browse/RHEL-25808https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1897
When creating VLAN over OVS internal interface which holding the same
name as its controller OVS bridge, NetworkManager will fail with error:
Error: Connection activation failed: br0.101 failed to create
resources: cannot retrieve ifindex of interface br0 (Open vSwitch
Bridge)
Expanded the `find_device_by_iface()` with additional argument
`child: NmConnection *` which will validate whether candidate is
suitable to be parent device.
In `nm_device_check_parent_connection_compatible()`, we only not allow OVS
bridge and OVS port being parent.
Resolves: https://issues.redhat.com/browse/RHEL-26753
Signed-off-by: Gris Ge <fge@redhat.com>
With `NM_CHECKPOINT_CREATE_FLAG_TRACK_INTERNAL_GLOBAL_DNS` flag set on
checkpoint creation, the checkpoint rollback will restore the
global DNS in internal configure file
`/var/lib/NetworkManager/NetworkManager-intern.conf`.
If user has set global DNS in /etc folder, this flag will not take any
effect.
Resolves: https://issues.redhat.com/browse/RHEL-23446
Signed-off-by: Gris Ge <fge@redhat.com>
The new option at NMSettingConnection allow the user to specify if the
connection needs to be down when powering off the system. This is useful
for IP address removal prior powering off. In order to accomplish that,
we listen on "Shutdown" systemd DBus signal.
The option is set to FALSE by default, it can be specified globally on
configuration file or per profile.
The code that is adding the devices to the sleeping list and taking them
down should be moved to a separated function. This way we can reuse it
and we avoid duplicating code.
In order to provide the NMSleepMonitor a more generic usage, let's
rename the whole module to NMPowerMonitor. Nothing is exposed to the API
so it is a trivial renaming.
If a generic device is present and the name matches, it is compatible
with any link type.
For example, if a generic connection has a device-handler that creates
a dummy interface, the link is compatible with the NMDeviceGeneric.
(cherry picked from commit 5978fb2b27)
When a generic connection has a custom device-handler, it always
generates a NMDeviceGeneric, even when the link that gets created is
of a type natively supported by NM. On service restart, we need to
keep track that the device is generic or otherwise a different device
type will be instantiated.
(cherry picked from commit f2613be150)
Mark the methods/properties deprecated in the D-Bus API (via
org.freedesktop.DBus.Introspectable.Introspect(), [1]).
It affects those properties that are documented as deprecated in
introspection XML.
$ busctl -j call \
org.freedesktop.NetworkManager \
/org/freedesktop/NetworkManager \
org.freedesktop.DBus.Introspectable \
Introspect | \
jq '.data[0]' -r | \
grep -5 Deprecated
[1] https://dbus.freedesktop.org/doc/dbus-specification.html#standard-interfaces-introspectable
This option was only introduced only to allow keeping the old behavior
in RHEL7, while the default order was changed from 'ifindex' to 'name'
in RHEL8. The usefulness of this option is questionable, as 'name'
together with predictable interface names should give predictable order.
When not using predictable interface names, the name is unpredictable
but so is the ifindex.
https://issues.redhat.com/browse/NMT-926https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1814
Remove all the code that was added for the CSME coexistence.
The Intel WiFi team can't commit on when, if at all, this feature will
be completely integrated and tested in the NetworkManager.
The preferred solution for now is the solution that involves the kernel
only.
Remove the code that was merged so far.
The device authentication request is an async process, it can not know
the answer right away, it is not guarantee that device is still
exported on D-Bus when authentication finishes. Thus, do not return
SUCCESS and abort the authentication request when device is not alive.
https://bugzilla.redhat.com/show_bug.cgi?id=2210271
When activating a port connection it will require the controller
connection is active or a valid controller device candidate is available
for activation.
One of the conditions we consider for a controller device to be a valid
candidate for the connection is that it is not active, therefore we
should also consider as valid a device that is currently deactivating.
Otherwise, we could fail during the port activation just because the
deactivation of the controller device candidate didn't finish yet.
https://bugzilla.redhat.com/show_bug.cgi?id=2125615https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1693
By default, bond/bridge/team devices ignore carrier, and so do their
ports. However, it can make sense to set '[device*].ignore-carrier' for
the controller device. Meaningfully support that.
This is a follow up to commit 8c91422954 ('device: handle carrier
changes for master device differently'), which didn't fully solve the
problem.
What already works, is that when you set ignore-carrier for the
controller, then after loss of carrier and a carrier wait timeout, the
controller and ports go down. If both the controller and port profiles
have autoconnect disabled, they stay down and that's it. It works as
expected, but is not very useful, because when we want to automatically
react on carrier loss, we also want to automatically reconnect.
For controller profiles, carrier only makes sense when ports are
attached. However, we can (auto) activate controller profiles without
ports. So when the user enables autoconnect for the controller profile,
then the profile will eagerly reconnect. That means, after loss of
carrier, the device goes down and reconnects right away. It means, when
configuring a bond with ignore-carrier=no and autoconnect=yes, then
the sensible thing happens (an immediate reconnect). That is just not
a useful configuration.
The useful way to configure configure ignore-carrier=no for a controller
device, autoconnect on the master must be disabled while being enabled
on the ports. After all, it's the ports that will autoconnect based on
the carrier state and bring up the controller with them.
Note that at the moment when a port decides to autoconnect, the
controller profile is not yet selected. That only happens later during
_internal_activate_device() after searching it with find_master(). At
that point, the port profile checks whether it should autoconnect based
on its own carrier state, and abort if not.
If autoconnect is aborted due to lack of carrier, the profile gets
blocked from autoconnect with reason "failed". Hence, when the carrier
returns, we need to clear any "failed" blocked reasons and schedule
another autoconnect check,
Note that this really only works if the port is itself a simple device,
like an ethernet. If the port is itself a software device (like a bond,
or a VLAN), then the carrier state in _internal_activate_device() is
unknown, and we cannot avoid autoconnect. It's unclear how that could
make sense, if at all.
This setup can be combined with "connection.autoconnect-slaves=yes". In
that case, we have the first port to autoconnect when they get carrier,
bringing up the controller too. Usually the other ports that don't have
carrier would not autoconnect, but with autoconnect-slaves they will.
The effect is, that we autoconnect whenever any of the ports has
carrier, and then we immediately also bring up the ports that don't have
carrier (which we usually would not).
https://bugzilla.redhat.com/show_bug.cgi?id=2156684https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1658
When managing a device after it is realized, we previously always set
the NOW_MANAGED reason, that makes the device fully-managed.
This works based on the assumption that initially an external device
has unmanaged flag EXTERNAL_DOWN set, and therefore the device stays
unmanaged during realization.
It is possible that an external device appears already with addresses
(or attached to a controller); we need to set reason
CONNECTION_ASSUMED if it's an external device, so that we don't set
sys-iface-state=managed.
Reproducer:
ip link add br1 type bridge
killall -STOP NetworkManager
ip link add dummy1 type dummy
ip link set dummy1 master br1
ip link set dummy1 up
sleep .5
killall -CONT NetworkManager
After this, dummy1 is fully managed by NM while it shouldn't.
https://bugzilla.redhat.com/show_bug.cgi?id=2149012
If there are ports that refer the controllers by a device name, and
multiple autoconnectable controller devices of that name, the
situation gets messy. In particular, the autoconnect logic can start
activating a device with a higher autoconnect priority, but then a port
can override it by bringing up another controller of possibly lower
autoconnect priority.
Let's
1.) prefer controller connections with higher autoconnect priority
and
2.) prefer connections that are already active so that we don't
disrupt existing activation.
https://bugzilla.redhat.com/show_bug.cgi?id=2121451
When managing the interface after wake/reenable, the reason determines
whether the device will be sys-iface-state=managed or external.
Commit 5a9a7623c5 ('core: set STATE_REASON_CONNECTION_ASSUMED when
waking up') changed the reason from 'now-managed' to
'connection-assumed'; the effect was that devices that were fully
managed before sleeping become external after a wake up. For example:
$ nmcli connection add type ethernet ifname enp1s0
Connection 'ethernet-enp1s0' (47fcd81e-bf00-4c02-b25b-354894f5657e) successfully added.
$ nmcli device | grep enp1s0
enp1s0 ethernet connected ethernet-enp1s0
$ nmcli networking off
$ nmcli device | grep enp1s0
enp1s0 ethernet unmanaged --
$ nmcli networking on
$ nmcli device | grep enp1s0
enp1s0 ethernet unavailable --
Set the correct reason during wake up so that the previous state is
restored.
Fixes: 5a9a7623c5 ('core: set STATE_REASON_CONNECTION_ASSUMED when waking up')
https://bugzilla.redhat.com/show_bug.cgi?id=2193422
Cleanup logging to always print a "block-autoconnect:" prefix to related
lines. Also, make sure that everywhere where the state changes, a line
gets logged. Also, for devconf data print both the interface and the
profile.
We only have a few blocked reasons. Some of them can be only set on the
devcon data, and some only on the settings connection. Assert that we
don't mix that up.
NMPolicy really should be merged into NMManager. It has not a clear responsiblity
so that there are two separate objects only makes things confusing. Anyway. It
is permissible to look up the NMPolicy instance of a NMManager. Add an accessor.
It's the better name. Especially since there is no more signal involved,
the term "emit" doesn't match.
Note also how the previous approach using a signal tried to abstract
what is happening. So we were no longer rechecking-autoconnect, instead,
we were emitting-a-signal-to-recheck-autoconnect. Just be plain about
what it is doing and don't go through a layer of signal.
GObject signals don't make the code easier to understand, on the
contrary. They may have their purpose, when objects truly must/should
not be aware of each other, and need to be composed very loosely. That
is not the case here.
There really is only one subscriber to NM_DEVICE_RECHECK_AUTO_ACTIVATE
signal, and it only makes sense this way. Instead of going through a
signal invocation, just call the well known method directly. It becomes
clearer who calls this code (and it has a lower overhead).
When using cscope/ctags it also is easier to follow the code because the
tools understand function calls.
F_SETFL will reset the flags. That is wrong, as we only want to add
O_NONBLOCK flag and leaving the other flags alone. Usually, we would
need to call F_GETFL first.
Note that on Linux, F_SETFL can only set certain flags, so the
O_RDWR|O_CLOEXEC flags were unaffected by this. That means, most likely
there are no other flags that our use of F_SETFL would wrongly clear.
Still, it's ugly, because it's not obvious whether there might be other
flags.
Avoid that altogether, by setting the flag already during open().
Fixes: 67e092abcb ('core: better handling of rfkill for WiMAX and WiFi (bgo #629589) (rh #599002)')
- drop annotations from "@error" which has defaults.
- ensure all annotations are on the same line. That's useful
when searching for an annotation, to find the line that specifies
the argument name.
- convert a few plain docs into gtkdoc annotations.
If we are deactivating active-connections for a specific
settings-connection, also consider active-connections that are waiting
for authorization. Otherwise, when the connection is deleted, a
active-connection might still reference it.
For each connection that corresponds to a software device, we create a
"unrealized" device that then becomes realized just before the
connection starts activating. Currently, in certain conditions NM
creates two devices with the same name and type, one realized and one
not; this is not expected and can lead to other issues especially when
a software device is reactivated.
Avoid that by relaxing the check in system_create_virtual_device(): if
a device exists with the same name and type, we don't want to create
another even if the type-specific parameters differ.