Commit graph

146 commits

Author SHA1 Message Date
Beniamino Galvani
ba86c208e0 Revert "core: prevent the activation of unavailable OVS interfaces only"
This was a workaround until the real cause of the issue was found.

This reverts commit a1c05d2ce6.
2025-04-02 10:01:38 +02:00
Lubomir Rintel
b7114d00ed Reapply "manager: create virtual devices on AddAndActivate()"
This reverts commit ccae5dc0e2.

(cherry picked from commit 11045cfa00)
2025-02-26 13:29:53 +01:00
Lubomir Rintel
4a1c51317e manager: make system_create_virtual_device() return a GError
This is done so that AddAndActivate() will return sensible errors in a
future patch that makes it support creating virtual devices.

In effect, all errors are logged in one place, therefore the log levels
are different. I don't think we're losing anything of value by being
a little less verbose here.

(cherry picked from commit 45d82f720c)
2025-02-26 13:29:49 +01:00
Fernando Fernandez Mancera
b8ef2a551e core: prevent the activation of unavailable OVS interfaces only
Preventing the activation of unavailable devices for all device types is
too aggresive and leads to race conditions, e.g when a non-virtual bond
port gets a carrier, preventing the device to be a good candidate for
the connection.

Instead, enforce this check only on OVS interfaces as NetworkManager
just makes sure that ovsdb->ready is set to TRUE.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2139

Fixes: 774badb151 ('core: prevent the activation of unavailable devices')
(cherry picked from commit a1c05d2ce6)
2025-02-18 12:29:19 +01:00
Beniamino Galvani
774badb151 core: prevent the activation of unavailable devices
When autoconnecting ports of a controller, we look for all candidate
(device,connection) tuples through the following call trace:

 -> autoconnect_ports()
   -> find_ports()
     -> nm_manager_get_best_device_for_connection()
       -> nm_device_check_connection_available()
         -> _nm_device_check_connection_available()

The last function checks that a specific device is available to be
activated with the given connection. For virtual devices, it only
checks that the device is compatible with the connection based on the
device type and characteristics, without considering any live network
information.

For OVS interfaces, this doesn't work as expected. During startup, NM
performs a cleanup of the ovsdb to remove entries that were previously
added by NM. When the cleanup is terminated, NMOvsdb sets the "ready"
flag and is ready to start the activation of new OVS interfaces. With
the current mechanism, it is possible that a OVS-interface connection
gets activated via the autoconnect-ports mechanism without checking
the "ready" flag.

Fix that by also checking that the device is available for activation.
2025-02-12 09:53:06 +01:00
Beniamino Galvani
6c1eb99d32 core: cleanup nm_manager_get_best_device_for_connection()
Rename "unavailable_devices" to "exclude_devices", as the
"unavailable" term has a specific, different meaning in NetworkManager
(i.e. the device is in the UNAVAILABLE state). Also, use
nm_g_hash_table_contains() when needed.
2025-02-12 09:51:01 +01:00
Lubomir Rintel
d1725cd288 Revert "manager: create virtual devices on AddAndActivate()"
This reverts commit eb635c23a7.
2025-02-06 10:35:02 +01:00
Lubomir Rintel
79219553be cloud-setup: fix build
Fixes: 6ff4b9e57c ('cloud-setup: create VLANs for multiple VNICs on OCI')
2025-01-20 17:53:58 +01:00
Lubomir Rintel
eb635c23a7 manager: create virtual devices on AddAndActivate()
If the connection didn't exist in advance, there's no unrealized device,
and find_device_by_iface() is not going to get us one.

Call system_create_virtual_device() afrer nm_utils_complete_generic()
completes the connection for virtual devices. Make sure we do proper
cleanup if we happen to fail the activation later, so that de device
doesn't end up hanging there.
2025-01-20 06:18:45 +01:00
Lubomir Rintel
57e140d961 manager: split device creation off from validate_activation_request()
Make validate_activation_request() only do the validation -- split the
determination of the device into find_device_for_activation().

The point of this is to be able complete the connection and actually
create a virtual device after the validation.

I believe this is also somewhat easier to follow now that the procedure
does what its name says.
2025-01-20 06:15:54 +01:00
Lubomir Rintel
25871f1971 manager: reword some error messages
They've been a little too cryptic and unnecessarily long before.
2025-01-20 06:13:59 +01:00
Lubomir Rintel
be034a1f3f device: simplify the nm_utils_complete_generic() machinery
The point is to get rid of device/connection type specific arguments, to
eventually be able to complete the connection on AddAndActivate before knowing
which factory is going to take care of creating the device.

Aside from that, the whole thing is pretty awful -- with complicated
macros and variadic argument (ugh). Let's get rid of that.
2025-01-20 06:13:59 +01:00
Íñigo Huguet
e330eb9c4a l3cfg: remove routes added by NM on reapply
By default, on reapply we were only syncing the main routes table. This
causes that routes added by NM to other tables are not removed on
reapply. This was done to preserve routes added externally, but routes
added by NM itself should be removed.

Add a new route table syncing mode "main + NM routes". This mode
maintains the normal behaviour of syncing completely the main table,
and for other tables removes only routes that were added by us, leaving
the rest untouched. Use this mode by default, as this is what a user
would expect on reapply.

Note: this might not work if NM is restarted between the profile being
modified and the reapply, because NM forgets what routes were added by
itself because of the restart. This is a rare corner case, though.

Use the D-Bus property "VersionInfo" to expose a capability flag
indicating that this bug is fixed. It is the first capability that we
expose in this way. However, it is convenient to do it this way as it's
something that clients like nmstate needs to know, so they can decide
whether a conn down is needed or not. It is not enough to decide that by
version number because it might be fixed via a downstream patch in distros
like RHEL.

https://issues.redhat.com/browse/RHEL-67324
https://issues.redhat.com/browse/RHEL-66262

Fixes: e9c17fcc9b ('l3cfg: default to 'main' route table sync mode')
2024-12-11 15:52:09 +00:00
Beniamino Galvani
bb6881f88c format: run nm-code-format
Reformat with:

  clang-format version 19.1.0 (Fedora 19.1.0-1.fc41)

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2046
2024-10-04 11:07:35 +02:00
Íñigo Huguet
7dae55f0f2 core: rename NM_DEVICE_MANAGED_TYPE_MANAGED to _TYPE_FULL
Managed type = managed is a bit unclear, because all managed types are
for devices that are managed, but with different levels. Managed type =
managed could be interpreted as other types are unmanaged. Change it to
managed type = full.
2024-08-28 15:35:56 +02:00
Íñigo Huguet
573c48d034 core: rename sys-iface-state to managed-type internally
The previous name was not very self explanatory. Managed type indicates
a bit better what the meaning is.
2024-08-28 15:35:56 +02:00
Fernando Fernandez Mancera
ad68b28843 config: parse autoconnect-ports value on config
As part of the conscious language effort we must provide an alternative
option to configure autoconnect-ports system-wide on NetworkManager
configuration file.
2024-08-09 15:47:32 +02:00
Fernando Fernandez Mancera
79221f79a2 src: drop most slave references from the code
While we cannot remove all the references to "slave" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Fernando Fernandez Mancera
090d617017 src: drop most master references from the code
While we cannot remove all the references to "master" we can remove most
of them.
2024-08-09 15:47:32 +02:00
Gris Ge
83a2595970 activation: Allow changing controller of exposed active connection
When activating a port with its controller deactivating by new
activation, NM will register `state-change` signal waiting controller to
have new active connections. Once controller got new active connection,
the port will invoke `nm_active_connection_set_controller()` which lead
to assert error on
    g_return_if_fail(!nm_dbus_object_is_exported(NM_DBUS_OBJECT(self)))

because this active connection is already exposed as DBUS object.

To fix the problem, we remove the restriction on controller been
write-only and notify DBUS object changes for controller property.

Signed-off-by: Gris Ge <fge@redhat.com>
2024-07-12 17:38:01 +08:00
Íñigo Huguet
4bf11b7d66 manager: save timestamps when shutting down
Connection timestamps are updated (saved to disk) on connection up and
down. This way, the last used connection will take precedence for
autoconnect if they have the same priority.

But as we don't actually do connection down when NM stops, the last
connection timestamp of all active connections is the timestamp of when
they were brought up. Then, the activation order might be wrong on next
start.

One case where timestamps are wrong (although it is not clear how
important it is because the connections are activated on different
interfaces):
1. Activate con1 <- timestamp updated
2. Activate con2 <- timestamp updated
3. Deactivate con2 <- timestamp updated
4. Stop NM <- timestamp of con2 is higher than con1, but con1 was still
   active when con2 was brought down.

Other case that is reproducible (from
https://issues.redhat.com/browse/RHEL-35539):
1. Activate con1
2. Activate con2 on same interface:
   - As a consequence con1 is deactivated and its timestamp updated
   - The timestamp of con2 is also updated
3. Stop NM <- timestamp of con1 and con2 is the same, next activation
   order will be undefined.

Fix by saving the timestamps on NM shutdown.
2024-05-22 12:49:59 +02:00
Gris Ge
a68d2fd780 checkpoint: fix port reactivation when controller is deactivating
Problem:

    Given a OVS port with `autoconnect-ports` set to default or false,
    when reactivation required for checkpoint rollback,
    previous activated OVS interface will be in deactivate state after
    checkpoint rollback.

The root cause:

    The `activate_stage1_device_prepare()` will mark the device as
    failed when controller is deactivating or deactivated.
    In `activate_stage1_device_prepare()`, the controller device is
    retrieved from NMActiveConnection, it will be NULL when NMActiveConnection
    is in deactivated state. This will cause device been set to
    `NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED` which prevent all follow
    up `autoconnect` actions.

Fix:
    When noticing controller is deactivating or deactivated with reason
    `NM_DEVICE_STATE_REASON_NEW_ACTIVATION`, use new function
    `nm_active_connection_set_controller_dev()` to wait on controller
    device state between NM_DEVICE_STATE_PREPARE and
    NM_DEVICE_STATE_ACTIVATED. After that, use existing
    `nm_active_connection_set_controller()` to use new
    NMActiveConnection of controller to move on.

Resolves: https://issues.redhat.com/browse/RHEL-31972

Signed-off-by: Gris Ge <fge@redhat.com>
2024-05-14 11:39:21 +08:00
Íñigo Huguet
56179465df Updated code format
The CI will use Fedora 40 for code formatting check. Update the code
formatting so it passes.
2024-04-08 06:35:20 +00:00
Beniamino Galvani
1b60dd9a9e manager: fix assertion failure during shutdown
Fix the following:

  NetworkManager: file ../src/libnm-core-impl/nm-connection.c: line 321 (nm_connection_get_setting): should not be reached
  NetworkManager.service: Main process exited, code=dumped, status=5/TRAP

Fixes: bd38a19832 ('connection: add support to down-on-poweroff')
2024-04-04 11:12:17 +02:00
Beniamino Galvani
de130df3e2 manager: fix race condition while enumerating devices at startup
While enumerating devices at startup, we take a snapshot of existing
links from platform and we start creating device instances for
them. It's possible that in the meantime, while processing netlink
events in platform_link_added(), a link gets renamed. If that happens,
then we have two different views of the same ifindex: the cached link
from `links` and the link in platform.

This can cause issues: in platform_link_added() we create the device
with the cached name; then in NMDevice's constructor(), we look up
from platform the ifindex for the given name. Because of the rename,
this lookup can match a newly created, different link.

The end result is that the ifindex from the initial snapshot doesn't
get a NMDevice and is not handled by NetworkManager.

Fix this problem by fetching the latest version of the link from
platform to make sure we have a consistent view of the state.

https://issues.redhat.com/browse/RHEL-25808
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1897
2024-03-26 10:26:02 +01:00
Gris Ge
7096f52a59 ovs: Do not allow OVS bridge and port to be parent
When creating VLAN over OVS internal interface which holding the same
name as its controller OVS bridge, NetworkManager will fail with error:

    Error: Connection activation failed: br0.101 failed to create
    resources: cannot retrieve ifindex of interface br0 (Open vSwitch
    Bridge)

Expanded the `find_device_by_iface()` with additional argument
`child: NmConnection *` which will validate whether candidate is
suitable to be parent device.

In `nm_device_check_parent_connection_compatible()`, we only not allow OVS
bridge and OVS port being parent.

Resolves: https://issues.redhat.com/browse/RHEL-26753

Signed-off-by: Gris Ge <fge@redhat.com>
2024-03-15 16:12:37 +08:00
Fernando Fernandez Mancera
170e128215 core: deprecate master in NMActiveConnection internal API
PROP_INT_MASTER_READY and PROP_INT_MASTER are internal API only, that
means we can replace it right away. In addition, replace the functions
related to the properties.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1885
2024-03-13 18:24:47 +01:00
Fernando Fernandez Mancera
1f05526ed7 core: drop NMDevice master and introduce controller
The master property for NMDevice is internal only therefore we can
replace it directly with controller. In addition, I have adapted related
functions to use controller instead of master.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1881
2024-03-13 18:00:40 +01:00
Gris Ge
86d67da28d checkpoint: Allow rollback on internal global DNS
With `NM_CHECKPOINT_CREATE_FLAG_TRACK_INTERNAL_GLOBAL_DNS` flag set on
checkpoint creation, the checkpoint rollback will restore the
global DNS in internal configure file
`/var/lib/NetworkManager/NetworkManager-intern.conf`.

If user has set global DNS in /etc folder, this flag will not take any
effect.

Resolves: https://issues.redhat.com/browse/RHEL-23446

Signed-off-by: Gris Ge <fge@redhat.com>
2024-03-13 20:52:37 +08:00
Wen Liang
db5b92fa03 libnm: use nm_setting_connection_get_controller() where possible
To enforce conscious language support, use
`nm_setting_connection_get_controller()` where possible and replace
`nm_setting_connection_get_master()`.

https://issues.redhat.com/browse/RHEL-28623

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1882
2024-03-12 09:54:31 +01:00
Fernando Fernandez Mancera
bd38a19832 connection: add support to down-on-poweroff
The new option at NMSettingConnection allow the user to specify if the
connection needs to be down when powering off the system. This is useful
for IP address removal prior powering off. In order to accomplish that,
we listen on "Shutdown" systemd DBus signal.

The option is set to FALSE by default, it can be specified globally on
configuration file or per profile.
2024-03-04 18:16:54 +00:00
Fernando Fernandez Mancera
c8cf02e6b8 manager: abstract code from do_sleep_wake() to reuse it
The code that is adding the devices to the sleeping list and taking them
down should be moved to a separated function. This way we can reuse it
and we avoid duplicating code.
2024-03-04 18:29:07 +01:00
Fernando Fernandez Mancera
5ab87886f3 power: rename NMSleepMonitor to NMPowerMonitor
In order to provide the NMSleepMonitor a more generic usage, let's
rename the whole module to NMPowerMonitor. Nothing is exposed to the API
so it is a trivial renaming.
2024-03-04 18:29:07 +01:00
Beniamino Galvani
f9c0f7ae64 manager: make generic devices compatible with all link types
If a generic device is present and the name matches, it is compatible
with any link type.

For example, if a generic connection has a device-handler that creates
a dummy interface, the link is compatible with the NMDeviceGeneric.

(cherry picked from commit 5978fb2b27)
2024-02-21 11:49:21 +01:00
Beniamino Galvani
3bb34edc53 core: persist state of software generic devices across restarts
When a generic connection has a custom device-handler, it always
generates a NMDeviceGeneric, even when the link that gets created is
of a type natively supported by NM. On service restart, we need to
keep track that the device is generic or otherwise a different device
type will be instantiated.

(cherry picked from commit f2613be150)
2024-02-21 11:49:20 +01:00
Fernando Fernandez Mancera
b0b068e103 all: use the new NMSettingConnection autoconnect-ports property
(cherry picked from commit 8a08a74abf)
2024-02-02 12:52:20 +01:00
Fernando Fernandez Mancera
027b259602 all: use the new NMSettingConnection port-type property 2024-01-23 08:21:16 +01:00
Thomas Haller
2fe8ec25b9
core: mark deprecated D-Bus API as deprecated in Introspect()
Mark the methods/properties deprecated in the D-Bus API (via
org.freedesktop.DBus.Introspectable.Introspect(), [1]).

It affects those properties that are documented as deprecated in
introspection XML.

  $ busctl -j call \
        org.freedesktop.NetworkManager \
        /org/freedesktop/NetworkManager \
        org.freedesktop.DBus.Introspectable \
        Introspect | \
    jq '.data[0]' -r | \
    grep -5 Deprecated

[1] https://dbus.freedesktop.org/doc/dbus-specification.html#standard-interfaces-introspectable
2024-01-16 09:28:18 +01:00
Fernando Fernandez Mancera
6576ddc532 config: drop slaves-order config option
This option was only introduced only to allow keeping the old behavior
in RHEL7, while the default order was changed from 'ifindex' to 'name'
in RHEL8. The usefulness of this option is questionable, as 'name'
together with predictable interface names should give predictable order.
When not using predictable interface names, the name is unpredictable
but so is the ifindex.

https://issues.redhat.com/browse/NMT-926

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1814
2023-12-12 15:28:52 +01:00
Emmanuel Grumbach
3476135911 platform: remove CSME related code
Remove all the code that was added for the CSME coexistence.
The Intel WiFi team can't commit on when, if at all, this feature will
be completely integrated and tested in the NetworkManager.
The preferred solution for now is the solution that involves the kernel
only.
Remove the code that was merged so far.
2023-09-25 11:46:24 +00:00
Wen Liang
b341161e2a nm-manager: ensure device is exported on D-Bus in authentication request
The device authentication request is an async process, it can not know
the answer right away, it is not guarantee that device is still
exported on D-Bus when authentication finishes. Thus, do not return
SUCCESS and abort the authentication request when device is not alive.

https://bugzilla.redhat.com/show_bug.cgi?id=2210271
2023-08-22 12:17:16 -04:00
Thomas Haller
5ff1468717
all: ensure signendess for arguments of NM_{MIN,MAX,CLAMP}() macros matches 2023-08-07 09:24:36 +02:00
Fernando Fernandez Mancera
fb362e0583 manager: allow controller activation if device is deactivating
When activating a port connection it will require the controller
connection is active or a valid controller device candidate is available
for activation.

One of the conditions we consider for a controller device to be a valid
candidate for the connection is that it is not active, therefore we
should also consider as valid a device that is currently deactivating.
Otherwise, we could fail during the port activation just because the
deactivation of the controller device candidate didn't finish yet.

https://bugzilla.redhat.com/show_bug.cgi?id=2125615

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1693
2023-07-19 12:09:09 +02:00
Thomas Haller
be438ec103
device/trivial: add code comment to _internal_activate_device() about how activation should work 2023-06-15 09:58:24 +02:00
Thomas Haller
b9231a0e18
core: better handle ignore-carrier=no for bond/bridge/team devices
By default, bond/bridge/team devices ignore carrier, and so do their
ports. However, it can make sense to set '[device*].ignore-carrier' for
the controller device. Meaningfully support that.

This is a follow up to commit 8c91422954 ('device: handle carrier
changes for master device differently'), which didn't fully solve the
problem.

What already works, is that when you set ignore-carrier for the
controller, then after loss of carrier and a carrier wait timeout, the
controller and ports go down. If both the controller and port profiles
have autoconnect disabled, they stay down and that's it. It works as
expected, but is not very useful, because when we want to automatically
react on carrier loss, we also want to automatically reconnect.

For controller profiles, carrier only makes sense when ports are
attached. However, we can (auto) activate controller profiles without
ports. So when the user enables autoconnect for the controller profile,
then the profile will eagerly reconnect. That means, after loss of
carrier, the device goes down and reconnects right away. It means, when
configuring a bond with ignore-carrier=no and autoconnect=yes, then
the sensible thing happens (an immediate reconnect). That is just not
a useful configuration.

The useful way to configure configure ignore-carrier=no for a controller
device, autoconnect on the master must be disabled while being enabled
on the ports. After all, it's the ports that will autoconnect based on
the carrier state and bring up the controller with them.

Note that at the moment when a port decides to autoconnect, the
controller profile is not yet selected. That only happens later during
_internal_activate_device() after searching it with find_master().  At
that point, the port profile checks whether it should autoconnect based
on its own carrier state, and abort if not.

If autoconnect is aborted due to lack of carrier, the profile gets
blocked from autoconnect with reason "failed". Hence, when the carrier
returns, we need to clear any "failed" blocked reasons and schedule
another autoconnect check,

Note that this really only works if the port is itself a simple device,
like an ethernet. If the port is itself a software device (like a bond,
or a VLAN), then the carrier state in _internal_activate_device() is
unknown, and we cannot avoid autoconnect. It's unclear how that could
make sense, if at all.

This setup can be combined with "connection.autoconnect-slaves=yes". In
that case, we have the first port to autoconnect when they get carrier,
bringing up the controller too. Usually the other ports that don't have
carrier would not autoconnect, but with autoconnect-slaves they will.
The effect is, that we autoconnect whenever any of the ports has
carrier, and then we immediately also bring up the ports that don't have
carrier (which we usually would not).

https://bugzilla.redhat.com/show_bug.cgi?id=2156684

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1658
2023-06-14 18:46:46 +02:00
Thomas Haller
bd3162ad89
core: allow resetting blocked reason for a device in nm_manager_devcon_autoconnect_blocked_reason_set() 2023-06-14 11:15:52 +02:00
Thomas Haller
6d75b7f348
core: reorder return in find_master()
It feels ugly to set the out arguments, in case we are failing the
function. Note that there is no change in behavior here. This is purely
cosmetic.
2023-06-14 11:15:52 +02:00
Thomas Haller
44076802a9
core: various minor code cleanups around find_master() 2023-06-14 11:15:52 +02:00
Thomas Haller
ac7b6e532f
core: rename nm_config_data_get_device_config_*() variants
The fully generic way to lookup a device config is using the
NMMatchSpecDeviceData. Rename the accessors that operate on a NMDevice
instead.
2023-06-14 11:07:34 +02:00
Beniamino Galvani
9d18510437 manager: set the right reason when managing device after realize
When managing a device after it is realized, we previously always set
the NOW_MANAGED reason, that makes the device fully-managed.

This works based on the assumption that initially an external device
has unmanaged flag EXTERNAL_DOWN set, and therefore the device stays
unmanaged during realization.

It is possible that an external device appears already with addresses
(or attached to a controller); we need to set reason
CONNECTION_ASSUMED if it's an external device, so that we don't set
sys-iface-state=managed.

Reproducer:

   ip link add br1 type bridge
   killall -STOP NetworkManager
   ip link add dummy1 type dummy
   ip link set dummy1 master br1
   ip link set dummy1 up
   sleep .5
   killall -CONT NetworkManager

After this, dummy1 is fully managed by NM while it shouldn't.

https://bugzilla.redhat.com/show_bug.cgi?id=2149012
2023-05-29 14:23:23 +02:00