nm_uuid_generate_from_strings() uses variant3 UUIDs based on MD5.
We shouldn't use that in the future.
We will add a replacement, so rename this function so that the "good"
name is free again. Of course, code that uses this function currently
relies on that the behavior doesn't change. We cannot just drop it
entirely, but will replace it by something that gives the same result.
Rename.
This is something that does happen.
Is that a bug? If so, this should not be a warning message but an
assertion failure. If it's not a bug, then this does not warrant warning
level, because the user wouldn't know what to do about this and it's
something that occasionally happens.
Granted, the state handling in NMDevice is complex, that it's unclear
whether this indicates a problem or not. In any case, having a warning
does only confuse the user.
Have "len" before "elem_size". That is consistent with g_qsort_with_data()
and bsearch(), and is also what I would expect.
Note that the previous commit just renamed the function. If a user
of the new, changed API gets backported to an older branch, we will
get a compilation error and note that the arguments need to be adjusted.
The "nm_utils_" prefix is just too verbose. Drop it.
Also, Posix has a bsearch function. As this function
is similar, rename it.
Note that currently the arguments are provided in differnt
order from bsearch(). That will be partly addressed next.
That is the main reason for the rename. The next commit
will swap the arguments, so do a rename first to get a compilation
error when backporting a patch that uses the changed API.
find_parent_device_for_connection() must return a parent device even
if it's unrealized. In fact, callers already handle the unrealized
case; in specific:
- _internal_activate_device() will try to autoactivate a connection
on the unrealized parent to realize it;
- system_create_virtual_device()'s goal is to create a NMDevice
object, so it doesn't matter whether the parent is realized or not.
Relax the condition about managed-ness, since any unrealized
device is also unmanaged.
This change fixes the following scenario, where all profiles have
autoconnect=no and autoconnect-slaves=yes:
vrf0
-------------^----------------
| | |
| bridge0 bond0.4000
| .
bond0 <......................
---^---
| |
veth0 veth1
----> = controller
....> = VLAN parent
When profiles are added, unrealized devices are created for bond0 and
bridge0, but not for bond0.4000 becase its parent is unrealized. Then
the autoconnect-slaves machinery for vrf0 tries to activate all ports
but fails for bond0.4000 because it can't find a device for it.
https://bugzilla.redhat.com/show_bug.cgi?id=2101317
When disposing NMPolicy all the devices in the devices hash-table should
be unregistered and removed from the hash-table.
Fixes: 7e3d090acb ('policy: refactor tracking of registered devices')
We already have src/linux-headers, where we have complete copies of linux
user space headers. Of course that exists, because we want to use certain
features and don't depend on the installed kernel headers. Which works
well, because kernel user space API is stable, and we anyway want to
support compiling against a newer kernel and run against an older (e.g.
in a container). So having our copy of newer kernel headers is merely
as if we compiled against as newer kernel.
Add "src/nm-compat-headers" which has a similar purpose, but a different
approach. Instead of replacing the included header entirely, include
the system header and patch it with #define.
Use this for "linux/if_addr.h". Of course, the approach here is that we
no longer include <linux/if_addr.h> directly, but instead include
"nm-compat-headers/linux/if_addr.h".
Sync/blocking methods are ugly. Their name should highlight this.
Also, we may have an async variant, so we will need the "good" name
for apply() and apply_finish().
Previously, we used _nm_utils_ascii_str_to_bool(). That can accept any
kind of input (like "true"), so one might think that this is better to
use on user-input. However, NMSettingBond already validates the these
options are integers (either "0" or "1"). So a value like "true"
could never be here.
Use _nm_setting_bond_opt_value_as_intbool() because that asserts that
the option if of the expected type (integer).
These variants provide additional nm_assert() checks, and are thus
preferable.
Note that we cannot just blindly replace &g_array_index() with
&nm_g_array_index(), because the latter would not allow getting a
pointer at index [arr->len]. That might be a valid (though uncommon)
usecase. The correct replacement of &g_array_index() is thus
nm_g_array_index_p().
I checked the code manually and replaced uses of nm_g_array_index_p()
with &nm_g_array_index(), if that was a safe thing to do. The latter
seems preferable, because it is familar to &g_array_index().
Add nm_g_array_index() as a replacement for g_array_index(). The value
of nm_g_array_index(), nm_g_array_index_p(), nm_g_array_first() and
nm_g_array_last() is that they add nm_assert() checks for valid
parameters.
nm_g_array_{first,last}() now returns an lvalue and not a pointer.
As such, they are just shorthands for nm_g_array_index() at index
0 and len-1, respectively.
`nm_g_array_index_p(arr, Type, idx)` is almost the same as
`&nm_g_array_index(arr, Type, idx)`. The only difference (and why the
former variant exists), is that nm_g_array_index_p() allows to get a
pointer one after the end.
This means, this is correct and valid to do:
// arr->len might be zero
arr = nm_g_array_index_p(arr, Type, 0);
for (i = 0; i < arr->len; i++, arr++)
...
ptr = nm_g_array_index_p(arr, Type, 0);
end = nm_g_array_index_p(arr, Type, arr->len);
for (; ptr < end; ptr++)
...
This would not be valid to do with nm_g_array_{index,first,last}().
Also fix supporting "const GArray *arr" parameter. Of course, the function
casts the constness away. Technically, that matches the fact that arr->data
is also not a const pointer. In practice, we might want to propagate the
constness of the container to the constness of the element lookup. While
doable, that is not implemented.
Bond option netlink support requires primary property to be a ifindex
instead of the interface name. This is a workaround for supporting
specifying a primary that does not exist yet.
```
nmcli con add type bond ifname mybond0 bond.options "mode=active-backup,primary=veth1"
Connection 'bond-mybond0' (38100ef9-11e2-4003-aff9-cb2d152ce34f) successfully added.
nmcli con add type ethernet ifname veth1 master mybond0
cat /sys/class/net/mybond0/bonding/primary
veth1
```
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1362
Fixes: e064eb9d13 ('bond: use netlink to set bond options')
_nm_config_data_log_sort() is used for sorting the groups in the
keyfile during nm_config_data_log(). The idea is to present the keyfile
in a defined, but useful order.
However, it is not a total order. That is, it will return c=0 (equal) for
certain groups, if the pre-existing order in the GKeyFile should be
honored. For example, we want to sort all [device*] sections close to
each other, but we want to preserve their relative order. In that case,
the function would return 0 although the group names differed.
Also, _nm_config_data_log_sort() does not expect to receive duplicate names.
It would return c!=0 for comparing "device" and "device".
This means, _nm_config_data_log_sort() is fine for sorting the input as
we have it. However, it cannot be used to binary search the groups. This
caused that some sections might be duplicated in the `NetworkManager
--print-config` output. Otherwise, it had no bad effects.
Fixes(no-backport): 78d34d7c2e ('config: fix printing default values for missing sections')
Previously, nm_platform_ip_address_sync() would always add the "IFA_F_NOPREFIXROUTE"
flag. Add a way to let the caller control that.
Add a flags argument, with a new flag "with-noprefixroute". By default
(with flags "none"), nm_platform_ip_address_sync() would no longer
add "IFA_F_NOPREFIXROUTE" flag, but the caller can now opt-in to that.
The purpose is that on "lo" interface we will want to let kernel
handle the prefix route. So have a per-ifindex opt-in for controlling
this.
During nm_platform_ip_address_flush() we use "none" flags, because the
function anyway doesn't add any addresses, so it wouldn't matter.
There is no change in behavior.
Co-authored-by: Thomas Haller <thaller@redhat.com>
We've been outright ignoring master-slave checks if the link ended up
without a master since commit 2e22880894 ('device: don't remove the
device from master if its link has no master').
This was done to deal with OpenVSwitch port-interface relationship,
where the interface's platform link lacked an actual master in platform
(what matters there is the OVSDB entry), but the fix was too wide.
Let's limit the special case to devices whose were not enslaved to
masters that lack a platform link, which pretty much happens for
OpenVSwitch only.
Morale: Write better commit messages of future you is going to be upset
Fixes: 2e22880894 ('device: don't remove the device from master if its link has no master')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1358
For the ipoib connection, it is still considered as valid if the
profile does not set the device name. Also, the ifcfg reader should not
duplicate the checks that `nm_connection_verify()` performs (especially
not wrongly). Therefore, NM should skip validating the DEVICE when
reading the ifcfg file for the ipoib connection.
https://bugzilla.redhat.com/show_bug.cgi?id=2122703
When writing the p-key setting to the ifcfg file and reading the
setting back, the value has to be consistent. This is not limited to
p-key only, any setting value during the ifcfg write and read also has
to be consistent.
This was probably added in commit cb5606cf1c ('ifcfg-rh:
add support for Infiniband partitions') as this is also what
ifup-ib does ([1]). For NetworkManager profiles however, the
p-key is also valid without the high bit set, so the ifcfg-rh
reader must honor that.
[1] 0c9fb6ca7b/rdma.ifup-ib (L75)
Otherwise we're likely interfering with an in-progress activation.
Consider the following connections, first two being active:
id=bond0a type=bond interface-name=bond0, (Active)
id=dummy0a type=dummy interface-name=dummy0 master=bond0a, (Active)
id=bond0b type=bond interface-name=bond0
id=dummy0b type=dummy interface-name=dummy0 master=bond0b
Note there's two hierarchies with bond0 bond having a dummy0 port,
first one (bond0a, dummy0a) being active.
Suppose the users wants to bring the other one up (bond0b, dummy0b) and
does a "nmcli c up bond0b". This is what happens:
1.) bond0b starts activation due to user request
2.) bond0a starts deactivation due to new activation
3.) dummy0 loses its master, begins deactivation
4.) dummy0 finishes deactivation
5.) both dummy0 being deactivated and bond0b check for slaves enqueues
auto-activation check for dummy0
6.) auto-activation picks dummy0a for dummy0
7.) dummy0a begins activation
8.) dummy0a looks for master connection, picks bond0a
9.) bond0a starts activating on bond0, kicks bond0b away
10.) bond0a and dummy0a end up finishing activation
11.) Everybody unhappy :(
NM's auto-activation logic is only takes autoconnect priority into
account when figuring out a connection to activate and can't be expected
to bring up most sensible combination of connection when there's
multiple ones for the same devices with complex dependencies.
Nevertheless, it shouldn't ever undo the activations if the user is
bringing up the connections manually.
This patch prevents bringing up of master devices that are not
DISCONNECTED and therefore shouldn't be up for grabs. This was
previously done for hardware devices only whereas I believe it should be
the case for *all* realized devices.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1172https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1364
The dhclient plugin already supports sending a decline when IPv4 ACD
fails. Also implement support for IPv6 DAD.
See-also: 156d84217c ("dhcp/dhclient: implement accept/decline (ACD) for dhclient plugin")
Currently we accept the DHCPv6 just after addresses are configured on
kernel, without waiting DAD result. Instead, wait that DAD completes
and decline the lease if all addresses are detected as duplicate.
Note that when an address has non-infinite lifetime and fails DAD,
kernel removes it automatically. With iproute2 we see something like:
602: testX6 inet6 2620:🔢5678/128 scope global tentative dynamic noprefixroute
valid_lft 7500sec preferred_lft 7200sec
Deleted 602: testX6 inet6 2620:🔢5678/128 scope global dadfailed tentative dynamic noprefixroute
valid_lft 7500sec preferred_lft 7200sec
Since the address gets removed from the platform cache, at the moment
we don't have a way to check the flags of the removal
message. Therefore, we assume that any address that goes away in
tentative state was detected as duplicate.
https://bugzilla.redhat.com/show_bug.cgi?id=2096386
The @dracut_NM_vlan_over_team_no_boot sometimes fails, among other
things, because it fails to assume an indicated connection after a
restart.
That seems to happen because after the decision to activate the
indicated connection, the device does not move from DISCONNECTED state
quickly enough. Another assumption recheck runs in between and decides
to generate a connection, because the assume state was already reset
in between.
First start, creates and activates b3a61b68-f744-4a4c-a513-61399c154a67
on vlan0017:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
...
settings: update[b3a61b68-f744-4a4c-a513-61399c154a67]: adding connection "vlan0017"
(45113870df0a4cfb/keyfile)
Second start:
NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting...
(after a restart, asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe)
Assumption attempt successfully picks the right connection and thus
proceeds to reset the assume state:
manager: (vlan0017): assume: will attempt to assume matching connection 'vlan0017'
(b3a61b68-f744-4a4c-a513-61399c154a67) (indicated)
device[c7c5101cf0b73f5f] (vlan0017): assume-state: set guess-assume=0, connection=(null)
Everything great so far, activation of the right connection is enqueued
and the device moves away from unavailable state. However, the
activation can't proceed immediately:
device (vlan0017): state change: unmanaged -> unavailable
(reason 'connection-assumed', sys-iface-state: 'assume')
device (vlan0017): state change: unavailable -> disconnected
(reason 'connection-assumed', sys-iface-state: 'assume')
active-connection[0x55ba1162f1c0]: set device "vlan0017" [0x55ba1163c4f0]
device[c7c5101cf0b73f5f] (vlan0017): queue activation request waiting for carrier
Now another assumption attempt is done. The original assume state is
gone, so a connection is generated:
platform-linux: UDEV event: action 'add' subsys 'net' device 'vlan0017' (6); seqnum=1959
device[c7c5101cf0b73f5f] (vlan0017): queued link change for ifindex 6
manager: (vlan0017): assume: generated connection 'vlan0017' (57627119-8c20-4f9e-bf4d-4fc427b4a6a9)
keyfile: commit: 57627119-8c20-4f9e-bf4d-4fc427b4a6a9 (vlan0017) added as
"/run/NetworkManager/system-connections/vlan0017-57627119-8c20-4f9e-bf4d-4fc427b4a6a9.nmconnection"
(nm-generated,volatile,external)
I think this shouldn't have happened. We've picked the correct
connection already and it's enqueued for activation!
Change the check in nm_device_emit_recheck_assume() to also consider
any queued activation.
Fixes-test: @dracut_NM_vlan_over_team_no_boot
Co-authored-by: Lubomir Rintel <lkundrak@v3.sk>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1351
If teamd crashes, we restore it. That's very nice, but if it really
crashed then it left ports attached and the slave connections are not
going to fail and the port configuration (e.g. priority or link watcher) in
teamd's memory will be gone.
This will restore the port configuration when the teamd connection is
re-established. This probably also fixes a race where a slave connection
would be enslaved (only possible externally and manually?) while we
didn't establish a connection to teamd yet. We'll just send the port
configuration in once're connected.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1361
Add option to set ofport_request when configuring ovs interface. When
connection with ofport_request configured is activated ovsdb will first
try to activated on the port set by ofport_request.