Have "len" before "elem_size". That is consistent with g_qsort_with_data()
and bsearch(), and is also what I would expect.
Note that the previous commit just renamed the function. If a user
of the new, changed API gets backported to an older branch, we will
get a compilation error and note that the arguments need to be adjusted.
The "nm_utils_" prefix is just too verbose. Drop it.
Also, Posix has a bsearch function. As this function
is similar, rename it.
Note that currently the arguments are provided in differnt
order from bsearch(). That will be partly addressed next.
That is the main reason for the rename. The next commit
will swap the arguments, so do a rename first to get a compilation
error when backporting a patch that uses the changed API.
find_parent_device_for_connection() must return a parent device even
if it's unrealized. In fact, callers already handle the unrealized
case; in specific:
- _internal_activate_device() will try to autoactivate a connection
on the unrealized parent to realize it;
- system_create_virtual_device()'s goal is to create a NMDevice
object, so it doesn't matter whether the parent is realized or not.
Relax the condition about managed-ness, since any unrealized
device is also unmanaged.
This change fixes the following scenario, where all profiles have
autoconnect=no and autoconnect-slaves=yes:
vrf0
-------------^----------------
| | |
| bridge0 bond0.4000
| .
bond0 <......................
---^---
| |
veth0 veth1
----> = controller
....> = VLAN parent
When profiles are added, unrealized devices are created for bond0 and
bridge0, but not for bond0.4000 becase its parent is unrealized. Then
the autoconnect-slaves machinery for vrf0 tries to activate all ports
but fails for bond0.4000 because it can't find a device for it.
https://bugzilla.redhat.com/show_bug.cgi?id=2101317
These variants provide additional nm_assert() checks, and are thus
preferable.
Note that we cannot just blindly replace &g_array_index() with
&nm_g_array_index(), because the latter would not allow getting a
pointer at index [arr->len]. That might be a valid (though uncommon)
usecase. The correct replacement of &g_array_index() is thus
nm_g_array_index_p().
I checked the code manually and replaced uses of nm_g_array_index_p()
with &nm_g_array_index(), if that was a safe thing to do. The latter
seems preferable, because it is familar to &g_array_index().
Otherwise we're likely interfering with an in-progress activation.
Consider the following connections, first two being active:
id=bond0a type=bond interface-name=bond0, (Active)
id=dummy0a type=dummy interface-name=dummy0 master=bond0a, (Active)
id=bond0b type=bond interface-name=bond0
id=dummy0b type=dummy interface-name=dummy0 master=bond0b
Note there's two hierarchies with bond0 bond having a dummy0 port,
first one (bond0a, dummy0a) being active.
Suppose the users wants to bring the other one up (bond0b, dummy0b) and
does a "nmcli c up bond0b". This is what happens:
1.) bond0b starts activation due to user request
2.) bond0a starts deactivation due to new activation
3.) dummy0 loses its master, begins deactivation
4.) dummy0 finishes deactivation
5.) both dummy0 being deactivated and bond0b check for slaves enqueues
auto-activation check for dummy0
6.) auto-activation picks dummy0a for dummy0
7.) dummy0a begins activation
8.) dummy0a looks for master connection, picks bond0a
9.) bond0a starts activating on bond0, kicks bond0b away
10.) bond0a and dummy0a end up finishing activation
11.) Everybody unhappy :(
NM's auto-activation logic is only takes autoconnect priority into
account when figuring out a connection to activate and can't be expected
to bring up most sensible combination of connection when there's
multiple ones for the same devices with complex dependencies.
Nevertheless, it shouldn't ever undo the activations if the user is
bringing up the connections manually.
This patch prevents bringing up of master devices that are not
DISCONNECTED and therefore shouldn't be up for grabs. This was
previously done for hardware devices only whereas I believe it should be
the case for *all* realized devices.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1172https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1364
The current behavior of "nmcli networking off" is that it starts
disconnecting the devices, but doesn't wait for them to actually
come down.
That is not too helpful: the user never knows when the network is
actually disconnected.
Some users, notably the NetworkManager-CI test suite, seem to expect the
devices are all disconnected after the command finishes. Even worse,
it immediately proceeds activating the connections:
@ovs_cloned_mac_set_on_iface
...
* Execute "nmcli networking off && nmcli networking on"
This results in pure utter chaos. In particular, the slave connections
sometimes refuse to activate after "nmcli networking on", because the
master connections are still getting disconnected in response to
preceding "nmcli networking off".
Let's make Enable(FALSE) and Sleep(TRUE) block until none of the devices
are expected to go down.
Note that this makes those call also return when Enable(TRUE) and
Sleep(FALSE) is issued in meanwhile. Therefore a return from
Enable(FALSE) doesn't necessarily imply the networking is disabled.
This is a feature, not a bug -- the actual manager state is available in
the "state" property.
Fixes-test: @ovs_cloned_mac_set_on_iface
https://bugzilla.redhat.com/show_bug.cgi?id=2093175https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1292
DHCP leases for a given interface are already exported on D-Bus
through DHCP4Config and DHCP6Config objects. It is useful to have the
same information also available on the filesystem so that it can be
easily used by scripts.
NM already saves some information about DHCP leases in /var, however
that directory can only be accessed by root, for good reasons.
Append lease options to the existing state file
/run/NetworkManager/devices/$ifindex. Contrary to /var this directory
is not persistent, but it seems more correct to expose the lease only
when it is active and not after it expired or after a reboot.
Since the file is in keyfile format, we add new [dhcp4] and [dhcp6]
sections; however, since some options have the same name for DHCPv4
and DHCPv6, we add a "dhcp4." or "dhcp6." prefix to make the parsing
by scripts (e.g. via "grep") easier.
The option name is the same we use on D-Bus. Since some DHCPv6 options
also have a "dhcp6_" prefix, the key name can contain "dhcp6" twice.
The new sections look like this:
[dhcp4]
dhcp4.broadcast_address=172.25.1.255
dhcp4.dhcp_lease_time=120
dhcp4.dhcp_server_identifier=172.25.1.4
dhcp4.domain_name_servers=172.25.1.4
dhcp4.domain_search=example.com
dhcp4.expiry=1641214444
dhcp4.ip_address=172.25.1.182
dhcp4.next_server=172.25.1.4
dhcp4.routers=172.25.1.4
dhcp4.subnet_mask=255.255.255.0
[dhcp6]
dhcp6.dhcp6_name_servers=fd01::1
dhcp6.dhcp6_ntp_servers=ntp.example.com
dhcp6.ip6_address=fd01::1aa
nm_dns_manager_get() is already a singleton. So users usually
can just get it whenever they need -- except during shutdown
after the singleton was destroyed. This is usually fine, because
users really should not try to get it late during shutdown.
However, if you subscribe a signal handler on the singleton, then you
will also eventually want to unsubscribe it. While the moment when you
subscribe it is clearly not during late-shutdown, it's not clear how
to ensure that the signal listener gets destroyed before the DNS manager
singleton.
So usually, whenever you are going to subscribe a signal, you need to
make sure that the target object stays alive long enough. Which may
mean to keep a reference to it.
Next, we will have NMDevice subscribe to the singleton. With above said,
that would mean that potentially every NMDevice needs to keep a
reference to the NMDnsManager. That is not best. Also, later NMManager
will face the same problem, because it will also subscribe to
NMDnsManager.
So, instead let NMManager own a reference to the NMDnsManager. This
ensures the lifetimes are properly guarded (NMDevice also references
NMManager already).
Also, access nm_dns_manager_get() lazy on first use, to only initialize
it when needed the first time (which might be quite late).
We store the timestamp when a profile activated the last time to
"/var/lib/NetworkManager/timestamps". There was also a timer which
would update the timestamp of activated connections every 300 seconds.
That seems unnecessary, drop it.
For one, waking up every 5 minutes and rewriting a file to disk seems
undesirable, for example if /var is a device where unnecessary writes
should be minimized.
Note that we already update the timestamp when a device goes down,
and of course when it comes up. Updating the timestamp in between seems
unnecessary.
This reverts commit 607350294d ('core: update timestamp in active
system connections every 5 mins (bgo #583756)').
An alternative would be to only update the timestamp in memory (so that
it would appear updated on D-Bus), but delay writing the file until
something important happens. `nm_key_file_db_*()` already tracks whether
there are changes ("dirty") and whether it's necessary to write the
file. It would be possible to track two dirty flags: one that requires
immediate update, and one that only ensures we will re-write dirty files
eventually.
See-also: https://bugzilla.gnome.org/show_bug.cgi?id=583756https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1171
Introduce a RadioFlags property on the manager object. For now it
contains two bits WLAN_AVAILABLE, WWAN_AVAILABLE to indicate whether
any radio interface is present in the system. The presence of a radio
is detected by looking at devices and rfkill switches.
In future, any radio-related read-only boolean flag can be exposed via
this property, including the already existing WirelessHardwareEnabled
and WwanHardwareEnabled properties.
This will allow migrating a connection. If specified, the connection will
be confined to a particular settings plugin when written back. If the
plugin differs from the existing one, it will be removed from the old one.
When we have a bridge interface with ports attached externally (that is,
not by NetworkManager itself), then it can make sense that during
checkpoint rollback we want to keep those ports attached.
During rollback, we may need to deactivate the bridge device and
re-activate it. Implement this, by setting a flag before deactivating,
which prevents external ports to be detached. The flag gets cleared,
when the device state changes to activated (the following activation)
or unmanaged.
This is an ugly solution, for several reasons.
For one, NMDevice tracks its ports in the "slaves" list. But what
it does is ugly. There is no clear concept to understand what it
actually tacks. For example, it tracks externally added interfaces
(nm_device_sys_iface_state_is_external()) that are attached while
not being connected. But it also tracks interfaces that we want to attach
during activation (but which are not yet actually enslaved). It also tracks
slaves that have no actual netdev device (OVS). So it's not clear what this
list contains and what it should contain at any point in time. When we skip
the change of the slaves states during nm_device_master_release_slaves_all(),
it's not really clear what the effects are. It's ugly, but probably correct
enough. What would be better, if we had a clear purpose of what the
lists (or several lists) mean. E.g. a list of all ports that are
currently, physically attached vs. a list of ports we want to attach vs.
a list of OVS slaves that have no actual netdev device.
Another problem is that we attach state on the device
("activation_state_preserve_external_ports"), which should linger there
during the deactivation and reactivation. How can we be sure that we don't
leave that flag dangling there, and that the desired following activation
is the one we cared about? If the follow-up activation fails short (e.g. an
unmanaged command comes first), will we properly disconnect the slaves?
Should we even? In practice, it might be correct enough.
Also, we only implement this for bridges. I think this is where it makes
the most sense. And after all, it's an odd thing to preserve unknown,
external things during a rollback -- unknown, because we have no knowledge
about why these ports are attached and what to do with them.
Also, the change doesn't remember the ports that were attached when the
checkpoint was created. Instead, we preserve all ports that are attached
during rollback. That seems more useful and easier to implement. So we
don't actually rollback to the configuration when the checkpoint was
created. Instead, we rollback, but keep external devices.
Also, we do this now by default and introduce a flag to get the previous
behavior.
https://bugzilla.redhat.com/show_bug.cgi?id=2035519https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/ # 909
The kernel may add a reason for hardware rfkill. Make the NetworkManager
able eto fetch it and parse it.
For now, no action will be taken upon the new reasons.
The different reasons that the kernel can expose are either the radio
was switched off by a hardware rfkill switch. This reason is adveritsed
by bit 0 in the bitmap returned by RFKILL_STATE_REASON udev property.
This is the rfkill that existed until now.
The new reason is mapped to bit 1 and teaches the user space that the
wifi device is currently used by the CSME firmware on the platform. In
that case, the NetworkManager can ask CSME (through the iwlmei kernel
module) what BSSID the CSME firmware is associated to. Once the
NetworkManager gets to the conclusion is has the credentials to connect
to that very same AP, it can request the wifi device and the CSME
firmware will allow the host to take the ownership on the device. CSME
will give 3 seconds to the host to get an IP or it'll take the device
back. In order to complete all the process until we get the DHCP ACK
within 3 seconds, the NetworkManager will need to optimize the scan and
limit the scan to that specific BSSID on that specific channel.
All this flow is not implemented yet, but the first step is to identify
that the device is not owned by the host.
The signal parameters are G_TYPE_UINT. We should not assume that our
enums are a compatible integer type.
In practice of course they always were. Let's just be clear about when
we have an integer and when we have an enum.
Naming is important. Especially when we have a 8k LOC monster, that
manages everything.
Rename things related to Rfkill to give them a common prefix.
Also, move code around so it's beside each other.
RadioState contained both the mutable user-data and meta-data about the
radio state. The latter is immutable. The parts that cannot change, should
be separate from those that can change. Move them to a separate array.
Of course, we only have on NMManager instance, so having these fields embedded
in NMManagerPrivate did not waste memory. This change is done, because immutable
fields should be const. In this case, they are now const global data, which
is even protected by the linker from accidental mutation.
Names in header files should have an "nm" prefix. We do that pretty
consistently. Fix the offenders RfKillState and RfKillType.
Also, rename the RfKillState enums to follow the type name. For example,
NM_RFKILL_STATE_SOFT_BLOCKED instead of RFKILL_SOFT_BLOCKED.
Also, when we camel-case a typedef (NMRfKillState) we would want that
the lower-case names use underscore between the words. So it should be
`nm_rf_kill_state_to_string()`. But that looks awkward. So the right solution
here is to also rename "RfKill" to "Rfkill". That make is consistent
with the spelling of the existing `NMRfkillManager` type and the
`nm-rfkill-manager.h` file.
- use "const RadioState" where possible
- use "bool" bitfields in RadioState (boolean flags in structs
should be `bool` types for consistency) and order fields by
alignment.
- break lines for variable declaration in manager_rfkill_update_one_type().
- return (and nm_assert()) from update_rstate_from_rfkill(). By
not adding a default case, compiler would warn if we forget to
handle an enum value. We can easily do that, by just returning,
and let the "default" case be handled by nm_assert_not_reached()
-- which unlike g_warn_if_reached() compiles to nothing in
release build.
- add nm_assert() that `priv->radio_states[rtype]` is not out of range.
- use designated initializers for priv->radio_states[].
When the NM_UNMANAGED_PLATFORM_INIT flag is cleared last in
device_link_changed(), a recheck-assume is scheduled and then the
device goes immediately to UNAVAILABLE. During the state transition,
addresses and routes are removed from the interface. Then,
recheck-assume finds that the device can be assumed but it's too late
since the device was already deconfigured.
This is a problem as the whole point of assuming a device is to
activate a connection while leaving the device untouched.
In the NMCI "dracut_NM_vlan_over_bridge and dracut_NM_vlan_over_bond"
test, NM in real root tries to assume a vlan device that was activated
in initrd. When the interface gets deconfigured in UNAVAILABLE, the
connection to the NFS server breaks and the rootfs becomes
inaccessible.
The fix to this problem is to delay state transitions in
device_link_changed() to a idle handler, so that recheck-assume can
run before.
Fixes-test: @dracut_NM_vlan_over_bridge
Fixes-test: @dracut_NM_vlan_over_bond
https://bugzilla.redhat.com/show_bug.cgi?id=2047302
We have at least static and transient hostnames. Let's be clear which
one we are talking about.
Note that also NM_SETTINGS_HOSTNAME gets renamed to
NM_SETTINGS_STATIC_HOSTNAME, because it seems clearer.
The only purpose of NM_SETTINGS_STATIC_HOSTNAME is to be the backing
property for the "Hostname" D-Bus property for the NMDBusObject glue.
So, while the new name makes more sense to me, it's now also
inconsistent with it's primary use (the D-Bus property). Still...
When the device gets realized, similar to the situation that the device
is unmanaged by platform-init, if the device is still unmanaged by
parent and we clear the assume state. Then, when the device becomes
managed, NM is not able to properly assume the device using the UUID.
Therefore, we should not clear the assume state if the device has only
the NM_UNMANAGED_PLATFORM_INIT or the NM_UNMANAGED_PARENT flag set
in the unmanaged flags.
The previous commit 3c4450aa4d ('core: don't reset assume state too
early') did something similar for NM_UNMANAGED_PLATFORM_INIT flag only.
We use clang-format for automatic formatting of our source files.
Since clang-format is actively maintained software, the actual
formatting depends on the used version of clang-format. That is
unfortunate and painful, but really unavoidable unless clang-format
would be strictly bug-compatible.
So the version that we must use is from the current Fedora release, which
is also tested by our gitlab-ci. Previously, we were using Fedora 34 with
clang-tools-extra-12.0.1-1.fc34.x86_64.
As Fedora 35 comes along, we need to update our formatting as Fedora 35
comes with version "13.0.0~rc1-1.fc35".
An alternative would be to freeze on version 12, but that has different
problems (like, it's cumbersome to rebuild clang 12 on Fedora 35 and it
would be cumbersome for our developers which are on Fedora 35 to use a
clang that they cannot easily install).
The (differently painful) solution is to reformat from time to time, as we
switch to a new Fedora (and thus clang) version.
Usually we would expect that such a reformatting brings minor changes.
But this time, the changes are huge. That is mentioned in the release
notes [1] as
Makes PointerAligment: Right working with AlignConsecutiveDeclarations. (Fixes https://llvm.org/PR27353)
[1] https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html#clang-format
Completely rework IP configuration in the daemon. Use NML3Cfg as layer 3
manager for the IP configuration of an interface. Use NML3ConfigData as
pieces of configuration that the various components collect and
configure. NMDevice is managing most of the IP configuration at a higher
level, that is, it starts DHCP and other IP methods. Rework the state
handling there.
This is a huge rework of how NetworkManager daemon handles IP
configuration. Some fallout is to be expected.
It appears the patch deletes many lines of code. That is not accurate, because
you also have to count the files `src/core/nm-l3*`, which were unused previously.
Co-authored-by: Beniamino Galvani <bgalvani@redhat.com>
The two nm-sudo helper functions are only used by the OVS device plugin,
but they are part of NetworkManager core binary.
This is done commonly, where the NetworkManager binary has symbols that
are used by the plugins. But in this case, NetworkManager itself doesn't
use the symbols. That will case the linker to drop them.
A previous solution for that was commit 684f2acffe ('build: add way to
keep unused symbols when linking NetworkManager'), but that doesn't seem
to work with clang 3.4 (rhel-7).
Instead, actually use the symbol so that it cannot be dropped.
Naming is important, because the name of a thing should give you a good
idea what it does. Also, to find a thing, it needs a good name in the
first place. But naming is also hard.
Historically, some strv helper API was named as nm_utils_strv_*(),
and some API had a leading underscore (as it is internal API).
This was all inconsistent. Do some renaming and try to unify things.
We get rid of the leading underscore if this is just a regular
(internal) helper. But not for example from _nm_strv_find_first(),
because that is the implementation of nm_strv_find_first().
- _nm_utils_strv_cleanup() -> nm_strv_cleanup()
- _nm_utils_strv_cleanup_const() -> nm_strv_cleanup_const()
- _nm_utils_strv_cmp_n() -> _nm_strv_cmp_n()
- _nm_utils_strv_dup() -> _nm_strv_dup()
- _nm_utils_strv_dup_packed() -> _nm_strv_dup_packed()
- _nm_utils_strv_find_first() -> _nm_strv_find_first()
- _nm_utils_strv_sort() -> _nm_strv_sort()
- _nm_utils_strv_to_ptrarray() -> nm_strv_to_ptrarray()
- _nm_utils_strv_to_slist() -> nm_strv_to_gslist()
- nm_utils_strv_cmp_n() -> nm_strv_cmp_n()
- nm_utils_strv_dup() -> nm_strv_dup()
- nm_utils_strv_dup_packed() -> nm_strv_dup_packed()
- nm_utils_strv_dup_shallow_maybe_a() -> nm_strv_dup_shallow_maybe_a()
- nm_utils_strv_equal() -> nm_strv_equal()
- nm_utils_strv_find_binary_search() -> nm_strv_find_binary_search()
- nm_utils_strv_find_first() -> nm_strv_find_first()
- nm_utils_strv_make_deep_copied() -> nm_strv_make_deep_copied()
- nm_utils_strv_make_deep_copied_n() -> nm_strv_make_deep_copied_n()
- nm_utils_strv_make_deep_copied_nonnull() -> nm_strv_make_deep_copied_nonnull()
- nm_utils_strv_sort() -> nm_strv_sort()
Note that no names are swapped and none of the new names existed
previously. That means, all the new names are really new, which
simplifies to find errors due to this larger refactoring. E.g. if
you backport a patch from after this change to an old branch, you'll
get a compiler error and notice that something is missing.
Add a new 'keep-configuration' device option, set to 'yes' by
default. When set to 'no', on startup NetworkManager ignores that the
interface is pre-configured and doesn't try to keep its
configuration. Instead, it activates one of the persistent
connections.