It's the better name. Especially since there is no more signal involved,
the term "emit" doesn't match.
Note also how the previous approach using a signal tried to abstract
what is happening. So we were no longer rechecking-autoconnect, instead,
we were emitting-a-signal-to-recheck-autoconnect. Just be plain about
what it is doing and don't go through a layer of signal.
GObject signals don't make the code easier to understand, on the
contrary. They may have their purpose, when objects truly must/should
not be aware of each other, and need to be composed very loosely. That
is not the case here.
There really is only one subscriber to NM_DEVICE_RECHECK_AUTO_ACTIVATE
signal, and it only makes sense this way. Instead of going through a
signal invocation, just call the well known method directly. It becomes
clearer who calls this code (and it has a lower overhead).
When using cscope/ctags it also is easier to follow the code because the
tools understand function calls.
The delete_on_deactivate_link_delete() handler may be called after the
device was already removed from NMManager. Don't allow that.
Check whether the device is still exported on D-Bus as indication.
NM_reboot_openvswitch_vlan_configuration_var2 test exposes a race. What
the test does, is to create OVS profiles and repeatedly restart
NetworkManager, checking that those profiles autoconnect and the OVS
configuration gets created.
There is a race, where:
- the OVS interface exists, and an NMDeviceOvsInterface is created
- first ovsdb cleans up old interfaces, sending a json request.
- OVS deletes the interface, and NetworkManager first picks up the
platform signal (there is a race here, usually the ovsdb request
completes first, which will cleanup the NMDeviceOvsInterface in
a different way).
- when the device gets unrealized, we don't schedule a
check-autoactivate, so the device stays down.
See https://bugzilla.redhat.com/show_bug.cgi?id=2152864#c5 for a log
file with more details.
What should instead happen, is to autoactivate the OVS interface, which
then also fully configures the port and bridge interfaces.
Explicitly schedule an autoactivate when unrealizing devices.
Note that there are now several cases, where NetworkManager autoconnects
more eagerly. This even affects some CI tests and user-visible behavior.
But I think relying on "just don't call nm_device_emit_recheck_auto_activate()
to hope that autoconnect doesn't happen is wrong. It must always be
possible to trigger an autoconnect check, and the right thing must
happen. We only don't trigger autoconnect checks *all* the time, because
it would be a waste of CPU resources, but whenever we slightly suspect
that an autoconnect may happen, we must be allowed to trigger a check.
If a device is in a condition where it previously did not autoconnect,
and it also *should* not autoconnect, then we need to fix the code that
evaluates whether an autoconnect may happen (not avoid triggering a
check).
https://bugzilla.redhat.com/show_bug.cgi?id=2152864
Fixes-test: @NM_reboot_openvswitch_vlan_configuration_var2
Currently, when we delete a device then autoconnect does not kick in
right away. But that is only, because we happen not to schedule a
"autoactivate" recheck.
What should be happen, is that rechecking whether to autoconnect is
always allowed, and that we have the necessary state to know that
autoconnect currently should not work.
Instead, block autoconnect of the involved profile. That makes sense,
because clearly we don't want to autoconnect right again after `nmcli
device delete $iface`.
<error> is mostly about "really should not happen" scenarios. It's
closer to an assertion failure, and something that NetworkManager should
not happen.
Of course, things can go wrong, but <warn> is a sufficient. When ovsdb
gives unexpected communication, it's just a warning. At least, that's
also what all the similar cases in "nm-ovsdb.c" already do
GSocketConnection/GOutputStream/GInputStream seems rather unnecessary.
Maybe they make sense when you want to write portable code (for
Windows). Otherwise, watching a file descriptor and reading/writing it
directly is simpler (and also more efficient).
For example, we passed no GCancellable to g_input_stream_read_async().
What does that mean w.r.t. destroying the NMOvsdb instance? I suspect
it's wrong, but it's hard to say, because there are so many layers of
code.
Note that we anyway keep state in NMOvsdb, namely the data we want to
send (output_buf) and the data we partially received (input_buf). All we
need, are poll notifications when the file descriptor is ready. To
those, we hook up the read/write callbacks. Also before was the code
async, and there were callbacks when when read/write was done. That does
not simplify the code in any way.
- we no longer use separate NMOvsdbPrivate.buf and NMOvsdbPrivate.input
buffers. There is just a NMOvsdbPrivate.input_buf that can we can fill
directly.
The "priv->bufp" offset is only used while parsing a message at a time.
It's unnecessary to track it in NMOvsdbPrivate and keep it between
parsing messages. Tracking the state in NMOvsdbPrivate makes it more
complicated to understand, because one needs to reason at which times
the state is used (when it really is not used).
Also, move the parsing to a separate function.
The device shouldn't change state from DEACTIVATING to DISCONNECTED
until its detached from its controller; otherwise, the port detach
that is in progress can conflict with the following activation.
This changes the signature of detach_port() to be asynchronous,
similarly to attach_port(). The implementation can return TRUE/FALSE
on immediate completion.
Current implementations return immediately and so there is no change
in behavior for now.
- drop annotations from "@error" which has defaults.
- ensure all annotations are on the same line. That's useful
when searching for an annotation, to find the line that specifies
the argument name.
- convert a few plain docs into gtkdoc annotations.
If the automatically selected channel for an AP is set as NO-IR in the
current regulatory domain, the hotspot connection will fail to
start. NO-IR means that any mechanisms that initiate radiation are not
permitted on this channel, this includes sending probe requests or
modes of operation that require beaconing such as AP. Skip channels
with the NO-IR flag.
The sys-iface-state "assume" means to gracefully take over a device (for
example, after a restart). The end result is a fully managed interface.
The flag only has meaning while activating, and for most practical
purposes, such devices should be treated the same as fully activated
ones.
Without this, the MTU is not reset until the device reaches fully
activated state, at which point the sys-iface-state switches from
"assume" to "managed". With the previous commit, at that point we also
schedule an idle commit, which ends up also setting the MTU. Before
that, the MTU was only reset some undefined time later, when we happened
to do another NML3Cfg commit. Nonetheless, even waiting until we reach
fully activated state is wrong. Also during activation, commit the MTU.
I guess, what theoretically could happen is that we get our MTU via
ip-config (like DHCP). Then, during assuming we hit _commit_mtu()
without having the DHCP lease yet. This happens after a restart, so it
would be wrong to first reset the MTU, before we re-receive the DHCP
lease. However, if the MTU is really to be set due via
NM_DEVICE_MTU_SOURCE_IP_CONFIG, then all other MTU sources are also not
in effect (because ip-config has a low priority). In that case, we would
not have an MTU to reset and the code would not commit a new MTU. Thus
this should still be fine, also during activation when we didn't yet get
the DHCP lease (or other information to dynamically set the MTU).
When assuming a device, the NMActiveConnection switches the
sys-iface-state from "assume" to "managed" when the device reaches the
activated state.
<debug> [1679353062.8884] active-connection[000055bd310b92e0]: set state activated (was activating)
<debug> [1679353062.8885] active-connection[000055bd310b92e0]: update activation type from assume to managed
Note that the "assume" state is probably a misfeature, and should be
dropped in favor of more appropriate flags. Meaning, "assume" state for
the most part is very similar to sys-iface-state "managed", and the
cases where (during activation) we need to be graceful, may be better
covered with other (more specialized) state flags. Regardless, for most
practical purposes, sys-iface-state "assume" should be treated similar
to "managed" state.
When we fully activated, we should be sure to do yet another idle
commit. Note that scheduling an idle-commit is something that must
always be allowed to any users of NML3Cfg. The users have no knowledge
about each other and coordinate by registering their commit type
handles. Issuing an idle "auto" commit must be therefore allowed to
them at any time. If that were not the case, then there would be a bug
to fix. The only reason to maybe not do it, is when we are sure there is
nothing to commit and we would want to avoid unnecessary work.
You can easily reproduce this and see that we don't in fact schedule a
commit after becoming managed. A commit usually only happens later, for
example when we receive an autoconf6 update.
This affects for example setting the MTU. Currently, _commit_mtu() bails
out for nm_device_sys_iface_state_is_external_or_assume() and thus
during activation the MTU will not be set. Later, once we reach
activated state, due to this it still is not set right away. This patch
fixes that, although we should also change _commit_mtu() to not bail out
for sys-iface-state "assume".
"wifi.seen-bssids" looks like a regular property, but it is not. Unlike
almost all other properties, it does not contain user configuration,
rather it gets filled by the daemon.
The values are thus stored in "/var/lib/NetworkManager/seen-bssids"
file, and the daemon maintains the values separately from the profile.
Only before exporting the profile on D-Bus, the value gets merged (see
NM_SETTINGS_CONNECTION_GET_PRIVATE(self)->>getsettings_cached and
nm_connection_to_dbus_full().
Hence, looking at nm_setting_wireless_get_num_seen_bssids() is not
working. Fix that.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1253
Fixes: 0f3203338c ('wifi: roam aggressively if we on a multi-AP network')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1577
Previously, there was "temporary-not-available" mechanism in NML3Cfg,
which aimed to handle IPv6 routes with prefsrc. Theoretically, that
mechanism may have been extended to other use-cases, like IPv4 routes
with prefsrc. What it attempted to handle, is the inability to configure
such routes, unless the respective prefsrc address is configured and
non-tentative. However, the address that we are waiting for, could also
be on another interface, so that mechanism wasn't applicable. This is
now replaced by _routes_watch_ip_addrs(). It seems there isn't anything
useful left for the "temporary-not-available" mechanism and it can go,
except...
We want to log a warning when we are unable to configure a route. Also,
in the future we might want to know when the IP configuration is
degradated due to inability to configure the desired routes (a condition
that we might want to expose to the user, not only via logging; or we
may want to react on that).
However, with prefsrc routes we don't know right away whether the
inability to configure the route right away indicates an actual problem,
or whether that will resolve itself (e.g. after the address passes
DAD/ACD, after we received an DHCP lease or after the address was
configured on another interface). Consequently, to know whether the
current inability to configure such a route is a problem, we need to
know the larger context. nm_platform_ip_route_sync() does not have that
context.
Instead, nm_platform_ip_route_sync() needs only do debug log about
failure to configure routes. It will now also return all the failed
routes to NML3Cfg, which can decide whether that is a problem.
This reworks the previous "temporary-not-available" mechanism to track
the state of the failed routes, to eventually decide whether there is an
actual problem (and log about it).
Another problem this solves is that since commit ('platform: always
reconfigure IP routes even if removed externally'), we will eagerly
re-try to configure the same route over and over. We cannot just spam
the log with warnings about the same failure on every commit. We need to
remember that we already logged about the problem and rate limit
warnings otherwise. This is what the new mechanism also achieves.
Indeed, all this is mostly for the sole benefit of logging better
warnings (and not duplicated).
When creating a parent dependent device it can have software device as
parent without an ifindex. In that case, it will fail on an ssertion on
parent being missing.
In order to avoid this, we are handling the situation similar to what we
do for VLAN devices. NetworkManager will raise different error and block
the autoconnection instead of asserting.
This solves the assert error for the following commands:
```
nmcli connection add type macvlan ifname mv1 con-name mv1+ macvlan.parent dummy0 mode vepa
nmcli connection add type dummy ifname dummy0 con-name dummy0+ autoconnect no
```
There were a few places where we did already this but there was one
place where we missed it, in nm-manager.c:do_sleep_wake(). Therefore,
the device end in DISCONNECTED state and did not assume the connection.
When the state is DISCONNECTED is being set from a
configuring/configured state we might want to always DECONFIGURE the
interface (ifindex, ip addresses, routes..) except if the
sys-iface-state is REMOVED in that case we would like to remove it.
This is the IPv6 equivalent of arp_ip_target option. It requires
arp_interval set and allow the user to specify up to 16 IPv6 addresses
as targets. By default, the list is empty.
The valid values for this option are 0 (off) and 1 (on). By default the
value is 1 (on). Please notice that this option is only compatible with
802.3AD mode.
The new arp_missed_max option valid range is 0-255 where value 0 means
not set. Please notice that this option is not compatible with 802.3AD,
balance-tlb and balance-alb modes.
OVS interfaces are special: the kernel link is created only after the
device is attached to the ovs-port, and as with all ports this happens
during stage3(ip-config). That means that the link doesn't exist
during stage2(config); therefore, explicitly update link properties
once the link appears.
We need to set the ethtool and tc properties for assumed devices,
since they go through a normal activation. External devices should not
be touched by NM.
Autoconnect retries are not being tracked by connection anymore. Now it
is tracked per Device x Connection. In addition, autoconnect might be
blocked for the connection due to no secrets or user requested.
All the properties tracking the retries and blocked time were move to
DevConData and the functions to manipulate them aswell. In NMPolicy the
logic didn't change very much. Instead of looking into the connection
when the device failed activation it looks for DevConData.
This setting allows the user to remove the local route rule that is
autogenerated for both IPv4 and IPv6. By default, NetworkManager won't
touch the local route rule.