This reverts commit 2ad5fbf025.
It is actually a partial revert. The changes to documentation don't need
to be reverted.
Fixes: 2ad5fbf025 ('policy: refresh IPv4 forwarding after connection activation and disconnection')
When we do `nmcli networking off` it's shown as state "sleeping". This
is confusing, and the only reason is that we share internally code to
handle both situations in a similar way.
Rename the state to the more generic name "disabled", situation that can
happen either because of sleeping or networking off.
Clients cannot differentiate the exact reason only with the NMState value,
but better that they show "network off" as this is the most common reason
that they will be able to display. If the system is suspending, there will
be only a short period of time that they can show the state, and showing
"network off" is not wrong because that's what NM has done as a response
to suspend.
In the logs, let's make explicit the exact reason why state is changing
to DISABLED: sleeping or networking off.
Logs before:
manager: disable requested (sleeping: no enabled: yes)
manager: NetworkManager state is now ASLEEP
Logs after:
manager: disable requested (sleeping: no enabled: yes)
manager: NetworkManager state is now DISABLED (NEWORKING OFF)
State before:
$ nmcli general
STATE ...
asleep ...
State after:
$ nmcli general
STATE ...
network off ...
Previously, IPv4 shared method will automatically enable the IPv4
global forwarding, all the per-interface level IPv4 forwarding settings
may be changed to match the global setting. Also, all the per-inteface
level forwarding settings can not be restored when deactivating the
shared connection. This is problematic as it may disrupt custom
configurations and lead to inconsistent forwarding behavior across
different network interfaces.
To address this, the implementation now ensures that the original
per-interface forwarding settings are preserved. Upon activating a
shared connection, instead of enabling IPv4 global forwarding
automatically, the per-interface forwarding is enabled on all other
connections unless a connection explicitly has the forwarding set to
"no" in its configuration. Upon deactivating all shared connection,
per-interface forwarding settings are restored to sysctl's default
value. Furthermore, deactivating any connection explicitly sets the
forwarding to sysctl's default value ensuring that network forwarding
behavior remains consistent.
Prevents NetworkManager from trying to determine the
transient hostname via DHCP or other means if "localhost"
is already configured as a static hostname, as the transient
hostname will be ignored by hostnamed if a static hostname
has already been set.
When calling activate_port_or_children_connections() we are unblocking
the ports and children but we are not resetting the number of retries if
it is an internal activation.
This is wrong as even if it's an internal activation the number of
retries should be reset. It won't interferfe with other blocking reasons
like USER_REQUESTED or MISSING_SECRETS.
When the "ipvX.routed-dns" property is set to true, add a route for
each DNS server via the current interface. The feature works in the
following way.
A new routing rule is created ("priority $PRIO not fwmark $MARK lookup
$TABLE") where $PRIO, $MARK and $TABLE are fixed values and are the
same for all interfaces. This rule is evaluated before standard rules
and tries to look up routes in table $TABLE, where NM adds the routes
to DNS servers.
To determine the next-hop to the name server, NM issues a RTM_GETROUTE
netlink request to kernel, specifying to return the route via the
current interface. In order to avoid results from $TABLE, NM also sets
the fwmark as $MARK in the request.
Currently if the system hostname can't be determined, NetworkManager
only retries when something changes: a new address is added, the DHCP
lease changes, etc.
However, it might happen that the current failure in looking up the
hostname is caused by an external factor, like a temporary outage of
the DNS server.
Add a mechanism to retry the resolution with an increasing timeout.
https://issues.redhat.com/browse/RHEL-17972
We are currently asserting that the list of devices waiting for
auto-activation in NMPolicy is not empty. This condition is always
false because:
- NMDevice holds a reference to NMManager
- NMManager holds a reference to NMPolicy
- on dispose, NMDevice asserts that it's not in NMPolicy's
auto-activate list
Therefore if there is any NMDevice alive, NMPolicy must be alive as
well. Instead, if there is no NMDevice alive the list must be empty.
The assertion could fail only when the NMPolicy instance gets
disposed, which usually doesn't happen because it's still referenced
at shutdown.
Fixes: aede228974 ('core: assert that devices are not registered when disposing NMPolicy')
Instruct the `NMDnsManager` to emit `CONFIG_CHANGED` signal even
`dns=none` or failed to modify `/etc/resolv.conf`.
The `NMPolicy` will only update hostname when DNS is managed.
Signed-off-by: Gris Ge <fge@redhat.com>
When we register the auto-activate, the device has to be registered in
NMPolicy, the assertion is correct and ensure that.
This reverts commit 712729f652.
When a port cannot activate because the controller is not ready, it gets
blocked from autoconnect (see commit 725fed01cf ('policy: block
connection from autoconnect in case of failed dependency')).
Later, when the master activates we call activate_slave_connections()
(see commit 32efb87d4d ('core: unblock failed connections when the
master is available')), which unblocks those port profiles so they can
autoconnect.
However, imagine you add a port profile with autoconnect enabled. The
profile tries to autoconnect, finds no master and gets blocked. Then,
add the controller profile with autoconnect disabled. The controller is
not autoactivating, not calling activate_slave_connections() and the
profiles stay down.
Fix that by unblocking autoconnect of the ports when the controller
profile changes.
It seems better for readability, because reacting based on the state-reason
is ugly already. This way, we access nm_device_state_reason_check(reason) only
at once place. With the if, it's not immediately obvious that both if/else
parts only switch on the reason too.
Cleanup logging to always print a "block-autoconnect:" prefix to related
lines. Also, make sure that everywhere where the state changes, a line
gets logged. Also, for devconf data print both the interface and the
profile.
We only have a few blocked reasons. Some of them can be only set on the
devcon data, and some only on the settings connection. Assert that we
don't mix that up.
NMDevice holds a reference to NMManager, which holds a reference to NMPolicy.
It is not possible that we try to dispose NMPolicy while there are still devices
registered. That would be a bug, that we need to find and solve
differently. Add an assertion instead of trying to handle it.
Add an assertion to nm_policy_device_recheck_auto_activate_schedule(),
that the device is currently registered in NMPolicy. Calling it outside
would be odd, and likely a bug.
But if we only register the auto-activate while being registered, we
don't need to take an additional reference. We know that the object must
be be alive (also, we have assertions that in fact it is still alive).
Hook the information for tracking the activation of a device, to the
NMDevice itself. Sure, that slightly couples the NMPolicy closer to
NMDevice, but the result is still simpler code because we don't need a
separate ActivateData.
It also means we can immediately tell whether the auto activation check
for NMDevice is already scheduled and don't need to search through the
list.
GObject signals don't make the code easier to understand, on the
contrary. They may have their purpose, when objects truly must/should
not be aware of each other, and need to be composed very loosely. That
is not the case here.
There really is only one subscriber to NM_DEVICE_RECHECK_AUTO_ACTIVATE
signal, and it only makes sense this way. Instead of going through a
signal invocation, just call the well known method directly. It becomes
clearer who calls this code (and it has a lower overhead).
When using cscope/ctags it also is easier to follow the code because the
tools understand function calls.
Don't try to block a device/connection pair when the connection was
removed. Doing so would create a new devcon entry associated with the
connection that is being deleted.
Fixes: b73b34c3ee ('policy: track autoconnect retries per Device x Connection')
Autoconnect retries are not being tracked by connection anymore. Now it
is tracked per Device x Connection. In addition, autoconnect might be
blocked for the connection due to no secrets or user requested.
All the properties tracking the retries and blocked time were move to
DevConData and the functions to manipulate them aswell. In NMPolicy the
logic didn't change very much. Instead of looking into the connection
when the device failed activation it looks for DevConData.
Improve logging:
- log only when something changes
- print the new resolver state, instead of the old one
- rename state "in-progress" to "started"
- log when the resolver state is reset due to DNS changes
With multi-connect enabled, this can cause infinite retries to autoconnect,
see [1].
That has bad consequences for example in initrd, where
nm-wait-online-initrd.service would wait up to one hour before failing
and blocking boot.
This reverts commit 1656d82045.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2039734#c5
Fixes: 1656d82045 ('policy: track the autoconnect retries in devices for multi-connect')
The warning "-Wcast-align=strict" seems useful and will be enabled
next. Fix places that currently cause the warning by using the
new macro NM_CAST_ALIGN(). This macro also nm_assert()s that the alignment
is correct.
In some scenarios, autoconnect should not be blocked if the device is
activated on the external connection (e.g. autoconnect on the loopback
device).
Adding the `allow_autoconnect_on_external` flag to support such
behavior.
We soon will handle loopback, so -- if no loopback profile is activated
in NetworkManager -- we will have an externally managed profile on
loopback. This messes up the result.
In general, external connections don't make much sense for
build_device_hostname_infos(). Ignore them.