Commit graph

16781 commits

Author SHA1 Message Date
Thomas Haller
87b46e1663
core: improve handling for blocking autoconnect
Cleanup logging to always print a "block-autoconnect:" prefix to related
lines. Also, make sure that everywhere where the state changes, a line
gets logged. Also, for devconf data print both the interface and the
profile.
2023-05-04 10:34:12 +02:00
Thomas Haller
fc624b8de8
core: assert for valid blocked reasons in autoconnect code
We only have a few blocked reasons. Some of them can be only set on the
devcon data, and some only on the settings connection. Assert that we
don't mix that up.
2023-05-04 10:34:12 +02:00
Fernando Fernandez Mancera
2f0571f193 bonding: add support to prio property in bond ports
Add per port priority support for bond active port re-selection during
failover. A higher number means a higher priority in selection. The
primary port still has the highest priority. This option is only
compatible with active-backup, balance-tlb and balance-alb modes.
2023-05-03 10:44:06 +02:00
Fernando Fernandez Mancera
e200b16291 platform: add support to prio property in bond ports 2023-05-03 10:43:58 +02:00
Fernando Fernandez Mancera
bb435674b5 platform: add netlink support for bond port options
sysfs is deprecated and kernel will not add new bond port options to
sysfs. Netlink is a stable API and therefore is the right method to
communicate with kernel in order to set the link options.
2023-05-03 09:55:45 +02:00
Fernando Fernandez Mancera
762cd06ffa libnm: fix ifcfg variable documentation at queue-id property
The correct variable for queue-id in ifcfg is BOND_PORT_QUEUE_ID.
2023-05-03 09:55:45 +02:00
Thomas Haller
d3b5496362
firewall: create "dynamic" sets for nft rules for slb-bonding
A workaround for a nftables issue ([1]). I don't know why that matters.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2177667

Fixes: e9268e3924 ('firewall: add mlag firewall utils for multi chassis link aggregation (MLAG) for bonding-slb')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1614
2023-05-03 08:12:15 +02:00
Thomas Haller
db3da65c6c
dns: refactor domain_is_valid() to combine #if blocks 2023-05-02 11:42:55 +02:00
Thomas Haller
4ddbf32f1b
dns/trivial: rename check_public_suffix parameter of domain_is_valid()
Names are important. The previous name was counter intuitive for what
the behavior was.
2023-05-02 11:42:49 +02:00
Thomas Haller
601605dbea
dns: use NM_STR_HAS_SUFFIX() instead of g_str_has_suffix()
It translates to a plain memcmp() as the argument is a string literal.
2023-05-02 11:40:34 +02:00
Thomas Haller
b4338de984
dns: fix logging for resetting the host-domain
The previous logging happened, when the value did not change. Log
instead, when the value changes.

Fixes: 86bb09c93b ('dns: generate correct search domain for hostnames on non-public TLD')
2023-05-02 11:40:33 +02:00
Tom Sobczynski
86bb09c93b
dns: generate correct search domain for hostnames on non-public TLD
dns-manager uses the Mozilla Public Suffix List to determine an
appropriate search domain when generating /etc/resolv.conf. It is
presumed that if the hostname is "example.com", the user does not want
to automatically search "com" for unqualified hostnames, which is
reasonable.  To implement that, prior to the fix, domain_is_valid()
implicitly used the PSL "prevailing star rule", which had the
consequence of assuming that any top-level domain (TLD) is public
whether it is on the official suffix list or not. That meant
"example.local" or "example.localdomain" would not result in searching
"local" or "localdomain" respectively, but rather /etc/resolv.conf would
contain the full hostname "example.local" as the search domain and not
give users what they expect.  The fix here uses the newer PSL API
function that allows us to turn off the "prevailing star rule" so that
"local" and "localdomain" are NOT considered public TLDs because they
are not literally on the suffix list. That in turn gives us the search
domain "local" or "localdomain" in /etc/resolv.conf and allows
unqualified hostname lookups "e.g., resolvectl query example" to find
example.local while example.com still maintains the previous behavior
(i.e., search domain of "example.com" rather than "com").

[thaller@redhat.com: reworded commit message]

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1281

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1613
2023-05-02 11:23:09 +02:00
Thomas Haller
cb6f8b987c
all: fix various wrong "return FALSE" for returning pointers 2023-05-02 08:37:20 +02:00
Thomas Haller
6428ee04a8
systemd: define ENABLE_GSHADOW to zero
To be consistent with other defines.
2023-05-02 08:36:37 +02:00
Marc Muehlfeld
0cb43c9e42
man: rewrite ipv4.method and ipv6.method man page descriptions
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1275

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1611
2023-04-27 09:09:11 +02:00
Thomas Haller
b48c314328
core: simplify tracking of delete_on_deactivate idle action
Before commit a42682d44f ('device: take reference to device object
before 'delete_on_deactivate''), we used a weak pointer to track the
idle action.

As we now use a strong reference, we can store all data about the idle
action in NMDevice itself. Drop DeleteOnDeactivateData.
2023-04-27 08:40:12 +02:00
Thomas Haller
aede228974
core: assert that devices are not registered when disposing NMPolicy
NMDevice holds a reference to NMManager, which holds a reference to NMPolicy.
It is not possible that we try to dispose NMPolicy while there are still devices
registered. That would be a bug, that we need to find and solve
differently. Add an assertion instead of trying to handle it.
2023-04-27 08:40:12 +02:00
Thomas Haller
0dd4724446
core: don't take reference on NMDevice to track auto-activate
Add an assertion to nm_policy_device_recheck_auto_activate_schedule(),
that the device is currently registered in NMPolicy. Calling it outside
would be odd, and likely a bug.

But if we only register the auto-activate while being registered, we
don't need to take an additional reference. We know that the object must
be be alive (also, we have assertions that in fact it is still alive).
2023-04-27 08:40:12 +02:00
Thomas Haller
a22e5080a0
core: rework tracking of auto-activating devices in NMPolicy
Hook the information for tracking the activation of a device, to the
NMDevice itself. Sure, that slightly couples the NMPolicy closer to
NMDevice, but the result is still simpler code because we don't need a
separate ActivateData.

It also means we can immediately tell whether the auto activation check
for NMDevice is already scheduled and don't need to search through the
list.
2023-04-27 08:40:12 +02:00
Thomas Haller
520fcc8667
core: add nm_manager_get_policy() accessor
NMPolicy really should be merged into NMManager. It has not a clear responsiblity
so that there are two separate objects only makes things confusing. Anyway. It
is permissible to look up the NMPolicy instance of a NMManager. Add an accessor.
2023-04-27 08:40:12 +02:00
Thomas Haller
a81925ad32
core: call nm_manager_device_recheck_auto_activate_schedule() from "nm-manager.c"
No need to call down to the device, to call back up to the NMManager.
2023-04-27 08:40:12 +02:00
Thomas Haller
751b927cf2
core: rename nm_device_emit_recheck_auto_activate() to nm_device_recheck_auto_activate_schedule()
It's the better name. Especially since there is no more signal involved,
the term "emit" doesn't match.

Note also how the previous approach using a signal tried to abstract
what is happening. So we were no longer rechecking-autoconnect, instead,
we were emitting-a-signal-to-recheck-autoconnect. Just be plain about
what it is doing and don't go through a layer of signal.
2023-04-27 08:40:12 +02:00
Thomas Haller
3c59c6b393
core: drop NM_DEVICE_RECHECK_AUTO_ACTIVATE signal and call policy directly
GObject signals don't make the code easier to understand, on the
contrary.  They may have their purpose, when objects truly must/should
not be aware of each other, and need to be composed very loosely. That
is not the case here.

There really is only one subscriber to NM_DEVICE_RECHECK_AUTO_ACTIVATE
signal, and it only makes sense this way. Instead of going through a
signal invocation, just call the well known method directly. It becomes
clearer who calls this code (and it has a lower overhead).

When using cscope/ctags it also is easier to follow the code because the
tools understand function calls.
2023-04-27 08:35:28 +02:00
Thomas Haller
aa2569a9cd
core: use GSource for tracking reset_connections_retries idle action
The numeric source IDs are discouraged. Use a GSource instead.
2023-04-27 08:35:28 +02:00
Thomas Haller
1559c37b9f
core: use GSource for tracking _device_recheck_auto_activate_all_cb idle action
The numeric source IDs are discouraged. Use a GSource instead.
2023-04-27 08:35:28 +02:00
Thomas Haller
886786ee0b
core: rename internal function nm_policy_device_recheck_auto_activate_all_schedule()
The "all" variant is strongly related to nm_policy_device_recheck_auto_activate_schedule().
Rename, so that the names express that better.
2023-04-27 08:35:28 +02:00
Thomas Haller
f1c15f0ae7
core: expose and rename nm_policy_device_recheck_auto_activate_schedule()
Let's simplify this part of the code. This is the first step.
2023-04-27 08:35:27 +02:00
Thomas Haller
49c1e01519
core: don't trigger recheck to auto activate for deleted devices
The delete_on_deactivate_link_delete() handler may be called after the
device was already removed from NMManager. Don't allow that.

Check whether the device is still exported on D-Bus as indication.
2023-04-27 08:35:27 +02:00
Thomas Haller
e699dff46a
device: trigger a recheck to autoconnect when unrealizing ovs-interface
NM_reboot_openvswitch_vlan_configuration_var2 test exposes a race. What
the test does, is to create OVS profiles and repeatedly restart
NetworkManager, checking that those profiles autoconnect and the OVS
configuration gets created.

There is a race, where:

- the OVS interface exists, and an NMDeviceOvsInterface is created
- first ovsdb cleans up old interfaces, sending a json request.
- OVS deletes the interface, and NetworkManager first picks up the
  platform signal (there is a race here, usually the ovsdb request
  completes first, which will cleanup the NMDeviceOvsInterface in
  a different way).
- when the device gets unrealized, we don't schedule a
  check-autoactivate, so the device stays down.

See https://bugzilla.redhat.com/show_bug.cgi?id=2152864#c5 for a log
file with more details.

What should instead happen, is to autoactivate the OVS interface, which
then also fully configures the port and bridge interfaces.

Explicitly schedule an autoactivate when unrealizing devices.

Note that there are now several cases, where NetworkManager autoconnects
more eagerly. This even affects some CI tests and user-visible behavior.
But I think relying on "just don't call nm_device_emit_recheck_auto_activate()
to hope that autoconnect doesn't happen is wrong. It must always be
possible to trigger an autoconnect check, and the right thing must
happen. We only don't trigger autoconnect checks *all* the time, because
it would be a waste of CPU resources, but whenever we slightly suspect
that an autoconnect may happen, we must be allowed to trigger a check.
If a device is in a condition where it previously did not autoconnect,
and it also *should* not autoconnect, then we need to fix the code that
evaluates whether an autoconnect may happen (not avoid triggering a
check).

https://bugzilla.redhat.com/show_bug.cgi?id=2152864
Fixes-test: @NM_reboot_openvswitch_vlan_configuration_var2
2023-04-26 17:11:52 +02:00
Thomas Haller
14d429dd17
device: block autoconnect of profile when deleting device
Currently, when we delete a device then autoconnect does not kick in
right away. But that is only, because we happen not to schedule a
"autoactivate" recheck.

What should be happen, is that rechecking whether to autoconnect is
always allowed, and that we have the necessary state to know that
autoconnect currently should not work.

Instead, block autoconnect of the involved profile. That makes sense,
because clearly we don't want to autoconnect right again after `nmcli
device delete $iface`.
2023-04-26 11:05:18 +02:00
Thomas Haller
c68cbcb8fa
device: minor cleanup of code path in delete_cb() 2023-04-26 11:05:18 +02:00
Thomas Haller
7deea767d3
core: use NMStrBuf in nm_utils_stable_id_parse() 2023-04-21 12:51:43 +02:00
Thomas Haller
21cf2dc58f
libnm,core: make "default${CONNECTION}" the built-in stable ID
The "connection.stable-id" supports placeholders like "${CONNECTION}" or
"${DEVICE}".

The stable-id can also be specified in global connection defaults in
NetworkManager.conf, by leaving it unset in the profile. Global
connection defaults always follow the pattern, that they correspond to a
per-profile property, and only when the per-profile value indicates a
special default/unset value, the global connection default is consulted.
Finally, if the global connection default is also not configured in
NetworkManager.conf, a built-in default is used (which may not be
constant either, for example ipv6.ip6-privacy's built-in default depends
on a sysctl value).

In any case, every possible configuration that can be achieved should be
configurable both per-profile and via global connection default. That
was not given for the stable-id, because the built-in default generated
an ID in a way that could not be explicitly expressed otherwise.

So you could not:
- explicitly set the per-profile value to the built-in default, to avoid
  that the global-connection-default overwrites it.
- explicitly set the global-connection-default to the built-in default,
  to avoid that a lower priority [connection*] section overwrites the
  stable-id again.

Fix that inconsistency to make it possible to explicitly set the
built-in default.

Change behavior for literally "default${CONNECTION}" and make it behave
as the built-in default. Also document that the built-in default has that
value.

It's unlikely that this breaks an existing configuration, but of course,
if any user configured "connection.stable-id=default${CONNECTION}", then
the behavior changes for them.
2023-04-21 12:49:18 +02:00
Beniamino Galvani
cab80c5129 device: emit dhcp-change dispatcher event also after a lease renewal
Fixes: e1648d0665 ('core: commit l3cd asynchronously on DHCP bound event')
Co-authored-by: Thomas Haller <thaller@redhat.com>

https://bugzilla.redhat.com/show_bug.cgi?id=2179537
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1609
2023-04-18 18:18:55 +02:00
Thomas Haller
afe80171b2
ppp: move ppp code to "nm-pppd-compat.c"
Using the ppp code is rather ugly.

Historically, the pppd headers don't follow a good naming convention,
and define things that cause conflicts with our headers:

  /usr/include/pppd/patchlevel.h:#define VERSION          "2.4.9"
  /usr/include/pppd/pppd.h:typedef unsigned char  bool;

Hence we had to include the pppd headers in certain order, and be
careful.

ppp 2.5 changes API and cleans that up. But since we need to support
also old versions, it does not immediately simplify anything.

Only include "pppd" headers in "nm-pppd-compat.c" and expose a wrapper
API from "nm-pppd-compat.h". The purpose is that "nm-pppd-compat.h"
exposes clean names, while all the handling of ppp is in the source
file.
2023-04-17 18:27:50 +02:00
Eivind Næss
8469c09a50
ppp, adding support for compiling against pppd-2.5.0
This change does the following
* Adding in nm-pppd-compat.h to mask details regarding different
  versions of pppd.
* Fix the nm-pppd-plugin.c regarding differences in API between
  2.4.9 (current) and latet pppd 2.5.0 in master branch
* Additional fixes to the configure.ac to appropriately set defines used
  for compilation
2023-04-16 21:05:07 +02:00
Thomas Haller
290bac0af9
libnm: fix annotation for out_is_valid of nm_wireguard_peer_get_allowed_ip()
Fixes: 5d28a0dd89 ('doc: replace all (allow-none) annotations by (optional) and/or (nullable)')
2023-04-16 16:49:37 +02:00
Beniamino Galvani
89a8f51235 device: stop activation when queueing the unmanaged state
When the unmanaged state is queued, we must ensure that the current
activation doesn't overwrite the queue stated with a new one. This can
happen for example if a dispatcher script or a firewall call
terminate, or if the next activation stage is dispatched.

Fixes-test: @preserve_master_and_ip_settings
https://bugzilla.redhat.com/show_bug.cgi?id=2178269
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1599
2023-04-11 09:19:03 +02:00
Thomas Haller
640c82710f
platform/tests: fix unit test creating ip6gre tunnel with old iproute2
Older versions of iproute2 don't support the "enclimit" argument. Work
around that from the unit tests.

Fixes: 1505ca3626 ('platform/tests: ip6gre & ip6gretap test cases (ip6 tunnel flags)')
2023-04-07 17:25:26 +02:00
Beniamino Galvani
24461954d0 dhcp: reset IPv6 DAD flag on lease update
If the client was waiting for IPv6 DAD to complete and the lease was
updated or lost, `wait_ipv6_dad` needs to be cleared; otherwise, at
the next platform change the client will try to evaluate the DAD state
with a different or no lease. In particular if there is no lease the
client will try to decline it because there are no valid addresses,
leading to an assertion failure:

 ../src/core/dhcp/nm-dhcp-client.c:997:_dhcp_client_decline: assertion failed: (l3cd)

Backtrace:

  __GI_raise ()
  __GI_abort ()
  g_assertion_message ()
  g_assertion_message_expr ()
  _dhcp_client_decline (self=0x1af13b0, l3cd=0x0, error_message=0x8e25e1 "DAD failed", error=0x7ffec2c45cb0) at ../src/core/dhcp/nm-dhcp-client.c:997
  l3_cfg_notify_cb (l3cfg=0x1bc47f0, notify_data=0x7ffec2c46c60, self=0x1af13b0) at ../src/core/dhcp/nm-dhcp-client.c:1190
  g_closure_invoke ()
  g_signal_emit_valist ()
  g_signal_emit ()
  _nm_l3cfg_emit_signal_notify () at ../src/core/nm-l3cfg.c:629
  _nm_l3cfg_notify_platform_change_on_idle () at ../src/core/nm-l3cfg.c:1390
  _platform_signal_on_idle_cb () at ../src/core/nm-netns.c:411
  g_idle_dispatch ()

Fixes: 393bc628ff ('dhcp: wait DAD completion for DHCPv6 addresses')

https://bugzilla.redhat.com/show_bug.cgi?id=2179890
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1594
2023-04-06 15:56:59 +02:00
Thomas Haller
3a76d717da
ovsdb: debug log all messages of socket buffer 2023-04-04 08:58:06 +02:00
Thomas Haller
0ee60b943d
ovsdb: downgrade error logging to warnings
<error> is mostly about "really should not happen" scenarios. It's
closer to an assertion failure, and something that NetworkManager should
not happen.

Of course, things can go wrong, but <warn> is a sufficient. When ovsdb
gives unexpected communication, it's just a warning. At least, that's
also what all the similar cases in "nm-ovsdb.c" already do
2023-04-04 08:43:21 +02:00
Thomas Haller
25c97817d2
ovsdb: limit maxiumum data size for receive buffer from ovsdb 2023-04-04 08:43:21 +02:00
Thomas Haller
f7d321c6d6
ovsdb: add watchdog for unparsable JSON data in socket 2023-04-04 08:43:21 +02:00
Thomas Haller
7e12d437fe
ovsdb: use the FD directly instead of GSocketConnection/GOutputStream
GSocketConnection/GOutputStream/GInputStream seems rather unnecessary.
Maybe they make sense when you want to write portable code (for
Windows). Otherwise, watching a file descriptor and reading/writing it
directly is simpler (and also more efficient).

For example, we passed no GCancellable to g_input_stream_read_async().
What does that mean w.r.t. destroying the NMOvsdb instance? I suspect
it's wrong, but it's hard to say, because there are so many layers of
code.

Note that we anyway keep state in NMOvsdb, namely the data we want to
send (output_buf) and the data we partially received (input_buf). All we
need, are poll notifications when the file descriptor is ready. To
those, we hook up the read/write callbacks. Also before was the code
async, and there were callbacks when when read/write was done. That does
not simplify the code in any way.

- we no longer use separate NMOvsdbPrivate.buf and NMOvsdbPrivate.input
  buffers. There is just a NMOvsdbPrivate.input_buf that can we can fill
  directly.
2023-04-04 08:43:21 +02:00
Thomas Haller
f862d4bbce
ovsdb: use nm_auto_free cleanup attribute in "nm-ovsdb.c" 2023-04-04 08:43:21 +02:00
Thomas Haller
64825b4f58
ovsdb: don't track buffer offset in NMOvsdb data and refactor parsing JSON messages
The "priv->bufp" offset is only used while parsing a message at a time.
It's unnecessary to track it in NMOvsdbPrivate and keep it between
parsing messages. Tracking the state in NMOvsdbPrivate makes it more
complicated to understand, because one needs to reason at which times
the state is used (when it really is not used).

Also, move the parsing to a separate function.
2023-04-04 08:43:21 +02:00
Thomas Haller
1378ed7d96
core: drop unnecessary initialization in nm_utils_spawn_helper()
We did not initialize "child_stderr". If that were necessary, we would need
to add it too. However, it is clearly not necessary to initialize those fields.
2023-04-04 08:43:21 +02:00
Thomas Haller
ce414933a7
core: use nm_io_fcntl_setfl_update_nonblock() helper 2023-04-04 08:43:21 +02:00
Thomas Haller
f4943e07f1
glib-aux: add nm_io_fcntl_setfl_update_nonblock() helper 2023-04-04 08:43:20 +02:00