Commit graph

2030 commits

Author SHA1 Message Date
Thomas Haller
640c82710f
platform/tests: fix unit test creating ip6gre tunnel with old iproute2
Older versions of iproute2 don't support the "enclimit" argument. Work
around that from the unit tests.

Fixes: 1505ca3626 ('platform/tests: ip6gre & ip6gretap test cases (ip6 tunnel flags)')
2023-04-07 17:25:26 +02:00
Beniamino Galvani
24461954d0 dhcp: reset IPv6 DAD flag on lease update
If the client was waiting for IPv6 DAD to complete and the lease was
updated or lost, `wait_ipv6_dad` needs to be cleared; otherwise, at
the next platform change the client will try to evaluate the DAD state
with a different or no lease. In particular if there is no lease the
client will try to decline it because there are no valid addresses,
leading to an assertion failure:

 ../src/core/dhcp/nm-dhcp-client.c:997:_dhcp_client_decline: assertion failed: (l3cd)

Backtrace:

  __GI_raise ()
  __GI_abort ()
  g_assertion_message ()
  g_assertion_message_expr ()
  _dhcp_client_decline (self=0x1af13b0, l3cd=0x0, error_message=0x8e25e1 "DAD failed", error=0x7ffec2c45cb0) at ../src/core/dhcp/nm-dhcp-client.c:997
  l3_cfg_notify_cb (l3cfg=0x1bc47f0, notify_data=0x7ffec2c46c60, self=0x1af13b0) at ../src/core/dhcp/nm-dhcp-client.c:1190
  g_closure_invoke ()
  g_signal_emit_valist ()
  g_signal_emit ()
  _nm_l3cfg_emit_signal_notify () at ../src/core/nm-l3cfg.c:629
  _nm_l3cfg_notify_platform_change_on_idle () at ../src/core/nm-l3cfg.c:1390
  _platform_signal_on_idle_cb () at ../src/core/nm-netns.c:411
  g_idle_dispatch ()

Fixes: 393bc628ff ('dhcp: wait DAD completion for DHCPv6 addresses')

https://bugzilla.redhat.com/show_bug.cgi?id=2179890
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1594
2023-04-06 15:56:59 +02:00
Thomas Haller
3a76d717da
ovsdb: debug log all messages of socket buffer 2023-04-04 08:58:06 +02:00
Thomas Haller
0ee60b943d
ovsdb: downgrade error logging to warnings
<error> is mostly about "really should not happen" scenarios. It's
closer to an assertion failure, and something that NetworkManager should
not happen.

Of course, things can go wrong, but <warn> is a sufficient. When ovsdb
gives unexpected communication, it's just a warning. At least, that's
also what all the similar cases in "nm-ovsdb.c" already do
2023-04-04 08:43:21 +02:00
Thomas Haller
25c97817d2
ovsdb: limit maxiumum data size for receive buffer from ovsdb 2023-04-04 08:43:21 +02:00
Thomas Haller
f7d321c6d6
ovsdb: add watchdog for unparsable JSON data in socket 2023-04-04 08:43:21 +02:00
Thomas Haller
7e12d437fe
ovsdb: use the FD directly instead of GSocketConnection/GOutputStream
GSocketConnection/GOutputStream/GInputStream seems rather unnecessary.
Maybe they make sense when you want to write portable code (for
Windows). Otherwise, watching a file descriptor and reading/writing it
directly is simpler (and also more efficient).

For example, we passed no GCancellable to g_input_stream_read_async().
What does that mean w.r.t. destroying the NMOvsdb instance? I suspect
it's wrong, but it's hard to say, because there are so many layers of
code.

Note that we anyway keep state in NMOvsdb, namely the data we want to
send (output_buf) and the data we partially received (input_buf). All we
need, are poll notifications when the file descriptor is ready. To
those, we hook up the read/write callbacks. Also before was the code
async, and there were callbacks when when read/write was done. That does
not simplify the code in any way.

- we no longer use separate NMOvsdbPrivate.buf and NMOvsdbPrivate.input
  buffers. There is just a NMOvsdbPrivate.input_buf that can we can fill
  directly.
2023-04-04 08:43:21 +02:00
Thomas Haller
f862d4bbce
ovsdb: use nm_auto_free cleanup attribute in "nm-ovsdb.c" 2023-04-04 08:43:21 +02:00
Thomas Haller
64825b4f58
ovsdb: don't track buffer offset in NMOvsdb data and refactor parsing JSON messages
The "priv->bufp" offset is only used while parsing a message at a time.
It's unnecessary to track it in NMOvsdbPrivate and keep it between
parsing messages. Tracking the state in NMOvsdbPrivate makes it more
complicated to understand, because one needs to reason at which times
the state is used (when it really is not used).

Also, move the parsing to a separate function.
2023-04-04 08:43:21 +02:00
Thomas Haller
1378ed7d96
core: drop unnecessary initialization in nm_utils_spawn_helper()
We did not initialize "child_stderr". If that were necessary, we would need
to add it too. However, it is clearly not necessary to initialize those fields.
2023-04-04 08:43:21 +02:00
Thomas Haller
ce414933a7
core: use nm_io_fcntl_setfl_update_nonblock() helper 2023-04-04 08:43:21 +02:00
Thomas Haller
31fd8f60cf
all: use G_SPAWN_CLOEXEC_PIPES with g_spawn_async_with_pipes()
G_SPAWN_CLOEXEC_PIPES is supported since glib 2.40, which we already
depend on.
2023-04-04 08:43:20 +02:00
Thomas Haller
62a85fa845
core: fix setting FD flags in _rfkill_update_system()
F_SETFL will reset the flags. That is wrong, as we only want to add
O_NONBLOCK flag and leaving the other flags alone. Usually, we would
need to call F_GETFL first.

Note that on Linux, F_SETFL can only set certain flags, so the
O_RDWR|O_CLOEXEC flags were unaffected by this. That means, most likely
there are no other flags that our use of F_SETFL would wrongly clear.
Still, it's ugly, because it's not obvious whether there might be other
flags.

Avoid that altogether, by setting the flag already during open().

Fixes: 67e092abcb ('core: better handling of rfkill for WiMAX and WiFi (bgo #629589) (rh #599002)')
2023-04-04 08:43:20 +02:00
Thomas Haller
d1f7e439c6
core: fix setting non-blocking stderr in nm_utils_spawn_helper()
Fixes: d65702803c ('core: print stderr from nm-daemon-helper')
2023-04-04 08:43:20 +02:00
Thomas Haller
fd123315e5
core: fix setting non-blocking FD in nm_utils_spawn_helper()
Fixes: 6ac21ba916 ('core: add infrastructure for spawning a helper process')
2023-04-04 08:43:20 +02:00
Beniamino Galvani
fc13215826 ovs: implement asynchronous detach_port()
Make detach_port() return only after ovsdb reports that the operation
finished.
2023-04-04 08:21:22 +02:00
Beniamino Galvani
07dc237e5c device: wait port detach before leaving the DEACTIVATING state
The device shouldn't change state from DEACTIVATING to DISCONNECTED
until its detached from its controller; otherwise, the port detach
that is in progress can conflict with the following activation.
2023-04-04 08:21:22 +02:00
Beniamino Galvani
82d0fa2a87 device: make detach_port() method asynchronous
This changes the signature of detach_port() to be asynchronous,
similarly to attach_port(). The implementation can return TRUE/FALSE
on immediate completion.

Current implementations return immediately and so there is no change
in behavior for now.
2023-04-04 08:21:22 +02:00
Thomas Haller
de8104c71c
device: fix assertion condition in _dev_ipdhcpx_start()
src/core/devices/nm-device.c: In function '_dev_ipdhcpx_start':
  src/core/devices/nm-device.c:10672:13: error: logical 'or' of collectively exhaustive tests is always true [-Werror=logical-op]
               nm_assert(pd_hint_length > 0 || pd_hint_length <= 128);
               ^
  src/core/devices/nm-device.c:10672:13: error: logical 'or' of collectively exhaustive tests is always true [-Werror=logical-op]
  src/core/devices/nm-device.c:10672:13: error: logical 'or' of collectively exhaustive tests is always true [-Werror=logical-op]

Fixes: e2b9019ac0 ('dhcp: support prefix delegation hint')
2023-04-04 08:19:20 +02:00
Beniamino Galvani
fa997be216 dhcp: export the prefix delegation
Export the IA_PD option so that it is available via D-Bus and in the
lease file in /run.
2023-04-03 16:04:55 +02:00
Beniamino Galvani
e2b9019ac0 dhcp: support prefix delegation hint
Support the prefix delegation hint in the DHCP client.

dhclient only supports a prefix length, emit a warning if the user set
a non-zero prefix.
2023-04-03 16:04:55 +02:00
Beniamino Galvani
f9c1d06e64 libnm,nmcli: add ipv6.dhcp-pd-hint property
Add a new property to specify a hint for DHCPv6 prefix delegation.
2023-04-03 16:04:55 +02:00
Thomas Haller
f58a69656a
core: error out in nm_utils_kill_child_{sync,async}() if first waitpid() gives ECHILD
It is wrong trying to send the signal still. Just error out.

Note that ECHILD indicates that the process is either not a child
or was already reaped. In both cases, that is a bug of the caller
who must keep accurate track of the child's process ID.
2023-04-03 10:27:43 +02:00
Thomas Haller
a65e80e8b6
core: pre-allocate exact buffer size for output in nm_utils_spawn_helper()
It's easy enough to know how many bytes are needed. Just allocate the
right size (+1, because NMStrBuf really likes to reserve that extra byte
for the trailing NUL, even if it's not needed in this case).
2023-04-03 10:27:43 +02:00
Thomas Haller
346196792c
core/trivial: add code comment about using GChildWatchSource 2023-04-03 10:27:43 +02:00
Thomas Haller
597e127bcb
core: use nm_g_child_watch_source_new() in nm_utils_spawn_helper() 2023-04-03 10:27:43 +02:00
Thomas Haller
a52d620549
core: assert that nm_utils_spawn_helper() is used with default context 2023-04-03 10:27:43 +02:00
Thomas Haller
3411f42418
core: store main context in variable in nm_utils_spawn_helper()
There is no change in behavior, because the GTask's context
is of course g_main_context_get_thread_default(). Still, not point
in making that unclear.
2023-04-03 10:27:43 +02:00
Thomas Haller
f9c409d34c
core: qualify logging lines related to helper with "nm-daemon-helper"
Seems to be the better name, because that is also the name of the
executable.
2023-04-03 10:27:43 +02:00
Thomas Haller
f74109e4b0
core: rename nmlog defines in "nm-core-utils.c"
Those are related to _NMLOG2() macro. Rename.
2023-04-03 10:27:43 +02:00
Thomas Haller
b5875ec79e
core: use nm_clear_fd() helper in nm_utils_spawn_helper() 2023-04-03 10:27:43 +02:00
Ratchanan Srirattanamet
bb226d4ed1
wwan/ofono: account for port in the Proxy property 2023-03-30 08:41:35 +02:00
Ratchanan Srirattanamet
264fed4778
wwan/ofono: correct MMS proxy property lookup
The property name under `Settings` dict is just `Proxy`, unlike the one
outside which is `MessageProxy`. See [1].

[1] https://kernel.googlesource.com/pub/scm/network/ofono/ofono/+/refs/heads/master/doc/connman-api.txt#253

Fixes: a6e81af87f ('wwan: add support for using oFono as a modem manager')
2023-03-30 08:41:35 +02:00
Thomas Haller
6043910bff
doc: use "Returns:" annotation instead of deprecated aliases 2023-03-29 11:46:48 +02:00
Thomas Haller
98dd4180ec
all: various fixes to gtk-doc annotations
- drop annotations from "@error" which has defaults.

- ensure all annotations are on the same line. That's useful
  when searching for an annotation, to find the line that specifies
  the argument name.

- convert a few plain docs into gtkdoc annotations.
2023-03-29 11:46:48 +02:00
Beniamino Galvani
b3e5504972 core: don't block a connection that was removed
Don't try to block a device/connection pair when the connection was
removed. Doing so would create a new devcon entry associated with the
connection that is being deleted.

Fixes: b73b34c3ee ('policy: track autoconnect retries per Device x Connection')
2023-03-29 11:19:35 +02:00
Beniamino Galvani
6d6bd92510 core: also deactivate ACs that are not authorized yet
If we are deactivating active-connections for a specific
settings-connection, also consider active-connections that are waiting
for authorization. Otherwise, when the connection is deleted, a
active-connection might still reference it.
2023-03-29 10:28:37 +02:00
Beniamino Galvani
e6b3a6a2b6 core: move deactivation of active connections to the manager
It's needed for the next commit.
2023-03-29 10:28:26 +02:00
Beniamino Galvani
1399aa925d wifi: skip no-ir channels when determining AP channel
If the automatically selected channel for an AP is set as NO-IR in the
current regulatory domain, the hotspot connection will fail to
start. NO-IR means that any mechanisms that initiate radiation are not
permitted on this channel, this includes sending probe requests or
modes of operation that require beaconing such as AP. Skip channels
with the NO-IR flag.
2023-03-28 09:46:11 +02:00
Beniamino Galvani
0a02995175 core: fix l3cd comparison
NM_CMP_SELF(a, b) returns immediately if the objects are the same.

Fixes: cb29244552 ('core: support compare flags in nm_l3_config_data_cmp_full()')
Fixes-test: @dracut_NM_iSCSI_ibft_table

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1583
2023-03-28 09:15:37 +02:00
Alfonso Sánchez-Beato
2566e99fbe
core: do not check 'match' settings when comparing connections
Match settings are already used for matching an existing connection to
a device, it does not really make sense to compare them with an
auto-generated connection that is not going to have them.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1585
2023-03-27 22:08:44 +02:00
Corentin Noël
5d28a0dd89
doc: replace all (allow-none) annotations by (optional) and/or (nullable)
The (allow-none) annotation is deprecated since a long time now, it is better to
use (nullable) and/or (optional) which clarifies what it means with the (out)
annotation.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1551
2023-03-27 11:49:43 +02:00
Thomas Haller
15101447c3
device: also configure MTU while assuming devices
The sys-iface-state "assume" means to gracefully take over a device (for
example, after a restart). The end result is a fully managed interface.
The flag only has meaning while activating, and for most practical
purposes, such devices should be treated the same as fully activated
ones.

Without this, the MTU is not reset until the device reaches fully
activated state, at which point the sys-iface-state switches from
"assume" to "managed". With the previous commit, at that point we also
schedule an idle commit, which ends up also setting the MTU. Before
that, the MTU was only reset some undefined time later, when we happened
to do another NML3Cfg commit. Nonetheless, even waiting until we reach
fully activated state is wrong. Also during activation, commit the MTU.

I guess, what theoretically could happen is that we get our MTU via
ip-config (like DHCP). Then, during assuming we hit _commit_mtu()
without having the DHCP lease yet. This happens after a restart, so it
would be wrong to first reset the MTU, before we re-receive the DHCP
lease. However, if the MTU is really to be set due via
NM_DEVICE_MTU_SOURCE_IP_CONFIG, then all other MTU sources are also not
in effect (because ip-config has a low priority). In that case, we would
not have an MTU to reset and the code would not commit a new MTU. Thus
this should still be fine, also during activation when we didn't yet get
the DHCP lease (or other information to dynamically set the MTU).
2023-03-27 08:54:47 +02:00
Thomas Haller
e773559d9d
device: schedule an idle commit when setting device's sys-iface-state
When assuming a device, the NMActiveConnection switches the
sys-iface-state from "assume" to "managed" when the device reaches the
activated state.

  <debug> [1679353062.8884] active-connection[000055bd310b92e0]: set state activated (was activating)
  <debug> [1679353062.8885] active-connection[000055bd310b92e0]: update activation type from assume to managed

Note that the "assume" state is probably a misfeature, and should be
dropped in favor of more appropriate flags. Meaning, "assume" state for
the most part is very similar to sys-iface-state "managed", and the
cases where (during activation) we need to be graceful, may be better
covered with other (more specialized) state flags. Regardless, for most
practical purposes, sys-iface-state "assume" should be treated similar
to "managed" state.

When we fully activated, we should be sure to do yet another idle
commit. Note that scheduling an idle-commit is something that must
always be allowed to any users of NML3Cfg. The users have no knowledge
about each other and coordinate by registering their commit type
handles.  Issuing an idle "auto" commit must be therefore allowed to
them at any time.  If that were not the case, then there would be a bug
to fix. The only reason to maybe not do it, is when we are sure there is
nothing to commit and we would want to avoid unnecessary work.

You can easily reproduce this and see that we don't in fact schedule a
commit after becoming managed. A commit usually only happens later, for
example when we receive an autoconf6 update.

This affects for example setting the MTU. Currently, _commit_mtu() bails
out for nm_device_sys_iface_state_is_external_or_assume() and thus
during activation the MTU will not be set. Later, once we reach
activated state, due to this it still is not set right away. This patch
fixes that, although we should also change _commit_mtu() to not bail out
for sys-iface-state "assume".
2023-03-27 08:52:54 +02:00
Thomas Haller
a6f5dbb426
core/dbus: "RequestName" of NetworkManager D-Bus API later to fix race
NetworkManager.service is "Type=dbus". Systemd takes that as indication
for declaring the service as started when the D-Bus name is acquired.

Currently, we acquire the name very early. The benefit is, that the
service appears to start very fast. However, most the D-Bus API is not
yet populated or ready to use. So if you order your service
`After=NetworkManager.service`, then there is a race that NetworkManager
might not yet be fully usable.

Another benefit was that requesting a D-Bus name is atomic. That means,
we could take that to ensure only one NetworkManager daemon was running.
If we noticed that NetworkManager is already running, we would quit
without doing anything. In practice, systemd already ensures that the
daemon is not running in parallel. This was still useful for catching
misuse when testing manually. This is now no longer done. We will notice
a concurrent NetworkManager only very late, at which point we might have
already broken things (e.g. rewrite wrong state files).

Fix the race with `After=` by acquiring the name much later.

Note that NetworkManager is pretty slow during initialization. This
easily adds several hundreds of milliseconds to the startup.
2023-03-23 13:06:57 +01:00
Thomas Haller
4699a4c3cd
core/dbus: split RequestName D-Bus call out of initialization for NMDBusManager 2023-03-23 13:06:57 +01:00
Thomas Haller
dabfa26a41
core: don't configure IP routes unless there are also IP addresses
Since l3cfg rework, NetworkManager tracks IP routes early, not not only
when IP configuration is ready. That means, with `ipv4.method=auto` and
static `ipv4.routes`, then routes are most likely already configured
before the IP address is obtained via DHCP.

That may be desirable in some cases, but for many cases it's probably
wrong.

Instead, only configure the routes (with an ifindex) when we also have
an IP address.

https://bugzilla.redhat.com/show_bug.cgi?id=2102212

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1574
2023-03-22 17:47:49 +01:00
Thomas Haller
07c6f933d1
wifi: fix aggressively roaming (background Wi-Fi scanning) based on seen-bssids
"wifi.seen-bssids" looks like a regular property, but it is not. Unlike
almost all other properties, it does not contain user configuration,
rather it gets filled by the daemon.

The values are thus stored in "/var/lib/NetworkManager/seen-bssids"
file, and the daemon maintains the values separately from the profile.
Only before exporting the profile on D-Bus, the value gets merged (see
NM_SETTINGS_CONNECTION_GET_PRIVATE(self)->>getsettings_cached and
nm_connection_to_dbus_full().

Hence, looking at nm_setting_wireless_get_num_seen_bssids() is not
working. Fix that.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1253

Fixes: 0f3203338c ('wifi: roam aggressively if we on a multi-AP network')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1577
2023-03-22 17:15:54 +01:00
Thomas Haller
1feaf427d2
platform: rework handling of failed routes during nm_platform_ip_route_sync()
Previously, there was "temporary-not-available" mechanism in NML3Cfg,
which aimed to handle IPv6 routes with prefsrc. Theoretically, that
mechanism may have been extended to other use-cases, like IPv4 routes
with prefsrc. What it attempted to handle, is the inability to configure
such routes, unless the respective prefsrc address is configured and
non-tentative.  However, the address that we are waiting for, could also
be on another interface, so that mechanism wasn't applicable. This is
now replaced by _routes_watch_ip_addrs(). It seems there isn't anything
useful left for the "temporary-not-available" mechanism and it can go,
except...

We want to log a warning when we are unable to configure a route. Also,
in the future we might want to know when the IP configuration is
degradated due to inability to configure the desired routes (a condition
that  we might want to expose to the user, not only via logging; or we
may want to react on that).

However, with prefsrc routes we don't know right away whether the
inability to configure the route right away indicates an actual problem,
or whether that will resolve itself (e.g. after the address passes
DAD/ACD, after we received an DHCP lease or after the address was
configured on another interface).  Consequently, to know whether the
current inability to configure such a route is a problem, we need to
know the larger context.  nm_platform_ip_route_sync() does not have that
context.

Instead, nm_platform_ip_route_sync() needs only do debug log about
failure to configure routes. It  will now also  return all the failed
routes to NML3Cfg, which can decide whether that is a problem.

This reworks the previous "temporary-not-available" mechanism to track
the state of the failed routes, to eventually decide whether there is an
actual problem (and log about it).

Another problem this solves is that since commit ('platform: always
reconfigure IP routes even if removed externally'), we will eagerly
re-try to configure the same route over and over. We cannot just spam
the log with warnings about the same failure on every commit. We need to
remember that we already logged about the problem and rate limit
warnings otherwise. This is what the new mechanism also achieves.

Indeed, all this is mostly for the sole benefit of logging better
warnings (and not duplicated).
2023-03-21 15:58:55 +01:00
Thomas Haller
b8dba58892
l3cfg: don't return success/failure from _l3_commit_one()
It was unused anyway.

But also, what would we do with this? We are in the middle of a commit,
if something goes wrong, we cannot just abort but need to continue on
and make the best of it.

Maybe there are very specific error cases that we need to handle, but
those are not covered by a boolean return value. Instead, we might need
to take specific action.

The boolean success variable was meaningless. Drop it.
2023-03-21 15:58:53 +01:00