Commit graph

138 commits

Author SHA1 Message Date
Matthieu Baerts (NGI0)
e3f20ecf95 mptcp: add 'laminar' endpoint support
This new endpoint type has been recently added to the kernel in v6.18
[1]. It will be used to create new subflows from the associated address
to additional addresses announced by the other peer. This will be done
if allowed by the MPTCP limits, and if the associated address is not
already being used by another subflow from the same MPTCP connection.

Note that the fullmesh flag takes precedence over the laminar one.
Without any of these two flags, the path-manager will create new
subflows to additional addresses announced by the other peer by
selecting the source address from the routing tables, which is harder to
configure if the announced address is not known in advance.

The support of the new flag is easy: simply by declaring a new flag for
NM, and adding it in the related helpers and existing checks looking at
the different MPTCP endpoint. The documentation now references the new
endpoint type.

Note that only the new 'define' has been added in the Linux header file:
this file has changed a bit since the last sync, now split in two files.
Only this new line is needed, so the minimum has been modified here.

Link: https://git.kernel.org/torvalds/c/539f6b9de39e [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
(cherry picked from commit 2b03057de0)
2025-11-19 15:01:58 +01:00
Jan Vaclav
b1614ffb90 l3cfg: add info about n-acd eBPF state to log messages 2025-10-22 21:49:56 +02:00
Beniamino Galvani
d62c25ef2f dns: return error from nm_dns_uri_parse()
Return a GError from nm_dns_uri_parse() to indicate why the URI could
not be parsed. This is useful for logging and user reporting.
2025-08-14 09:40:26 +02:00
Beniamino Galvani
eb0a41ce1f l3cfg: simplify the ACD timeouts
ACD_WAIT_PROBING_EXTRA_TIME_MSEC and ACD_WAIT_PROBING_EXTRA_TIME2_MSEC
now are always used together. Consolidate them into a single constant.
2025-07-22 10:24:27 +02:00
Beniamino Galvani
127f73a5c2 l3cfg: fix the interval of the ACD restart timer
After ACD_WAIT_PROBING_EXTRA_TIME_MSEC has elapsed,
_l3_acd_data_timeout_schedule_probing_restart() keeps rescheduling the
timer with a zero interval, resulting in 100% CPU usage. This
continues until the probe is destroyed after
ACD_WAIT_PROBING_EXTRA_TIME2_MSEC.

When computing the interval, we need to use
(ACD_WAIT_PROBING_EXTRA_TIME_MSEC + ACD_WAIT_PROBING_EXTRA_TIME2_MSEC)
as the expiry time.
2025-07-22 10:24:26 +02:00
Beniamino Galvani
407d753a5a l3cfg: don't reset the ACD probe timestamp during timer events
acd_data->probing_timestamp_msec indicates when the probing
started. It is used in different places to calculate the timeout for
certain operations. In particular, it is used to detect that the probe
creation took too long when handling the ACD_STATE_CHANGE_MODE_TIMEOUT
event.

If we reset this timestamp at every timer event, we'll never hit the
probe creation timeout. Therefore, the l3cfg will keep trying forever
to create the probe.
See: https://lists.freedesktop.org/archives/networkmanager/2025-July/000418.html

Fix this by not updating the timestamp during a timeout event.

Fixes: a09f9cc616 ('l3cfg: ensure the probing timeout is initialized on probe start')
2025-07-22 10:24:26 +02:00
Beniamino Galvani
74cf2a2bd8 l3cfg: fix logging message
Fix spacing in:

 acd[192.168.122.42, probing]: probing currently  stillnot possible
                                                 ^^^^^^^^^

Fixes: b8f9d7b5dd
2025-07-10 10:04:36 +02:00
Michael Biebl
10e58f7c3c typo fix: allows to -> allows one to
Detected by lintian:

Example:
I: network-manager: typo-in-manual-page "allows to" "allows one to" [usr/share/man/man5/NetworkManager.conf.5.gz:1266]
2025-03-26 19:22:01 +01:00
Beniamino Galvani
ceef38d9a5 l3cfg: only add MPTCP endpoints for non-tentative IPv6 addresses
An IPv6 endpoint is not usable until the address is non-tentative. Add
a mechanism to wait until the address is ready.

(cherry picked from commit 227cd6307b)
2025-02-24 09:07:54 +01:00
Beniamino Galvani
2c5a51201d l3cfg: wait for the address before configuring an MPTCP endpoint
Skip the configuration of the MPTCP endpoint when the address is in
the l3cd but is not yet configured in the platform. This typically
happens when IPv4 DAD is enabled and the address is being probed.

If we configure the endpoint without the address set, the kernel will
try to use the endpoint immediately but it will fail. Then, the
endpoint will not be used ever again after the address is added.

(cherry picked from commit 6bf859af79)
2025-02-24 09:07:54 +01:00
Beniamino Galvani
a301c259f2 core: split nm_netns_watcher_remove_all()
The name suggests that the function always removes all the watchers
with the given tag; instead it removes only "dirty" ones when the
"all" parameter is FALSE. Split the function in two variants.

(cherry picked from commit b6e67c6abc)
2025-02-24 09:07:53 +01:00
Beniamino Galvani
05efd5ab62 l3cfg: add the DNS routing rules explicitly
Add the DNS routing rules explicitly instead of tracking them via the
NMGlobalTracker mechanism. Since we do not plan to ever remove them,
there is no reason to track the rules. Also, the current
implementation is buggy because in some situations the rules are
wrongly removed when they should not.

Fixes: bf3ecd9031 ('l3cfg: fix DNS routes')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2125
2025-02-04 10:35:55 +01:00
Tomas Korbar
c6e1925dec dns: Add dnsconfd DNS plugin
dnsconfd can now be used as DNS configuration plugin.

If ipvX.routed-dns is set to -1 and dnsconfd plugin is enabled then
routes are added by default.
2025-01-29 14:41:47 +01:00
Beniamino Galvani
bf3ecd9031 l3cfg: fix DNS routes
The current approach is flawed. During a commit of the L3
configuration we do a RTM_GETROUTE to find the next-hop to the DNS
server on the current interface, in order to create the DNS route to
inject into the l3cd. However, we haven't added routes to kernel yet
and so the result of the RTM_GETROUTE is going to be wrong.

In some cases, for example when IPv4 DAD is enabled, the bug can't be
easily noticed because we perform multiple commits for the interface,
and the regular routes are already set in kernel from the 2nd commit
on.

To fix the problem, do the following: during a commit we first add
addresses and routes to platform. Then, we create a list of DNS routes
to configure, we collect the old DNS routes, and do a comparison. If
they changed, we need to add the DNS routes to platform in a 2nd step.

Note that in the previous approach we tracked the routes in the
committed-l3cd object of the l3cfg, and so they were applied to kernel
automatically. Because of the 2-step requirement, that no longer works
and we must apply the DNS routes manually.

Fixes: 5449b18a94 ('core: support automatically adding DNS routes')
2025-01-14 23:31:59 +01:00
Beniamino Galvani
aefc7732f0 l3cfg: add the DNS routing rule only when needed
Don't try to add the routing rule that points to the table containing
DNS routes at every commit.

Instead, look into the platform cache to see if the rule already
exists and add it only when needed.
2025-01-14 23:31:59 +01:00
Beniamino Galvani
4422b14704 core, libnm: support per-connection DNS URIs
Accept name servers with a URI syntax in the ipv4.dns and ipv6.dns
properties; and accept them everywhere else in the core and libnm.
2025-01-07 15:41:44 +01:00
Fernando Fernandez Mancera
69f3493670 l3cfg: add helper function to fetch all the IPv4 configured addresses
This function would be useful when performing operations related to the
IPv4 addresses configured on the l3cfg. E.g this function will be used
for getting the IPv4 to announce on a GARP on bonding-slb when one of
the ports failover.
2024-12-18 14:45:54 +01:00
Íñigo Huguet
c06d130c38 l3cfg: get routes to prune from the list of routes configured by NM
We always sync routes in the main table, but routes in tables other
than main are only pruned if were added by NM, by default. Get the list
of routes to prune from other tables using obj_state->os_nm_configured,
as this tracks what routes were effectively added by NM.

The list should be the same that the one obtained from l3cfg_old. It
could be different if we commited the l3cfg with an NMIPRouteTableSyncMode
of NM_IP_ROUTE_TABLE_SYNC_MODE_MAIN, thus not deleting some routes at
commit time. However, since the previous commit, we never do it.

What all this shows is that starting to use different NMIPRouteTableSyncModes
is probably a bad idea: it will be a source of bugs of routes not being
always synced as users expect, and the use case for them is still to be
known.
2024-12-11 15:52:09 +00:00
Íñigo Huguet
e330eb9c4a l3cfg: remove routes added by NM on reapply
By default, on reapply we were only syncing the main routes table. This
causes that routes added by NM to other tables are not removed on
reapply. This was done to preserve routes added externally, but routes
added by NM itself should be removed.

Add a new route table syncing mode "main + NM routes". This mode
maintains the normal behaviour of syncing completely the main table,
and for other tables removes only routes that were added by us, leaving
the rest untouched. Use this mode by default, as this is what a user
would expect on reapply.

Note: this might not work if NM is restarted between the profile being
modified and the reapply, because NM forgets what routes were added by
itself because of the restart. This is a rare corner case, though.

Use the D-Bus property "VersionInfo" to expose a capability flag
indicating that this bug is fixed. It is the first capability that we
expose in this way. However, it is convenient to do it this way as it's
something that clients like nmstate needs to know, so they can decide
whether a conn down is needed or not. It is not enough to decide that by
version number because it might be fixed via a downstream patch in distros
like RHEL.

https://issues.redhat.com/browse/RHEL-67324
https://issues.redhat.com/browse/RHEL-66262

Fixes: e9c17fcc9b ('l3cfg: default to 'main' route table sync mode')
2024-12-11 15:52:09 +00:00
Íñigo Huguet
e1840ad5fb platform: rename NM_IP_ROUTE_TABLE_SYNC_MODE_FULL -> ALL_EXCEPT_LOCAL
The difference between FULL and ALL was not obvious without reading the
documentation. Moreover, a new mode is going to be introduced so the
confusion could grow. Rename to a more explicit name.
2024-12-11 15:52:09 +00:00
Wen Liang
883399606f l3cfg: never retry ACD on NOARP interfaces
After upgrading to RHEL-9.4, customers have reported that `ip monitor`
repeatedly logs the same route additions every 30 seconds. This issue
appears to stem from NetworkManager continually retrying to add the same
routes due to keep retrying Address Conflict Detection (ACD) on NOARP
interfaces.

To prevent unnecessary route additions and reduce log noise, this change
modifies NetworkManager's behavior to stop retrying ACD on interfaces
with the NOARP flag.

This fix addresses route instability and excessive logging for affected
NOARP configurations.

https://issues.redhat.com/browse/RHEL-59125
2024-11-15 13:46:37 +00:00
Beniamino Galvani
5449b18a94 core: support automatically adding DNS routes
When the "ipvX.routed-dns" property is set to true, add a route for
each DNS server via the current interface. The feature works in the
following way.

A new routing rule is created ("priority $PRIO not fwmark $MARK lookup
$TABLE") where $PRIO, $MARK and $TABLE are fixed values and are the
same for all interfaces. This rule is evaluated before standard rules
and tries to look up routes in table $TABLE, where NM adds the routes
to DNS servers.

To determine the next-hop to the name server, NM issues a RTM_GETROUTE
netlink request to kernel, specifying to return the route via the
current interface. In order to avoid results from $TABLE, NM also sets
the fwmark as $MARK in the request.
2024-10-23 15:38:36 +02:00
Beniamino Galvani
3eb45c1d40 l3cfg: simplify signals
During a commit of layer-3 configuration, multiple signals are
emitted:

 - if the combined l3cd configuration changes, we first emit a
   L3CD_CHANGED signal, with flag `commited` FALSE;
 - if the previously committed configuration is different from the one
   we want to commit, we emit again the same signal with `commited`
   TRUE;
 - a PRE_COMMIT signal
 - a POST_COMMIT signal

The usefulness of the first and third signals is questionable: there
is no need to signal that the configuration changes if we are not
going to commit it. Also, PRE_COMMIT is redundant as we just emitted
L3CD_CHANGED. Nobody is using those 2 signals.

Simplify this by leaving only PRE_COMMIT and POST_COMMIT, which are
always emitted during a commit and provide information on the l3cd
changes.

This commit doesn't change behavior.
2024-10-23 15:06:58 +02:00
Beniamino Galvani
bb6881f88c format: run nm-code-format
Reformat with:

  clang-format version 19.1.0 (Fedora 19.1.0-1.fc41)

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2046
2024-10-04 11:07:35 +02:00
Beniamino Galvani
a09f9cc616 l3cfg: ensure the probing timeout is initialized on probe start
When handling event TIMEOUT, "acd_data->probing_timeout_msec" needs to
be always initialized before jumping to "handle_start_probing:";
otherwise, an assertion failure is triggered at:

  static void
  _l3_acd_data_timeout_schedule_probing_restart(AcdData *acd_data, gint64 now_msec)
  {
    ...
    nm_assert(acd_data->probing_timeout_msec > 0);

Even if the ACD data is already in state PROBE, that doesn't mean that
the timeout is already initialized because the PROBE state can also be
reached from a INSTANCE_RESET event; and depending on the previous
state "acd_data->probing_timeout_msec" could be uninitialized.

Fixes-test: @iptunnel_restart
Fixes: b8f9d7b5dd ('l3cfg: rework ACD handling in NML3Cfg to support handling conflicts')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2023
2024-09-02 10:04:11 +02:00
Beniamino Galvani
6897b6ecfd core: rename l3cd's "dhcp_enabled" to "allow_routes_without_address"
The name "dhcp_enabled" is misleading because the flag is set for
method=auto, which doesn't necessarily imply DHCP. Also, it doesn't
convey what the flag is used for. Rename it to
"allow_routes_without_address".

(cherry picked from commit b31febea22)
2024-05-28 09:50:09 +02:00
Beniamino Galvani
b185d21c95 l3cfg: fix handling of ipv6 hop limit
Fixes: 5c48c5d5d6 ('l3cfg: set IPv6 sysctls during NML3Cfg commit')

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1497

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1883
2024-03-12 09:58:28 +01:00
Wen Liang
cf28660b6a core: check the dhcp enabled flag in l3cfg
The decision to configure or not configure routes without addresses only
related to what method is configured - DHCP and non-DHCP cases. For DHCP
case, the deamon waits until addresses appear first before configuring
the static routes to preserve the behavior mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=2102212, otherwise, the
daemon can configure the routes immediately for non-DHCP case.
2024-01-24 09:15:39 -05:00
Thomas Haller
5d26452da5 l3cfg: handle dynamic added routes tracking and deletion
Dynamic added routes i.e ECMP single-hop routes, are not managed by l3cd
as the other ones. Therefore, they need to be tracked properly and
marked as dirty when they are.

Without this, the state of those ECMP single hop routes is not properly
tracked, and they are for example not removed by NML3Cfg when they
should.

Usually, routes to be configured originate from the combined
NML3ConfigData and are resolved early during a commit. For example,
_obj_states_update_all() creates for each such route an obj_state_hash
entry. Let's call those static, or "non-dynamic".

Later, we can receive additional routes. We get them back from NMNetns
during nm_netns_ip_route_ecmp_commit() (_commit_collect_routes()).
Let's call them "dynamic".

For those routes, we also must track an obj-state.

Now we have two reasons why an obj-state is tracked. The "non-dynamic"
and the "dynamic" one. Add two flags "os_dynamic" and "os_non_dynamic"
to the ObjStateData and consider the flags at the necessary places.

Co-authored-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
2023-12-04 17:00:13 +01:00
Fernando Fernandez Mancera
ffe22c498f l3cfg: factor out _obj_state_data_new() for creating object state
Will be used next.
2023-12-04 17:00:13 +01:00
Thomas Haller
8d205faa46 core: assert that tracked objects in NML3Cfg don't have metric_any
It wouldn't work otherwise. The object state is used to track routes
and compare them to what we find in platform.

A "metric_any" is useful at higher layers, to track a route where the
metric is decided by somebody else. But at the point when we add such an
object to the object-state, a fixed metric must be chosen.

Assert for that.
2023-12-04 17:00:13 +01:00
Beniamino Galvani
5b16c128bb l3cfg: fix pruning of ACD data (take 2)
If a commit is invoked without any change to the l3cd or to the ACD
data, in _l3cfg_update_combined_config() we skip calling
_l3_acd_data_add_all(), which should clear the dirty flag from ACDs.
Therefore, in case of such no-op commits the ACDs still marked as
dirty - but valid - are removed via:

 _l3_commit()
   _l3_acd_data_process_changes()
     _l3_acd_data_prune()
       _l3_acd_data_prune_one()

Invoking a l3cfg commit without any actual changes is allowed, see the
explanation in commit e773559d9d ('device: schedule an idle commit
when setting device's sys-iface-state').

The bug is visible by running test 'bond_addreses_restart_persistence'
with IPv4 ACD/DAD is enabled by default: after restart IPv6 completes
immediately, the devices becomes ACTIVATED, the sys-iface-state
transitions from ASSUME to MANAGED, a commit is done, and it
incorrectly prunes the ACD data. The result is that the IPv4 address
is never added again.

Fix this by doing the pruning only when we update the dirty flags.

This is a respin of commit ed565f9146 ('l3cfg: fix pruning of ACD
data') that was reverted because it was causing a crash. The crash was
caused by unconditionally clearing `acd_data_pruning_needed` in
_l3cfg_update_combined_config(), while we need to do it only when
actually committing the configuration.

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1749
2023-10-16 16:58:46 +02:00
Beniamino Galvani
6a63d79fe6 Revert "l3cfg: fix pruning of ACD data"
The commit causes the following assertion failure:

  0  0x00007f4187e22884 in __pthread_kill_implementation () from target:/lib64/libc.so.6
  1  0x00007f4187dd1afe in raise () from target:/lib64/libc.so.6
  2  0x00007f4187dba87f in abort () from target:/lib64/libc.so.6
  3  0x00007f4188386f4e in g_assertion_message (domain=domain@entry=0x6fc1bc "nm", file=file@entry=0x722e94 "../src/core/nm-l3cfg.c", line=line@entry=2134,
     func=func@entry=0x727730 <__func__.49> "_l3_acd_data_add_all", message=message@entry=0x23b3bb0 "assertion failed: (acd_data->info.track_infos[i]._priv.acd_dirty_track)")
     at ../glib/gtestutils.c:3450
  4  0x00007f41883f1597 in g_assertion_message_expr (domain=domain@entry=0x6fc1bc "nm", file=file@entry=0x722e94 "../src/core/nm-l3cfg.c", line=line@entry=2134,
     func=func@entry=0x727730 <__func__.49> "_l3_acd_data_add_all", expr=expr@entry=0x726450 "acd_data->info.track_infos[i]._priv.acd_dirty_track") at ../glib/gtestutils.c:3476
  5  0x0000000000587209 in _l3_acd_data_add_all (self=self@entry=0x23a7020, infos=infos@entry=0x0, infos_len=infos_len@entry=0, reapply=reapply@entry=1)
     at ../src/core/nm-l3cfg.c:2134
  6  0x0000000000587702 in _l3cfg_update_combined_config (self=self@entry=0x23a7020, to_commit=to_commit@entry=1, reapply=reapply@entry=1, out_old=out_old@entry=0x7ffd09ea4ca8,
     out_changed_combined_l3cd=out_changed_combined_l3cd@entry=0x7ffd09ea4c7c) at ../src/core/nm-l3cfg.c:3858
  7  0x000000000058a202 in _l3_commit (self=0x23a7020, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY, is_idle=is_idle@entry=0) at ../src/core/nm-l3cfg.c:5046
  8  0x000000000058a49f in nm_l3cfg_commit (self=<optimized out>, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY) at ../src/core/nm-l3cfg.c:5115
  9  0x00000000004856cd in nm_device_l3cfg_commit (self=self@entry=0x23ab870, commit_type=commit_type@entry=NM_L3_CFG_COMMIT_TYPE_REAPPLY, commit_sync=commit_sync@entry=1)
     at ../src/core/devices/nm-device.c:4155
  10 0x00000000004b1814 in nm_device_cleanup (self=self@entry=0x23ab870, reason=reason@entry=NM_DEVICE_STATE_REASON_NEW_ACTIVATION,
     cleanup_type=cleanup_type@entry=CLEANUP_TYPE_DECONFIGURE) at ../src/core/devices/nm-device.c:15884
  11 0x00000000004b26c9 in _set_state_full (self=self@entry=0x23ab870, state=state@entry=NM_DEVICE_STATE_DISCONNECTED, reason=NM_DEVICE_STATE_REASON_NEW_ACTIVATION,
     quitting=quitting@entry=0) at ../src/core/devices/nm-device.c:16291
  12 0x00000000004b2fe4 in nm_device_state_changed (self=self@entry=0x23ab870, state=state@entry=NM_DEVICE_STATE_DISCONNECTED, reason=<optimized out>)
     at ../src/core/devices/nm-device.c:16505
  13 0x00000000004b69de in queued_state_set (user_data=user_data@entry=0x23ab870) at ../src/core/devices/nm-device.c:16532
  14 0x00007f41883bf4fd in g_idle_dispatch (source=0x23a88e0, callback=0x4b6956 <queued_state_set>, user_data=0x23ab870) at ../glib/gmain.c:6163
  15 0x00007f41883c34fc in g_main_dispatch (context=0x22c4d10) at ../glib/gmain.c:3460
  16 g_main_context_dispatch (context=0x22c4d10) at ../glib/gmain.c:4200
  17 0x00007f41884216b8 in g_main_context_iterate.isra.0 (context=0x22c4d10, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4276
  18 0x00007f41883c2aff in g_main_loop_run (loop=0x22c3b50) at ../glib/gmain.c:4479
  19 0x0000000000423a37 in main (argc=<optimized out>, argv=<optimized out>) at ../src/core/main.c:519

This reverts commit ed565f9146.
2023-10-05 21:26:31 +02:00
Beniamino Galvani
6ebf2c6ba1 l3cfg: log the reason when marking IP configuration dirty 2023-10-05 09:05:13 +02:00
Beniamino Galvani
e83e8b73f4 l3cfg: improve logging
- avoid "update" as it is also a commit type
 - make clear that the commit is not happening now
2023-10-05 09:05:07 +02:00
Beniamino Galvani
ed565f9146 l3cfg: fix pruning of ACD data
If a commit is invoked without any change to the l3cd or to the ACD
data, in _l3cfg_update_combined_config() we skip calling
_l3_acd_data_add_all(), which should clear the dirty flag from ACDs.
Therefore, in case of such no-op commits the ACDs still marked as
dirty - but valid - are removed via:

 _l3_commit()
   _l3_acd_data_process_changes()
     _l3_acd_data_prune()
       _l3_acd_data_prune_one()

Invoking a l3cfg commit without any actual changes is allowed, see the
explanation in commit e773559d9d ('device: schedule an idle commit
when setting device's sys-iface-state').

The bug is visible by running test 'bond_addreses_restart_persistence'
with IPv4 ACD/DAD is enabled by default: after restart IPv6 completes
immediately, the devices becomes ACTIVATED, the sys-iface-state
transitions from ASSUME to MANAGED, a commit is done, and it
incorrectly prunes the ACD data. The result is that the IPv4 address
is never added again.

Fix this by doing the pruning only when we update the dirty flags.
2023-10-05 09:04:32 +02:00
Beniamino Galvani
7548ff57d3 l3cfg: skip ACD for interfaces with IFF_NOARP
Interfaces with IFF_NOARP don't support Address Conflict Detection,
which is based on ARP. Trying to start ACD on them would result in
ENOBUFS always being returned by send(), and n-acd handles such error
by retrying indefinitely.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
2023-10-05 09:04:09 +02:00
Beniamino Galvani
687051368f l3cfg: schedule a commit when ACD is not supported
On interfaces not supporting ACD (for example, layer3 interfaces), the
probe fails to be created with message:

 l3cfg[...,ifindex=2]: acd[172.25.17.1, init]: probe-good (interface does not support acd, initial post-commit)
 l3cfg[...,ifindex=2]: acd[172.25.17.1, ready]: set state to ready (probe is ready, waiting for address to be configured)

During the post-commit event, if the address is not yet configured, we
need to schedule a new commit to actually add it.

Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
2023-10-05 09:03:40 +02:00
Beniamino Galvani
aed21d50af l3cfg: remove tna_dirty member
The member is no longer used.

Fixes: 1feaf427d2 ('platform: rework handling of failed routes during nm_platform_ip_route_sync()')
2023-09-04 18:25:42 +02:00
Beniamino Galvani
3fb1c4dc23 l3cfg: fix typo in variable name
Replace "mesc" with "msec".

Fixes: 1feaf427d2 ('platform: rework handling of failed routes during nm_platform_ip_route_sync()')
2023-09-04 18:25:41 +02:00
Beniamino Galvani
8da4d088ba l3cfg: fix log message
nm_utils_addr_family_to_char() requires a valid address family.

Fixes: 1feaf427d2 ('platform: rework handling of failed routes during nm_platform_ip_route_sync()')
2023-09-04 18:25:41 +02:00
Beniamino Galvani
68dc2d3ca9 l3cfg: demote logging level for ACD conflict messages
NMDevice is now emitting those logs at info level.
2023-08-11 13:30:38 +02:00
Beniamino Galvani
db307e69cb l3cfg: return the conflicting MAC address with ACD events
When a collision is detected by the Address Conflict Detection
mechanism, store the conflicting MAC address in NML3AcdAddrInfo, so
that it is available to listeners of NML3Cfg for events of type
NM_L3_CONFIG_NOTIFY_TYPE_ACD_EVENT.
2023-08-11 13:30:38 +02:00
Thomas Haller
dabfa26a41
core: don't configure IP routes unless there are also IP addresses
Since l3cfg rework, NetworkManager tracks IP routes early, not not only
when IP configuration is ready. That means, with `ipv4.method=auto` and
static `ipv4.routes`, then routes are most likely already configured
before the IP address is obtained via DHCP.

That may be desirable in some cases, but for many cases it's probably
wrong.

Instead, only configure the routes (with an ifindex) when we also have
an IP address.

https://bugzilla.redhat.com/show_bug.cgi?id=2102212

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1574
2023-03-22 17:47:49 +01:00
Thomas Haller
1feaf427d2
platform: rework handling of failed routes during nm_platform_ip_route_sync()
Previously, there was "temporary-not-available" mechanism in NML3Cfg,
which aimed to handle IPv6 routes with prefsrc. Theoretically, that
mechanism may have been extended to other use-cases, like IPv4 routes
with prefsrc. What it attempted to handle, is the inability to configure
such routes, unless the respective prefsrc address is configured and
non-tentative.  However, the address that we are waiting for, could also
be on another interface, so that mechanism wasn't applicable. This is
now replaced by _routes_watch_ip_addrs(). It seems there isn't anything
useful left for the "temporary-not-available" mechanism and it can go,
except...

We want to log a warning when we are unable to configure a route. Also,
in the future we might want to know when the IP configuration is
degradated due to inability to configure the desired routes (a condition
that  we might want to expose to the user, not only via logging; or we
may want to react on that).

However, with prefsrc routes we don't know right away whether the
inability to configure the route right away indicates an actual problem,
or whether that will resolve itself (e.g. after the address passes
DAD/ACD, after we received an DHCP lease or after the address was
configured on another interface).  Consequently, to know whether the
current inability to configure such a route is a problem, we need to
know the larger context.  nm_platform_ip_route_sync() does not have that
context.

Instead, nm_platform_ip_route_sync() needs only do debug log about
failure to configure routes. It  will now also  return all the failed
routes to NML3Cfg, which can decide whether that is a problem.

This reworks the previous "temporary-not-available" mechanism to track
the state of the failed routes, to eventually decide whether there is an
actual problem (and log about it).

Another problem this solves is that since commit ('platform: always
reconfigure IP routes even if removed externally'), we will eagerly
re-try to configure the same route over and over. We cannot just spam
the log with warnings about the same failure on every commit. We need to
remember that we already logged about the problem and rate limit
warnings otherwise. This is what the new mechanism also achieves.

Indeed, all this is mostly for the sole benefit of logging better
warnings (and not duplicated).
2023-03-21 15:58:55 +01:00
Thomas Haller
b8dba58892
l3cfg: don't return success/failure from _l3_commit_one()
It was unused anyway.

But also, what would we do with this? We are in the middle of a commit,
if something goes wrong, we cannot just abort but need to continue on
and make the best of it.

Maybe there are very specific error cases that we need to handle, but
those are not covered by a boolean return value. Instead, we might need
to take specific action.

The boolean success variable was meaningless. Drop it.
2023-03-21 15:58:53 +01:00
Thomas Haller
85816d1f19
core: skip watching prefsrc addresses if the address is ready 2023-03-21 15:58:49 +01:00
Thomas Haller
e4ac0c407d
core: watch IP addresses appearing/disappearing and recommit pref_src routes
Routes with pref_src (RTA_PREFSRC) can only be added when the
corresponding IP address is configured (and non-tentative, in case of
IPv6). Additionally, that address may be on any interface, not only on
the one we want to configure the route on. This means, when we first
activate a profile with a route that has a src attrbute, then that src
address might only be configured later. For example, with IPv6, it takes
a while for the address to become non-tentative. Or the address might
come from DHCP, and not be present initially. Or the address might even
be configured on another interface/profile. That means, while we might
be unable to configure the route now, we may become able any time later.

Solve that by subscribing to NMNetns to get notifications whenever such
an address gets added. In that case, schedule an idle commit, which may
then succeed.
2023-03-21 15:58:47 +01:00
Thomas Haller
7fa63c23b4
platform,l3cfg: remove force-commit flag for addresses/routes
We no longer need this. We now always force-commit routes and addresses.
See the previous commit.
2023-03-21 15:58:43 +01:00
Thomas Haller
7ca95cee15
platform: always reconfigure IP routes even if removed externally
NML3Cfg is stateful, that means it remembers which address/route it
configured earlier. That is important because the API users of NML3Cfg
only say what the want to configure now, and NML3Cfg needs to remove
addresses/routes that it configured earlier but are no longer to be
present. Also, NetworkManager wants to allow the user to add
addresses/routes externally with `ip addr|route add` and NetworkManager
not removing it. This is a common use case for dispatcher scripts, but
in general, we want to allow other components to add addresses/routes.

We try something similar with the removal of routes/addresses managed by
NetworkManager. When NetworkManager adds a route/address, which later
disappears, then we assume that the user intentionally removed the
address/route and take the hint to not re-add it.

However, it doesn't work. It is problematic for two reasons:

- kernel can automatically remove routes. For example, deleting an IPv4
  address that is the prefsrc of a route, will cause kernel to delete
  that route. Sure, we may be unable to re-configure the route at this
  moment, but we shouldn't remember indefinitely that the route is
  supposed to be absent. Rather, we should re-add it when possible.

- kernel is a pain with validating consistencies of routes. For example,
  when a route has a nexthop gateway, then the gateway must be onlink
  (directly reachable), or kernel refuses to add it with "Nexthop has
  invalid gateway". Of course, when removing the onlink route kernel is
  fine leaving the gateway route behind, which it would otherwise refuse
  to add.
  Anyway. Such interdependencies for when kernel rejects adding a route
  with "Nexthop has invalid gateway" are non-trivial. We try to work
  around that by always adding the necessary onlink routes. See
  nm_l3_config_data_add_dependent_onlink_routes(). But if the user
  externally removed the dependent onlink route, and NetworkManager
  remembers to not re-adding it, then the efforts from
  nm_l3_config_data_add_dependent_onlink_routes() are ignored. This
  causes ripple effects and NetworkManager will also be unable to add the
  nexthop route.

Trying to preserve absence of routes that NetworkManager would like to
configure is not tenable. Don't do it anymore. There was anyway no
guarantee that on the next update NetworkManager wouldn't try to re-add
the route in question. For example, if the route came from DHCP, and the
lease temporarily went away and came back, then NetworkManager probably
would have (correctly) forgotten that the user wished that the route be
absent. This did not work reliably and it just causes problems.
2023-03-21 15:58:41 +01:00