Currently it is possible to specify a list of default settings plugins
to be used when configuration doesn't contain the main.plugins key.
We want to add a mechanism that allows to automatically load any
plugin found in the plugins directory without needing
configuration. This mechanism is useful when plugins are shipped in a
different, optional subpackage, to automatically use them.
With such mechanism, the actual list of plugins will be determined
(in order of evaluation):
1. via explicit user configuration in /etc, if any
2. via distro configuration in /usr, if any
3. using the build-time default, if any
4. looking for known plugins in /usr/lib
(cherry picked from commit 392daa5dab)
When we have a bridge interface with ports attached externally (that is,
not by NetworkManager itself), then it can make sense that during
checkpoint rollback we want to keep those ports attached.
During rollback, we may need to deactivate the bridge device and
re-activate it. Implement this, by setting a flag before deactivating,
which prevents external ports to be detached. The flag gets cleared,
when the device state changes to activated (the following activation)
or unmanaged.
This is an ugly solution, for several reasons.
For one, NMDevice tracks its ports in the "slaves" list. But what
it does is ugly. There is no clear concept to understand what it
actually tacks. For example, it tracks externally added interfaces
(nm_device_sys_iface_state_is_external()) that are attached while
not being connected. But it also tracks interfaces that we want to attach
during activation (but which are not yet actually enslaved). It also tracks
slaves that have no actual netdev device (OVS). So it's not clear what this
list contains and what it should contain at any point in time. When we skip
the change of the slaves states during nm_device_master_release_slaves_all(),
it's not really clear what the effects are. It's ugly, but probably correct
enough. What would be better, if we had a clear purpose of what the
lists (or several lists) mean. E.g. a list of all ports that are
currently, physically attached vs. a list of ports we want to attach vs.
a list of OVS slaves that have no actual netdev device.
Another problem is that we attach state on the device
("activation_state_preserve_external_ports"), which should linger there
during the deactivation and reactivation. How can we be sure that we don't
leave that flag dangling there, and that the desired following activation
is the one we cared about? If the follow-up activation fails short (e.g. an
unmanaged command comes first), will we properly disconnect the slaves?
Should we even? In practice, it might be correct enough.
Also, we only implement this for bridges. I think this is where it makes
the most sense. And after all, it's an odd thing to preserve unknown,
external things during a rollback -- unknown, because we have no knowledge
about why these ports are attached and what to do with them.
Also, the change doesn't remember the ports that were attached when the
checkpoint was created. Instead, we preserve all ports that are attached
during rollback. That seems more useful and easier to implement. So we
don't actually rollback to the configuration when the checkpoint was
created. Instead, we rollback, but keep external devices.
Also, we do this now by default and introduce a flag to get the previous
behavior.
https://bugzilla.redhat.com/show_bug.cgi?id=2035519https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/ # 909
(cherry picked from commit 98b3056604)
For devices that configure IP by themselves (by returning
"->ready_for_ip_config() = TRUE" and implementing
->act_stage3_ip_config()), we skip manual configuration. Currently,
manual configuration is the only one that sets flag HAS_DNS_PRIORITY
into the resulting l3cd.
So, the merged l3cd for such devices misses a dns-priority and is
ignored by the DNS manager.
Explicitly initialize the priority to 0; in this way, the default
value for the device will be set in the final l3cd during the merge.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/931
(cherry picked from commit b2e559fab2)
We can get a platform signal for any number of reasons. In particular,
we can get a signal that the object is present in platform, while the object
is tracked as zombie.
"Zombies" are objects that were actively configured by NetworkManager, but
now no longer and thus will need to be removed. We remember them as objects
that we need to delete.
The assertion was wrong. We don't need to handle the case "in_platform"
and linked in "os_zombie_lst" specially. If we get a signal that the
object exists while being a zombie, that is fine and not something to
handle specially.
Backtrace:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f6a208f1db5 in __GI_abort () at abort.c:79
#2 0x00007f6a212ed123 in g_assertion_message (domain=<optimized out>, file=<optimized out>, line=<optimized out>,
func=0x560e23ada2c0 <__func__.39909> "_obj_states_externally_removed_track", message=<optimized out>) at gtestutils.c:2533
#3 0x00007f6a2134620e in g_assertion_message_expr (domain=domain@entry=0x560e23b781a0 "nm", file=file@entry=0x560e23acec60 "src/core/nm-l3cfg.c", line=line@entry=920,
func=func@entry=0x560e23ada2c0 <__func__.39909> "_obj_states_externally_removed_track", expr=expr@entry=0x560e23ad1980 "c_list_is_empty(&obj_state->os_zombie_lst)") at gtestutils.c:2556
#4 0x0000560e23853f38 in _obj_states_externally_removed_track (self=self@entry=0x560e25f168e0, obj=<optimized out>, obj@entry=0x560e25e466a0, in_platform=in_platform@entry=1)
at src/core/nm-l3cfg.c:920
#5 0x0000560e2385b8ea in _nm_l3cfg_notify_platform_change (self=0x560e25f168e0, change_type=change_type@entry=NM_PLATFORM_SIGNAL_CHANGED, obj=0x560e25e466a0) at src/core/nm-l3cfg.c:1364
#6 0x0000560e23861251 in _platform_signal_cb (platform=<optimized out>, obj_type_i=<optimized out>, ifindex=<optimized out>, platform_object=0x560e25e466b8, change_type_i=2,
p_self=<optimized out>) at ./src/libnm-platform/nmp-object.h:443
#7 0x00007f6a1c4a914e in ffi_call_unix64 () at ../src/x86/unix64.S:76
#8 0x00007f6a1c4a8aff in ffi_call (cif=cif@entry=0x7fffac40e570, fn=fn@entry=0x560e23861100 <_platform_signal_cb>, rvalue=<optimized out>, avalue=avalue@entry=0x7fffac40e480)
at ../src/x86/ffi64.c:525
#9 0x00007f6a217fee85 in g_cclosure_marshal_generic (closure=<optimized out>, return_gvalue=<optimized out>, n_param_values=<optimized out>, param_values=<optimized out>,
invocation_hint=<optimized out>, marshal_data=<optimized out>) at gclosure.c:1490
#10 0x00007f6a217fe3bd in g_closure_invoke (closure=0x560e25df53c0, return_value=0x0, n_param_values=5, param_values=0x7fffac40e7a0, invocation_hint=0x7fffac40e720) at gclosure.c:804
#11 0x00007f6a21811945 in signal_emit_unlocked_R (node=node@entry=0x7f6a00008870, detail=detail@entry=0, instance=instance@entry=0x560e25ddd080, emission_return=emission_return@entry=0x0,
instance_and_params=instance_and_params@entry=0x7fffac40e7a0) at gsignal.c:3636
#12 0x00007f6a2181aa56 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7fffac40e9c0) at gsignal.c:3392
#13 0x00007f6a2181b093 in g_signal_emit (instance=instance@entry=0x560e25ddd080, signal_id=<optimized out>, detail=detail@entry=0) at gsignal.c:3448
#14 0x0000560e2392deea in nm_platform_cache_update_emit_signal (self=0x560e25ddd080, cache_op=NMP_CACHE_OPS_UPDATED, obj_old=<optimized out>, obj_new=<optimized out>)
at src/libnm-platform/nm-platform.c:8824
#15 0x0000560e238fd520 in event_handler_recvmsgs () at src/libnm-platform/nm-linux-platform.c:7183
#16 0x0000560e238fdcbf in event_handler_read_netlink () at src/libnm-platform/nm-linux-platform.c:9403
#17 0x0000560e238ffab3 in delayed_action_handle_one () at src/libnm-platform/nm-linux-platform.c:6238
#18 0x0000560e238ffcae in delayed_action_handle_all () at src/libnm-platform/nm-linux-platform.c:6256
#19 0x0000560e23901acc in do_delete_object () at src/libnm-platform/nm-linux-platform.c:7392
#20 0x0000560e2390227c in ip4_address_delete () at src/libnm-platform/nm-linux-platform.c:8782
#21 0x0000560e23922709 in nm_platform_ip4_address_delete (self=self@entry=0x560e25ddd080, ifindex=ifindex@entry=150, address=16843009, plen=<optimized out>, peer_address=16843009)
at src/libnm-platform/nm-platform.c:3574
#22 0x0000560e239275ab in nm_platform_ip_address_sync (self=0x560e25ddd080, addr_family=addr_family@entry=2, ifindex=150, known_addresses=<optimized out>, known_addresses@entry=0x0,
addresses_prune=0x560e25e81aa0) at src/libnm-platform/nm-platform.c:3984
#23 0x0000560e23855e17 in _l3_commit_one (self=0x560e25f168e0, addr_family=2, commit_type=<optimized out>, l3cd_old=<optimized out>, changed_combined_l3cd=<optimized out>)
at src/core/nm-l3cfg.c:4256
#24 0x0000560e2385fc5c in _l3_commit (self=0x560e25f168e0, commit_type=NM_L3_CFG_COMMIT_TYPE_REAPPLY, is_idle=<optimized out>) at src/core/nm-l3cfg.c:4353
#25 0x0000560e239c6a6d in nm_device_cleanup (self=0x560e25e985e0, reason=<optimized out>, cleanup_type=CLEANUP_TYPE_DECONFIGURE) at src/core/devices/nm-device.c:15082
#26 0x0000560e239c7522 in _set_state_full (self=0x560e25e985e0, state=<optimized out>, reason=<optimized out>, quitting=0) at src/core/devices/nm-device.c:15467
#27 0x0000560e239cd482 in queued_state_set (user_data=user_data@entry=0x560e25e985e0) at src/core/devices/nm-device.c:15706
#28 0x00007f6a2131b27b in g_idle_dispatch (source=0x560e25ebab60, callback=0x560e239cd3d0 <queued_state_set>, user_data=0x560e25e985e0) at gmain.c:5579
#29 0x00007f6a2131e95d in g_main_dispatch (context=0x560e25d97bc0) at gmain.c:3193
#30 g_main_context_dispatch (context=context@entry=0x560e25d97bc0) at gmain.c:3873
#31 0x00007f6a2131ed18 in g_main_context_iterate (context=0x560e25d97bc0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3946
#32 0x00007f6a2131f042 in g_main_loop_run (loop=0x560e25d730f0) at gmain.c:4142
#33 0x0000560e237c06ec in main (argc=<optimized out>, argv=<optimized out>) at src/core/main.c:509
Fixes: 929eae245d ('l3cfg: implement NM_L3CFG_CONFIG_FLAGS_ASSUME_CONFIG_ONCE and rework object state')
(cherry picked from commit 849a4eee5c)
After the first time committing, the routes and addresses are removed
directly by bypassing the l3cfg in `nm_device_cleanup()`, then when
committing the second time, the l3cfg think that some addresses are
still configured but they are actually already disappeared from the
kernel already.
To fix it, commit the l3cd changes through l3cfg instead of removing
the addresses/routes directly.
https://bugzilla.redhat.com/show_bug.cgi?id=2043514
Fixes-test: @nmcli_general_activate_static_connection_carrier_not_ignored
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1115
(cherry picked from commit 9f6114afe8)
The DPDK port will not have a link after the devbind which is needed for
configuring an interface to be a DPDK port. The MTU is being committed
during the link change but for DPDK ports there is no link.
The DPDK port MTU should be set on ovsdb right after the interface is
added to ovsdb. This way the users will be able to set MTU for DPDK
ports and modify it.
Please see the following results:
```
port 2: iface0 (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3,
configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=2000, requested_rx_queues=1,
requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=false)
```
(cherry picked from commit 59c60cccf5)
Since commit ecc73eb239 ('ovs-port: always remove the OVSDB entry on
slave release'), ovs port were removing the ovsdb entry upon being
un-enslaved, no matter what the reason for un-enslavement was. The idea
was to remove the stale ovsdb entry upon forcible device removal.
This cleanup is specific to OpenVSwitch, since for other device types,
the device master is the property of the slave and thus goes away along
with the device.
Turns out we're now removing the ovsdb entry even when the device
actually doesn't go away, but we're pretending it does because the
daemon is shutting down.
To add insult to injury, we generally end up removing one entry,
because the other ovsdb calls end up in a queue and don't get serviced
before the daemon shuts down. The result is a mess. (This patch
doesn't solve that -- if someone terminates the daemon with in-flight
ovsdb calls they're still out of luck).
Let's do the cleanup now only if the device was actually physically
removed.
Fixes-test: @NM_reboot_openvswitch_vlan_configuration
Fixes: ecc73eb239 ('ovs-port: always remove the OVSDB entry on slave release')
https://bugzilla.redhat.com/show_bug.cgi?id=2055665https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1117
(cherry picked from commit 897977e960)
pppd is a delicate flower. On orderly shutdown, it likes to tell the
other side. This seems to take at least a second even when no real
network latency is at play, on busy systems 1.5 seconds easily ends up
being inadequate.
A violent shutdown is generally okay apart from that it can leave
garbage (port lock) behind and the other side potentially confused for a
while.
As it happens, this interacts badly with modemu.pl which is used for
testing: the pseudo terminal in PPP line discipline mode has no idea
that the remote disconnected and while ModemManager is learning that
something wrong the hard way (AT command timing out, because the remote
still expects to talk PPP), the test times out.
Let's increase the timeout to something more reasonable.
https://bugzilla.redhat.com/show_bug.cgi?id=2049596https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1103
(cherry picked from commit 47ff99515f)
Don't progress to the IP ready state until all objects are committed
to platform. Note that l3cfg has a 20 seconds timeout after which
unavailable objects are considered "definitely unavailable" and are
removed from the list.
Fixes-test: @ipv6_routes_with_src
https://bugzilla.redhat.com/show_bug.cgi?id=2043133
(cherry picked from commit f15b3f15a7)
l3cfg has a "temp_not_available" list of objects that couldn't be
added to platform, but can be added once some preconditions become
true (for example, a IPv6 route with a "src" attribute requires a
non-tentative src address to be present).
Retry to commit those objects once all addresses have completed
ACD/DAD.
(cherry picked from commit 9a090fdf7b)
nm_l3_config_data_get_nameservers() returns a pointer to "struct in6_addr". Not
a pointer to pointers.
#0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:389
#1 0x00007f8060dd9109 in memcpy (__len=<optimized out>, __src=0xfd, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:29
#2 g_array_append_vals (len=1, data=0xfd, farray=0x55dd69332130) at ../glib/garray.c:522
#3 g_array_append_vals (farray=0x55dd69332130, data=0xfd, len=1) at ../glib/garray.c:509
#4 0x000055dd68d2a27d in _garray_inaddr_add (p_arr=<optimized out>, addr_family=<optimized out>, addr=0xfd) at src/core/nm-l3-config-data.c:295
#5 0x000055dd68ef6510 in nm_l3_config_data_add_nameserver (nameserver=<optimized out>, addr_family=10, self=0x55dd6949f900) at src/core/nm-l3-config-data.c:1442
#6 nm_device_copy_ip6_dns_config (self=0x55dd693c4420, from_device=<optimized out>) at src/core/devices/nm-device.c:10468
#7 0x00007f8060f28aba in _g_closure_invoke_va (param_types=0x0, n_params=<optimized out>, args=0x7fffed43d610, instance=0x55dd693c4420, return_value=0x0, closure=0x55dd693cdb10)
at ../gobject/gclosure.c:893
#8 g_signal_emit_valist (instance=0x55dd693c4420, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7fffed43d610) at ../gobject/gsignal.c:3406
#9 0x00007f8060f28c03 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at ../gobject/gsignal.c:3553
#10 0x000055dd68efd1fb in _dev_ipac6_start (self=0x55dd693c4420) at src/core/devices/nm-device.c:11348
#11 0x000055dd68efd698 in _dev_ipac6_start_continue (self=0x55dd693c4420) at src/core/devices/nm-device.c:11373
#12 _dev_ipll6_set_llstate (self=0x55dd693c4420, llstate=<optimized out>, lladdr=<optimized out>) at src/core/devices/nm-device.c:10576
#13 0x000055dd68e7915e in _emit_changed_on_idle_cb (user_data=user_data@entry=0x55dd6941ca50) at src/core/nm-l3-ipv6ll.c:221
#14 0x00007f8060e0639b in g_idle_dispatch (source=0x55dd693eea30, callback=0x55dd68e78fd0 <_emit_changed_on_idle_cb>, user_data=0x55dd6941ca50) at ../glib/gmain.c:5897
#15 0x00007f8060e0a05f in g_main_dispatch (context=0x55dd6922c800) at ../glib/gmain.c:3381
#16 g_main_context_dispatch (context=0x55dd6922c800) at ../glib/gmain.c:4099
#17 0x00007f8060e5f2a8 in g_main_context_iterate.constprop.0 (context=0x55dd6922c800, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4175
#18 0x00007f8060e09773 in g_main_loop_run (loop=0x55dd69211010) at ../glib/gmain.c:4373
#19 0x000055dd68d09c7b in main (argc=<optimized out>, argv=<optimized out>) at src/core/main.c:509
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
(cherry picked from commit a2c8a3228b)
Certain route types (blackhole, unreachable, prohibit) are not tied to
an interface. They are thus global and we need to track them system wide
(or better: per network namespace). That is done by NMPRouteManager.
For the routing rules, it's NMDevice itself to track/untrack the rules.
That is done for historical reasons, at the time, NML3Cfg did not exit.
Now with NML3Cfg, it seems that also NML3Cfg should be the part that
handles nodev routes. One reason is that we want to move IP
functionality out of NMDevice. So callers (NMDevice) would just add
blackhole routes to the NML3ConfigData and let NML3Cfg handle them.
Still, to handle these routes is rather different from regular routes.
Normally, NML3Cfg tracks an object state (ObjStateData) for each address/route,
and it hooks into platform signals to update the os_plobj field. Those signals
are dispatched by NMNetns and are only per-ifindex. Hence, NML3Cfg
wouldn't be notified about those nodev routes. Consequently, there
os_plobj could not be (efficiently) maintained and there is no
ObjStateData for such routes.
Instead, all that NML3Cfg does is have the routes in the NML3ConfigData and
tell NMPRouteManager about them. Seems simple enough. The only question
is when should NMPRouteManager sync? For now, we sync when the
track/untracking brings any changes and during reapply. Which is
probably fine.
(cherry picked from commit 9ab53e561a)
Specifically, in nm_utils_ip_route_attribute_to_platform() and in
_l3_config_data_add_obj() handle such new route type. For the moment,
they cannot be stored in a valid NMSettingIPConfig, but later this will
be necessary.
(cherry picked from commit 6255e0dcac)
This will be required next, when we will have also routes without a
device. Split the generation of the route list out.
(cherry picked from commit e32bc6d248)
The general idea is that when we have entries tracked by the
route-manager, that we can mark them all as dirty. Then, calling the
"track" function will reset the dirty flag. Finally, there is a method
to delete all dirty entries.
As we can lookup an entry with O(1) (using dictionaries), we can
sync the list of tracked objects with O(n). We just need to track
all the ones we care about, and then delete those that were not touched
(that is, are still dirty).
Previously, we had to explicitly mark all entries as dirty. We can do
better. Just let nmp_route_manager_untrack_all() mark the survivors as
dirty right away. This way, we can save iterating the list once.
It also makes sense because the only purpose of the dirty flag is to
aid this prune mechanism with track/untrack-all. So, untrack-all can
just help out, and leave the remaining entries dirty, so that the next
track does the right thing.
(cherry picked from commit 9e90bb0817)
Routes of type blackhole, unreachable, prohibit don't have an
ifindex/device. They are thus in many ways similar to routing rules,
as they are global. We need a mediator to keep track which routes
to configure.
This will be very similar to what NMPRulesManager already does for
routing rules. Rename the API, so that it also can be used for routes.
Renaming the file will be done next, so that git's rename detection
doesn't get too confused.
(cherry picked from commit ea4f6d7994)
So far, certain NMObject types could not have an ifindex of zero. Hence,
nmp_lookup_init_object() took such an ifindex to mean lookup all objects
of that type.
Soon, we will support blackhole/unreachable/prohibit route types, which
have their ifindex set to zero. It is still useful to lookup those routes
types via nmp_lookup_init_object().
Change behaviour how to interpret the ifindex. Note that this also
affects various callers of nmp_lookup_init_object(). If somebody was
relying on the previous behavior, it would need fixing.
(cherry picked from commit d4ad9666bd)
We made the choice, that NMPlatformIPRoute does not contain the actual
route table, instead it contains a "remapped" number: table_coerced.
That remapping done, so that the default (which we want semantically to
be 254, RT_TABLE_MAIN) is numerical zero so that struct initialization
doesn't you require to explicitly set the default.
Hence, we must always distinguish whether we have the real table number
or the "table_coerced", and you must convert back and forth between the
two.
Now, the parameter of nm_l3_config_data_merge() are real table numbers
(as also indicated by their name not having the term "coerced"). So
usually they are set to actually 254.
When we set the field of NMPlatformIPRoute, we must coerce it. This was
wrong, and we would see wrong table numbers in the log:
l3cfg[17b98e59a477b0f4,ifindex=2]: obj-state: track: [2a32eca99405767e, ip4-route, type unicast table 0 0.0.0.0/0 via ...
Fixes: b4aa35e72d ('l3cfg: extend nm_l3cfg_add_config() to accept default route table and metric')
(cherry picked from commit e23ebe9183)
The metered status can depend on the DHCP lease, as we accept the
ANDROID_METERED vendor option. That means, on a DHCP update we need
to re-evaluate the metered flag.
This fixes a potential race, where IPv6 might succeed first and
activation completes (with GUESS_NO metered flag). A subsequent
DHCPv4 update requires to re-evaluate that decision.
Fixes-test: @connection_metered_guess_yes
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1080
When the NM_UNMANAGED_PLATFORM_INIT flag is cleared last in
device_link_changed(), a recheck-assume is scheduled and then the
device goes immediately to UNAVAILABLE. During the state transition,
addresses and routes are removed from the interface. Then,
recheck-assume finds that the device can be assumed but it's too late
since the device was already deconfigured.
This is a problem as the whole point of assuming a device is to
activate a connection while leaving the device untouched.
In the NMCI "dracut_NM_vlan_over_bridge and dracut_NM_vlan_over_bond"
test, NM in real root tries to assume a vlan device that was activated
in initrd. When the interface gets deconfigured in UNAVAILABLE, the
connection to the NFS server breaks and the rootfs becomes
inaccessible.
The fix to this problem is to delay state transitions in
device_link_changed() to a idle handler, so that recheck-assume can
run before.
Fixes-test: @dracut_NM_vlan_over_bridge
Fixes-test: @dracut_NM_vlan_over_bond
https://bugzilla.redhat.com/show_bug.cgi?id=2047302
nm_device_set_unmanaged_by_user_settings() does nothing when the
device is unmanaged by platform-init. Remove the if branch to make
this more explicit.
The ACD state handling is unfortunately very complicated. That is, because
we obviously need to track state about how ACD is going (the acd_data, and
in particular NML3AcdAddrState). Then there are various things that can
happen, which are the AcdStateChangeMode enums. All these state-changes
come together in one function: _l3_acd_data_state_change(), which is
therefore complicated (I don't think that it would become simpler by
spreading this code out to different functions, on the contrary).
Anyway.
So, what happens when we need to reset the n-acd instance? For example,
because the MAC address of the link changed or some error. I guess, we
need to restart probing.
Previously, I think this was not handled properly. We already tried to
fix this several times, the last was commit b331606386 ('l3cfg: on
n-acd instance-reset clear also ready ACD state'). There is still an
issue ([1]).
The bug [1] is, that we are in state NM_L3_ACD_ADDR_STATE_READY, during
ACD_STATE_CHANGE_MODE_TIMEOUT event. That leads to an assertion
failure.
#5 0x00007f23be74698f in g_assertion_message_expr (domain=0x5629aca70359 "nm", file=0x5629aca62aab "src/core/nm-l3cfg.c", line=2395, func=0x5629acb26b30 <__func__.72.lto_priv.4> "_l3_acd_data_state_change", expr=<optimized out>) at ../glib/gtestutils.c:3091
#6 0x00005629ac937e46 in _l3_acd_data_state_change (self=0x5629add69790, acd_data=0x5629add8d520, state_change_mode=ACD_STATE_CHANGE_MODE_TIMEOUT, sender_addr=0x0, p_now_msec=0x7ffded506460) at src/core/nm-l3cfg.c:2395
#7 0x00005629ac939f4d in _l3_acd_data_timeout_cb (user_data=user_data@entry=0x5629add8d520) at src/core/nm-l3cfg.c:1933
#8 0x00007f23be71c5a1 in g_timeout_dispatch (source=0x5629addd7a80, callback=0x5629ac939ee0 <_l3_acd_data_timeout_cb>, user_data=0x5629add8d520) at ../glib/gmain.c:4889
#9 0x00007f23be71bd4f in g_main_dispatch (context=0x5629adc6da00) at ../glib/gmain.c:3337
#10 g_main_context_dispatch (context=0x5629adc6da00) at ../glib/gmain.c:4055
That can only happen, (I think) when we scheduled the timeout
during an earlier ACD_STATE_CHANGE_MODE_INSTANCE_RESET event. Meaning,
we need to handle instance-reset better.
Instead, during instance-reset, switch always back to state PROBING, and
let the timeout figure it out.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2047788
The first point is that ACD timeout is strongly tied to the current state. That
is (somewhat) obvious, because _l3_acd_data_state_set_full() will clear any pending
timeout. So you can only schedule a timeout *after* setting the state,
and setting the next state, will clear the timeout.
Likewise, note that l3_acd_data_state_change() for the event
ACD_STATE_CHANGE_MODE_TIMEOUT asserts that it is only called in the few
states where that is expected. See rhbz#2047788 where that assertion
gets hit.
The first point means that we must only schedule a timer when we are
also in a state that supports that. Add an assertion for that at the
point when scheduling the timeout. The assert at this point is useful,
because it catches the moment when we do the wrong thing (instead of
getting the assertion later during the timeout, when we no longer know
where the error happened).
See-also: https://bugzilla.redhat.com/show_bug.cgi?id=2047788
Now that higher priorities numbers really mean more important,
it seems that the VPN configuration should be rather important.
Bump the number, also in relation to NMDevice's L3ConfigDataType.
It might not matter too much, because usually the VPN tunnel device does
not have NDevice to add other l3cds and those from VPN might be alone.
Except, maybe with routing VPN (libreswan) that is different. Dunno.
NM_L3CFG_CONFIG_PRIORITY_IPV6LL and L3_CONFIG_DATA_TYPE_LL_6 are
somewhat related. They probably should be numerically identical.
Now, L3_CONFIG_DATA_TYPE_LL_6 cannot be zero (because
L3_CONFIG_DATA_TYPE_LL_4 is zero and L3ConfigDataType values must be
distinct). Therefore, also bump NM_L3CFG_CONFIG_PRIORITY_IPV6LL to 1.
Of course, NM_L3CFG_CONFIG_PRIORITY_IPV6LL is not obviously more important
than NM_L3CFG_CONFIG_PRIORITY_IPV4LL. But to be consistent with
L3_CONFIG_DATA_TYPE_LL_6 is probably more important here.
There was a disagreement whether higher priority numbers mean more/less
important. NMDevice and callers of nm_l3cfg_add_config() assumed that
higher numbers are more important, while _l3_config_datas_get_sorted_cmp()
did the inverse.
There is no obvious right or wrong. We only need to agree. It seems
slightly more sensible to keep the caller's interpretation.
This commit does not seem correct. The enum was moved with the declared
intent to make manual IP configuration preferred. But the code comment
in L3ConfigDataType states that higher numbers are more important.
Also, all the other values are intentionally ordered so that more
important method have higher numbers. For example, LL_4 < DHCP_4 < SHARED_4 <
DEV_4 (in increasing priority). While it's often not clear whether to
prefer one method over the other, or what the actual effect of (LL_4 < DHCP_4)
is, the numbers were chosen intentionally.
Only moving MANUALIP first, counters the relative order of the other values.
If there is the problem, that the code comment is wrong (and lower
numbers mean more important), then we would have to reverse all
enum values.
The real problem is that NML3Cfg's _l3_config_datas_get_sorted_cmp()
has the wrong order/understanding. So the real fix is there and will
be done next. That is, unless we would agree that _l3_config_datas_get_sorted_cmp()
is in the right, and prefer lower numbers -- in which case all values had to be
reversed.
This reverts commit af1903fe3f.
Works for the internal DHCP client only as sd-dhcp does the option
parsing for us.
The option 56 is not understood by dhclient so we would need to parse it
ourselves. Let's not do it for now, as the RFC seems to written in a
somewhat poor taste.
https://bugzilla.redhat.com/show_bug.cgi?id=2047415#c2
We have global data NMDhcpOption that describes the DHCP meta data.
There is a consistency check with NM_MORE_ASSERTS.
Improve the error message when the meta data is inconsistent to
help finding the bug.
When NetworkManager shuts down, it is not done properly. We cannot
ensure that all pending async operations are cancelled and therefore
there are possible device leaks. This means that not all the devices
will be disposed and finalized because a different component is handling
a reference to it.
Since the l3cfg refactor, this is affecting to ovs ports. When shutting
down sometimes there are ovs-interface or ovs-port devices that are not
being cleared correctly. When starting NetworkManager back, these
devices are not going to be created again because they already exist and
the existing compatible connections will be instruct to use the device.
But currently, ovsdb is only unrealizing a device after removal if the
state is UNMANAGED. This is wrong, because it will left an inconsistent
state in NetworkManager and the ovs-port/ovs-interface connection won't
be activated.
The interfaces removed by ovsdb must be unrealized if they are on
UNAVAILABLE state.
https://bugzilla.redhat.com/show_bug.cgi?id=2029937
The properties NM_DEVICE_MODEM_CAPABILITIES and
NM_DEVICE_MODEM_CURRENT_CAPABILITIES are uint, with a range from zero to
G_MAXUINT32.
nm_modem_get_capabilities() passes the "untrusted" flags from
ModemManager. Ensure that they fit into 32 bit, and don't cause an
assertion failure due to the range check.
Yes, in practice, on all platforms where we build (known to me), guint
is always the same as guint32. So this has little effect.
Also, cast to guint. Previously, this was just a C enum
NMDeviceModemCapabilities, which theoretically has implementation
defined sizes. Yes, in practice, they too are of size guint, and this
was not an actual bug. But be specific about converting between
different integer types.