When we're deactivating an externally created device that has a master
because we're activating a connection on it, actually remove the device
from the master. Otherwise unpleasant things happen:
active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
device (dummy0): disconnecting for new activation request.
device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)
Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.
This patch changes that.
device (br0): bridge port dummy0 was detached
device (dummy0): released from master device br0
active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')
platform: (dummy0) emit signal link-changed changed: 102: dummy0
<NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746
Now the platform sent us a new link, the "master" property is still set.
device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare
l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
device (br0): master: add one slave 0a458361f9fed8f5/dummy0
Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above. Sad.
device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
active-connection[0x55ed7ba78400]: set state activating (was unknown)
manager: NetworkManager state is now CONNECTING
active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
platform: (dummy0) link: releasing 102 from master 'br0' (101)
device (br0): detached bridge port dummy0
Now things go south. The stage1 cleans the device up, removing it from
the master and the device itself decides it should deactivate itself
because it lots its master regardless of the fact that it should not
have one and it's in fact an unwanted carryover from previous activation.
I believe this is also wrong.
device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'
device[0a458361f9fed8f5] (dummy0): queue-state[deactivating, reason:connection-assumed, id:298]: queue state change
device[0a458361f9fed8f5] (dummy0): activation-stage: synchronously invoke activate_stage2_device_config
device (dummy0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Now things are really weird. We synchronously go to config, effectively
overriding the queued deactivation. We've really messed up.
Sometimes weird things happen.
Let dummy0 be an externally created device that has a master. We decide
to activate a connection that has no master on it:
active-connection[0x55ed7ba78400]: constructed (NMActRequest, version-id 4, type managed)
device[0a458361f9fed8f5] (dummy0): sys-iface-state: external -> managed
device[0a458361f9fed8f5] (dummy0): queue activation request waiting for currently active connection to disconnect
device (dummy0): disconnecting for new activation request.
device (dummy0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (enslaved)(no-config)
Note the "no-config" above. We'set priv->master = NULL, but didn't
communicate the change to the platform. I believe this is not good.
device (br0): bridge port dummy0 was detached
device (dummy0): released from master device br0
active-connection[0x55ed7ba782e0]: set state deactivating (was activated)
device (dummy0): ip4: set state none (was done, reason: ip-state-clear)
device (dummy0): ip6: set state none (was done, reason: ip-state-clear)
device (dummy0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')
platform: (dummy0) emit signal link-changed changed: 102: dummy0
<NOARP,UP,LOWER_UP;broadcast,noarp,up,running,lowerup> mtu 1500 master 101 arp 1 dummy* init
addrgenmode none addr EA:8D:DD:DF:1F:B7 brd FF:FF:FF:FF:FF:FF driver dummy rx:0,0 tx:39,4746
Now the platform sent us a new link, the "master" property is still set.
device[0a458361f9fed8f5] (dummy0): queued link change for ifindex 102
device[0a458361f9fed8f5] (dummy0): deactivating device (reason 'new-activation') [60]
device (dummy0): ip: set (combined) state none (was done, reason: ip-state-clear)
config: device-state: write #102 (/run/NetworkManager/devices/102); managed=managed, perm-hw-addr-fake=EA:8D:DD:DF:1F:B7, route-metric-default=0-0
active-connection[0x55ed7ba782e0]: set state deactivated (was deactivating)
active-connection[0x55ed7ba782e0]: check-master-ready: already signalled (state deactivated, master 0x55ed7ba781c0 is in state activated)
device (dummy0): Activation: starting connection 'dummy1' (ec6fca51-84e6-4a5b-a297-f602252c9f69)
device[0a458361f9fed8f5] (dummy0): activation-stage: schedule activate_stage1_device_prepare
l3cfg[ae290b5c1f585d6c,ifindex=102]: emit signal (platform-change-on-idle, obj-type-flags=0x2a)
device (br0): master: add one slave 0a458361f9fed8f5/dummy0
Amidst the new activation we're processing the netlink message we got.
We set priv->master back, effectively nullifying the release above.
device (dummy0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'in-state-change'
active-connection[0x55ed7ba78400]: set state activating (was unknown)
manager: NetworkManager state is now CONNECTING
active-connection[0x55ed7ba78400]: check-master-ready: not signalling (state activating, no master)
device[8fff58d61c7686ce] (br0): slave dummy0 state change 30 (disconnected) -> 40 (prepare)
device[0a458361f9fed8f5] (dummy0): remove_pending_action (1): 'in-state-change'
device (br0): master: release one slave 0a458361f9fed8f5/dummy0 (not enslaved) (force-configure)
platform: (dummy0) link: releasing 102 from master 'br0' (101)
device (br0): detached bridge port dummy0
Now stage1 cleans the device up, removing it from the master.
device[0a458361f9fed8f5] (dummy0): Activation: connection 'dummy1' master deactivated
device (dummy0): ip4: set state none (was pending, reason: ip-state-clear)
device (dummy0): ip6: set state none (was pending, reason: ip-state-clear)
device[0a458361f9fed8f5] (dummy0): add_pending_action (2): 'queued-state-change-deactivating'
We decide to deal with this by enqueuing a deactivation. That is not
great -- we shouldn't even have had this master!
This patch takes the deactivation path only if we were willingly
enslaved to the master in question.
The @bond_mode_8023ad test has been seen failing, with a log like this:
<debug> [...3.0484] device[...] (eth1): Activation: connection 'bond0.0' master deactivated
<debug> [...3.0484] device[...] (eth1): add_pending_action (2): 'queued-state-change-deactivating'
<debug> [...3.0484] device[...] (eth1): queue-state[deactivating, reason:new-activation, id:709]: queue state change
What happened is that eth1 has been activating. It was already enslaved
to a bond and was in an ip-config state when the bond was removed.
A change to "deactivating" state has been enqueued. But then this
happened:
<trace> [...3.0942] device[...] (eth1): ip4: check-state: state done => done, is_failed=0, is_pending=0,
is_started=0 temp_na=0, may-fail-4=1, may-fail-6=1; disabled4; manualip4=done; ignore6 manualip6=done
<trace> [...3.0942] device[...] (eth1): ip: check-state: (combined) state pending => done
<debug> [...3.0943] device[...] (eth1): ip: set (combined) state done (was pending, reason: check-ip-state)
<info> [...3.0943] device (eth1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
<debug> [...3.0943] device[...] (eth1): add_pending_action (3): 'in-state-change'
<debug> [...3.0943] device[...] (eth1): queue-state[deactivating, reason:new-activation, id:709]: clear queued state change
The IP config succeeded and the queued "deactivating" change was
overriden by the IP4 check result, prompting a change to "ip-check".
With the master still missing. Not good.
Let's terminate the appempts to check the IP state when we cancel the
activation, so that it doesn't override the enqueued state change.
Fixes-test: @bond_mode_8023ad
https://bugzilla.redhat.com/show_bug.cgi?id=2080928https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1245
Introduction of a new setting ipv4.link-local, which enables
link-local IP addresses concurrently with other IP address assignment
implementations such as dhcp or manually.
No way is implemented to obtain a link-local address as a fallback when
dhcp does not respond (as dhcpd does, for example). This could be be
added later.
To maintain backward compatibility with ipv4.method ipv4.link-local has
lower priority than ipv4.method. This results in:
* method=link-local overrules link-local=disabled
* method=disabled overrules link-local=enabled
Furthermore, link-local=auto means that method defines whether
link-local is enabled or disabled:
* method=link-local --> link-local=enabled
* else --> link-local=disabled
The upside is, that this implementation requires no normalization.
Normalization is confusing to implement, because to get it really
right, we probably should support normalizing link-local based on
method, but also vice versa. And since the method affects how other
properties validate/normalize, it's hard to normalize that one, so that
the result makes sense. Normalization is also often not great to the
user, because it basically means to modify the profile based on other
settings.
The downside is that the auto flag becomes API and exists because
we need backward compatibility with ipv4.method.
We would never add this flag, if we would redesign "ipv4.method"
(by replacing by per-method-specific settings).
Defining a default setting for ipv4.link-local in the global
configuration is also supported.
The default setting for the new property can be "default", since old
users upgrading to a new version that supports ipv4.link-local will not
have configured the global default in NetworkManager.conf. Therefore,
they will always use the expected "auto" default unless they change
their configuration.
Co-Authored-By: Thomas Haller <thaller@redhat.com>
During the deactivation of ovs interfaces, ovsdb receives the command to
remove the interface but for OVS system ports the device won't
disappear.
When reconnecting, ovsdb will update first the status and it will notice
that the OVS system interface was removed and it will set the status as
DEACTIVATING. This is incorrect if the status is already DEACTIVATING,
DISCONNECTED, UNMANAGED or UNAVAILABLE because it will block the
activation of the interface.
https://bugzilla.redhat.com/show_bug.cgi?id=2080236
On m68k we get a static assertion, that NMPlatformIP4Address.address
is not at the same offset as NMPlatformIPAddress.address_ptr.
On most architectures, the bitfields fits in a gap between the fields,
but not on m68k, where integers are 2-byte aligned.
On m68k, integers are 2-byte aligned. Hence the assertion was wrong.
What we really want to check, is that NMIPAddr has not a smaller
alignment than in_addr_t and similar.
While at it, also assert the alignment for NMEtherAddr.
We want to assert that our alignment-guarantees do not exceed the
guarantees of the system-linker or system-allocator on the target
platform. Hence, we check against max_align_t. This is a lower bound,
but not the exact check we actually want. And as it turns out, on m64k
it is too low. Add a static check against 4-byte alignment for m64k as
a workaround.
Reported-by: Michael Biebl
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
https://github.com/c-util/c-rbtree/issues/9eb778d3969
As almost always, there is a point in keeping IPv4 and IPv6 implementations
similar. Behave different where there is an actual difference, at the bottom
of the stack.
Technically, g_warn_if_reached() may not be an assertion, according to
glib. However, there is G_DEBUG=fatal-warnings and we want to run with
that.
So this is an assertion to us. Also, logging to stderr/stdout is not a
useful thing to the daemon. Don't do this. Especially, since it depends
on user provided (untrusted) input.
Optimally we want stateless, pure code. Obviously, NMDhcpClient needs to
keep state to know what it's doing. However, we should well encapsulate
the state inside NMDhcpClient, and only accept events/notifications that
mutate the internal state according to certain rules.
Having a function public set_state(self, new_state) means that other
components (subclasses of NMDhcpClient) can directly mangle the state.
That means, you no longer need to only reason about the internal state
of NMDhcpClient (and the events/notifications/state-changes that it
implements). You also need to reason that other components take part of
maintaining that internal state.
Rename nm_dhcp_client_set_state() to nm_dhcp_client_notify(). Also, add
a new enum NMDhcpClientEventType with notification/event types.
In practice, this is only renaming. But naming is important, because it
suggests the reader how to think about the code.
The "noop" state is almost unused, however, nm_dhcp_set_state()
has a check "if (new_state >= NM_DHCP_STATE_TIMEOUT)", so the order
of the NOOP state matters.
Fix that by reordering.
Also, just return right away from NOOP.
NMDhcpState is very tied to events from dhclient. But most of these
states we don't care about, and NMDhcpClient definitely should abstract
and hide them.
We should repurpose NMDhcpState to simpler state. For that, first drop
the state from nm_dhcp_client_handle_event().
This is only the first step (which arguably makes the code more
complicated, because reason_to_state() gets spread out and the logic
happens more than once). That will be addressed next.
- return early to avoid nested block.
- use NM_STR_HAS_PREFIX() over g_str_has_prefix(), because that
can be inlined and only accepts a C literal as prefix argument.
- the code comment was unclear/wrong. If something comes from an environment
variables it is *NOT* UTF-8 safe. Also, we convert all non-ASCII characters,
not only non UTF-8 characters.
- as we already convert the string to ASCII, the check whether it's UTF-8
is bogus.
- using GString is unnecessary.
- use NM_IN_STRSET_ASCII_CASE().
- don't use else block after we return.
- don't accept the "iface" argument just for logging. The caller
can do the logging, if they wish.
g_bytes_ref() does not accept NULL. But doing so can be convenient,
add a helper for that.
Note that g_bytes_unref() does accept NULL, so there is no corresponding
helper.
Log messages when invalid DHCP options are found. For example:
<info> dhcp4 (eth0): error parsing DHCP option 6 (domain_name_servers): address 0.0.0.0 is ignored
<info> dhcp4 (eth0): error parsing DHCP option 12 (host_name): '.example.com' is not a valid DNS domain
<info> dhcp4 (eth0): error parsing DHCP option 26 (interface_mtu): value 60 is smaller than 68
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1225
A virtual infiniband profile (with p-key>=0) can also contain a
"connection.interface-name". But it is required to match the
f"{parent}.{p-key}" format.
However, such a profile can also set "mac_address" instead of "parent".
In that case, the validation code was crashing.
nmcli connection add type infiniband \
infiniband.p-key 6 \
infiniband.mac-address 52:54:00:86:f4:eb:aa:aa:aa:aa:52:54:00:86:f4:eb:aa:aa:aa:aa \
connection.interface-name aaaa
The crash was introduced by commit 99d898cf1f ('libnm: rework caching
of virtual-iface-name for infiniband setting'). Previously, it would not
have crashed, because we just called
g_strdup_printf("%s.%04x", priv->parent, priv->p_key)
with a NULL string. It would still not have validated the connection
and passing NULL as string to printf is wrong. But in practice, it
would have worked mostly fine for users.
Fixes: 99d898cf1f ('libnm: rework caching of virtual-iface-name for infiniband setting')
clang 3.4.2-9.el7 does not like this:
$ clang -DHAVE_CONFIG_H -I. -I.. -I../src/libnm-core-public -I./src/libnm-core-public -I../src/libnm-client-public -I./src/libnm-client-public -pthread -I/usr/include/gio-unix-2.0/ -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_40 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_40 -Wall -Werror -Wextra -Wdeclaration-after-statement -Wfloat-equal -Wformat-nonliteral -Wformat-security -Wimplicit-function-declaration -Winit-self -Wmissing-declarations -Wmissing-include-dirs -Wmissing-prototypes -Wpointer-arith -Wshadow -Wstrict-prototypes -Wundef -Wvla -Wno-duplicate-decl-specifier -Wno-format-y2k -Wno-missing-field-initializers -Wno-sign-compare -Wno-tautological-constant-out-of-range-compare -Wno-unknown-pragmas -Wno-unused-parameter -Qunused-arguments -Wunknown-warning-option -Wtypedef-redefinition -Warray-bounds -Wparentheses-equality -Wunused-value -Wimplicit-fallthrough -fno-strict-aliasing -fdata-sections -ffunction-sections -Wl,--gc-sections -g -O2 -MT examples/C/glib/examples_C_glib_add_connection_libnm-add-connection-libnm.o -MD -MP -MF examples/C/glib/.deps/examples_C_glib_add_connection_libnm-add-connection-libnm.Tpo -c -o examples/C/glib/examples_C_glib_add_connection_libnm-add-connection-libnm.o `test -f 'examples/C/glib/add-connection-libnm.c' || echo '../'`examples/C/glib/add-connection-libnm.c
...
../src/libnm-client-public/nm-client.h:149:31: error: redefinition of typedef 'NMClient' is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef struct _NMClient NMClient;
^
Our code base is C11 internally (actually "-std=gnu11"), but this problem
happens when we build the example. The warning is actually correct, because
our public headers should be more liberal (and possibly be C99 or even C89,
this is undefined).
Fixes: 649314ddaa ('libnm: replace nm-types.h by defining the types in respective headers')
We no longer have "nm-types.h", which forward declares most relevant
typedefs. We also don't ensure that each header includes all the
headers that it has a dependency (instead, we rely on the user to
include "NetworkManager.h", which does the right thing).
The "right thing" depends on doing doing it in the right order.
Reorder the includes.
We want to warn the user if they're connecting to an insecure network:
$ nmcli d wifi
IN-USE BSSID SSID MODE CHAN RATE SIGNAL BARS SECURITY
BA:00:6A:3C:C2:09 Secured Network Infra 2 54 Mbit/s 100 ▂▄▆█ WPA3
FA:7C:46:CC:9F:BE Ye Olde Wlan Infra 1 54 Mbit/s 100 ▂▄▆█ WEP
$ nmcli d wifi connect 'Ye Olde Wlan'
Warning: WEP encryption is known to be insecure.
...
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1224