Reduce the use of NM_PLATFORM_GET / nm_platform_get() to get
the platform singleton instance.
For one, this is a step towards supporting namespaces, where we need
to use different NMNetns/NMPlatform instances depending on in which
namespace the device lives.
Also, we should reduce our use of singletons. They are difficult to
coordinate on shutdown. Instead there should be a clear order of
dependencies, expressed by owning a reference to those singelton
instances. We already own a reference to the platform singelton,
so use it and avoid NM_PLATFORM_GET.
(cherry picked from commit 94d9ee129d)
When deciding whether to touch a device we sometimes look at whether
the active connection is external/assumed. In many cases however,
there is no active connection around (e.g. while moving the device
from state unmanaged to disconnected before assuming).
So in most cases we instead look at the device-state-reason to decide
whether to touch the interface (see nm_device_state_reason_check()).
Often it's desirable to have no state and passing data as function
arguments. However, the state reason has to be passed along several hops
(e.g. a queued state change). Or a change to a master/slave can affect
the slave/master, where we pass on the state reason. Or an intermediate
event might invalidate a previous state reason. Passing the state
whether to touch a device or not as a state-reason is cumbersome
and limited.
Instead, the device should be aware of whats going on. Add a
sys-iface-state with:
- SYS_IFACE_STATE_EXTERNAL: meaning, NM should not touch it
- SYS_IFACE_STATE_ASSUME: meaning, NM is gracefully taking over
- SYS_IFACE_STATE_MANAGED: meaning, the device is managed by NM
- SYS_IFACE_STATE_REMOVED: the device no longer exists
This replaces most checks of nm_device_state_reason_check() and
nm_active_connection_get_activation_type() by instead looking at
the sys-iface-state of the device.
This patch probably has still issues, but the previous behavior was
not very clear either. We will need to identify those issues in future
tests and tweak the behavior. At least, now there is one flag that
describes how to behave.
nm_device_uses_assumed_connection() basically called
nm_active_connection_get_assumed() on the device.
Rename those functions to be closer to the activation-type
flags.
The concepts of "assume", "external", and "assume_or_external"
will make sense with the following commits.
Scenario:
Have a connection with DHCPv4 and a default-route. When externally
removing the default route (`ip route delete 0.0.0.0/0`) and issuing
`nmcli device reapply $IF`, the default route was not restored.
That was because when externally removing the default route,
we would remove the gateway from priv->con_ip4_config (see
update_ip4_config()). Later, when reapplying the connection,
the IP method doesn't actually change. So we would not restart
DHCP and thus there is no gateway around to add the default route.
The default route would only be restored after receiving a DHCP lease
in the far future.
Fix that, by always restarting the IP method.
The state-change of a device has a reason argument, which is mostly for information
only.
There are many places in code that are the source of a state-reason.
Mostly these are calls to:
- nm_device_state_changed()
- nm_device_queue_state()
- nm_device_queue_recheck_available()
- nm_device_set_unmanaged_by_*()
- nm_device_master_release_one_slave()
- nm_device_ip_method_failed()
- nm_modem_emit_prepare_result()
- nm_modem_emit_ppp_failed()
- nm_manager_deactivate_connection()
- NM_SET_OUT (out_failure_reason, NM_DEVICE_STATE_REASON_*);
However, there are a few places in code that look at the reason
to decide how to proceed. I think this is a bad pattern, because
cause and effect are decoupled and it gets hard to understand where
a certain reason is set and what consequences that has.
Add a nop-function nm_device_state_reason_check() to mark all uses
of the device state reason that derive decisions from it. That is,
highlight the "effect" part.
This argument is only relevant when the NMActStageReturn argument
indicates NM_ACT_STAGE_RETURN_FAILURE. In all other cases it is ignored.
Rename the argument to make the meaning clearer. The argument is passed
through several layers of code, it isn't obvious that this argument only
matters for the failure case. Also, the distinct name makes it easier
to distinguish from other uses of the "reason" name.
While at it, do some drive-by cleanup:
- use g_return_*() instead of g_assert() to have a more graceful
assertion.
- functions like dhcp4_start() don't need to return a failure reason.
Most callers don't care, and the caller who does can determine the
proper reason.
- allow omitting the out-argument via NM_SET_OUT().
After commit 22e8af6242 ("device: set a per-device default MTU on
activation") we explicitly set the VLAN MTU to 1500 if not overridden
by user settings. This has the advantage that the MTU is set to a
predictable value, while before it could have different values
depending on when the interface was created (for example, the
interface would get a 1500 MTU if created during boot, or would
inherit the parent's MTU if activated manually).
However, a better default value is the MTU of the parent interface
which is in most cases what the user wants. This value was the default
before commit 22e8af6242 for manually activated connections.
https://bugzilla.redhat.com/show_bug.cgi?id=1414186
(cherry picked from commit 7dde8d8106)
Instead of overwriting ip4_config_pre_commit(), add a new function
get_mtu().
This also adds a default value in case there is no user-configuration.
This will allow us later to reset a default MTU based on the device
type.
Most implementations of realize_start_notify() do the same
for link_changed().
Let NMDevice's base implementation of realize_start_notify() call
link_changed() -- which by default does notthing. This allows subclasses
to only overwrite link_changed().
This makes it easier to install the files with proper names.
Also, it makes the makefile rules slightly simpler.
Lastly, the documentation is now generated into docs/api, which makes it
possible to get rid of the awkward relative file names in docbook.
Keep the include paths clean and separate. We use directories to group source
files together. That makes sense (I guess), but then we should use this
grouping also when including files. Thus require to #include files with their
path relative to "src/".
Also, we build various artifacts from the "src/" tree. Instead of having
individual CFLAGS for each artifact in Makefile.am, the CFLAGS should be
unified. Previously, the CFLAGS for each artifact differ and are inconsistent
in which paths they add to the search path. Fix the inconsistency by just
don't add the paths at all.
We only needed proper glib enum types for having properties
and signal arguments. These got all converted to plain int,
so no longer generate such an enum type.
Had to rename "nm-enum-types.h" because it works badly with
"libnm/nm-enum-types.h". Maybe I could fix that differently,
but duplicate names is anyway error prone.
Note that "nm-core-enum-types.h" is already taken too, so
"nm-src-enum-types.h" it is.
An interface would make sense to allow the actual device-factory to inherit
from another type.
However, glib interfaces make code much harder to follow and less
efficient. The device factory shall be a very simple type with meta data
about supported device types and the ability to create device instances.
There is no need to make this an interface implementation, instead just
let the factories inherit from NM_TYPE_DEVICE_FACTORY directly.
- use _NM_GET_PRIVATE() and _NM_GET_PRIVATE_PTR() everywhere.
- reorder statements, to have GObject related functions (init, dispose,
constructed) at the bottom of each file and in a consistent order w.r.t.
each other.
- unify whitespaces in signal and properties declarations.
- use NM_GOBJECT_PROPERTIES_DEFINE() and _notify()
- drop unused signal slots in class structures
- drop unused header files for device factories
This retry loop was added by commit dc6341acec.
But I suspect, that the main-point there was not to retry the netlink
request to set the interface up. Why would that fail, and why would
a failure to set the interface up require a retry?
I think it was added to wait for carrier. But waiting for carrier was
later dropped with commit 5074898591
and it is not clear why we would wait for carrier at all -- we don't
do that for other device types either.
Instead of letting different subclasses call reset in their
virtual deactivate() function, do it in the parent class.
This works nicely, because the parent know whether the MAC
address is currently modified.
When a user want to explicitly spoof the MAC address, a failure
to do so should fail activation. For one, failing to do so may
be a security problem. In any case, if user asks to configure the
interface in a certain way and we fail to do so that shall result
in a failure to activate.
Extend the "ethernet.cloned-mac-address" and "wifi.cloned-mac-address"
settings. Instead of specifying an explicit MAC address, the additional
special values "permanent", "preserve", "random", "random-bia", "stable" and
"stable-bia" are supported.
"permanent" means to use the permanent hardware address. Previously that
was the default if no explict cloned-mac-address was set. The default is
thus still "permanent", but it can be overwritten by global
configuration.
"preserve" means not to configure the MAC address when activating the
device. That was actually the default behavior before introducing MAC
address handling with commit 1b49f941a6.
"random" and "random-bia" use a randomized MAC address for each
connection. "stable" and "stable-bia" use a generated, stable
address based on some token. The "bia" suffix says to generate a
burned-in address. The stable method by default uses as token the
connection UUID, but the token can be explicitly choosen via
"stable:<TOKEN>" and "stable-bia:<TOKEN>".
On a D-Bus level, the "cloned-mac-address" is a bytestring and thus
cannot express the new forms. It is replaced by the new
"assigned-mac-address" field. For the GObject property, libnm's API,
nmcli, keyfile, etc. the old name "cloned-mac-address" is still used.
Deprecating the old field seems more complicated then just extending
the use of the existing "cloned-mac-address" field, although the name
doesn't match well with the extended meaning.
There is some overlap with the "wifi.mac-address-randomization" setting.
https://bugzilla.gnome.org/show_bug.cgi?id=705545https://bugzilla.gnome.org/show_bug.cgi?id=708820https://bugzilla.gnome.org/show_bug.cgi?id=758301
VLAN and MACVLAN devices consider an ethernet.mac-address setting
to find the parent device. This setting shall be the permanent MAC
address of the device, not the current.
When the entire NMSettingWired setting is missing, it should be treated
exactly the same as each property having the default/unset value.
Otherwise, adding a NMSettingWired setting only to set (say) MTU,
would result in different behavior. Although effectively the
"cloned-mac-address" shall be in both cases the same.
Instead of accessing the singleton getter nm_settings_get(), obtain
the settings instance from the device instance itself via
nm_device_get_settings().
The device creation can be attempted if the name can be determined. It
alone is doesn't mean that there's a parent device -- the name could
just have been hardcoded in the connection.
NetworkManager[21519]: nm_device_get_ifindex: assertion 'NM_IS_DEVICE (self)' failed
Program received signal SIGTRAP, Trace/breakpoint trap.
g_logv (log_domain=0x5555557fb2e5 "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fffffffd3d0) at gmessages.c:1046
1046 g_private_set (&g_log_depth, GUINT_TO_POINTER (depth));
(gdb) bt
#0 0x00007ffff4ec88c3 in g_logv (log_domain=0x5555557fb2e5 "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fffffffd3d0) at gmessages.c:1046
#1 0x00007ffff4ec8a3f in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at gmessages.c:1079
#2 0x00005555555d2090 in nm_device_get_ifindex (self=0x0) at devices/nm-device.c:562
#3 0x00005555555ef77a in nm_device_supports_vlans (self=0x0) at devices/nm-device.c:9865
#4 0x00005555555bf2f9 in create_and_realize (device=0x555555c549b0 [NMDeviceVlan], connection=0x555555b451e0, parent=0x0, out_plink=0x7fffffffd5f8, error=0x7fffffffd700) at devices/nm-device-vlan.c:225
#5 0x00005555555d5757 in nm_device_create_and_realize (self=0x555555c549b0 [NMDeviceVlan], connection=0x555555b451e0, parent=0x0, error=0x7fffffffd700) at devices/nm-device.c:1783
#6 0x0000555555688601 in system_create_virtual_device (self=0x555555af51c0 [NMManager], connection=0x555555b451e0) at nm-manager.c:1120
#7 0x000055555568894e in connection_changed (settings=0x555555ae8220 [NMSettings], connection=0x555555b451e0, manager=0x555555af51c0 [NMManager]) at nm-manager.c:1172
#8 0x0000555555693448 in nm_manager_start (self=0x555555af51c0 [NMManager], error=0x7fffffffda30) at nm-manager.c:4466
#9 0x00005555555d166f in main (argc=1, argv=0x7fffffffdba8) at main.c:454
(gdb)
Fixes: 332994f1b1
The hardware address of a VLAN must be kept aligned with the one of
its parent device, and we already used a signal in NMDeviceVlan to
catch changes in parent address and update the VLAN device
accordingly.
But this didn't work in all cases because the change might happen
after the VLAN gets created but before we register the signal, so it
is necessary to add further checks to enforce the alignment during the
device activation.
https://bugzilla.redhat.com/show_bug.cgi?id=1325752
Don't rely on what's already on the device. It could be that the MAC address
set on the device is not meaningful -- the NM crashed while two devices were
teamed together and now they have the same hardware address and now it's
impossible to bond them with mode=5.
- All internal source files (except "examples", which are not internal)
should include "config.h" first. As also all internal source
files should include "nm-default.h", let "config.h" be included
by "nm-default.h" and include "nm-default.h" as first in every
source file.
We already wanted to include "nm-default.h" before other headers
because it might contains some fixes (like "nm-glib.h" compatibility)
that is required first.
- After including "nm-default.h", we optinally allow for including the
corresponding header file for the source file at hand. The idea
is to ensure that each header file is self contained.
- Don't include "config.h" or "nm-default.h" in any header file
(except "nm-sd-adapt.h"). Public headers anyway must not include
these headers, and internal headers are never included after
"nm-default.h", as of the first previous point.
- Include all internal headers with quotes instead of angle brackets.
In practice it doesn't matter, because in our public headers we must
include other headers with angle brackets. As we use our public
headers also to compile our interal source files, effectively the
result must be the same. Still do it for consistency.
- Except for <config.h> itself. Include it with angle brackets as suggested by
https://www.gnu.org/software/autoconf/manual/autoconf.html#Configuration-Headers
Get rid of NM_UNMANAGED_DEFAULT and refine the interaction between
unmanaged flags, device state and managed property.
Previously, the NM_UNMANAGED_DEFAULT was special in that a device was
still considered managed if it had solely the NM_UNMANAGED_DEFAULT flag
set and its state was managed. Thus, whether the device (state) was managed,
depended on the device state too.
Now, a device is considered managed (or unmanaged) based on the unmanaged
flags and realization state alone. At the same time, the device state
directly corresponds to the managed property of the device. Of course,
while changing the unmanaged flags, that invariant is shortly violated
until the state transistion is complete.
Introduce more unmanaged flags whereas some of them are non-authorative.
For example, the EXTERNAL_DOWN flag has only effect as long as the user
didn't explicitly manage the device (NM_UNMANAGED_USER_EXPLICIT). In other
words, certain flags can render other flags ineffective. Whether the device
is considered managed depends on the flags but also at the explicitly unset flags.
In a way, this is similar to previous where NM_UNMANAGED_DEFAULT was ignored
(if no other flags were present).
Also, previously a device that was NM_UNMANAGED_DEFAULT and in disconnected
state would transition back to unmanaged. No longer do that. Once a device is
managed, it stays managed as long as the flags indicate it should be managed.
However, the user can also modify the unmanaged flags via the D-Bus API.
Also get rid or nm_device_finish_init(). That was previously called
by NMManager after add_device(). As we now realize devices (possibly
multiple times) this should be handled during realization.
https://bugzilla.gnome.org/show_bug.cgi?id=746566
When we decide to add a new link, we alredy checked that no such link exists
(ignoring race conditions).
It is wrong to accept a EXITS failure when adding the link. There is no guarantee
that the existing link has all the same properties as the one we intend to add.
More importantly, this link was added externally outside of NetworkManager and it
should not be taken over.
Just treat EXISTS as a failure as any other.
Extract a function update_properties() similar to other device
implementations. It reads the device specific data from platform
and raises property-changed notifications.
update_properties() is now called by realize_start_notify().
NMDeviceVlan:realize() -- which previously implemented something like
update_properties() -- now does nothing. Note that previously
realize() might have failed, but this is different from other device
implementations and it was unclear what a failure really meant.
Ok, it might fail because the link was not found in the platform cache.
But other implementations don't check that either, so why vlan? And
how to handle that properly anyway?
Therefore realize()'s implementation is no longer needed because
nm_device_realize() already calls realize_start_setup(), which
calls realize_start_notify() and update_properties().
update_connection() no longer refreshes the device properties.
Instead it only modifies the passed-in NMConnection. It's a bit
ugly that update_connection() uses both cached properties from
NMDeviceVlan (vlan_id) and platform properties (xgress maps).
Also, update_connection() doesn't return early on error but continues
trying to update the NMConnection. The reason is that update_connection()
cannot return a failure status.
All implementations of NMDevice:setup_start() in derived classes
invoke the parent implementation first. Enforce that by moving
NMDevice:setup_start() to realize_start_setup() and only notify
derived classes afterwards via NMDevice:realize_start_notify().