This is a complete refactoring of the bluetooth code.
Now that BlueZ 4 support was dropped, the separation of NMBluezManager
and NMBluez5Manager makes no sense. They should be merged.
At that point, notice that BlueZ 5's D-Bus API is fully centered around
D-Bus's ObjectManager interface. Using that interface, we basically only
call GetManagedObjects() once and register to InterfacesAdded,
InterfacesRemoved and PropertiesChanged signals. There is no need to
fetch individual properties ever.
Note how NMBluezDevice used to query the D-Bus properties itself by
creating a GDBusProxy. This is redundant, because when using the ObjectManager
interfaces, we have all information already.
Instead, let NMBluezManager basically become the client-side cache of
all of BlueZ's ObjectManager interface. NMBluezDevice was mostly concerned
about caching the D-Bus interface's state, tracking suitable profiles
(pan_connection), and moderate between bluez and NMDeviceBt.
These tasks don't get simpler by moving them to a seprate file. Let them
also be handled by NMBluezManager.
I mean, just look how it was previously: NMBluez5Manager registers to
ObjectManager interface and sees a device appearing. It creates a
NMBluezDevice object and registers to its "initialized" and
"notify:usable" signal. In the meantime, NMBluezDevice fetches the
relevant information from D-Bus (although it was already present in the
data provided by the ObjectManager) and eventually emits these usable
and initialized signals.
Then, NMBlue5Manager emits a "bdaddr-added" signal, for which NMBluezManager
creates the NMDeviceBt instance. NMBluezManager, NMBluez5Manager and
NMBluezDevice are strongly cooperating to the point that it is simpler
to merge them.
This is not mere refactoring. This patch aims to make everything
asynchronously and always cancellable. Also, it aims to fix races
and inconsistencies of the state.
- Registering to a NAP server now waits for the response and delays
activation of the NMDeviceBridge accordingly.
- For NAP connections we now watch the bnep0 interface in platform, and tear
down the device when it goes away. Bluez doesn't send us a notification
on D-Bus in that case.
- Rework establishing a DUN connection. It no longer uses blocking
connect() and does not block until rfcomm device appears. It's
all async now. It also watches the rfcomm file descriptor for
POLLERR/POLLHUP to notice disconnect.
- drop nm_device_factory_emit_component_added() and instead let
NMDeviceBt directly register to the WWan factory's "added" signal.
NMDevice's act_stage1_prepare() now does nothing. Calling it is not
useful and has no effect.
In general, when a subclass overwrites a virtual function, it must be
defined whether the subclass must, may or must-not call the parents
implementation. Likewise, it must be clear when the parents
implementation should be chained: first, as last, does it matter?
In any case, that very much depends on how the parent is implemented
and this can only be solved by documentation and common conventions.
It's a forgiving approach to have a parents implementation do nothing,
then the subclass may call it at any time (or not call it at all).
This is especially useful if classes don't know their parent class well.
But in NetworkManager code the relationship between classes are known
at compile time, so every of these classes knows it derives directly
from NMDevice.
This forgingin approach was what NMDevice's act_stage1_prepare() was doing.
However, it also adds lines of code resulting in a different kind of complexity.
So, it's not clear that this forgiving approach is really better. Note
that it also has a (tiny) runtime and code-size overhead.
Change the expectation of how NMDevice's act_stage1_prepare() should be
called: it is no longer implemented, and subclasses *MUST* not chain up.
Not all masters type have a platform link and so it's wrong to check
for it to decide whether the slave should be really released. Move the
check to master devices that need it (bond, bridge and team).
OVS ports don't need the check because they don't call to platform to
remove a slave.
https://bugzilla.redhat.com/show_bug.cgi?id=1733709
We no longer add these. If you use Emacs, configure it yourself.
Also, due to our "smart-tab" usage the editor anyway does a subpar
job handling our tabs. However, on the upside every user can choose
whatever tab-width he/she prefers. If "smart-tabs" are used properly
(like we do), every tab-width will work.
No manual changes, just ran commands:
F=($(git grep -l -e '-\*-'))
sed '1 { /\/\* *-\*- *[mM]ode.*\*\/$/d }' -i "${F[@]}"
sed '1,4 { /^\(#\|--\|dnl\) *-\*- [mM]ode/d }' -i "${F[@]}"
Check remaining lines with:
git grep -e '-\*-'
The ultimate purpose of this is to cleanup our files and eventually use
SPDX license identifiers. For that, first get rid of the boilerplate lines.
We don't need GPtrArray to construct an array of fixed side.
Actually, we also don't need to malloc each NMPlatformBridgeVlan
element individually. Just allocate one buffer and append them
to the end.
(cherry picked from commit 6bc8ee87af)
In some cases it is convenient to specify ranges of bridge vlans, as
already supported by iproute2 and natively by kernel. With this commit
it becomes possible to add a range in this way:
nmcli connection modify eth0-slave +bridge-port.vlans "100-200 untagged"
vlan ranges can't be PVIDs because only one PVID vlan can exist.
https://bugzilla.redhat.com/show_bug.cgi?id=1652910
(cherry picked from commit 7093515777)
When STP is disabled, the bridge parameters 'priority', 'forward-delay',
'hello-time' and 'max-age' are irrelevant.
We already skip them when loading a connection profile from a ifcfg file.
Do the same when generating a connection from a configured device, in
order to possibly assume the connection.
...also when the connection is created at NetworkManager
startup to map an already configured bridge.
Ensure the device has configuration values that fall inside
NetworkManager boundaries, otherwise map the value with a default.
Platform had it's own scheme for reporting errors: NMPlatformError.
Before, NMPlatformError indicated success via zero, negative integer
values are numbers from <errno.h>, and positive integer values are
platform specific codes. This changes now according to nm-error:
success is still zero. Negative values indicate a failure, where the
numeric value is either from <errno.h> or one of our error codes.
The meaning of positive values depends on the functions. Most functions
can only report an error reason (negative) and success (zero). For such
functions, positive values should never be returned (but the caller
should anticipate them).
For some functions, positive values could mean additional information
(but still success). That depends.
This is also what systemd does, except that systemd only returns
(negative) integers from <errno.h>, while we merge our own error codes
into the range of <errno.h>.
The advantage is to get rid of one way how to signal errors. The other
advantage is, that these error codes are compatible with all other
nm-errno values. For example, previously negative values indicated error
codes from <errno.h>, but it did not entail error codes from netlink.
Note the special error codes NM_UTILS_ERROR_CONNECTION_AVAILABLE_*.
This will be used to determine, whether the profile is fundamentally
incompatible with the device, or whether just some other properties
mismatch. That information will be importand during a plain `nmcli
connection up`, where NetworkManager searches all devices for a device
to activate. If no device is found (and multiple errors happened),
we want to show the error that is most likely relevant for the user.
Also note, how NMDevice's check_connection_compatible() uses the new
class field "device_class->connection_type_check_compatible" to simplify
checks for compatible profiles.
The error reason is still unused.
It seems to me the NM_DEVICE_CLASS_DECLARE_TYPES() macro confuses more
than helping. Let's explicitly initialize the two fields, albeit with
another helper macro NM_DEVICE_DEFINE_LINK_TYPES() to get the list of
link-types right.
For consistency, also leave nop-lines like
device_class->connection_type_supported = NULL;
device_class->link_types = NM_DEVICE_DEFINE_LINK_TYPES ();
because all NMDevice class init methods should have this same
boiler plate code and to make it explicit that this is intended.
And there are only 3 occurences where this actually comes into play.
The majority of device implementations name their parent-class variable
"device_class". That also makes more sense as it is more consistant.
E.g. "parent" sounds like it's the direct parent, but that is not
the crucial point here. The crucial point at this place, is that we
access the NMDeviceClass typed pointer. Rename.
NMSettings exposes a cached list of all connection. We don't need
to clone it. Note that this is not save against concurrent modification,
meaning, add/remove of connections in NMSettings will invalidate the
list.
However, it wasn't save against that previously either, because
altough we cloned the container (GSList), we didn't take an additional
reference to the elements.
This is purely a performance optimization, we don't need to clone the
list. Also, since the original list is of type "NMConnection *const*",
use that type insistently, instead of dependent API requiring GSList.
IMO, GSList is anyway not a very nice API for many use cases because
it requires an additional slice allocation for each element. It's
slower, and often less convenient to use.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
Change the output of nm_platform_error_to_string() to print the numeric value.
Also, accept a string buffer instead of using an alloca() allocated buffer.
There is still a macro to provide the previous functionality, but it
was ill-suited to call from inside a loop.
The bluetooth device *never* manages NAP connection. Hence, checking for
nm_bt_vtable_network_server in "nm-bluez-manager.c" is wrong.
Especially, because nm_bt_vtable_network_server is only initialized
much later, so during initial start, the bluetooth factory would wronly
claim to support it. This leads to a crash when having a NAP connection.
Also, the bridge factory requires the bluetooth plugin. It should only
claim to support NAP when the bluetooth plugin is present. That
way, we get a proper "missing plugin" error message, instead of failing
later during activation.
It seems to me, distributing the logic to various match_connection()
functions makes it more complicated, because the implementation is
spread out and interact in complicated ways. Anyway.
Fixes: 8665cdfeff
Make it possible to register different factories for the same setting
type, and add a match_connection() method to let each factory decide
if it's capable of handling a connection.
This will be used to decide whether a PPPoE connection must be handled
through the legacy Ethernet factory or through the new PPP factory.
The settings "bridge.mac-address" and "ethernet.cloned-mac-address" have an
overlapping meaning. If the former is unset, fallback to the latter.
Effectively, "bridge.mac-address" is deprecated in favor of
"ethernet.cloned-mac-address", which is more powerful as it supports
various modes like "stable". However, if a connection specifies
"bridge.mac-address", it is used when creating the bridge interface,
while "ethernet.cloned-mac-address" is used shortly after, during
activation.
Previously, master device types like bridge, bond, and team
would overwrite is_available() and check_connection_available()
and always return TRUE.
The device already expresses via nm_device_is_master() that it
is of a master kind. Refactor the code, so, instead of having these
device types overwrite is_available() and check_connection_available(),
let the parents implementation react on nm_device_is_master().
There is no change in behavior at all. Instead, the knowledge how to
treat a master device moves from the device implementation to the
parent class.
The 3 bluetooth NAP hooks are called each only once. Inline them.
It is still very easy to understand where the bluetooth related
functions are invoked: grep for nm_bt_vtable_network_server.
In deactivate(), don't bother checking whether the current active
connection is a bluetooth type. Just always call unregister_bridge().
It's fast, and does nothing in case the bridge isn't registered.
I change it because I disagree with the previous naming.
For example bt_network_server_available() would not only call
is_available(). Instead, it checks whether the connection can
activate regarding availability of the bluetooth connection
(meaning, it returns TRUE if it's not a bluetooth connection or
if the bluez manager gives green light). In the bridge case,
it doesn't check any network-server availability.
There is already a function with a meaningful name for this behavior:
check_connection_available().
Same with bt_network_server_register(). It would indicate success,
if the applied connection is not a bluetooth connection. In cases,
where it didn't actually register anything. A function called
bt_network_server_register() should only return success if it actually
registered anything.
Branch f9b1bc16e9 added bluetooth NAP
support. A NAP connection is of connection.type "bluetooth", but it
also has a "bridge" setting. Also, it is primarily handled by NMDeviceBridge
and NMBridgeDeviceFactory (with help from NMBluezManager).
However, don't let nm_connection_get_connection_type() and
nm_connnection_is_type() lie about what the connection.type is.
The type is "bluetooth" for most purposes -- at least, as far as
the client is concerned (and the public API of libnm). This restores
previous API behavior, where nm_connection_get_connection_type()
and nm_connection_is_type() would be simple accessors to the
"connection.type" property.
Only a few places care about the bridge aspect, and those places need special
treatment. For example NMDeviceBridge needs to be fully aware that it can
handle bluetooth NAP connection. That is nothing new: if you handle a
connection of any type, you must know which fields matter and what they
mean. It's not enough that nm_connection_get_connection_type() for bluetooth
NAP connectins is claiming to be a bridge.
Counter examples, where the original behavior is right:
src/nm-manager.c- g_set_error (error,
src/nm-manager.c- NM_MANAGER_ERROR,
src/nm-manager.c- NM_MANAGER_ERROR_FAILED,
src/nm-manager.c- "NetworkManager plugin for '%s' unavailable",
src/nm-manager.c: nm_connection_get_connection_type (connection));
the correct message is: "no bluetooth plugin available", not "bridge".
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: if ( ( nm_connection_is_type (connection, NM_SETTING_WIRED_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: && !nm_connection_get_setting_pppoe (connection))
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_VLAN_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_WIRELESS_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_INFINIBAND_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_BOND_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_TEAM_SETTING_NAME)
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c: || nm_connection_is_type (connection, NM_SETTING_BRIDGE_SETTING_NAME))
src/settings/plugins/ifcfg-rh/nms-ifcfg-rh-writer.c- return TRUE;
the correct behavior is for ifcfg-rh plugin to reject bluetooth NAP
connections, not proceed and store it.
Reduce the use of NM_PLATFORM_GET / nm_platform_get() to get
the platform singleton instance.
For one, this is a step towards supporting namespaces, where we need
to use different NMNetns/NMPlatform instances depending on in which
namespace the device lives.
Also, we should reduce our use of singletons. They are difficult to
coordinate on shutdown. Instead there should be a clear order of
dependencies, expressed by owning a reference to those singelton
instances. We already own a reference to the platform singelton,
so use it and avoid NM_PLATFORM_GET.
(cherry picked from commit 94d9ee129d)
This argument is only relevant when the NMActStageReturn argument
indicates NM_ACT_STAGE_RETURN_FAILURE. In all other cases it is ignored.
Rename the argument to make the meaning clearer. The argument is passed
through several layers of code, it isn't obvious that this argument only
matters for the failure case. Also, the distinct name makes it easier
to distinguish from other uses of the "reason" name.
While at it, do some drive-by cleanup:
- use g_return_*() instead of g_assert() to have a more graceful
assertion.
- functions like dhcp4_start() don't need to return a failure reason.
Most callers don't care, and the caller who does can determine the
proper reason.
- allow omitting the out-argument via NM_SET_OUT().
The problem is that the bridge's MTU cannot be larger then the slaves'.
Configuring such a setting results in an error being logged and the
activation proceeds (without applying the desired MTU).
Unclear how to fix that best.
This makes it easier to install the files with proper names.
Also, it makes the makefile rules slightly simpler.
Lastly, the documentation is now generated into docs/api, which makes it
possible to get rid of the awkward relative file names in docbook.
Keep the include paths clean and separate. We use directories to group source
files together. That makes sense (I guess), but then we should use this
grouping also when including files. Thus require to #include files with their
path relative to "src/".
Also, we build various artifacts from the "src/" tree. Instead of having
individual CFLAGS for each artifact in Makefile.am, the CFLAGS should be
unified. Previously, the CFLAGS for each artifact differ and are inconsistent
in which paths they add to the search path. Fix the inconsistency by just
don't add the paths at all.
We only needed proper glib enum types for having properties
and signal arguments. These got all converted to plain int,
so no longer generate such an enum type.
Had to rename "nm-enum-types.h" because it works badly with
"libnm/nm-enum-types.h". Maybe I could fix that differently,
but duplicate names is anyway error prone.
Note that "nm-core-enum-types.h" is already taken too, so
"nm-src-enum-types.h" it is.
An interface would make sense to allow the actual device-factory to inherit
from another type.
However, glib interfaces make code much harder to follow and less
efficient. The device factory shall be a very simple type with meta data
about supported device types and the ability to create device instances.
There is no need to make this an interface implementation, instead just
let the factories inherit from NM_TYPE_DEVICE_FACTORY directly.
- use _NM_GET_PRIVATE() and _NM_GET_PRIVATE_PTR() everywhere.
- reorder statements, to have GObject related functions (init, dispose,
constructed) at the bottom of each file and in a consistent order w.r.t.
each other.
- unify whitespaces in signal and properties declarations.
- use NM_GOBJECT_PROPERTIES_DEFINE() and _notify()
- drop unused signal slots in class structures
- drop unused header files for device factories
- All internal source files (except "examples", which are not internal)
should include "config.h" first. As also all internal source
files should include "nm-default.h", let "config.h" be included
by "nm-default.h" and include "nm-default.h" as first in every
source file.
We already wanted to include "nm-default.h" before other headers
because it might contains some fixes (like "nm-glib.h" compatibility)
that is required first.
- After including "nm-default.h", we optinally allow for including the
corresponding header file for the source file at hand. The idea
is to ensure that each header file is self contained.
- Don't include "config.h" or "nm-default.h" in any header file
(except "nm-sd-adapt.h"). Public headers anyway must not include
these headers, and internal headers are never included after
"nm-default.h", as of the first previous point.
- Include all internal headers with quotes instead of angle brackets.
In practice it doesn't matter, because in our public headers we must
include other headers with angle brackets. As we use our public
headers also to compile our interal source files, effectively the
result must be the same. Still do it for consistency.
- Except for <config.h> itself. Include it with angle brackets as suggested by
https://www.gnu.org/software/autoconf/manual/autoconf.html#Configuration-Headers