Add an alternative to g_clear_pointer(). The differences are:
- nm_clear_pointer() is more type safe as it does not cast neither the
pointer nor the destroy function. Commonly, the types should be compatible
and not requiring a cast. Casting in the macro eliminates some of the
compilers type checking. For example, while
g_clear_pointer (&priv->hash_table, g_ptr_array_unref);
compiles, nm_clear_pointer() would prevent such an invalid use.
- also, clear the destination pointer *before* invoking the destroy
function. Destroy might emit signals (like weak-pointer callbacks
of GArray clear functions). Clear the destination first, so that
we don't leave a dangling pointer there.
- return TRUE/FALSE depending on whether there was a pointer to clear.
I tested that redefining g_clear_pointer()/g_clear_object() with our
more typesafe nm_* variants still compiles and indicates no bugs. So
that is good. It's not really expected that turning on more static checks
would yield a large number of bugs, because generally our code is in a good
shape already. We have few such bugs, because we already turn all all warnings
and extra checks that make sense. That however is not an argument for
not introducing (and using) a more resticted implementation.
It's slightly more correct to first clear the pointer location
before invoking the destroy function. The destroy function might
emit other callbacks, and at a certain point the pointer becomes
dangling. Avoid this danling pointer, by first clearing the
memory, and then destroing the instance.
With g_clear_pointer(pptr, g_free), pptr is cast to a non-const pointer,
and hence there is no compiler warning about calling g_free() in a const
pointer. However, it still feels ugly to pass a const pointer to
g_clear_pointer(). We should either add an explicity cast, or just
make the pointer non-const.
I guess part of the problem is that C's "const char *" means that the
string itself is immutable, but also that it cannot be freed. We most
often want a different semantic: the string itself is immutable after
being initialized once, but the memory itself can and will be freed.
Such a notion of immutable C strings cannot be expressed.
For that, just remove the "const" from the declarations. Although we
don't want to modify the (content of the) string, it's still more a
mutable string.
Only in _vt_cmd_obj_copy_lnk_vlan(), add an explicity cast but keep the
type as const. The reason is, that we really want that NMPObject
instances are immutable (in the sense that they don't modify while
existing), but that doesn't mean the memory cannot be freed.
The g_clear_pointer() macro already contains a cast to GDestroyNotify. No
need to do it ourself. In fact, with the cast, this only works with the
particular g_clear_pointer() implementation, that first assigns the
destroy function to a local variable.
See-also: https://bugzilla.gnome.org/show_bug.cgi?id=674634#c52
We already do conditional build with "#if WITH_CONCHECK".
Get rid of the conditional in the makefile and instead do
conditional compilating inside the source file "nm-connectivity.c".
The advantage is, now if you want to know which parts are build,
you only need to grep for the WITH_CONCHECK preprocessor define
instead of also caring about the conditional in Makefile.am and
meson.build.
It doesn't change the fact of conditional compilation. But it
consistently uses one mechanism to achieve it.
nm_connectivity_state_to_string() is entirely independent of the GObject implementation
of NMConnectivity. Move it to the beginning of the source file. It will be useful next
because we will *always* build "nm-connectivity.c" source file, but disable various
parts with #if. Hence, move the part that should always be build to the top.
We use a private D-Bus socket for example for DHCP clients to report back
at unix:path=/var/run/NetworkManager/private-dhcp.
By default, gdbus will enable the authentication mechanisms EXTERNAL
and DBUS_COOKIE_SHA1. However, DBUS_COOKIE_SHA1 requires a /root/.dbus-keyrings
directory, which is not available to NetworkManager as it is started with
ProtectHome=read-only. And writing to /root would be a bad idea anyway.
This leads to a warning
NetworkManager[10962]: Error adding entry to keyring: Error creating directory “/root/.dbus-keyrings”: Read-only file system
Disable all but the EXTERNAL mechanism.
See-also: https://dbus.freedesktop.org/doc/dbus-specification.html#auth-mechanismshttps://bugzilla.gnome.org/show_bug.cgi?id=793116https://github.com/NetworkManager/NetworkManager/pull/79
The documentation for the ipv4.dhcp-client-id property says:
If the property is not a hex string it is considered as a
non-hardware-address client ID and the 'type' field is set to 0.
However, currently we set the client-id without the leading zero byte
in the dhclient configuration and thus dhclient sends the first string
character as type and the remainder as client-id content. Looking
through git history, the dhclient plugin has always behaved this way
even if the intent was clearly that string client-id had to be zero
padded (this is evident by looking at
nm_dhcp_utils_client_id_string_to_bytes()). The internal plugin
instead sends the correct client-id with zero type.
Change the dhclient plugin to honor the documented behavior and add
the leading zero byte when the client-id is a string.
This commit introduces a change in behavior for users that have
dhcp=dhclient and have a plain string (not hexadecimal) set in
ipv4.dhcp-client-id, as NM will send a different client-id possibly
changing the IP address returned by the server.
https://bugzilla.gnome.org/show_bug.cgi?id=793957
The test tries to do IPv4 DAD. That necessarily involves waiting
for a timeout. Since the NMArpingManager spawns arping processes,
the precise timings depend on the load of the machine and may be
large in some cases.
Usually, our test would run fast to successful completion.
However, sometimes, it can take several hundered milliseconds.
Instead of increasing the timeout to a large value (which would
needlessly extend the run time of our tests in the common cases),
try first with a reasonably short timeout. A timeout which commonly
results in success. If the test with the short timeout fails, just
try again with an excessively large timeout.
This saves about 400 msec for the common case, but extends the
races that we saw where not even 250 msec of wait time were
sufficient.
This is the approach used by systemd-networkd.
I don't understand the logic that caused systemd-networkd to make the change -
9e49656037
Instead, I am suggesting it for consistency, and because it seems to me this is the
exact correct behaviour. Because if you enable NetworkManager, and rely on it to
configure your network devices, then network mounts will not start correctly at boot
time unless you also enable NetworkManager-wait-online.service.
Enabling NetworkManager-wait-online.service does not cause unnecessary serialization
of the boot process; it is only pulled in if something else (like a network mount)
pulls in network-online.target.
I am suggesting this in response to reading this user support request [1].
[1] https://unix.stackexchange.com/questions/429604/fstab-not-automatically-mounting-smb-storage
[thaller@redhat.com: reworded commit message]
https://github.com/NetworkManager/NetworkManager/pull/76
Previously we would kill the client when the lease expired and we
restarted it 3 times at 2 minutes intervals before failing the
connection. If the client is killed after it received a NACK from the
server, it doesn't have the chance to delete the lease file and the
next time it is started it will request the same lease again.
Also, the previous restart logic is a bit convoluted.
Since clients already know how to deal with NACKs, let them continue
for a grace period after the expiry. When the grace period ends, we
fail the method and this can either fail the whole connection or keep
it active depending on the may-fail configuration.
https://bugzilla.gnome.org/show_bug.cgi?id=783391
I dislike the static hash table to cache the integer counter for
numbered paths. Let's instead cache the counter at the class instance
itself -- since the class contains the information how the export
path should be exported.
However, we cannot use a plain integer field inside the class structure,
because the class is copied between derived classes. For example,
NMDeviceEthernet and NMDeviceBridge both get a copy of the NMDeviceClass
instance. Hence, the class doesn't contain the counter directly, but
a pointer to one counter that can be shared between sibling classes.
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.
Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.
This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.
This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.
Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.
Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.
Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.
Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.
Numbers (for contrib/rpm --without debug on x86_64):
- the patch changes the code size of NetworkManager by
- 2809360 bytes
+ 2537528 bytes (-9.7%)
- Runtime measurements are harder because there is a large variance
during testing. In other words, the numbers are not reproducible.
Currently, the implementation performs no caching of GVariants at all,
but it would be rather simple to add it, if that turns out to be
useful.
Anyway, without strong claim, it seems that the new form tends to
perform slightly better. That would be no surprise.
$ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .; done)
- real 1m39.355s
+ real 1m37.432s
$ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
- real 0m26.843s
+ real 0m25.281s
- Regarding RSS size, just looking at the processes in similar
conditions, doesn't give a large difference. On my system they
consume about 19MB RSS. It seems that the new version has a
slightly smaller RSS size.
- 19356 RSS
+ 18660 RSS
The next commit will completely rework NMBusManager and replace
NMExportedObject by a new type NMDBusObject.
Originally, NMDBusObject was added along NMExportedObject to ease
the rework and have compilable, intermediate stages of refactoring. Now,
I think the new name is better, because NMDBusObject is very strongly related
to the bus manager and the old name NMExportedObject didn't make that
clear.
I also slighly prefer the name NMDBusObject over NMBusObject, hence
for consistancy, also rename NMBusManager to NMDBusManager.
This commit only renames the file for a nicer diff in the next commit.
It does not actually update the type name in sources. That will be done
later.
The change doesn't really make a difference. I thought it would, so I
did it. But turns out (as the code correctly assumes), while the
notifications are frozen, it's OK to leave the property still in an
inconsistent state while emitting the notify signal.
Still, it feels slightly more correct this way, so keep the change.
The notify() signal is not emitted while the object properties are
blocked via g_object_freeze_notify(). That makes is unsuitable to
emit a notification for "peer" property whenver the device's "parent"
property changes. Because especially with freeze/thaw, we want to emit
both signals in the same batch, not first emit change signals for "parent",
and then in a second run the signals for "peer".
Use the existing parent_changed_notify() virtual function instead.
The generated code is really just a thin wrapper around direct
GDBusProxy calls. GDBusProxy is reasonably convenient to use directly,
drop this wrapper.
We also don't use a generated wrapper for other cases where
NetworkManager acts as a D-Bus client. There is no reason to
do it in this case.
While the nmdbus_*() functions that we were using are small wrappers,
we also created a NMDBusSecretAgent instance, and hence several other
functions and symbols are used as well. It's unnecessary.
This saves 8552 bytes for NetworkManager binary (2817944 vs. 2809392
bytes for contrib/rpm on x86_64).
On exit during NMManager's dispose(), we must fist remove active connections
via active_connection_remove(), before clearing the volatile-connection-list.
Otheriwise, while deleting the active connection, we schedule a idle action
to delete the volatile connection on idle, but at that point the dispose()
already cleaned up the idle list.
==3150== 72 (24 direct, 48 indirect) bytes in 1 blocks are definitely lost in loss record 3,411 of 6,079
==3150== at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
==3150== by 0x6AB7358: g_malloc (gmem.c:94)
==3150== by 0x6ACEF35: g_slice_alloc (gslice.c:1025)
==3150== by 0x1686B1: connection_flags_changed (nm-manager.c:1823)
==3150== by 0x661F73C: g_closure_invoke (gclosure.c:804)
==3150== by 0x66324DD: signal_emit_unlocked_R (gsignal.c:3635)
==3150== by 0x663AD04: g_signal_emit_valist (gsignal.c:3391)
==3150== by 0x663B66E: g_signal_emit (gsignal.c:3447)
==3150== by 0x2EC753: connection_flags_changed (nm-settings.c:824)
==3150== by 0x661F73C: g_closure_invoke (gclosure.c:804)
==3150== by 0x66324DD: signal_emit_unlocked_R (gsignal.c:3635)
==3150== by 0x663AD04: g_signal_emit_valist (gsignal.c:3391)
==3150== by 0x663B66E: g_signal_emit (gsignal.c:3447)
==3150== by 0x6623C03: g_object_dispatch_properties_changed (gobject.c:1080)
==3150== by 0x1DFD47: dispatch_properties_changed (nm-dbus-object.c:237)
==3150== by 0x6626178: g_object_notify_by_spec_internal (gobject.c:1173)
==3150== by 0x6626178: g_object_notify_by_pspec (gobject.c:1283)
==3150== by 0x2E7377: _notify (nm-settings-connection.c:53)
==3150== by 0x2E7377: nm_settings_connection_set_flags_full (nm-settings-connection.c:2346)
==3150== by 0x2E744D: nm_settings_connection_set_flags (nm-settings-connection.c:2316)
==3150== by 0x2E7466: set_visible (nm-settings-connection.c:316)
==3150== by 0x2E7774: nm_settings_connection_delete (nm-settings-connection.c:795)
==3150== by 0x1665A8: _delete_volatile_connection_do (nm-manager.c:598)
==3150== by 0x1668F4: active_connection_remove (nm-manager.c:625)
==3150== by 0x16ABA7: dispose (nm-manager.c:6726)
==3150== by 0x6624607: g_object_unref (gobject.c:3293)
==3150== by 0x1D779B: _nm_singleton_instance_destroy (nm-core-utils.c:138)
==3150== by 0x4011332: _dl_fini (in /usr/lib64/ld-2.26.so)
==3150== by 0x815FB57: __run_exit_handlers (in /usr/lib64/libc-2.26.so)
==3150== by 0x815FBA9: exit (in /usr/lib64/libc-2.26.so)
==3150== by 0x1383C7: main (main.c:467)
NMTST_ASSERT_PLATFORM_NETNS_CURRENT() already checks that the current namespace
is correct. Remove the duplicate assertion.
Also, NMP_CACHE_OPS_UNCHANGED is numerically identical to NM_PLATFORM_SIGNAL_NONE.
Use it in the assertion.
Return the extended ack message from the WaitForNlResponse delayed
action so that the caller can print a detailed reason with the
appropriate logging level.
From v4.12 the kernel appends some attributes to netlink acks
containing a textual description of the error and other fields (see
commit [1]). Parse those attributes and print the error message.
Examples:
platform-linux: netlink: recvmsg: error message from kernel: Network is unreachable (101) "Nexthop has invalid gateway" for request 12
platform-linux: netlink: recvmsg: error message from kernel: Invalid argument (22) "Local address cannot be multicast" for request 21
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d4bc93368f5a0ddb57c8c885cdad9c9b7a10ed5
If an operation is cancelled through the GCancellable, then the idiom is
that the operation is always cancelled, even if it has finished
successfully. To ensure this is the case, add calls to
g_simple_async_result_set_check_cancellable everywhere.
Without this, e.g. gnome-control-center will crash when switching away
from the power panel quickly, as the NMClient creation finishes
asynchronously and g-c-c assume that G_IO_ERROR_CANCELLED is returned to
ensure it doesn't access the now invalid user_data parameter.
https://bugzilla.gnome.org/show_bug.cgi?id=794088
NAP connections are a bit special, in that they also have a [bridge]
setting, but their connection.type is "bluetooth".
The canonical way to check whether a bluetooth connection is of NAP type
is by calling _nm_connection_get_setting_bluetooth_for_nap().
So, instead of checking for bluetooth.type "pan" or "dun", check the
opposite and whether the connection is of NAP type. In practice it's the
same, but let'check for NAP consistently via get_setting_bluetooth_for_nap().
Bluetooth tethering using DUN or PANU is a common way to expose a
metered 3G or 4G connection from a phone to a laptop. We deliberately
ignore NAP connections, which is where we’re sharing internet from the
laptop to another device.
We could also set GUESS_YES for WiMAX connections, but NetworkManager
doesn’t support them any more. Add a comment about that.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=794120
The condition was obviosly inverted, blocking autoconnect when
it should not, and not blocking it when it should.
[thaller@redhat.com: modified original patch and rewrite commit message]
Fixes: e2c8ef45achttps://bugzilla.gnome.org/show_bug.cgi?id=794014
IPv4 routes that are a response to RTM_GETROUTE must have the cloned
flag while IPv6 routes don't have to. Don't check the flag for IPv6
routes and add a test case to verify that RTM_GETROUTE works for IPv6.
https://bugzilla.gnome.org/show_bug.cgi?id=793962