Commit graph

2294 commits

Author SHA1 Message Date
Thomas Haller
d8a31794c8 connectivity: rework async connectivity check requests
An asynchronous request should either be cancellable or not keep
the target object alive. Preferably both.

Otherwise, it is impossible to do a controlled shutdown when terminating
NetworkManager. Currently, when NetworkManager is about to terminate,
it just quits the mainloop and essentially leaks everything. That is a
bug. If we ever want to fix that, every asynchronous request must be
cancellable in a controlled way (or it must not prevent objects from
getting disposed, where disposing the object automatically cancels the
callback).

Rework the asynchronous request for connectivity check to

- return a handle that can be used to cancel the operation.
  Cancelling is optional. The caller may choose to ignore the handle
  because the asynchronous operation does not keep the target object
  alive. That means, it is still possible to shutdown, by everybody
  giving up their reference to the target object. In which case the
  callback will be invoked during dispose() of the target object.

- also, the callback will always be invoked exactly once, and never
  synchronously from within the asynchronous start call. But during
  cancel(), the callback is invoked synchronously from within cancel().
  Note that it's only allowed to cancel an action at most once, and
  never after the callback is invoked (also not from within the callback
  itself).

- also, NMConnectivity already supports a fake handler, in case
  connectivity check is disabled via configuration. Hence, reuse
  the same code paths also when compiling without --enable-concheck.
  That means, instead of having #if WITH_CONCHECK at various callers,
  move them into NMConnectivity. The downside is, that if you build
  without concheck, there is a small overhead compared to before. The
  upside is, we reuse the same code paths when compiling with or without
  concheck.

- also, the patch synchronizes the connecitivty states. For example,
  previously `nmcli networking connectivity check` would schedule
  requests in parallel, and return the accumulated result of the individual
  requests.
  However, the global connectivity state of the manager might have have
  been the same as the answer to the explicit connecitivity check,
  because while the answer for the manual check is waiting for all
  pending checks to complete, the global connectivity state could
  already change. That is just wrong. There are not multiple global
  connectivity states at the same time, there is just one. A manual
  connectivity check should have the meaning of ensure that the global
  state is up to date, but it still should return the global
  connectivity state -- not the answers for several connectivity checks
  issued in parallel.
  This is related to commit b799de281b
  (libnm: update property in the manager after connectivity check),
  which tries to address a similar problem client side.
  Similarly, each device has a connectivity state. While there might
  be several connectivity checks per device pending, whenever a check
  completes, it can update the per-device state (and return that device
  state as result), but the immediate answer of the individual check
  might not matter. This is especially the case, when a later request
  returns earlier and obsoletes all earlier requests. In that case,
  earlier requests return with the result of the currend devices
  connectivity state.

This patch cleans up the internal API and gives a better defined behavior
to the user (thus, the simple API which simplifies implementation for the
caller). However, the implementation of getting this API right and properly
handle cancel and destruction of the target object is more complicated and
complex. But this but is not just for the sake of a nicer API. This fixes
actual issues explained above.

Also, get rid of GAsyncResult to track information about the pending request.
Instead, allocate our own handle structure, which ends up to be nicer
because it's strongly typed and has exactly the properties that are
useful to track the request. Also, it gets rid of the awkward
_finish() API by passing the relevant arguments to the callback
directly.
2018-04-10 15:11:23 +02:00
Thomas Haller
ef93f6caad platform: support creating non-persistant TUN/TAP devices
For completeness, extend the API to support non-persistant
device. That requires that nm_platform_link_tun_add()
returns the file descriptor.

While NetworkManager doesn't create such devices itself,
it recognizes the IFLA_TUN_PERSIST / IFF_PERSIST flag.
Since ip-tuntap (obviously) cannot create such devices,
we cannot add a test for how non-persistent devices look
in the platform cache. Well, we could instead add them
with ioctl directly, but instead, just extend the platform
API to allow for that.

Also, use the function from test-lldp.c to (optionally) use
nm_platform_link_tun_add() to create the tap device.
2018-04-09 20:16:31 +02:00
Thomas Haller
c9f89cafdf platform: adding onlink gateway route for manual addresses
Kernel does not all allow to configure a route via a gateway, if the
gateway is not directly reachable.

For non-manually added routes (e.g. from DHCP), we ignore them as a
server configuration errors. For manually added routes, we try to work
around them.

Note that if the user adds a manual route that references a gateway,
maybe he should be required to also add a matching onlink route for
the gateway (or an address that results in a device-route), otherwise
the configuration could be considered invalid. That was however not
done historically, and also, it seems a rather unhelpful behavior.
NetworkManage should just make it work, not not assume anything is
wrong with the configuration. Similarly, for IPv4, the user could
configure the route as onlink, however, that still requires extra
configuration of which the user might not be aware.

This would apply for example, when a connection has method=auto,
and would obtain the routes automatically. It seems sensible to
allow the user to add a route via the gateway, if he ~knows~ that
this particular network will provide such a configuration via DHCP.

In the past however, we tried not to automatically add a device route,
but instead see whether we will get a suitable route via DHCP. If we
wouldn't get such a route, we would however fail the connection.
However, this is really very hard to get right.
We call ip_config_merge_and_apply() possibly before receiving automatic
IP configuration (commit 7070d17ced, "device: reset
@con_ip6_config on failure before RA"). In this case, we could not yet
configure the route. Instead, we also cannot fail (yet), because we should
wait whether we will receive a route that makes this configuration
feasable.
That is hard to get right. How long should we wait? If we get a DHCP lease
and still cannot add the route, should we fail the IP configuration or wait
longer for another lease? Worse, if we decide to fail the IP configuration,
it might not fail the entire activation. Instead, we would only mark the
current address family as failed. If we later get a DHCP lease, should we
retry to add the route again? -- probably yes. If we still fail, we would
need to keep the IP configuration in failed state, regardless that DHCP
succeeded. Part of the problem is, that we are bad at tracking the
failed state per IP method. So, if manual configuration fails but DHCP
succeeds, we get the state wrong. That should be fixed separately, but it
just shows how hard it is to have this route that we currently cannot
add, and wanting to wait for something that might never come, but still
fail at some point.

Instead, if we cannot add a route due to a missing onlink gateway,
just retry and add the /32 or /128 direct route ourself.

Note that for IPv6 routes that have a "src" address which is still
TENTATIVE, we also cannot currently add the route and retry later.
However, that is fundamentally different, because:
  - the configuration here is correct, it's only that the address
    didn't yet pass IPv6 DAD and kernel is being unhelpful (rh#1457196).
  - we only have to wait a few seconds for DAD to complete or fail.
    So, it's easy to implement this sensibly.
2018-04-04 14:57:07 +02:00
Thomas Haller
78ed0a4a23 device: add IPv6 link local address via merge-and-apply
The device must not directly add addresses or routes. Instead,
it must track the addresses/routes it wants to add in the NMIP6Config.

Otherwise, during reapply, the information is lost and the next
sync will remove them.

Fixes-test: @ipv6_preserve_cached_routes
2018-04-04 14:57:07 +02:00
Thomas Haller
21b262f268 device/trival: rename NMIwdManagerPrivate.nm_manager field to "manager"
Similar cases of such a field are named "manager". Also,
internal names shall not have an "nm" prefix, contrary
to names in a header file, which shall have such a prefix.
2018-04-04 14:02:13 +02:00
Thomas Haller
8de522fad0 core: add macro for iterating CList of devices of NMManager
I find it slightly nicer and explict. Also, the list elements
are strictly speaking private, we should better not explicitly
use them outside of NMManager/NMDevice. The macro hides this.
2018-04-04 14:02:13 +02:00
Beniamino Galvani
eb8257dea5 core: properly initialize stable dhcp client-id
Fixes: 62a7863979
2018-04-01 16:28:47 +02:00
Thomas Haller
e49a32936c all: use nm_utils_hash_keys_to_array() 2018-03-27 09:58:00 +02:00
Thomas Haller
d7b1a911d9 wifi: rework tracking of wifi-aps to use CList
- no longer track APs in a hash table with their exported path
  as key. The exported path is already tracked by NMDBusManager's
  lookup index, so we can reuse that for fast lookup by path. Otherwise,
  track the APs in a CList per device.

- as we now track APs in a CList, their order is well defined.
  We no longer need to sort APs and obsoletes nm_wifi_aps_get_sorted()
  and simplifies nm_wifi_aps_find_first_compatible().
2018-03-27 09:58:00 +02:00
Thomas Haller
4a705e1a0c core: track devices in manager via embedded CList
Instead of using a GSList for tracking the devices, use a CList.
I think a CList is in most cases the more suitable data structure
then GSList:

 - you can find out in O(1) whether the object is linked. That
   is nice, for example to assert in NMDevice's destructor that
   the object was unlinked, and we will use that later in
   nm_manager_get_device_by_path().
 - you can unlink the element in O(1) and you can unlink the
   element without having access to the link's head
 - Contrary to GSList, this does not require an extra slice
   allocation for the link node. It quite possibliy consumes
   slightly less memory because the CList structure is embedded
   in a struct that we already allocate. Even if slice allocation
   would be perfect to only consume 2*sizeof(gpointer) for the link
   note, it would at most be as-good as CList. Quite possibly,
   there is an overhead though.
 - CList possibly has better memory locality, because the link
   structure and the data are close to each other.

Something which could be seen as disavantage, is that with CList
one device can only be tracked in one NMManager instance at a time.
But that is fine. There exists only one NMManager instance for now,
and even if we would ever introduce multiple managers, we probably
would not associate one NMDevice instance with multiple managers.

The advantages are arguably not huge, but CList is IMHO clearly the
more suited data structure. No need to stick to a suboptimal data
structure for the job. Refactor it.
2018-03-27 09:49:43 +02:00
Thomas Haller
1010cc777f device: merge IPv4 and IPv6 versions of _cleanup_ip_pre() 2018-03-20 21:03:20 +01:00
Thomas Haller
b95f974144 device: merge IPv4 and IPv6 versions of queued_ip_config_change() 2018-03-20 21:03:20 +01:00
Thomas Haller
9c330ab320 device: merge IPv4 and IPv6 versions of nm_device_set_ip_config() (pt2) 2018-03-20 21:03:20 +01:00
Thomas Haller
3de79deb1a device: merge IPv4 and IPv6 versions of nm_device_set_ip_config() (pt1)
Almost on change, just merge the functions in one, with a top-level
if/else.
2018-03-20 21:03:20 +01:00
Thomas Haller
7f0b43108d device/trivial: rename IPv4/IPv6 related fields in NMDevicePrivate struct
These fields have the same purpose for IPv4 and IPv6. Also, they have an alias
with name _x, that can be indexed by an IS_IPv4 1/0 value.

Rename the fields so that the distinguisher 4/6/x is at the end. The point
is to make the name more similar.
2018-03-20 21:03:20 +01:00
Thomas Haller
d0de8cb6d1 device: merge IPv4 and IPv6 versions of ip_config_merge_and_apply() (pt3) 2018-03-20 21:03:20 +01:00
Thomas Haller
259cd24f48 device: merge IPv4 and IPv6 versions of ip_config_merge_and_apply() (pt2) 2018-03-20 21:03:20 +01:00
Thomas Haller
d7d8611e72 device: merge IPv4 and IPv6 versions of ip_config_merge_and_apply() (pt1)
Functions like these are conceptually very similar. Commonly,
what we want to do for one address family we also want to do
for the other.

Merge the two functions. This moves the similar parts closer
to each other and stand beside it. This is only the first part
of the merge, which is pretty trivial without larger changes
(to keep the diff simple). More next.
2018-03-20 21:03:20 +01:00
Thomas Haller
745d60c06e device: in nm_device_capture_initial_config() only update config once
Now that there is no difference between initial capturing of
the configuration, and a later update_ip_config() call during
a signal from platform, we only need to make sure that the
IP config instances are initialized at least once.

In case we are called multiple times, there is nothing to do.
2018-03-20 21:03:20 +01:00
Thomas Haller
454195c09d device: don't capture resolve.conf for initial device config
This was called by via

  ...
  - manager:recheck_assume_connection()
    - manager:get_existing_connection()
      - nm_device_capture_initial_config()
        - update_ext_ip_config(initial=TRUE)

and would parse resolv.conf, and try to fill the device IP config
with nameservers and dns-options.

But why? It would only have effect if NM was started with
nm_dns_manager_get_resolv_conf_explicit(), but is that really sensible?
And it would only take effect on devices that have a default route.
And for what is this information even used?

Let's not do it this way. If we need this information for assuming or
external sys-iface mode, then it should be explicitly loaded at the
appropriate moment.

For now, drop it and see what breaks. Then we can fix it properly. If
it even matters.
2018-03-20 21:02:52 +01:00
Thomas Haller
453f9e5140 device: drop capture_lease_config() during connection assumption
Drop capture_lease_config(). It was added by commit
0321073b3c.

Note that it was only called by

  ...
  - manager:recheck_assume_connection()
    - manager:get_existing_connection()
      - nm_device_capture_initial_config()
        - update_ext_ip_config(addr_family=AF_INET, initial=TRUE)
          - capture_lease_config()

It had only effect when the device had IPv4 permanent addresses.
But then, capture_lease_config() would go on and iterate over
all connections (sorted by last-connect timestamp). It would
consider connection candidates that are compatible with the device,
and try to read the lease information from disk

It's really unclear what this means. For assuming (graceful take over),
do we need the lease information in the device? I don't think so,
because we will match an existing connection. The lease information
shall be read while activating (if necessary).

For external connections, we don't even have a matching connection
and we always generate a new one. It doesn't seem right to consider
leases from disk, for a different connection.

Just drop this. It's really ugly. If this causes an issue, it must be
fixed differently. We want to behave determinstically and well defined.
I don't even comprehend all the implications of what this had.

Also note that update_ext_ip_config() no longer clears
priv->dev_ip4_config.
2018-03-20 21:00:31 +01:00
Thomas Haller
19e657474d device: fix assertion in queued_ip6_config_change()
Fixes: 31ca7962f8
2018-03-20 15:24:38 +01:00
Thomas Haller
1d88f50443 device: also export NMIPxConfig on error in nm_device_set_ipx_config()
A failure to configure an address family does not mean that the connection
is going to fail. It depends, for example on ipvx.may-fail.

Always export the NMIPxConfig instance in that case.
2018-03-20 15:24:38 +01:00
Thomas Haller
5fd82a2035 device: cleanup completing wait for linklocal6
linklocal6_complete() had only one caller. The caller would check
whether the conditions for linklocal6_complete() are satisfied, and
then call it. Note that linklocal6_complete() would again assert
that these conditions hold. Don't do this. Just move the check
inside linklocal6_complete(), and rename to linklocal6_check_complete().

Also, linklocal6_complete() was called by update_ip_config(),
which was called by nm_device_capture_initial_config() and
queued_ip6_config_change().
It doesn't make sense to call linklocal6_complete() during
nm_device_capture_initial_config(). Move the call to
queued_ip6_config_change().
2018-03-20 15:24:38 +01:00
Thomas Haller
6cdf0b1820 device: fix check for existing addresses to ignore DADFAILED
Likewise, in ndisc_ra_timeout() we also want to include tentative
addresses. Looking into priv->ip6_config to determine whether
an other IP configuration is active, is anyway odd, and likely
a bug.
2018-03-20 15:24:38 +01:00
Thomas Haller
a58d4f5d3f device: use nm_ip6_config_find_first_address() in check_and_add_ipv6ll_addr() 2018-03-20 15:24:38 +01:00
Thomas Haller
945339cba5 core: add nm_ip6_config_find_first_address() function and refactor lookup of code
Instead have one particular nm_ip6_config_get_address_first_nontentative() function,
make it more extendable. Now, we pass a match-type argument, which can control which
element to search.

This patch has no change in behavior, but it already makes clear, that
nm_ip6_config_get_address_first_nontentative() was buggy, because it would
also return addresses that failed DAD.
2018-03-20 15:24:38 +01:00
Thomas Haller
fe02bb4f2a device: minor cleanups for ipv6ll_handle boolean variable
Don't do "if (var == FALSE)" for boolean variables.

Also, make booleans in NMDevicePrivate structure bitfields
and reorder the fields beside other bitfields. This allows
a tighter packing of the structure.
2018-03-20 15:24:38 +01:00
Thomas Haller
041afd2c3a device/trivial: rename internal field "nm_ipv6ll" to "ipv6ll_handle"
The "nm_" prefix should not be used for internal names.
2018-03-20 15:24:38 +01:00
Thomas Haller
0cc605e72b device: simplify return values for addrconf6_start_with_link_ready() and linklocal6_start()
addrconf6_start_with_link_ready() cannot fail. Hence, don' return a boolean
success value.

linklocal6_start() can only either POSTPONE or succeed right away. Don't return
a NMActStageReturn value, TRUE/FALSE is enough.

This simplifies the callers that don't have to check for values that never
come.
2018-03-20 15:24:38 +01:00
Thomas Haller
f813164d55 device: drop and inline trival function linklocal6_cleanup()
It doesn't seem to make code clearer, rather, slightly more complex.
2018-03-20 15:24:38 +01:00
Thomas Haller
fa09e7eb53 device/ndisc: skip link-local addresses from NDisc 2018-03-20 15:24:38 +01:00
Thomas Haller
e17cd1d742 core: avoid clone of all-connections list for nm_utils_complete_generic()
NMSettings exposes a cached list of all connection. We don't need
to clone it. Note that this is not save against concurrent modification,
meaning, add/remove of connections in NMSettings will invalidate the
list.

However, it wasn't save against that previously either, because
altough we cloned the container (GSList), we didn't take an additional
reference to the elements.

This is purely a performance optimization, we don't need to clone the
list. Also, since the original list is of type "NMConnection *const*",
use that type insistently, instead of dependent API requiring GSList.

IMO, GSList is anyway not a very nice API for many use cases because
it requires an additional slice allocation for each element. It's
slower, and often less convenient to use.
2018-03-20 15:08:18 +01:00
Thomas Haller
14adbc692a ofono: fix crash during complete-connection for Ofono modem
nm_modem_complete_connection() cannot just return FALSE in case
the modem doesn't overwrite complete_connection(). It must set
the error variable.

This leads to a crash when calling AddAndActivate for Ofono type
modem. It does not affect the ModemManager implementation
NMModemBroadband, because that one implements the method.
2018-03-20 15:08:18 +01:00
Thomas Haller
af97b9a41e device: minor cleanup of nm_device_complete_connection() and add code comment
Regarding the cleanup: remove the success variable and instead error
out early.
2018-03-20 15:08:18 +01:00
Thomas Haller
39ab38a04d core/platform: add support for TUN/TAP netlink support and various cleanup
Kernel recently got support for exposing TUN/TAP information on netlink
[1], [2], [3]. Add support for it to the platform cache.

The advantage of using netlink is that querying sysctl bypasses the
order of events of the netlink socket. It is out of sync and racy. For
example, platform cache might still think that a tun device exists, but
a subsequent lookup at sysfs might fail because the device was deleted
in the meantime. Another point is, that we don't get change
notifications via sysctl and that it requires various extra syscalls
to read the device information. If the tun information is present on
netlink, put it into the cache. This bypasses checking sysctl while
we keep looking at sysctl for backward compatibility until we require
support from kernel.

Notes:

- we had two link types NM_LINK_TYPE_TAP and NM_LINK_TYPE_TUN. This
  deviates from the model of how kernel treats TUN/TAP devices, which
  makes it more complicated. The link type of a NMPlatformLink instance
  should match what kernel thinks about the device. Point in case,
  when parsing RTM_NETLINK messages, we very early need to determine
  the link type (_linktype_get_type()). However, to determine the
  type of a TUN/TAP at that point, we need to look into nested
  netlink attributes which in turn depend on the type (IFLA_INFO_KIND
  and IFLA_INFO_DATA), or even worse, we would need to look into
  sysctl for older kernel vesions. Now, the TUN/TAP type is a property
  of the link type NM_LINK_TYPE_TUN, instead of determining two
  different link types.

- various parts of the API (both kernel's sysctl vs. netlink) and
  NMDeviceTun vs. NMSettingTun disagree whether the PI is positive
  (NM_SETTING_TUN_PI, IFLA_TUN_PI, NMPlatformLnkTun.pi) or inverted
  (NM_DEVICE_TUN_NO_PI, IFF_NO_PI). There is no consistent way,
  but prefer the positive form for internal API at NMPlatformLnkTun.pi.

- previously NMDeviceTun.mode could not change after initializing
  the object. Allow for that to happen, because forcing some properties
  that are reported by kernel to not change is wrong, in case they
  might change. Of course, in practice kernel doesn't allow the device
  to ever change its type, but the type property of the NMDeviceTun
  should not make that assumption, because, if it actually changes, what
  would it mean?

- note that as of now, new netlink API is not yet merged to mainline Linus
  tree. Shortcut _parse_lnk_tun() to not accidentally use unstable API
  for now.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1277457
[2] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=1ec010e705934c8acbe7dbf31afc81e60e3d828b
[3] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=118eda77d6602616bc523a17ee45171e879d1818

https://bugzilla.redhat.com/show_bug.cgi?id=1547213
https://github.com/NetworkManager/NetworkManager/pull/77
2018-03-20 11:59:52 +01:00
Thomas Haller
f0442a47ed all: avoid calling g_free on a const pointer with g_clear_pointer()
With g_clear_pointer(pptr, g_free), pptr is cast to a non-const pointer,
and hence there is no compiler warning about calling g_free() in a const
pointer. However, it still feels ugly to pass a const pointer to
g_clear_pointer(). We should either add an explicity cast, or just
make the pointer non-const.

I guess part of the problem is that C's "const char *" means that the
string itself is immutable, but also that it cannot be freed. We most
often want a different semantic: the string itself is immutable after
being initialized once, but the memory itself can and will be freed.
Such a notion of immutable C strings cannot be expressed.

For that, just remove the "const" from the declarations. Although we
don't want to modify the (content of the) string, it's still more a
mutable string.

Only in _vt_cmd_obj_copy_lnk_vlan(), add an explicity cast but keep the
type as const. The reason is, that we really want that NMPObject
instances are immutable (in the sense that they don't modify while
existing), but that doesn't mean the memory cannot be freed.
2018-03-19 15:45:46 +01:00
Thomas Haller
9545a8bc34 all: don't explicitly cast destroy function for g_clear_pointer()
The g_clear_pointer() macro already contains a cast to GDestroyNotify. No
need to do it ourself. In fact, with the cast, this only works with the
particular g_clear_pointer() implementation, that first assigns the
destroy function to a local variable.

See-also: https://bugzilla.gnome.org/show_bug.cgi?id=674634#c52
2018-03-19 15:27:08 +01:00
Thomas Haller
b680d118ee connectivity: fix integer type for signal-id NMDevicePrivate.concheck_periodic_id 2018-03-19 14:39:09 +01:00
Thomas Haller
059d34a27f arping/tests: better handle wait timeout for test IPv4 DAD
The test tries to do IPv4 DAD. That necessarily involves waiting
for a timeout. Since the NMArpingManager spawns arping processes,
the precise timings depend on the load of the machine and may be
large in some cases.

Usually, our test would run fast to successful completion.
However, sometimes, it can take several hundered milliseconds.

Instead of increasing the timeout to a large value (which would
needlessly extend the run time of our tests in the common cases),
try first with a reasonably short timeout. A timeout which commonly
results in success. If the test with the short timeout fails, just
try again with an excessively large timeout.

This saves about 400 msec for the common case, but extends the
races that we saw where not even 250 msec of wait time were
sufficient.
2018-03-15 11:24:08 +01:00
Beniamino Galvani
17009ed91d dhcp: handle expiry by letting the client continue for some time
Previously we would kill the client when the lease expired and we
restarted it 3 times at 2 minutes intervals before failing the
connection. If the client is killed after it received a NACK from the
server, it doesn't have the chance to delete the lease file and the
next time it is started it will request the same lease again.

Also, the previous restart logic is a bit convoluted.

Since clients already know how to deal with NACKs, let them continue
for a grace period after the expiry. When the grace period ends, we
fail the method and this can either fail the whole connection or keep
it active depending on the may-fail configuration.

https://bugzilla.gnome.org/show_bug.cgi?id=783391
2018-03-13 15:11:08 +01:00
Thomas Haller
57ab9fd60f core/dbus: rework creating numbered D-Bus export path by putting counter into class
I dislike the static hash table to cache the integer counter for
numbered paths. Let's instead cache the counter at the class instance
itself -- since the class contains the information how the export
path should be exported.

However, we cannot use a plain integer field inside the class structure,
because the class is copied between derived classes. For example,
NMDeviceEthernet and NMDeviceBridge both get a copy of the NMDeviceClass
instance. Hence, the class doesn't contain the counter directly, but
a pointer to one counter that can be shared between sibling classes.
2018-03-13 11:29:18 +01:00
Thomas Haller
297d4985ab core/dbus: rework D-Bus implementation to use lower layer GDBusConnection API
Previously, we used the generated GDBusInterfaceSkeleton types and glued
them via the NMExportedObject base class to our NM types. We also used
GDBusObjectManagerServer.

Don't do that anymore. The resulting code was more complicated despite (or
because?) using generated classes. It was hard to understand, complex, had
ordering-issues, and had a runtime and memory overhead.

This patch refactors this entirely and uses the lower layer API GDBusConnection
directly. It replaces the generated code, GDBusInterfaceSkeleton, and
GDBusObjectManagerServer. All this is now done by NMDbusObject and NMDBusManager
and static descriptor instances of type GDBusInterfaceInfo.

This adds a net plus of more then 1300 lines of hand written code. I claim
that this implementation is easier to understand. Note that previously we
also required extensive and complex glue code to bind our objects to the
generated skeleton objects. Instead, now glue our objects directly to
GDBusConnection. The result is more immediate and gets rid of layers of
code in between.
Now that the D-Bus glue us more under our control, we can address issus and
bottlenecks better, instead of adding code to bend the generated skeletons
to our needs.

Note that the current implementation now only supports one D-Bus connection.
That was effectively the case already, although there were places (and still are)
where the code pretends it could also support connections from a private socket.
We dropped private socket support mainly because it was unused, untested and
buggy, but also because GDBusObjectManagerServer could not export the same
objects on multiple connections. Now, it would be rather straight forward to
fix that and re-introduce ObjectManager on each private connection. But this
commit doesn't do that yet, and the new code intentionally supports only one
D-Bus connection.
Also, the D-Bus startup was simplified. There is no retry, either nm_dbus_manager_start()
succeeds, or it detects the initrd case. In the initrd case, bus manager never tries to
connect to D-Bus. Since the initrd scenario is not yet used/tested, this is good enough
for the moment. It could be easily extended later, for example with polling whether the
system bus appears (like was done previously). Also, restart of D-Bus daemon isn't
supported either -- just like before.

Note how NMDBusManager now implements the ObjectManager D-Bus interface
directly.

Also, this fixes race issues in the server, by no longer delaying
PropertiesChanged signals. NMExportedObject would collect changed
properties and send the signal out in idle_emit_properties_changed()
on idle. This messes up the ordering of change events w.r.t. other
signals and events on the bus. Note that not only NMExportedObject
messed up the ordering. Also the generated code would hook into
notify() and process change events in and idle handle, exhibiting the
same ordering issue too.
No longer do that. PropertiesChanged signals will be sent right away
by hooking into dispatch_properties_changed(). This means, changing
a property in quick succession will no longer be combined and is
guaranteed to emit signals for each individual state. Quite possibly
we emit now more PropertiesChanged signals then before.
However, we are now able to group a set of changes by using standard
g_object_freeze_notify()/g_object_thaw_notify(). We probably should
make more use of that.

Also, now that our signals are all handled in the right order, we
might find places where we still emit them in the wrong order. But that
is then due to the order in which our GObjects emit signals, not due
to an ill behavior of the D-Bus glue. Possibly we need to identify
such ordering issues and fix them.

Numbers (for contrib/rpm --without debug on x86_64):

- the patch changes the code size of NetworkManager by
  - 2809360 bytes
  + 2537528 bytes (-9.7%)

- Runtime measurements are harder because there is a large variance
  during testing. In other words, the numbers are not reproducible.
  Currently, the implementation performs no caching of GVariants at all,
  but it would be rather simple to add it, if that turns out to be
  useful.
  Anyway, without strong claim, it seems that the new form tends to
  perform slightly better. That would be no surprise.

  $ time (for i in {1..1000}; do nmcli >/dev/null || break; echo -n .;  done)
  - real    1m39.355s
  + real    1m37.432s

  $ time (for i in {1..2000}; do busctl call org.freedesktop.NetworkManager /org/freedesktop org.freedesktop.DBus.ObjectManager GetManagedObjects > /dev/null || break; echo -n .; done)
  - real    0m26.843s
  + real    0m25.281s

- Regarding RSS size, just looking at the processes in similar
  conditions, doesn't give a large difference. On my system they
  consume about 19MB RSS. It seems that the new version has a
  slightly smaller RSS size.
  - 19356 RSS
  + 18660 RSS
2018-03-12 18:37:08 +01:00
Thomas Haller
8b75f10ebe device: set properties before emitting the change notification
The change doesn't really make a difference. I thought it would, so I
did it. But turns out (as the code correctly assumes), while the
notifications are frozen, it's OK to leave the property still in an
inconsistent state while emitting the notify signal.

Still, it feels slightly more correct this way, so keep the change.
2018-03-12 18:03:07 +01:00
Thomas Haller
34493c5134 device/veth: don't use notify() signal to bind changes for "peer" property
The notify() signal is not emitted while the object properties are
blocked via g_object_freeze_notify(). That makes is unsuitable to
emit a notification for "peer" property whenver the device's "parent"
property changes. Because especially with freeze/thaw, we want to emit
both signals in the same batch, not first emit change signals for "parent",
and then in a second run the signals for "peer".

Use the existing parent_changed_notify() virtual function instead.
2018-03-12 18:03:07 +01:00
Thomas Haller
d76cfa3814 device: rework checking for bluetooth NAP connection in nm_device_update_metered()
NAP connections are a bit special, in that they also have a [bridge]
setting, but their connection.type is "bluetooth".

The canonical way to check whether a bluetooth connection is of NAP type
is by calling _nm_connection_get_setting_bluetooth_for_nap().

So, instead of checking for bluetooth.type "pan" or "dun", check the
opposite and whether the connection is of NAP type. In practice it's the
same, but let'check for NAP consistently via get_setting_bluetooth_for_nap().
2018-03-08 14:49:58 +01:00
Philip Withnall
599da6fd02 devices: Set NM_METERED_GUESS_YES for Bluetooth PANU/DUN connections
Bluetooth tethering using DUN or PANU is a common way to expose a
metered 3G or 4G connection from a phone to a laptop. We deliberately
ignore NAP connections, which is where we’re sharing internet from the
laptop to another device.

We could also set GUESS_YES for WiMAX connections, but NetworkManager
doesn’t support them any more. Add a comment about that.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

https://bugzilla.gnome.org/show_bug.cgi?id=794120
2018-03-08 13:35:21 +01:00
Andrew Zaborowski
29e9d206aa iwd: don't call nm_wifi_ap_set_ssid for empty SSID
If SSID is an empty string there's no need to call nm_wifi_ap_set_ssid
as it won't do anything.  It also has an assert checking that NULL is
passed for an empty SSID and we were passing a non-NULL pointer.
2018-03-05 00:46:00 +01:00
Andrew Zaborowski
8435aa8b31 iwd: fix device-added signal handler signature
This bug was not causing a crash for me because of the !IS_NM_DEVICE_IWD
check and because my glib version probably had the assertion within
NM_IWD_MANAGER_GET_PRIVATE disabled.

While there, change the g_signal_connect line to use the macro for the
signal name.
2018-03-05 00:35:01 +01:00
Andrew Zaborowski
6571b576c4 iwd: set Device.Powered during set_enable
Make sure .set_enabled uses the Device.Powered property to basically
bring the netdev UP and DOWN as I understand is expected by the
nm_device logic.

Device.Powered should generally reflect the UP state immediately but
just to avoid possible race conditions .is_available() will now return
a value that is an AND of the local "enabled" state and IWD's Powered
property.
2018-03-05 00:34:43 +01:00