For NMPlatform instances we had an error reporting mechanism
which stores the last error reason in a private field. Later we
would check it via nm_platform_get_error().
Remove this. It was not used much, and it is not a great way
to report errors.
One problem is that at the point where the error happens, you don't
know whether anybody cares about an error code. So, you add code to set
the error reason because somebody *might* need it (but in realitiy, almost
no caller cares).
Also, we tested this functionality which is hardly used in non-testing code.
While this was a burden to maintain in the tests, it was likely still buggy
because there were no real use-cases, beside the tests.
Then, sometimes platform functions call each other which might overwrite the
error reason. So, every function must be cautious to preserve/set
the error reason according to it's own meaning. This can involve storing
the error code, calling another function, and restoring it afterwards.
This is harder to get right compared to a "return-error-code" pattern, where
every function manages its error code independently.
It is better to return the error reason whenever due. For that we already
have our common glib patterns
(1) gboolean fcn (...);
(2) gboolean fcn (..., GError **error);
In few cases, we need more details then a #gboolean, but don't want
to bother constructing a #GError. Then we should do instead:
(3) NMPlatformError fcn (...);
Later remove nm_platform_get_error() and signal errors via return
error codes.
Also, fix nm_platform_infiniband_partition_add() and
nm_platform_vlan_add() to check the type of the existing link
and fail with WRONG_TYPE otherwise.
- rename "NONE" to "SUCCESS", what it really is.
- change the to-string result not to contain spaces
and being closer the name of the enum value.
- add new error reasons "UNSPECIFIED" and "BUG".
- remove the code comments around the enum definition.
They add no further description about why this error
happens and only paraphrase the name of the enum.
- reserve negative integers for 'errno'. This is neat
because if we get a system error we can pass on the
underlying errno as cause.
The @udi field is not a static string, so any user of a NMPlatformLink
instance must make sure not to use the field beyond the lifetime of the
NMPlatformLink instance.
As we pass on the platform link instance during platform changed events,
this is hard to ensure for the subscriber of the signal -- because a
call back into platform could invalidate/modify the object.
Just not expose this field as part of the link instance. The few callers
who actually needed it should instead call nm_platform_get_uid(). With
that, the lifetime of the returned 'const char *' pointer is clearly
defined.
Use the event socket to request object via NLM_F_DUMP.
No longer use 'priv->nlh' socket to fetch objects.
Instead fetch them via the priv->nlh_event socket that also
provides asynchronous events when objects change.
That way, the events are in sync with our explicit requests
and we can directly use the events. Previously, the events were
only used to indicate that a refetch must happen, so that every
event triggered a complete dump of all addresses/routes.
We still use 'priv->nlh' to make synchronous requests such as
adding/changing/deleting objects. That means, after we send a
request, we must make sure that the result manifested itself
at 'nlh_event' socket and the platform cache.
That's why we sometimes still must force a dump to sync changes.
That could be improved by using only one netlink socket so that
we would wait for the ACK of our request.
While not yet perfect, this already significantly reduces the number of
fetches. Additionally, before, whenever requesting a dump of addresses
or routes (which we did much more often, search for "get_kernel_object for type"
log lines), we always dumped IPv4 and IPv6 together. Now only request
the addr-family in question.
https://bugzilla.gnome.org/show_bug.cgi?id=747985https://bugzilla.redhat.com/show_bug.cgi?id=1211133
Add a construct-only property NM_PLATFORM_REGISTER_SINGLETON to NMPlatform.
When set to TRUE, the constructor will self-register to nm_platform_setup().
The reason for this is that the _LOG() macro in NMLinuxPlatform logs the
self pointer if the instance is not the singleton instance.
During construction, we already have many log lines due to initialization
of the instance. These lines all end up qualified with the self pointer.
By earlier self-registering, printing the pointer value is omitted.
Yes, this patch is really just to prettify logging.
Switch platform caching implementation. Instead of caching libnl
objects, cache our own types.
Don't remove yet the now obsolete functions.
Advantage:
* Performance
- as we now cache our native NMPlatformObject instances, we no longer
have to convert libnl objects every time we access the platform
cache.
- for most cases, access is now O(1) because we can lookup the object
in a hash table. Note that ip4_address_get_all() still has to
create a copy of the result (O(n)), but as the caller is about to
use those elements, he cannot do better then O(n) anyway.
* We cache our own native types and have full control over them. We
cannot extend the libnl objects, which has many short-commings:
- _rtnl_addr_hack_lifetimes_rel_to_abs() to convert the timestamps
to absolute values (and back).
- hack_empty_master_iff_lower_up() would modify the internal flag,
but it looses the original value. That means, we can only hack
the state before putting a link into the cache, but we cannot revert
that change, when a slave in the cache changes state.
That was previously solved by always refetching the master when
a slave changed. Now we can re-evaluate the connected state
(DELAYED_ACTION_TYPE_MASTER_CONNECTED).
- we implement functions like equality, to-string as most suitable
for us. Before we needed hacks like nm_nl_object_diff(),
nm_nl_cache_search(), route_search_cache().
- we can extend our objects with exactly those properties we care,
and possibly additional properties that are not representable in
the libnl objects.
- we no longer cache RTM_F_CLONED routes and they get rejected early
on as we receive them.
- In the future, maybe it'd be interesting the make platform objects
immutable (and ref-counted) and expose them directly.
* Previous implementation did not order the refresh of objects but
called check_cache_items(). Now, those actions are delayed and
combined in an attempt to reduce the overall number of reloads.
Realize how expensive a check_cache_items() for addresses and routes
was: it would iterate all addresses/routes and call refresh_object().
The latter obtains a full dump of *all* objects again, and ignores
all but the needle.
Note that we probably still schedule some delayed actions that
are not needed.
Later we can optimize that further (related bug bgo #747985).
While some of these points could also have been implemented with
caching of libnl objects, that would have become hard to maintain.
https://bugzilla.gnome.org/show_bug.cgi?id=747981
NMPObject is a simple "object" implemenation around NMPlatformObject.
They are ref-counted and have a class-pointer. Several basic functions
like equality, hash, to-string are implemented.
NMPCache is can be used to store the NMPObject. Objects are indexed
via their primary id, but there is also multi-lookup via NMCacheId
and NMMultiIndex.
Part of the implementation is inside "nm-linux-platform.c",
because it depends on utility functions from there.
Cache the scope as part of the NMPlatformIP4Route and
no longer read it from libnl object when needed. Later
there will be no more libnl objects around, and we need
to scope when deleting an IPv4 route.
warning: function declaration isn’t a prototype [-Wstrict-prototypes]
In C function() and function(void) are two different prototypes (as opposed to
C++).
function() accepts an arbitrary number of arguments
function(void) accepts zero arguments