The parent of a link (IFLA_LINK) can be in another network namespace and
thus invisible to NM.
This requires the netlink attribute IFLA_LINK_NETNSID which is supported
by recent versions of kernel and libnl.
In this case, set the parent field to NM_PLATFORM_LINK_OTHER_NETNS
and properly handle this special case.
(cherry picked from commit 790a0713d2)
It can happen on a regular basis when many events get raised.
It is probalby not avoidable and most likely not an issue, so
downgrade the warning to info level.
(cherry picked from commit 7f9cb13057)
The logging macros _LOGD(), etc. are specific to each
file as they format the message according to their context.
Still, they were cumbersome to define and their implementation
was repeated over and over (slightly different at times).
Move the declaration of these macros to "nm-logging.h".
The source file now only needs to define _NMLOG(), and either
_NMLOG_ENABLED() or _NMLOG_DOMAIN.
This reduces code duplication and encourages a common implementation
and usage of these macros.
(cherry picked from commit ad7cdfc766)
For delete-events, we only need a shallow object with the key fields
set. That sufficies to lookup in the cache and find the object to
delete.
One other issue is that _nmp_vt_cmd_plobj_init_from_nl_link() and
link_extract_type() might call to ethtool for the already deleted
instance. Just avoid that.
https://bugzilla.redhat.com/show_bug.cgi?id=1247156
(cherry picked from commit a4c7176fe9)
Coverity detected that it was always-true:
src/platform/nm-linux-platform.c:4035: dead_error_line: Execution cannot reach the expression "preferred != 0U" inside this statement: "if (lifetime != 0U || lifet...".
(cherry picked from commit da612acc6a)
When adding an IPv4 address, kernel will also add a device-route.
We don't want that route because it has the wrong metric. Instead,
we add our own route (with a different metric) and remove the
kernel-added one.
This could be avoided if kernel would support an IPv4 address flag
IFA_F_NOPREFIXROUTE like it does for IPv6 (see related bug rh#1221311).
One important thing is, that we want don't want to manage the
device-route on assumed devices. Note that this is correct behavior
if "assumed" means "do-not-touch".
If "assumed" means "seamlessly-takeover", then this is wrong.
Imagine we get a new DHCP address. In this case, we would not manage
the device-route on the assumed device. This cannot be fixed without
splitting unmanaged/assumed with related bug bgo 746440.
This is no regression as we would also not manage device-routes
for assumed devices previously.
We also don't want to remove the device-route if the user added
it externally. Note that here we behave wrongly too, because we
don't record externally added kernel routes in update_ip_config().
This still needs fixing.
Let IPv4 device-routes also be managed by NMRouteManager. NMRouteManager
has a list of all routes and can properly add, remove, and restore
the device route as needed.
One problem is, that the device-route does not get added immediately
with the address. It only appears some time later. This is solved
by NMRouteManager watching platform and if a matchin device-route shows up
within a short time after configuring addresses, remove it.
If the route appears after the short timeout, assume they were added for
other reasons (e.g. by the user) and don't remove them.
https://bugzilla.gnome.org/show_bug.cgi?id=751264https://bugzilla.redhat.com/show_bug.cgi?id=1211287
(cherry picked from commit 5f54a323d1)
Also expose routes with "proto kernel". But add a flag
to nm_platform_ipx_route_get_all() to hide them by default.
(cherry picked from commit 42664e8752)
build_rtnl_addr() has two parameters "lifetime" and "preferred". Both
count from *now*.
Fix nmp_object_to_nl() to properly set these timestamps. This bug had
not real consequences, because the only place where we use
nmp_object_to_nl() the arguments are 0.
(cherry picked from commit d0ed4de104)
Change nm_platform_link_get() to return the cached NMPlatformLink
instance. Now what all our implementations (fake and linux) have such a
cache internal object, let's just expose it directly.
Note that the lifetime of the exposed link object is possibly quite
short. A caller must copy the returned value if he intends to preserve
it for later.
Also add nm_platform_link_get_by_ifname() and modify nm_platform_link_get_by_address()
to return the instance.
Certain functions, such as nm_platform_link_get_name(),
nm_platform_link_get_ifindex(), etc. are solely implemented based
on looking at the returned NMPlatformLink object. No longer implement
them as virtual functions but instead implement them in the base class
(nm-platform.c).
This removes code and eliminates the redundancy of the exposed
NMPlatformLink instance and the nm_platform_link_get_*() accessors.
Thereby also fix a bug in NMFakePlatform that tracked the link address
in a separate "address" field, instead of using "link.addr". That was
a case where the redundancy actually led to a bug in fake platform.
Also remove some stub implementations in NMFakePlatform that just
bail out. Instead allow for a missing virtual functions and perform
the "default" action in the accessor.
An example for that is nm_platform_link_get_permanent_address().
(cherry picked from commit e8e455817b)
We must check if a conflicting route/address is configured on
*any* interface, not only on the target ifindex.
This at least restores the previous behavior. Note that this whole
check_reinstall_device_route() still has problems and it might
be better to move it all to NMRouteManager.
Fixes: 470bcefa5f
(cherry picked from commit 1bfade833a)
After doing all the refactoring, rename the ObjectType enum to NMPObjectType.
I didn't do this earlier to avoid problems with rebasing. But since I will
backport the platform changes to nm-1-0, this is the right time to get
the name right.
(cherry picked from commit 518cf76de7)
After refactoring platform, nm_platform_ipx_route_get_all() and
nm_platform_ipx_address_get_all() was broken for calling with a
non-posititive ifindex (which has the meaning: ignore ifindex).
While fixing that, also refactor the NMPCacheIdType so that it matches
better the supported id-types.
Fixes: 470bcefa5f
(cherry picked from commit 8f9dac01ac)
For NMPlatform instances we had an error reporting mechanism
which stores the last error reason in a private field. Later we
would check it via nm_platform_get_error().
Remove this. It was not used much, and it is not a great way
to report errors.
One problem is that at the point where the error happens, you don't
know whether anybody cares about an error code. So, you add code to set
the error reason because somebody *might* need it (but in realitiy, almost
no caller cares).
Also, we tested this functionality which is hardly used in non-testing code.
While this was a burden to maintain in the tests, it was likely still buggy
because there were no real use-cases, beside the tests.
Then, sometimes platform functions call each other which might overwrite the
error reason. So, every function must be cautious to preserve/set
the error reason according to it's own meaning. This can involve storing
the error code, calling another function, and restoring it afterwards.
This is harder to get right compared to a "return-error-code" pattern, where
every function manages its error code independently.
It is better to return the error reason whenever due. For that we already
have our common glib patterns
(1) gboolean fcn (...);
(2) gboolean fcn (..., GError **error);
In few cases, we need more details then a #gboolean, but don't want
to bother constructing a #GError. Then we should do instead:
(3) NMPlatformError fcn (...);
(cherry picked from commit 68a4ffb4e2)
Don't expose @obj directly but clone the public fields. A signal
handler might call back into NMPlatform which could invalidate (or modify)
@obj.
(cherry picked from commit e7ee2fc139)
The @udi field is not a static string, so any user of a NMPlatformLink
instance must make sure not to use the field beyond the lifetime of the
NMPlatformLink instance.
As we pass on the platform link instance during platform changed events,
this is hard to ensure for the subscriber of the signal -- because a
call back into platform could invalidate/modify the object.
Just not expose this field as part of the link instance. The few callers
who actually needed it should instead call nm_platform_get_uid(). With
that, the lifetime of the returned 'const char *' pointer is clearly
defined.
(cherry picked from commit 1b2b988ea9)
Use the event socket to request object via NLM_F_DUMP.
No longer use 'priv->nlh' socket to fetch objects.
Instead fetch them via the priv->nlh_event socket that also
provides asynchronous events when objects change.
That way, the events are in sync with our explicit requests
and we can directly use the events. Previously, the events were
only used to indicate that a refetch must happen, so that every
event triggered a complete dump of all addresses/routes.
We still use 'priv->nlh' to make synchronous requests such as
adding/changing/deleting objects. That means, after we send a
request, we must make sure that the result manifested itself
at 'nlh_event' socket and the platform cache.
That's why we sometimes still must force a dump to sync changes.
That could be improved by using only one netlink socket so that
we would wait for the ACK of our request.
While not yet perfect, this already significantly reduces the number of
fetches. Additionally, before, whenever requesting a dump of addresses
or routes (which we did much more often, search for "get_kernel_object for type"
log lines), we always dumped IPv4 and IPv6 together. Now only request
the addr-family in question.
https://bugzilla.gnome.org/show_bug.cgi?id=747985https://bugzilla.redhat.com/show_bug.cgi?id=1211133
(cherry picked from commit 051cf8bbde)
Add a construct-only property NM_PLATFORM_REGISTER_SINGLETON to NMPlatform.
When set to TRUE, the constructor will self-register to nm_platform_setup().
The reason for this is that the _LOG() macro in NMLinuxPlatform logs the
self pointer if the instance is not the singleton instance.
During construction, we already have many log lines due to initialization
of the instance. These lines all end up qualified with the self pointer.
By earlier self-registering, printing the pointer value is omitted.
Yes, this patch is really just to prettify logging.
(cherry picked from commit 56b07b1a3f)
Switch platform caching implementation. Instead of caching libnl
objects, cache our own types.
Don't remove yet the now obsolete functions.
Advantage:
* Performance
- as we now cache our native NMPlatformObject instances, we no longer
have to convert libnl objects every time we access the platform
cache.
- for most cases, access is now O(1) because we can lookup the object
in a hash table. Note that ip4_address_get_all() still has to
create a copy of the result (O(n)), but as the caller is about to
use those elements, he cannot do better then O(n) anyway.
* We cache our own native types and have full control over them. We
cannot extend the libnl objects, which has many short-commings:
- _rtnl_addr_hack_lifetimes_rel_to_abs() to convert the timestamps
to absolute values (and back).
- hack_empty_master_iff_lower_up() would modify the internal flag,
but it looses the original value. That means, we can only hack
the state before putting a link into the cache, but we cannot revert
that change, when a slave in the cache changes state.
That was previously solved by always refetching the master when
a slave changed. Now we can re-evaluate the connected state
(DELAYED_ACTION_TYPE_MASTER_CONNECTED).
- we implement functions like equality, to-string as most suitable
for us. Before we needed hacks like nm_nl_object_diff(),
nm_nl_cache_search(), route_search_cache().
- we can extend our objects with exactly those properties we care,
and possibly additional properties that are not representable in
the libnl objects.
- we no longer cache RTM_F_CLONED routes and they get rejected early
on as we receive them.
- In the future, maybe it'd be interesting the make platform objects
immutable (and ref-counted) and expose them directly.
* Previous implementation did not order the refresh of objects but
called check_cache_items(). Now, those actions are delayed and
combined in an attempt to reduce the overall number of reloads.
Realize how expensive a check_cache_items() for addresses and routes
was: it would iterate all addresses/routes and call refresh_object().
The latter obtains a full dump of *all* objects again, and ignores
all but the needle.
Note that we probably still schedule some delayed actions that
are not needed.
Later we can optimize that further (related bug bgo #747985).
While some of these points could also have been implemented with
caching of libnl objects, that would have become hard to maintain.
https://bugzilla.gnome.org/show_bug.cgi?id=747981
(cherry picked from commit 470bcefa5f)