Unslaving from a bridge causes a wrong RTM_DELLINK event for
the former slave.
# ip link add dummy0 type dummy
# ip link add bridge0 type bridge
# ip link set bridge0 up
# ip link set dummy0 master bridge0
# ip monitor link &
# ip link set dummy0 nomaster
18: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master bridge0 state DOWN group default
link/ether 76:44:5f:b9:38:02 brd ff:ff:ff:ff:ff:ff
18: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether 76:44:5f:b9:38:02
Deleted 18: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether 76:44:5f:b9:38:02
18: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 76:44:5f:b9:38:02 brd ff:ff:ff:ff:ff:ff
19: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
19: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
Previously, during do_request_link() we would remember the link that is
about to be requested (delayed_deletion) and delay processing a new
RTM_DELLINK message until the end of do_request_link() -- and possibly
forget about about the deletion, if RTM_DELLINK was followed by a
RTM_NEWLINK.
However, this hack does not catch the case where an external command
unslaves the link.
Instead just accept the wrong event and raise a "removed" signal right
away. This brings the cache in an externally visible, wrong state that
will be fixed by a following "added" signal.
Still do that because working around the kernel bug is complicated. Also,
we already might emit wrong "added" signals for devices that are already
removed. As a consequence, a user should not consider the platform signals
until all events are processed.
Listeners to that signal should accept that added/removed link changes
can be wrong and should preferably handle them idly, when the events
have settled.
It can even be worse, that a RTM_DELLINK is not fixed by a following
RTM_NEWLINK:
...
# ip link set dummy0 nomaster
36: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be brd ff:ff:ff:ff:ff:ff
36: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be
Deleted 36: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be
37: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
37: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
So, when a slave is deleted, we have to refetch it too.
https://bugzilla.redhat.com/show_bug.cgi?id=1285719
(cherry picked from commit 8a87a91813)
Conflicts:
src/platform/nm-linux-platform.c
src/platform/tests/test-link.c
"platform" implements a iproute2 like command-line
tool based on NMPlatform.
It is badly maintained and mostly unused. If we want
to test something, we should write tests that are run
automatically during `make check`. Manual tests just
don't fly.
(cherry picked from commit f122879c83)
The program ran over the platform links and printed them.
Our to-string methods of platform objects are already supposed
to print all fields. So this only duplicates code to print a link.
If you want to see what links were picked up by platfrom run:
NMTST_DEBUG=log-level=TRACE ./src/platform/tests/monitor
or just
./src/platform/tests/monitor
Yes, this has less the iproute2 feeling, but it gives you a more
native access to the platform objects -- which is what you want
for debugging platform.
(cherry picked from commit d13d17f84a)
This gives us a way to externally configure the logging level like:
NMTST_DEBUG=log-level=TRACE ./src/platform/tests/monitor
(cherry picked from commit ca8e40e1dc)
For libnm library, "nm-dbus-interface.h" contains defines like the D-Bus
paths of NetworkManager. It is desirable to have this header usable without
having a dependency on "glib.h", for example for a QT application. For that,
commit c0852964a8 removed that dependancy.
For libnm-glib library, the analog to "nm-dbus-interface.h" is
"NetworkManager.h", and the same applies there. Commit
159e827a72 removed that include.
However, that broke build on PackageKit [1] which expected to get the
version macros by including "NetworkManager.h". So at least for libnm-glib,
we need to preserve old behavior so that a user including
"NetworkManager.h" gets the version macros, but not "glib.h".
Extract the version macros to a new header file "nm-version-macros.h".
This header doesn't include "glib.h" and can be included from
"NetworkManager.h". This gives as previous behavior and a glib-free
include.
For libnm we still don't include "nm-version-macros.h" to "nm-dbus-interface.h".
Very few users will actually need the version macros, but not using
libnm.
Users that use libnm, should just include (libnm's) "NetworkManager.h" to
get all headers.
As a special case, a user who doesn't want to use glib/libnm, but still
needs both "nm-dbus-interface.h" and "nm-version-macros.h", can include
them both separately.
[1] https://github.com/hughsie/PackageKit/issues/85
Fixes: 4545a7fe96
(cherry picked from commit 7bf10a75db)
There seems to be an issue with glib/ffi that causes failures
to pass enum-typed arguments to signals (related bug rh#1260577).
Add a test for platform signals which, beside NM_CONFIG_SIGNAL_CONFIG_CHANGED,
is the only place where we use enum-typed arguments for signals.
Strangely, this test doesn't cause the failure, so it's unclear why
the workaround was necessary for "config-changed" signal (commit
e7d66f1df6).
(cherry picked from commit 52cd5ee612)
Seems that team changed to now also raise two change signals.
Relax the assertion that broke tests on Fedora 22.
(cherry picked from commit 1c2883c940)
Change nm_platform_link_get() to return the cached NMPlatformLink
instance. Now what all our implementations (fake and linux) have such a
cache internal object, let's just expose it directly.
Note that the lifetime of the exposed link object is possibly quite
short. A caller must copy the returned value if he intends to preserve
it for later.
Also add nm_platform_link_get_by_ifname() and modify nm_platform_link_get_by_address()
to return the instance.
Certain functions, such as nm_platform_link_get_name(),
nm_platform_link_get_ifindex(), etc. are solely implemented based
on looking at the returned NMPlatformLink object. No longer implement
them as virtual functions but instead implement them in the base class
(nm-platform.c).
This removes code and eliminates the redundancy of the exposed
NMPlatformLink instance and the nm_platform_link_get_*() accessors.
Thereby also fix a bug in NMFakePlatform that tracked the link address
in a separate "address" field, instead of using "link.addr". That was
a case where the redundancy actually led to a bug in fake platform.
Also remove some stub implementations in NMFakePlatform that just
bail out. Instead allow for a missing virtual functions and perform
the "default" action in the accessor.
An example for that is nm_platform_link_get_permanent_address().
(cherry picked from commit e8e455817b)
After doing all the refactoring, rename the ObjectType enum to NMPObjectType.
I didn't do this earlier to avoid problems with rebasing. But since I will
backport the platform changes to nm-1-0, this is the right time to get
the name right.
(cherry picked from commit 518cf76de7)
warning: function declaration isn’t a prototype [-Wstrict-prototypes]
In C function() and function(void) are two different prototypes (as opposed to
C++).
function() accepts an arbitrary number of arguments
function(void) accepts zero arguments
(cherry picked from commit 2dc27a99d7)
After refactoring platform, nm_platform_ipx_route_get_all() and
nm_platform_ipx_address_get_all() was broken for calling with a
non-posititive ifindex (which has the meaning: ignore ifindex).
While fixing that, also refactor the NMPCacheIdType so that it matches
better the supported id-types.
Fixes: 470bcefa5f
(cherry picked from commit 8f9dac01ac)
For NMPlatform instances we had an error reporting mechanism
which stores the last error reason in a private field. Later we
would check it via nm_platform_get_error().
Remove this. It was not used much, and it is not a great way
to report errors.
One problem is that at the point where the error happens, you don't
know whether anybody cares about an error code. So, you add code to set
the error reason because somebody *might* need it (but in realitiy, almost
no caller cares).
Also, we tested this functionality which is hardly used in non-testing code.
While this was a burden to maintain in the tests, it was likely still buggy
because there were no real use-cases, beside the tests.
Then, sometimes platform functions call each other which might overwrite the
error reason. So, every function must be cautious to preserve/set
the error reason according to it's own meaning. This can involve storing
the error code, calling another function, and restoring it afterwards.
This is harder to get right compared to a "return-error-code" pattern, where
every function manages its error code independently.
It is better to return the error reason whenever due. For that we already
have our common glib patterns
(1) gboolean fcn (...);
(2) gboolean fcn (..., GError **error);
In few cases, we need more details then a #gboolean, but don't want
to bother constructing a #GError. Then we should do instead:
(3) NMPlatformError fcn (...);
(cherry picked from commit 68a4ffb4e2)
Later remove nm_platform_get_error() and signal errors via return
error codes.
Also, fix nm_platform_infiniband_partition_add() and
nm_platform_vlan_add() to check the type of the existing link
and fail with WRONG_TYPE otherwise.
(cherry picked from commit d7fe907c32)
- rename "NONE" to "SUCCESS", what it really is.
- change the to-string result not to contain spaces
and being closer the name of the enum value.
- add new error reasons "UNSPECIFIED" and "BUG".
- remove the code comments around the enum definition.
They add no further description about why this error
happens and only paraphrase the name of the enum.
- reserve negative integers for 'errno'. This is neat
because if we get a system error we can pass on the
underlying errno as cause.
(cherry picked from commit f7fb68755c)
The @udi field is not a static string, so any user of a NMPlatformLink
instance must make sure not to use the field beyond the lifetime of the
NMPlatformLink instance.
As we pass on the platform link instance during platform changed events,
this is hard to ensure for the subscriber of the signal -- because a
call back into platform could invalidate/modify the object.
Just not expose this field as part of the link instance. The few callers
who actually needed it should instead call nm_platform_get_uid(). With
that, the lifetime of the returned 'const char *' pointer is clearly
defined.
(cherry picked from commit 1b2b988ea9)
Use the event socket to request object via NLM_F_DUMP.
No longer use 'priv->nlh' socket to fetch objects.
Instead fetch them via the priv->nlh_event socket that also
provides asynchronous events when objects change.
That way, the events are in sync with our explicit requests
and we can directly use the events. Previously, the events were
only used to indicate that a refetch must happen, so that every
event triggered a complete dump of all addresses/routes.
We still use 'priv->nlh' to make synchronous requests such as
adding/changing/deleting objects. That means, after we send a
request, we must make sure that the result manifested itself
at 'nlh_event' socket and the platform cache.
That's why we sometimes still must force a dump to sync changes.
That could be improved by using only one netlink socket so that
we would wait for the ACK of our request.
While not yet perfect, this already significantly reduces the number of
fetches. Additionally, before, whenever requesting a dump of addresses
or routes (which we did much more often, search for "get_kernel_object for type"
log lines), we always dumped IPv4 and IPv6 together. Now only request
the addr-family in question.
https://bugzilla.gnome.org/show_bug.cgi?id=747985https://bugzilla.redhat.com/show_bug.cgi?id=1211133
(cherry picked from commit 051cf8bbde)
Just create a NMLinuxPlatform instance and unref it again.
This already connects to netlink and fetches all objects.
(cherry picked from commit 977626d942)
Switch platform caching implementation. Instead of caching libnl
objects, cache our own types.
Don't remove yet the now obsolete functions.
Advantage:
* Performance
- as we now cache our native NMPlatformObject instances, we no longer
have to convert libnl objects every time we access the platform
cache.
- for most cases, access is now O(1) because we can lookup the object
in a hash table. Note that ip4_address_get_all() still has to
create a copy of the result (O(n)), but as the caller is about to
use those elements, he cannot do better then O(n) anyway.
* We cache our own native types and have full control over them. We
cannot extend the libnl objects, which has many short-commings:
- _rtnl_addr_hack_lifetimes_rel_to_abs() to convert the timestamps
to absolute values (and back).
- hack_empty_master_iff_lower_up() would modify the internal flag,
but it looses the original value. That means, we can only hack
the state before putting a link into the cache, but we cannot revert
that change, when a slave in the cache changes state.
That was previously solved by always refetching the master when
a slave changed. Now we can re-evaluate the connected state
(DELAYED_ACTION_TYPE_MASTER_CONNECTED).
- we implement functions like equality, to-string as most suitable
for us. Before we needed hacks like nm_nl_object_diff(),
nm_nl_cache_search(), route_search_cache().
- we can extend our objects with exactly those properties we care,
and possibly additional properties that are not representable in
the libnl objects.
- we no longer cache RTM_F_CLONED routes and they get rejected early
on as we receive them.
- In the future, maybe it'd be interesting the make platform objects
immutable (and ref-counted) and expose them directly.
* Previous implementation did not order the refresh of objects but
called check_cache_items(). Now, those actions are delayed and
combined in an attempt to reduce the overall number of reloads.
Realize how expensive a check_cache_items() for addresses and routes
was: it would iterate all addresses/routes and call refresh_object().
The latter obtains a full dump of *all* objects again, and ignores
all but the needle.
Note that we probably still schedule some delayed actions that
are not needed.
Later we can optimize that further (related bug bgo #747985).
While some of these points could also have been implemented with
caching of libnl objects, that would have become hard to maintain.
https://bugzilla.gnome.org/show_bug.cgi?id=747981
(cherry picked from commit 470bcefa5f)
NMPObject is a simple "object" implemenation around NMPlatformObject.
They are ref-counted and have a class-pointer. Several basic functions
like equality, hash, to-string are implemented.
NMPCache is can be used to store the NMPObject. Objects are indexed
via their primary id, but there is also multi-lookup via NMCacheId
and NMMultiIndex.
Part of the implementation is inside "nm-linux-platform.c",
because it depends on utility functions from there.
(cherry picked from commit 53f98e7f9e)
Cache the scope as part of the NMPlatformIP4Route and
no longer read it from libnl object when needed. Later
there will be no more libnl objects around, and we need
to scope when deleting an IPv4 route.
(cherry picked from commit 619f660a3e)
warning: function declaration isn’t a prototype [-Wstrict-prototypes]
In C function() and function(void) are two different prototypes (as opposed to
C++).
function() accepts an arbitrary number of arguments
function(void) accepts zero arguments
(cherry picked from commit 94a393e9ed)
There is no general purpose file for platform utilities.
We only have nm-platform.h, which contains (mostly) functions
that operate on a NMPlatform instance (and that can be mocked
using NMFakePlatform).
Add a new file for independent utility functions. nm-platform-utils.c
should not call into functions having a NMPlatform instance, to
have them independent from platform caching and the platform
singleton.
(cherry picked from commit ce700d94f5)
==21573== Syscall param mount(type) points to unaddressable byte(s)
==21573== at 0x854B9BA: mount (syscall-template.S:81)
==21573== by 0x158922: main (test-common.c:295)
==21573== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==21573==
{
<insert_a_suppression_name_here>
Memcheck:Param
mount(type)
fun:mount
fun:main
}
Fixes: d6aef9c188
(cherry picked from commit 0d7012faab)
Mount a private sysfs instance. Otherwise gudev sees the devices from the
parent netns as opposed to our netns.
We do, however need a writable /sys/devices subtree for testing the bridge
code. There doesn't seem to be any other way to get a writable subtree of a
read-only filesystem than remounting it with no parameters after the initial
mount. We use this to get a writable sysfs instance and then bindmount it so
that it fits properly in the sysfs hierarchy.
Co-Authored-By: Thomas Haller <thaller@redhat.com>
(cherry picked from commit d6aef9c188)
The link test was disabled in commit 67ad3fcb5b.
The previous issues are not fixed, but apparently disabling the test doesn't
help to get it fixed.
Re-enable it and if it fails we have a better reason to fix it.
Or maybe it works now (?). Didn't fail for me...
(cherry picked from commit f614ebe6f5)
link_extract_type() would return the NMLinkType and a
@type_name string. If the type was unknown, this string
was rtnl_link_get_type() (IFLA_INFO_KIND).
Split up this behavior and treat those values independently.
link_extract_type() now only detects the NMLinkType. Most users
don't care about unknown types and can just use nm_link_type_to_string()
to get a string represenation.
Only nm_platform_link_get_type_name() (and NMDeviceGeneric:type_description)
cared about a more descriptive type. For that, modify link_get_type_name()
to return nm_link_type_to_string() if NMLinkType could be detected.
As fallback, return rtnl_link_get_type().
Also, rename the field NMPlatformLink:link_type to "kind". For now this
field is mostly unused. It will be used later when refactoring platform
caching.
(cherry picked from commit e2c742c77b)
Instead of having a nm_platform_free() function, use NM_DEFINE_SINGLETON_WEAK_REF()
and register a weak reference. That way, users who want to free the platform
instance can just unref it.
(cherry picked from commit 04ed48e5a0)
Most nm_platform_*() functions operate on the platform
singleton nm_platform_get(). That made sense because the
NMPlatform instance was mainly to hook fake platform for
testing.
While the implicit argument saved some typing, I think explicit is
better. Especially, because NMPlatform could become a more usable
object then just a hook for testing.
With this change, NMPlatform instances can be used individually, not
only as a singleton instance.
Before this change, the constructor of NMLinuxPlatform could not
call any nm_platform_*() functions because the singleton was not
yet initialized. We could only instantiate an incomplete instance,
register it via nm_platform_setup(), and then complete initialization
via singleton->setup().
With this change, we can create and fully initialize NMPlatform instances
before/without setting them up them as singleton.
Also, currently there is no clear distinction between functions
that operate on the NMPlatform instance, and functions that can
be used stand-alone (e.g. nm_platform_ip4_address_to_string()).
The latter can not be mocked for testing. With this change, the
distinction becomes obvious. That is also useful because it becomes
clearer which functions make use of the platform cache and which not.
Inside nm-linux-platform.c, continue the pattern that the
self instance is named @platform. That makes sense because
its type is NMPlatform, and not NMLinuxPlatform what we
would expect from a paramter named @self.
This is a major diff that causes some pain when rebasing. Try
to rebase to the parent commit of this commit as a first step.
Then rebase on top of this commit using merge-strategy "ours".
(cherry picked from commit c6529a9d74)
There's no udev running in containers, it only starts if /sys is writable. If a
hardware device is added to the container's namespace NM would not announce it.
This also removes the software link special case -- the software links will now
wait for udev initialization (in case udev is there) as well. There's no reason
to treat them differently anymore. This makes it possible to use udev properties
of the software links.
https://bugzilla.gnome.org/show_bug.cgi?id=740526
(cherry picked from commit 4a05869557)
Hard to debug failures, if we don't print where the failure
happens.
(cherry picked from commit 500cbcba21)
Conflicts:
src/platform/tests/test-common.c
Support accepting more then one signal at a time.
It is to be expected, that one change in platform raises
several signals. Extend the assertion helpers to express
that.
(cherry picked from commit 050c644cce)