The return value for the delete methods checks whether the object
is actually deleted. That is questionable behavior, because if the netlink
request succeeds, there is little point in checking with the platform cache.
As it is, it is racy.
Anyway, the previous value was totally wrong.
But it also uncovers another platform bug, which currently breaks
route tests. Will be fixed next.
and refactor NMFakePlatform to also track links via NMPCache.
For one, now NMFakePlatform also tests NMPCache, increasing the
coverage of what we care about.
Also, all our NMPlatform implementations now use NMPObject and NMPCache.
That means, we can expose those as part of the public API. Which is
great, because callers can keep a reference to the NMPObject object
and make use of generic functions like nmp_object_to_string().
And move some code from NMLinuxPlatform to NMPlatform, where it belongs.
The advantage is that we reuse (and test!) the NMPCache implementation for
tracking addresses.
Also, we now always expose proper NMPObjects from both linux and fake
platform.
For example,
obj = NMP_OBJECT_UP_CAST (nm_platform_ip4_address_get (...));
will work as expected. Also, the caller is now by NMPlatform API
allowed to take and keep a reference to the returned objects.
Routes and addresses don't implement cmd_obj_is_visible(),
hence they are always visible, and NMP_CACHE_ID_TYPE_OBJECT_TYPE_VISIBLE_ONLY
is identical to NMP_CACHE_ID_TYPE_OBJECT_TYPE.
Only link objects can be alive but invisible. Still, drop the index
for looking up visible links entirely. Let callers do the filtering,
if they care.
Implement the reference counting of NMPObject as part of
NMDedupMultiObj and get rid of NMDedupMultiBox.
With this change, the NMPObject is aware in which NMDedupMultiIndex
instance it is tracked.
- this saves an additional GSlice allocation for the NMDedupMultiBox.
- it is immediately known, whether an NMPObject is tracked by a
certain NMDedupMultiIndex or not. This saves an additional hash
lookup.
- previously, when all idx-types cease to reference an NMDedupMultiObj
instance, it was removed. Now, a tracked objects stays in the
NMDedupMultiIndex until it's last reference is deleted. This possibly
extends the lifetime of the object and we may reuse it better.
- it is no longer possible to add one object to more then one
NMDedupMultiIndex instance. As we anyway want to have only one
instance to deduplicate the objects, this is fine.
- the ref-counting implementation is now part of NMDedupMultiObj.
Previously, NMDedupMultiIndex could also track objects that were
not ref-counted. Hoever, the object anyway *must* implement the
NMDedupMultiObj API, so this flexibility is unneeded and was not
used.
- a downside is, that NMPObject grows by one pointer size, even if
it isn't tracked in the NMDedupMultiIndex. But we really want to
put all objects into the index for sharing and deduplication. So
this downside should be acceptable. Still, code like
nmp_object_stackinit*() needs to handle a larger object.
NMPlatform's cache should be directly accessible to the users,
at least the NMPLookup part and the fact that the cache contains
ref-counted, immutable NMPObjects.
This allows users to inspect the cache with zero overhead. Meaning,
they can obtain an NMDedupMultiHeadEntry and iterate the objects
themself. It also means, the are free to take and keep references
of the NMPObject instances (of course, without modifying them!).
NMFakePlatform will use the very same cache. The fake platform should
only differ when modifying the objects.
Another reason why this makes sense is because NMFakePlatform is for one
a test-stup but also tests behavior of platform itself. Using a separate
internal implementation for the caching is a pointless excecise, because
only the real NMPCache's implementation really matters for production.
So, either NMFakePlatform behaves idential, or it is buggy. Reuse it.
Port fake platform's tracking of routes to NMPCache and move duplicate
code from NMLinuxPlatform to the base class.
This commit only ports IP routes, eventually also addresses and links
should be tracked via the NMPCache instance.
We want to expose the NMPLookup and NMDedupMultiHeadEntry to the users
of NMPlatform, so that they can iterate the cache directly.
That means, NMPCache becames an integral part of NMPlatform's API
and must also be implemented by NMFakePlatform.
We want to move the multi_idx from NMLinuxPlatform to NMPlatform,
so that it can be used by NMFakePlatform as well. For that, we need
to know whether NMPlatform will use udev or not. Add a constrctor
property.
Rework platform object cache to use NMDedupMultiIndex.
Already previously, NMPCache used NMMultiIndex and had thus
O(1) for most operations. What is new is:
- Contrary to NMMultiIndex, NMDedupMultiIndex preserves the order of
the cached items. That is crucial to handle routes properly as kernel
will replace the first matching route based on network/plen/metric
properties. See related bug rh#1337855.
Without tracking the order of routes as they are exposed
by kernel, we cannot properly maintain the route cache.
- All NMPObject instances are now treated immutable, refcounted
and get de-duplicated via NMDedupMultiIndex. This allows
to have a global NMDedupMultiIndex that can be shared with
NMIP4Config and NMRouteManager. It also allows to share the
objects themselves.
Immutable objects are so much nicer. We can get rid of the
update pre-hook callback, which was required previously because
we would mutate the object inplace. Now, we can just update
the cache, and compare obj_old and obj_new after the fact.
- NMMultiIndex was treated as an internal of NMPCache. On the other
hand, NMDedupMultiIndex exposes NMDedupMultiHeadEntry, which is
basically an object that allows to iterate over all related
objects. That means, we can now lookup objects in the cache
and give the NMDedupMultiHeadEntry instance to the caller,
which then can iterate the list on it's own -- without need
for copying anything.
Currently, at various places we still create copies of lookup
results. That can be improved later.
The ability to share NMPObject instances should enable us to
significantly improve performance and scale with large number
of routes.
Of course there is a memory overhead of having an index for each list
entry. Each NMPObject may also require an NMDedupMultiEntry,
NMDedupMultiHeadEntry, and NMDedupMultiBox item, which are tracked
in a GHashTable. Optimally, one NMDedupMultiHeadEntry is the head
for multiple objects, and NMDedupMultiBox is able to deduplicate several
NMPObjects, so that there is a net saving.
Also, each object type has several indexes of type NMPCacheIdType.
So, worst case an NMPlatformIP4Route in the platform cache is tracked
by 8 NMPCacheIdType indexes, for each we require a NMDedupMultiEntry,
plus the shared NMDedupMultiHeadEntry. The NMDedupMultiBox instance
is shared between the 8 indexes (and possibly other).
We listen to all RTM_GETLINK messages to get updates on interfaces statuses.
Unfortunately wireless code in the kernel sends those messages with wireless information included
and all other information excluded. When we receive such message we wipe out our valid cached entry
with new object that is almost empty because netlink message didn't contain any information.
Solution to this is to check that incoming message contains MTU field: this field is always
set for complete messages about interfaces and is not set by wireless code.
Signed-off-by: Nikolay Martynov <mar.kolya@gmail.com>
https://github.com/NetworkManager/NetworkManager/pull/17
NMPlatform, NMRouteManager and NMDefaultRouteManager are singletons
instances. Users of those are for example NMDevice, which registers
to GObject signals of both NMPlatform and NMRouteManager.
Hence, as NMDevice:dispose() disconnects the signal handlers, it must
ensure that those singleton instances live longer then the NMDevice
instance. That is usually accomplished by having users of singleton
instances own a reference to those instances.
For NMDevice that effectively means that it shall own a reference to
several singletons.
NMPlatform, NMRouteManager, and NMDefaultRouteManager are all
per-namespace. In general it doesn't make sense to have more then
one instances of these per name space. Nnote that currently we don't
support multiple namespaces yet. If we will ever support multiple
namespaces, then a NMDevice would have a reference to all of these
manager instances. Hence, introduce a new class NMNetns which bundles
them together.
(cherry picked from commit 0af2f5c28b)
Platform's add/remove operations accept a "network" argument.
Kernel requires that the host part (based on plen) is all zero.
For NetworkManager we are more resilient to user configuration.
Cleanup the input argument already before calling _nl_msg_new_route().
Note that we use the same "network" argument to construct a obj_id
instance and to find the route in the cache (do_add_addrroute()).
Without cleaning the host part, the added object cannot be found
and the add-route command seemingly fails.
(cherry picked from commit 11d8c41898)
The function is supposed to set *unamanged to NM_UNMANAGED's and indicate
whether NM_UNMANAGED was present in the return value.
Fixes: e32839838e
(cherry picked from commit b7b0227935)
For nl80211, we don't care about the interface name and only use it
when formatting error messages. For wext, an up-to-date interface name
should be obtained every time to minimize the chance of race
conditions when the interface is renamed.
We should implement all our private-getters with the very same pattern
(i.e. their type structure contains a field "_priv" and nm_assert()
with a GObject type check).
NM_LINUX_PLATFORM_GET_PRIVATE() was already doing all of that. Now just
use the _NM_GET_PRIVATE_VOID() macro which formally follows the
intended pattern.
Also, as time goes by it is less likely to encounter a user
where the kernel has no support. The most likely reason nowadays
is that the user booted with "ipv6.disabled=1".
https://bugzilla.redhat.com/show_bug.cgi?id=1421019
- use gs_free attribute
- move printing the logging cache warning inside the place
where we actuall add a new item to the cache.
It's really a minor cleanup of stuff that come to my mind reviewing the
function.
We call nmp_utils_ethtool_get_driver_info() twice when receiving a
netlink message, but we don't need a clone of the string values.
Instead, expose a data structure that should be stack allocated
by the caller.
The ioctl APIs ethtool/mii require an interface ifname. That is inherrently
racy as interfaces can be renamed. This cannot be fixed, we can only
minimize the time between verifying the ifname and calling ioctl.
We already had problems with that when ethtool would access an interface
by name that didn't exists. See commit ab41c13b06 .
Checking for an existing interface only helps avoiding races when an interface
gets deleted. It does not help against renaming.
Go one step further, and instead of checking whether such an ifname
exists, try to get the ifname based on the ifindex immediately before
we need it.
This brings an additional overhead for each ethtool access.
Since function nmp_utils_open_sysctl() can avoid race condition, use it
in wifi_utils_is_wifi() to open sysfs and correctly check if it's a wifi
device.
https://bugzilla.gnome.org/show_bug.cgi?id=775613
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Since commit 9fafb382db, we would
explicitly set libnl's socket buffer size to 4*getpagesize().
That is also the default of libnl itself. Additionally, we would
workaround too small buffers by increasing the buffer size up to 512K.
A too small buffer causes messages to be lost. Usually, that only
results in a cache-resync, which isn't too bad. Lost messages are however
a problem if the lost message was an ACK that we were waiting for.
However, it is rather unlikely to happen, because it's expected that
the buffer size gets adjusted already when the cache is filled initially,
before any other requests are pending.
Still, let's increase the default buffer size to 32K, hoping that this
initial value is already large enough to avoid the problem altogether.
Note that iproute2 also uses a buffer size of 32K [1] [2].
Alternatively, we could use MSG_PEEK like systemd does [3]. However,
that requires two syscalls per message.
[1] https://patchwork.ozlabs.org/patch/592178/
[2] https://git.kernel.org/cgit/linux/kernel/git/shemminger/iproute2.git/tree/lib/libnetlink.c?id=f5f760b81250630da23a4021c30e802695be79d2#n274
[3] cd66af2274/src/libsystemd/sd-netlink/netlink-socket.c (L323)
We don't want to enable MSG_PEEK due to the overhead. But when we detect
that we just lost a message due to MSG_TRUNC, increase the buffer size and
retry.
See-also: 55ea6e6b6c