Glib is not out of memory safe, meaning it always aborts the program
when an allocation fails. It is not possible to meaningfully handle
out of memory when using glib.
Replace all allocation functions for netlink message with their glib
counter part and remove the NULL checks.
Originally, these were error numbers from libnl3. These error numbers
are separate from errno, which is unfortunate, because sometimes we
care about the native errno returned from kernel.
Now, refactor them so that the error numbers are in the shared realm
of errno, but failures from kernel or underlying API are still returned
via their native errno.
- NLE_INVAL doesn't exist anymore. Passing invalid arguments to a function
is commonly a bug. g_return_*(NLE_BUG) is the right answer to that.
- NLE_NOMEM and NLE_AGAIN is replaced by their errno counterparts.
- drop several error numbers. If nobody cares about these numbers,
there is no reason to have a specific error number for them.
NLE_UNSPEC is sufficient.
From libnl3, we only used the helper function to parse/generate netlink
messages and the socket functions to send/receive messages. We don't
need an external dependency to do that, it is simple enough.
Drop the libnl3 dependency, and replace all missing code by directly
copying it from libnl3 sources. At this point, I mostly tried to
import the required bits to make it working with few modifications.
Note that this increases the binary size of NetworkManager by 4736 bytes
for contrib/rpm build on x86_64. In the future, we can simplify the code
further.
A few modifications from libnl3 are:
- netlink errors NLE_* are now in the domain or regular errno.
The distinction of having to bother with two kinds of error
number domains was annoying.
- parts of the callback handling is copied partially and unused parts
are dropped. Especially, the verbose/debug handlers are not used.
In following commits, the callback handling will be significantly
simplified.
- the complex handling of seleting ports was simplified. We now always
let kernel choose the right port automatically.
Add a function that allows to re-request all objects of a certain type.
Usually, the cache is supposed to keep itself in a consistent state and
this function is not useful.
It is however useful during testing and debugging to explicitly reload
an object type.
If you ever think to need this function in non-testing code, then
something else is probably wrong with the cache implementation.
Add a WifiDataClass struct, that is immutable and contains all the
function pointers that were previously embedded in WifiData directly.
They are not ever modified after creation, hence this allows to have
a "static const" allocated instance of the VTable.
Also rename wifi_data_deinit() to wifi_data_unref(). It does not only
deinitialize the instance, instead it also frees it. Hence, rename it
to "unref()".
Kernel (as of 4.14) merely ACKs our RTM_DELQDISC and RTM_DELTFILTER, not
bothering to signal the full RTM_DEL* message unless the removal is
external to NetworkManager.
https://bugzilla.redhat.com/show_bug.cgi?id=1527197
NM_FLAGS_HAS() uses a static-assert that the second argument is a
single flag (power of two). With a single flag, NM_FLAGS_HAS(),
NM_FLAGS_ANY() and NM_FLAGS_ALL() are all identical.
The second argument must be a compile time constant, and if that is
not the case, one must not use NM_FLAGS_HAS().
Use NM_FLAGS_ANY() in these cases.
We're going to need that one for TC filter & action support.
<linux/tc_act/tc_defact.h> was moved to user-space API only in 2013
by commit 5bc3db5c9ca8407f52918b6504d3b27230defedc. Our travis CI currently
fails to build due to that.
Re-implement the header.
It only makes sense to call delete() with NMPObjects that
we obtained from the platform cache. Otherwise, if we didn't
get it from the cache in the first place, we wouldn't know
what to delete.
Hence, the input argument is (almost) always an NMPObject
in the first place. That is different from add(), where
we might create a new specific NMPlatform* instance on the
stack. For add() it makes slightly more sense to have different
functions depending on the type. For delete(), it doesn't.
We also do this for libnm, where it causes visible changes
in behavior. But if somebody would rely on the hashing implementation
for hash tables, it would be seriously flawed.
The file descriptor is owned by the netlink socket instance,
which we close in finalize. We most not close it when destroying
the IO channel, otherwise the file descriptor gets closed twice.
Closing an invalid file descriptor (or a descriptor that is already closed)
is a serious bug, because the integer values are re-used, so there is a race
that the close might affect an innocent file descriptor instead of just
failing with EBADF.
The "onlink" flag for IPv4 routes is part of the route ID.
Consider it in nm_platform_ip4_route_cmp().
Also, allow configuring the flag when adding a route.
Note that for IPv6, the onlink flag is still ignored.
Pretty much like kernel does.
We need to pass more alias-types. Instead of having numbered
versions, use variadic number of macro arguments.
Also, fix build failure with old compiler:
In file included from src/nm-ip6-config.c:24:
./src/nm-ip6-config.h:44:29: error: controlling expression type 'typeof (ipconf_iter->current->obj)' (aka 'const void *const') not compatible with any generic association type
*out_address = has_next ? NMP_OBJECT_CAST_IP6_ADDRESS (ipconf_iter->current->obj) : NULL;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: b1810d7a68
_NM_GET_PRIVATE() used typeof() to propagate constness of the @self
pointer. However, that means, it could only be used with a self pointer
of the exact type. That means, you explicitly had to cast from (GObject *)
or from (void *).
The requirement is cumbersome, and often led us to either create @self
pointer we didn't need:
NMDeviceVlan *self = NM_DEVICE_VLAN (device);
NMDeviceVlanPrivate *priv = NM_DEVICE_VLAN_GET_PRIVATE (self);
or casting:
NMDeviceVlanPrivate *priv = NM_DEVICE_VLAN_GET_PRIVATE ((NMDevice *) device);
In both cases we forcefully cast the source variable, loosing help from
the compiler to detect a bug.
For "nm-linux-platform.c", instead we commonly have a pointer of type
NMPlatform. Hence, we always forcefully cast the type via _NM_GET_PRIVATE_VOID().
Rework the macro to use _Generic(). If compiler supports _Generic(), then we
will get all compile time checks as desired. If the compiler doesn't support
_Generic(), it will still work. You don't get the compile-time checking of course,
but you'd notice that something is wrong once you build with a suitable
compiler.
30. NetworkManager-1.9.2/src/settings/plugins/keyfile/nms-keyfile-writer.c:218:
check_return: Calling "g_mkdir_with_parents" without checking return
value (as is done elsewhere 4 out of 5
times).
25. NetworkManager-1.9.2/src/platform/nm-linux-platform.c:3969:
check_return: Calling "_nl_send_nlmsg" without checking return value (as
is done elsewhere 4 out of 5 times).
34. NetworkManager-1.9.2/src/nm-core-utils.c:2843:
negative_returns: "fd2" is passed to a parameter that cannot be negative.
26. NetworkManager-1.9.2/src/devices/wwan/nm-modem-broadband.c:897:
check_return: Calling "nm_utils_parse_inaddr_bin" without checking
return value (as is done elsewhere 4 out of 5 times).
3. NetworkManager-1.9.2/src/devices/bluetooth/nm-bluez5-manager.c:386:
check_return: Calling "g_variant_lookup" without checking return value
(as is done elsewhere 79 out of 83 times).
16. NetworkManager-1.9.2/libnm-util/nm-setting.c:405:
check_return: Calling "nm_g_object_set_property" without checking return
value (as is done elsewhere 4 out of 5 times).
Setting the MTU failes under regular conditions, for example when
setting the MTU of a master larger then the MTU of the slaves.
Logging a warning it too alarming.
When comparing an unsigned and a signed integer, the signed integer
is promoted to unsigned, resulting in a very large number.
See the checks "nwrote < len - 1", where nwrote might be -1
to indicate failure. The condition would not be TRUE due to
promoting -1 to the max int value.
Hence, sysctl_set() was rather wrong.
Replace the usage of g_str_hash() with our own nm_str_hash().
GLib's g_str_hash() uses djb2 hashing function, just like we
do at the moment. The only difference is, that we use a diffrent
seed value.
Note, that we initialize the hash seed with random data (by calling
getrandom() or reading /dev/urandom). That is a change compared to
before.
This change of the hashing function and accessing the random pool
might be undesired for libnm/libnm-core. Hence, the change is not
done there as it possibly changes behavior for public API. Maybe
we should do that later though.
At this point, there isn't much of a change. This patch becomes
interesting, if we decide to use a different hashing algorithm.
These static variables really never be modified.
Mark them as const, which allows the linker to mark them as
read-only.
The problem is libnl3's API, which has these parameters
not as const. Add a workaround for that. Clearly libnl3 is
not gonna modify the policy, that the API was fixed too [1]
[1] b4802a17a7
Kernel does not allow to add a route with table 0 (RT_TABLE_UNSPEC). It
effectively is an alias for the main table. We must consider that when
comparing routes sementically.
No need for duplicate log lines
<debug> [1506146476.8462] platform: link: adding tap tap0 owner 107 group -1
<debug> [1506146476.8462] platform-linux: link: add tap tap0 owner 107 group -1
Merge them.
Also, for consistency change the logging output for adding generic
interfaces in nm_platform_link_add().
After commit 5a69b27a64 ("platform: let platform operations only
consider kernel response") the platform only relies on kernel messages
and doesn't check if a deleted object is gone from the cache. For IPv6
addresses it can happen that the RTM_DELADDR comes after the ack, and
this causes random failures in test /address/ipv6/general-2:
[10.8009] platform: address: deleting IPv6 address 2001:db8:a🅱️1:2:3:4/64, ifindex 12 dev nm-test-device
[10.8009] platform-linux: delayed-action: schedule wait-for-nl-response (seq 55, timeout in 0.199999680, response-type 0)
[10.8009] platform-linux: delayed-action: handle wait-for-nl-response (any)
[10.8009] platform-linux: netlink: recvmsg: new message (2), flags 0x0100, seq 55
[10.8009] platform-linux: delayed-action: complete wait-for-nl-response (seq 55, timeout in 0.199980533, response-type 0, success)
[10.8009] platform-linux: do-delete-ip6-address[12: 2001:db8:a🅱️1:2:3:4]: success
**
NetworkManager:ERROR:src/platform/tests/test-common.c:1127:_ip_address_del: assertion failed: (external_command)
Use the same workaround in place for the addition of IPv6 addresses,
i.e. refetch the object if the address is still present after the ack.