There is no principle problem with returning zero has hash
value. But just don't do it. Our hash functions should not
return zero. Instead, return nm_hash_static(). This is why
the function exists.
We also do this for libnm, where it causes visible changes
in behavior. But if somebody would rely on the hashing implementation
for hash tables, it would be seriously flawed.
GHashTable optimizes a NULL equality function to use direct pointer
comparison. That saves the overhead of calling g_direct_equal().
This is also documented behavior for g_hash_table_new().
While at it, also don't pass g_direct_hash() but use the default
of %NULL. The behavior is the same, but consistently don't use
g_direct_hash().
The file descriptor is owned by the netlink socket instance,
which we close in finalize. We most not close it when destroying
the IO channel, otherwise the file descriptor gets closed twice.
Closing an invalid file descriptor (or a descriptor that is already closed)
is a serious bug, because the integer values are re-used, so there is a race
that the close might affect an innocent file descriptor instead of just
failing with EBADF.
The "onlink" flag for IPv4 routes is part of the route ID.
Consider it in nm_platform_ip4_route_cmp().
Also, allow configuring the flag when adding a route.
Note that for IPv6, the onlink flag is still ignored.
Pretty much like kernel does.
We need to pass more alias-types. Instead of having numbered
versions, use variadic number of macro arguments.
Also, fix build failure with old compiler:
In file included from src/nm-ip6-config.c:24:
./src/nm-ip6-config.h:44:29: error: controlling expression type 'typeof (ipconf_iter->current->obj)' (aka 'const void *const') not compatible with any generic association type
*out_address = has_next ? NMP_OBJECT_CAST_IP6_ADDRESS (ipconf_iter->current->obj) : NULL;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: b1810d7a68
_NM_GET_PRIVATE() used typeof() to propagate constness of the @self
pointer. However, that means, it could only be used with a self pointer
of the exact type. That means, you explicitly had to cast from (GObject *)
or from (void *).
The requirement is cumbersome, and often led us to either create @self
pointer we didn't need:
NMDeviceVlan *self = NM_DEVICE_VLAN (device);
NMDeviceVlanPrivate *priv = NM_DEVICE_VLAN_GET_PRIVATE (self);
or casting:
NMDeviceVlanPrivate *priv = NM_DEVICE_VLAN_GET_PRIVATE ((NMDevice *) device);
In both cases we forcefully cast the source variable, loosing help from
the compiler to detect a bug.
For "nm-linux-platform.c", instead we commonly have a pointer of type
NMPlatform. Hence, we always forcefully cast the type via _NM_GET_PRIVATE_VOID().
Rework the macro to use _Generic(). If compiler supports _Generic(), then we
will get all compile time checks as desired. If the compiler doesn't support
_Generic(), it will still work. You don't get the compile-time checking of course,
but you'd notice that something is wrong once you build with a suitable
compiler.
If @ifname is set, we above lookup by name (nmp_lookup_init_link_by_ifname)
and set ifname to NULL. Hence, inside the loop, the check for ifname is
never true.
30. NetworkManager-1.9.2/src/settings/plugins/keyfile/nms-keyfile-writer.c:218:
check_return: Calling "g_mkdir_with_parents" without checking return
value (as is done elsewhere 4 out of 5
times).
25. NetworkManager-1.9.2/src/platform/nm-linux-platform.c:3969:
check_return: Calling "_nl_send_nlmsg" without checking return value (as
is done elsewhere 4 out of 5 times).
34. NetworkManager-1.9.2/src/nm-core-utils.c:2843:
negative_returns: "fd2" is passed to a parameter that cannot be negative.
26. NetworkManager-1.9.2/src/devices/wwan/nm-modem-broadband.c:897:
check_return: Calling "nm_utils_parse_inaddr_bin" without checking
return value (as is done elsewhere 4 out of 5 times).
3. NetworkManager-1.9.2/src/devices/bluetooth/nm-bluez5-manager.c:386:
check_return: Calling "g_variant_lookup" without checking return value
(as is done elsewhere 79 out of 83 times).
16. NetworkManager-1.9.2/libnm-util/nm-setting.c:405:
check_return: Calling "nm_g_object_set_property" without checking return
value (as is done elsewhere 4 out of 5 times).
# random seed: R02S4ca8cfc3dace399c0f15b42411e45d2e
1..48
# Start of link tests
ok 1 /link/bogus
PASS: src/platform/tests/test-link-linux 1 /link/bogus
ok 2 /link/loopback
PASS: src/platform/tests/test-link-linux 2 /link/loopback
nmtst: initialize nmtst_get_rand() with NMTST_SEED_RAND=2697682474
ok 3 /link/internal
PASS: src/platform/tests/test-link-linux 3 /link/internal
ok 4 /link/external
PASS: src/platform/tests/test-link-linux 4 /link/external
# Start of software tests
./tools/run-nm-test.sh: line 193: 7589 Trace/breakpoint trap (core dumped) "${NMTST_DBUS_RUN_SESSION[@]}" "$TEST" "$@"
NMPlatformSignalAssert: src/platform/tests/test-link.c:298, test_slave(): failure to accept signal 0 times: 'link-changed-changed' ifindex 9 (1 times received)
ERROR: src/platform/tests/test-link-linux - too few tests run (expected 48, got 4)
ERROR: src/platform/tests/test-link-linux - exited with status 133 (terminated by signal 5?)
and nm_utils_ip6_property_path(). The API with static buffers
looks a bit nicer. But I think they are dangerous, because
we tend to pass the buffer down several layers of the stack, and
it's not immediately clear, that we don't overwrite the static
buffer again (which we probably did not, but it's hard to verify
that there is no bug there).
Setting the MTU failes under regular conditions, for example when
setting the MTU of a master larger then the MTU of the slaves.
Logging a warning it too alarming.
When comparing an unsigned and a signed integer, the signed integer
is promoted to unsigned, resulting in a very large number.
See the checks "nwrote < len - 1", where nwrote might be -1
to indicate failure. The condition would not be TRUE due to
promoting -1 to the max int value.
Hence, sysctl_set() was rather wrong.
We don't need this extra distinguisher. It makes no sense to ever
compare two routes with a different compare-type.
Also, the number of fields that is hashed already differs between each
compare type. If we have a good hashing algorithm, this already suffices
that the hash value looks largely different.
We often want to cascade hashing, meaning, to combine the
outcome of various hash functions in a larger hash.
Instead of having each hash function return a guint hash value,
accept a hash state argument. This saves the overhead of initializing
and completing the intermediate hash states.
It also avoids loosing entropy when we reduce the larger hash state
into the intermediate guint hash value.
By using a macro, we don't cast all the types to guint. Instead,
we use their native types directly. Hence, we don't need
nm_hash_update_uint64() nor nm_hash_update_ptr().
Also, for types smaller then guint like char, we save hashing
the all zero bytes.
siphash24() is wildly used by projects nowadays.
It's certainly slower then our djb hashing that we used before.
But quite likely it's fast enough for us, given how wildly it is
used. I think it would be hard to profile NetworkManager to show
that the performance of hash tables is the issue, be it with
djb or siphash24.
Certainly with siphash24() it's much harder to exploit the hashing
algorithm to cause worst case hash operations (provided that the
seed is kept private). Does this better resistance against a denial
of service matter for us? Probably not, but let's better be safe then
sorry.
Note that systemd's implementation uses a different seed for each hash
table (at least, after the hash table grows to a certain size).
We don't do that and use only one global seed.
Replace the usage of g_str_hash() with our own nm_str_hash().
GLib's g_str_hash() uses djb2 hashing function, just like we
do at the moment. The only difference is, that we use a diffrent
seed value.
Note, that we initialize the hash seed with random data (by calling
getrandom() or reading /dev/urandom). That is a change compared to
before.
This change of the hashing function and accessing the random pool
might be undesired for libnm/libnm-core. Hence, the change is not
done there as it possibly changes behavior for public API. Maybe
we should do that later though.
At this point, there isn't much of a change. This patch becomes
interesting, if we decide to use a different hashing algorithm.
The privious NM_HASH_* macros directly operated on a guint value
and were thus close to the actual implementation.
Replace them by adding a NMHashState struct and accessors to
update the hash state. This hides the implementation better
and would allow us to carry more state. For example, we could
switch to siphash24() transparently.
For now, we still do a form basically djb2 hashing, albeit with
differing start seed.
Also add nm_hash_str() and nm_str_hash():
- nm_hash_str() is our own string hashing implementation
- nm_str_hash() is our own string implementation, but with a
GHashFunc signature, suitable to pass it to g_hash_table_new().
Also, it has this name in order to remind you of g_str_hash(),
which it is replacing.
Introduce a NM_HASH_INIT() function. It makes the places
where we initialize a hash with a certain seed visually clear.
Also, move them from "shared/nm-utils/nm-shared-utils.h" to
"shared/nm-utils/nm-macros-internal.h". We might want to
have NM_HASH_INIT() non-inline (hence, define it in the
source file).
We added "ipv4.route-table-sync" and "ipv6.route-table-sync" to not change
behavior for users that configured policy routing outside of NetworkManager,
for example, via a dispatcher script. Users had to explicitly opt-in
for NetworkManager to fully manage all routing tables.
These settings were awkward. Replace them with new settings "ipv4.route-table"
and "ipv6.route-table". Note that this commit breaks API/ABI on the unstable
development branch by removing recently added API.
As before, a connection will have no route-table set by default. This
has the meaning that policy-routing is not enabled and only the main table
will be fully synced. Once the user sets a table, we recognize that and
NetworkManager manages all routing tables.
The new route-table setting has other important uses: analog to
"ipv4.route-metric", it is the default that applies to all routes.
Currently it only works for static routes, not DHCP, SLAAC,
default-route, etc. That will be implemented later.
For static routes, each route still can explicitly set a table, and
overwrite the per-connection setting in "ipv4.route-table" and
"ipv6.route-table".
These static variables really never be modified.
Mark them as const, which allows the linker to mark them as
read-only.
The problem is libnl3's API, which has these parameters
not as const. Add a workaround for that. Clearly libnl3 is
not gonna modify the policy, that the API was fixed too [1]
[1] b4802a17a7
gcc doesn't consider variables with cleanup attribute as unused.
clang does, and warns about them.
In one case, clang is right, in the other one the warning is bogus.
Fix both.
- use nm_utils_addr_family_to_char(). It asserts that the input argument
is either AF_INET or AF_INET6.
- rename variable @family to @addr_family for consistency.
- when logging addr_family for activation-stage, use v4 or v6 instead
of numeric AF_INET/AF_INET6.
Whenever we call a platform operation that reads or writes the netlink
socket, there is the possibility that the cache gets updated, as we
receive netlink events.
It is thus racy, if nm_platform_ip_route_sync() *first* adds routes, and
then obtains a list of routes to delete. The correct approach is to
determine which routes to delete first (and keep it in a list
@routes_prune), and pass that list down to nm_platform_ip_route_sync().
Arguably, this doesn't yet solve every race. For example, NMDevice
calls update_ext_ip_config() during ip4_config_merge_and_apply().
That is good, as it resyncs with platform. However, before calling
nm_ip4_config_commit() it calls other platform operations, like
_commit_mtu(). So, the race is still there.
Kernel does not allow to add a route with table 0 (RT_TABLE_UNSPEC). It
effectively is an alias for the main table. We must consider that when
comparing routes sementically.