Leak detection adds unhelpful messages to the stderr of nmcli, making
tests fail. For example:
=================================================================
==17156==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 256 byte(s) in 2 object(s) allocated from:
#0 0x7f08c7e27c88 in realloc (/lib64/libasan.so.5+0xefc88)
#1 0x7f08c7546e7d in g_realloc (/lib64/libglib-2.0.so.0+0x54e7d)
(cherry picked from commit 2fe9141960)
- make contrib/rpm similar to master.
- make .gitlab-ci.yml similar to master.
- disable eBPF everywhere. Now it must be explicitly disabled.
It seems to break unit tests on gitlab-ci, with something that
looks like a kernel issue. Or maybe ulimit -l is so small?
Anyway, disable it for now as there are problems with it.
We have random failures to build on gitlab-ci. Something is wrong,
at least, eBPF is not working reliably. Disable it for now.
(cherry picked from commit 0d16b037f5)
For better or worse, our release builds commonly do not disable assertions.
That means,
- NDEBUG is not set, and assert() is in effect
- G_DISABLE_ASSERT is not set, and g_assert() is in effect
- G_DISABLE_CHECKS is not set, and g_return*() is in effect.
On the other hand, NM_MORE_ASSERTS is not enabled by default and nm_assert()
is stripped away. That is the actual purpose of nm_assert(): it is
commonly disabled on release builds, while all other assertions are
enabled.
Note that it is fully supported to build NetworkManager with all kind of
assertions disabled. However, such a configuration is not much tested
and I would not recommend it for that reason.
%meson expands to
$ /usr/bin/meson --buildtype=plain --prefix=/usr --libdir=/usr/lib64 --libexecdir=/usr/libexec --bindir=/usr/bin --sbindir=/usr/sbin --includedir=/usr/include --datadir=/usr/share --mandir=/usr/share/man --infodir=/usr/share/info --localedir=/usr/share/locale --sysconfdir=/etc --localstatedir=/var --sharedstatedir=/var/lib --wrap-mode=nodownload --auto-features=enabled -Db_ndebug=true . x86_64-redhat-linux-gnu $OTHER_ARGS
thus passing -DNDEBUG to the meson build. Override that.
(cherry picked from commit ef338667f8)
We have random failures to build on gitlab-ci. Something is wrong,
at least, eBPF is not working reliably. Disable it for now.
(cherry picked from commit 52ea426b81)
Enabling eBPF causes src/devices/tests/test-acd to fail:
strace: bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=8, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0}, 112) = -1 EPERM (Operation not permitted)
NetworkManager-Message: 10:07:04.404: <warn> [1554631624.4046] acd[0xa2b400,10]: couldn't init ACD for announcing addresses on interface 'nm-test-veth0': Operation not permitted
Interestingly it does not always fail. Seems to depend on the kernel
which is used in the containerized test environments of gitlab-ci.
For now, just disable eBPF and use the fallback implementation.
(cherry picked from commit a5869d1b35)
And no longer use "fedora:lastest". While "fedora:rawhide" names the very
latest branch (and we want to test that), for all proper releases we want
name them explicitly.
(cherry picked from commit 2955d5e69a)
If NM fails to connect to teamd, it currently just sets the device
state to FAILED and waits that deactivate() is called later. However,
the 5 seconds timeout on teamd process start can hit in the meantime,
which fails with an assertion "nm_device_is_activating (device)".
Clean up the device state when the connection to teamd fails.
https://bugzilla.redhat.com/show_bug.cgi?id=1697900
(cherry picked from commit c48698d747)
When we delete the runner.name property, the runner object itself gets
deleted if that was the only property, and @runner becomes invalid.
==13818== Invalid read of size 1
==13818== at 0x55EAF4: nm_streq (nm-macros-internal.h:869)
==13818== by 0x55EAF4: _json_team_normalize_defaults (nm-utils.c:5573)
==13818== by 0x566C89: _nm_utils_team_config_set (nm-utils.c:6057)
==13818== by 0x5498A6: _nm_utils_json_append_gvalue (nm-utils-private.h:228)
==13818== by 0x5498A6: set_property (nm-setting-team.c:1622)
==13818== Address 0x182a9330 is 0 bytes inside a block of size 13 free'd
==13818== at 0x4839A0C: free (vg_replace_malloc.c:530)
==13818== by 0x4857868: json_delete_string (value.c:763)
==13818== by 0x4857868: json_delete (value.c:975)
==13818== by 0x4851FA1: UnknownInlinedFun (jansson.h:129)
==13818== by 0x4851FA1: hashtable_do_del (hashtable.c:131)
==13818== by 0x4851FA1: hashtable_del (hashtable.c:289)
==13818== by 0x55DFDD: _json_del_object (nm-utils.c:5384)
==13818== by 0x55EA70: _json_delete_object_on_string_match (nm-utils.c:5532)
==13818== by 0x55EADB: _json_team_normalize_defaults (nm-utils.c:5549)
==13818== by 0x566C89: _nm_utils_team_config_set (nm-utils.c:6057)
==13818== by 0x5498A6: _nm_utils_json_append_gvalue (nm-utils-private.h:228)
==13818== by 0x5498A6: set_property (nm-setting-team.c:1622)
==13818== Block was alloc'd at
==13818== at 0x483880B: malloc (vg_replace_malloc.c:299)
==13818== by 0x4852E8C: lex_scan_string (load.c:389)
==13818== by 0x4852E8C: lex_scan (load.c:620)
==13818== by 0x4853458: parse_object (load.c:738)
==13818== by 0x4853458: parse_value (load.c:862)
==13818== by 0x4853466: parse_object (load.c:739)
==13818== by 0x4853466: parse_value (load.c:862)
==13818== by 0x4853655: parse_json.constprop.7 (load.c:899)
==13818== by 0x48537CF: json_loads (load.c:959)
==13818== by 0x566780: _nm_utils_team_config_set (nm-utils.c:5961)
==13818== by 0x5498A6: _nm_utils_json_append_gvalue (nm-utils-private.h:228)
==13818== by 0x5498A6: set_property (nm-setting-team.c:1622)
Fixes: a5642fd93a ('libnm-core: team: rework defaults management on runner properties')
(cherry picked from commit 80a3031a7c)
When nmcli needs secrets for a connection it asks them for every known
setting. nmtui is a bit smarter and asks them only for settings that
actually exist in the connection. Make a step further and let clients
ask secrets only for setting that exist *and* have any secret
property. This decreases the number of D-Bus calls when editing or
showing a connection with secrets.
https://bugzilla.redhat.com/show_bug.cgi?id=1506536https://github.com/NetworkManager/NetworkManager/pull/327
(cherry picked from commit 5b5a768b69)
The 4th argument of AC_SEARCH_LIBS is a list of additional libraries,
not the name of the variable to hold the result which is always
ac_cv_search_$function. Also, we should ignore the result when it is
"none required".
Fixes: 1f2eeb85d8 ('build: rename $(LIBDL) to $(DL_LIBS) and modify detection')
(cherry picked from commit bd4957fcd7)
Go straight to unmanaged. That's what all the other devices do when
their backing resources vanish. If the device reached disconnected
state, an autoconnect check would try to connect it back, in vain.
https://github.com/NetworkManager/NetworkManager/pull/324
(cherry picked from commit 045b88a5b5)
Open vSwitch is the special kid on the block -- it likes to be in charge of
the link lifetime and so we shouldn't be. This means that we shouldn't be
attempting to remove the link: we'd just (gracefully) fail anyways.
More importantly, this also means that we shouldn't care if we see the link
go away. Once the device reaches DISCONNECTED state, its configuration is
cleaned up and we may already be activating another connection. We shouldn't
alter the device state when OpenVSwitch decides to drop the old link.
https://bugzilla.redhat.com/show_bug.cgi?id=1543557https://github.com/NetworkManager/NetworkManager/pull/324
(cherry picked from commit 3a55ec63e1)
Fixes a crash on failed AddAndActivate:
$ ip link set eth0 down
$ nmcli d conn eth0
Error: Failed to add/activate new connection: Connection 'eth0' is not available on device eth0 because device has no carrier
<NetworkManager crashes>
#3 0x000055555558b6c5 in _nm_g_return_if_fail_warning
#4 0x00005555557008c7 in nm_settings_has_connection
#5 0x0000555555700e5f in pk_add_cb
#6 0x0000555555726e30 in pk_call_cb
#7 0x0000555555726e30 in pk_call_cb
#8 0x0000555555726e30 in pk_call_cb
#9 0x00005555555aaea8 in _call_id_invoke_callback
#10 0x00005555555ab2e8 in _call_on_idle
https://github.com/NetworkManager/NetworkManager/pull/325
(cherry picked from commit f034f17ff6)
If we surprise-remove the master, slaves would immediately attempt to bring
things up by autoconnecting. Not cool. Policy, however, blocks
autoconnect if the slaves disconnect due to "dependency-failed", and it
indeed seems to be an appropriate reason here:
$ nmcli c add type bridge
$ nmcli c add type dummy ifname dummy0 master bridge autoconnect yes
$ nmcli c del bridge
$
Before:
(nm-bridge): state change: ip-config -> deactivating (reason 'connection-removed')
(nm-bridge): state change: deactivating -> disconnected (reason 'connection-removed')
(nm-bridge): detached bridge port dummy0
(dummy0): state change: activated -> disconnected (reason 'connection-removed')
(nm-bridge): state change: disconnected -> unmanaged (reason 'user-requested')
(dummy0): state change: disconnected -> unmanaged (reason 'user-requested')
policy: auto-activating connection 'bridge-slave-dummy0'
After:
(nm-bridge): state change: ip-config -> deactivating (reason 'connection-removed')
(nm-bridge): state change: deactivating -> disconnected (reason 'connection-removed')
(nm-bridge): detached bridge port dummy0
(dummy0): state change: activated -> deactivating (reason 'dependency-failed')
(nm-bridge): state change: disconnected -> unmanaged (reason 'user-requested')
(dummy0): state change: deactivating -> disconnected (reason 'dependency-failed')
(dummy0): state change: disconnected -> unmanaged (reason 'user-requested')
https://github.com/NetworkManager/NetworkManager/pull/319
(cherry picked from commit 8f2a8a52f0)
The capture variables, $1, etc, are not valid unless the match
succeeded, and they're not cleared, either.
$ git checkout -B C origin/master && \
echo XXXXX > f.txt && \
git add f.txt && \
git commit -m 'this commit does something()'
Branch 'C' set up to track remote branch 'master' from 'origin'.
Reset branch 'C'
Your branch is up to date with 'origin/master'.
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `git log --abbrev=12 --pretty=format:"%h ('%s')" -1 does something() 2>/dev/null'
>>> VALIDATE "a169a98e14 this commit does something()"
(commit message):4: Commit 'does something()' does not seem to exist:
> Subject: [PATCH] this commit does something()
(commit message):4: Refer to the commit id properly: :
> Subject: [PATCH] this commit does something()
The patch does not validate.
(cherry picked from commit d66a1ace23)
On Fedora rawhide we get the following build failure:
In file included from shared/systemd/src/basic/alloc-util.c:3:
./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:114:21: error: static declaration of 'gettid' follows non-static declaration
114 | static inline pid_t gettid(void) {
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from /usr/include/glib-2.0/gio/gcredentials.h:32,
from /usr/include/glib-2.0/gio/gio.h:46,
from ./shared/nm-utils/nm-macros-internal.h:31,
from ./shared/nm-default.h:293,
from ./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:22,
from shared/systemd/src/basic/alloc-util.c:3:
/usr/include/bits/unistd_ext.h:34:16: note: previous declaration of 'gettid' was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
glibc supports now gettid() call ([1]) which conflicts with our compat
implementation. Rename it.
[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=1d0fc213824eaa2a8f8c4385daaa698ee8fb7c92
(cherry picked from commit 10276322bd)
Using test-networkmanager-servic.py, I get the following error when
trying to add manual config with a dns address:
Error: g-io-error-quark: Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/dbus/service.py", line 707, in _message_cb
retval = candidate_method(self, *args, **keywords)
File "tools/test-networkmanager-service.py", line 1727, in AddConnection
return self.add_connection(con_hash)
File "tools/test-networkmanager-service.py", line 1731, in add_connection
con_inst = Connection(self.c_counter, con_hash, do_verify_strict)
File "tools/test-networkmanager-service.py", line 1601, in __init__
NmUtil.con_hash_verify(con_hash, do_verify_strict=do_verify_strict)
File "tools/test-networkmanager-service.py", line 497, in con_hash_verify
BusErr.raise_nmerror(e)
File "tools/test-networkmanager-service.py", line 419, in raise_nmerror
raise e
Exception: Unsupported value ipv4.dns = dbus.Array([dbus.UInt32(168430090L), dbus.UInt32(218893066L)], signature=dbus.Signature('u'), variant_level=1) (Cannot convert array element to type 'u': Must be number, not Variant)
https://mail.gnome.org/archives/networkmanager-list/2019-March/msg00013.html
(cherry picked from commit 9a71d7d273)
When the link goes down the kernel removes IPv6 addresses from the
interface. In update_ext_ip_config() we detect that addresses were
removed externally and drop them from various internal
configurations. Don't do that if the link is down so that those
addresses will be restored again on link up.
(cherry picked from commit 505d2adbc2)
Add a new argument to nm_ip_config_* helpers to also ignore addresses
similarly to what we already do for routes. This will be used in the
next commit; no change in behavior here.
(cherry picked from commit 39b7257208)
We can detect false DAD failures if the link goes down. Don't try to
prevent them, but just reset the counter if the link goes down.
(cherry picked from commit 056470a4ba)
When the interface is down DAD failures becomes irrelevant and we
shouldn't try to add a link-local address even if the configuration
contains other IPv6 addresses.
(cherry picked from commit 72385f363c)