Infiniband profiles can have a p-key set. Both in kernel API
("create_child" sysctl) and in NetworkManager API, that key can range
from 0x0001 to 0xFFFF (0x8000 excluded). NetworkManager does not support
renaming the interface, so kernel always assigns the interface name
"$PHYSDEV.$PKEY_ID" (with $PKEY_ID as 4 character hex digits).
Note that the highest bit in the p-key (0x8000) is the full-membership
flag. Internally, kernel only supports full-membership so when we create
for example "ib0.00c1" and "ib0.80c1" interfaces, their actually used
p-key is in both cases 0x80c1 and you can see it with `ip -d link`.
Nonetheless, kernel and NetworkManager allow to configure the p-key
without the highest bit set, and the result differs in the interface
name.
Note that initscripts' ifup-ib0 would always internally coerce the
PKEY_ID variable to have the high bit set ([1]). It also would require
that the `DEVICE=` variable is specified and matches the expected
interface name. So both these configurations are identical and valid:
DEVICE=ib0.80c1
PHYSDEV=ib0
PKEY_ID=0x80c1
and
DEVICE=ib0.80c1
PHYSDEV=ib0
PKEY_ID=0x00c1
Historically, NetworkManager would also implement the same restrictions
([2], [3], [4]). That meant, not all valid NetworkManager infiniband
profiles could be expressed as ifcfg file. For example, NetworkManager
allows to have "connection.interface-name" (`DEVICE=`) unset (which
ifup-ib and ifcfg reader did not allow). Also, NetworkManager would
allow configuring a "infiniband.p-key" without full membership flag, and
the reader would mangle that.
This caused various problems to the point that when you configure an
infiniband.p-key with a non-full-membership key, the ifcfg-rh written by
NetworkManager was invalid. Either, you could leave
"connection.interface-name" unset, but then the reader would complain
about missing `DEVICE=`. Or, we could write `DEVICE=ib0.00c1;
PKEY_ID=0x00c1`, which was invalid as we expected `DEVICE=ib0.80c1`.
This was addressed by rhbz 2122703 ([5]). The fix was to
- not require a `DEVICE=` ([6]).
- don't mangle the `PKEY_ID=` in the reader ([7]).
which happened in 1.41.2 and 1.40.2 (rhel-8.8).
With this change, we could persist any valid infiniband profile to ifcfg
format. We also could read back any valid ifcfg file that NetworkManager
would have written in the past (note that it could not write valid ifcfg
files previously, if the p-key didn't have the full-membership key set).
The problem is, that users were used to edit ifcfg files by hand, and
users would have files with:
DEVICE=ib0.80c1
PHYSDEV=ib0
PKEY_ID=0x00c1
This files had worked before, but now failed to verify as we would
expect `DEVICE=ib0.00c1`. Also, there was a change in behavior that
PKEY_ID is now interpreted without the high bit set. This is reported as
rhbz 2209164 ([8]).
We will do several things to fix that:
1) we now normalize the "connection.interface-name" to be valid. It was
not useful to set it anyway, as it was redundant. Complaining about a
redundant setting, which makes little sense to configure, is not useful.
This is done by [9].
2) we now again treat PKEY_ID= as if it had 0x8000 flag set. This was done by
[10].
With step 1) and 2), we are able to read any existing ifcfg files out
there in the way we did before 1.41.2.
There is however one piece missing. When we now create a profile using
nmcli/libnm/D-Bus, which has a non-full-membership p-key, then the
profile gets mangled in the process.
If the user uses NetworkManager API to configure an interface and
chooses a non-full-membership p-key, then this should work the same as
with keyfile plugin (or on rhel-9, where keyfile is the default). Note
that before 1.41.2 it didn't work at all, when the user used ifcfg-rh
backend. Likely(?) there are no users who rely on creating such a profile
with nmcli/libnm/D-Bus and expect to automatically have the p-key
normalized. That didn't work before 1.41.2 and didn't behave that way
between 1.41.2 and now.
This patch fixes that by introducing a new key PKEY_ID_NM= for holding
the real p-key. Now ifcfg backend is consistent with handling infiniband
profiles, and old, hand-written ifcfg files still work as before.
There is of course change in behavior, that ifcfg files between 1.41.2
and now were interpreted differently. But that is bug 2209164 ([8]) and
what we fix here.
For now strong reasons, we keep writing the PKEY_ID to file too. It's
redundant, but that is what a human might expect there.
[1] 05333c3602/f/rdma.ifup-ib (_75)
[2] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.40.0/src/core/settings/plugins/ifcfg-rh/nms-ifcfg-rh-reader.c#L5386
[3] cb5606cf1c (a7a78fccb2c8c945fd09038656ae734c1b0349ab_3493_3532)
[4] cb5606cf1c (a7a78fccb2c8c945fd09038656ae734c1b0349ab_3493_3506)
[5] https://bugzilla.redhat.com/show_bug.cgi?id=2122703
[6] 4c32dd9d25
[7] a4fe16a426
[8] https://bugzilla.redhat.com/show_bug.cgi?id=2209164
[9] 4610fd67e6
[10] f8e5e07355
(cherry picked from commit 5e3e38f291)
Add per port priority support for bond active port re-selection during
failover. A higher number means a higher priority in selection. The
primary port still has the highest priority. This option is only
compatible with active-backup, balance-tlb and balance-alb modes.
(cherry picked from commit 2f0571f193)
Users can set `no-aaaa` DNS option to suppress AAAA queries made by the
stub resolver, including AAAA lookups triggered by NSS-based interfaces
such as getaddrinfo. Only DNS lookups are affected.
(cherry picked from commit 9d4bbf78f0)
The password currently generated has ~48 bits of entropy; increase the
length from 8 to 12 to get ~70 bits. While at it, exclude characters
that look similar and might be entered wrongly by users.
(cherry picked from commit 231128d28d)
Since commit f18bf17dea ('wifi: cleanup
ensure_hotspot_frequency()'), NetworkManager automatically selects a
stable channel for AP connections that don't specify a fixed one. The
advantage of this approach is that NM can select a channel that works
well in the current regulatory domain.
However, nmcli still sets fixed channels 1 for 2.4GHz and 7 for 5GHz
when using the "device wifi hotspot". In particular, channel 7 on 5GHz
seems a bad choice because according to [1] it is not usable anywhere
in the world.
It seems difficult to select channel that works everywhere in the 5GHz
band, so it's better to not set a channel in the profile and let NM
find a usable one. For consistency, do the same also for the 2.4GHz
band even if the default choice (channel 1) should always work; by
letting NM choose a channel, different hotspot created with nmcli have
the chance of using different bands and not interfere with each other.
[1] https://en.wikipedia.org/wiki/List_of_WLAN_channels
(cherry picked from commit e446d2b632)
The generated password was all non-alphanumeric characters.
Fixes: 6e96d71731 ('all: use nm_random_*() instead of g_random_*()')
(cherry picked from commit ac2fb0e93d)
g_random_*() is based on GRand, which is not a CSPRNG. Instead, rely on
kernel to give us good random numbers, which is what nm_random_*() does.
Note that nm_random_*() calls getrandom() (or reads /dev/urandom), which
most likely is slower than GRand. It doesn't matter for our uses though.
It is cumbersome to review all uses of g_rand_*() whether their usage of
a non-cryptographically secure generator is appropriate. Instead, just
always use an appropriate function, thereby avoiding this question. Even
glib documentation refers to reading "/dev/urandom" as alternative. Which
is what nm_random_*() does. These days, it seems unnecessary to not use
the best random generator available, unless it's not fast enough or you
need a stable/seedable stream of random numbers.
In particular in nmcli, we used g_random_int_range() to generate
passwords. That is not appropriate. Sure, it's *only* for the hotspot,
but still.
(cherry picked from commit 6e96d71731)
This setting allows the user to remove the local route rule that is
autogenerated for both IPv4 and IPv6. By default, NetworkManager won't
touch the local route rule.
(cherry picked from commit d2ca44ffc6)
The main purpose is to simplify printf debugging and manual testing. We
can now trivially patch the code so that all output from nmcli gets
(additionally) written to a file. That is useful when debugging a unit
test in "test-client.py". Thereby we can duplicate all messages via
nm_utils_print(), which is in sync with the debug messages from libnm
and which honors LIBNM_CLIENT_DEBUG_FILE.
(cherry picked from commit b32e4c941a)
These will replace the direct calls to g_print()/g_printerr() in nmcli.
There are two purposes.
1) the new macros embody the concept of "printing something from nmcli".
It means, we can `git grep` for those functions, and find all the
relevant places, without hitting the irrelevant ones (e.g. tests that
also use g_print()).
2) by having one place, we can trivially change it. That is useful for
printf debugging. For example, "test-client.py" runs nmcli and
captures and compares the output. With libnm we can set
LIBNM_CLIENT_DEBUG and LIBNM_CLIENT_DEBUG_FILE to print libnm debug
messages to a file. But we cannot trivially synchronize the messages
from nmcli with that output (also because they are consumed by the test
and not immediately accessible). This would be easy, if we temporarily
could patch nmc_print*() to also log to nm_utils_print(). The new macros
will allow doing that at one place.
For example, patch the "#if 0" and run:
$ LIBNM_CLIENT_DEBUG=trace \
LIBNM_CLIENT_DEBUG_FILE='xxx.%p' \
NMTST_USE_VALGRIND=1 \
LIBTOOL="/bin/sh ./libtool"
./src/tests/client/test-client.sh -- -k monitor
(cherry picked from commit 4cf94f30c7)
nmc_print() will be used for something else. Rename. Also,
nmc_print_table() is the better name anyway because the function does a
lot of formatting and not simple printf().
(cherry picked from commit 4b2ded7a4a)
Since glib 2.45, we are guaranteed that g_free() just calls free(), so
both can be used interchangeably. However, we still only depend on glib
2.40.
In any case, it's ugly to mix the two. Memory allocated by plain
malloc(), should be only freed with free(). The buffer in question comes
from readline, which allocates it using the system allocator.
Fixes: 995229181c ('cli: remove editor thread')
(cherry picked from commit 5dc07174d3)
Such leaks show up in valgrind, and are simply bugs.
Also, various callers (not all of them, which is another bug!) like to
take ownership of the returned string and free it. That means, we leave
a dangling pointer in the global variable, which is very ugly and error
prone.
Also, the callers like to free the string with g_free(), which is not
appropriate for the "rl_string" memory which was allocated by readline.
It must be freed with free(). Avoid that, by cloning the string using
the glib allocator.
Fixes: 995229181c ('cli: remove editor thread')
(cherry picked from commit 89734c7553)
src/nmcli/devices.c:1196: double_free: Calling "_nm_auto_strfreev" frees pointer "arg_arr" which has already been freed.
Fixes: c5d45848dd ('cli: mark argv argument for command line parsing as const')
Introduce a "vlan.protocol" property that specifies the protocol of a
VLAN, which controls the tag (EtherType) used for encapsulation.
Regular VLANs use 802.1Q (tag 0x8100). To implement VLAN stacking it's
sometimes useful to have 802.1ad VLANs with tag 0x88A8.
The property is a string instead of e.g. an enum because this allows
maximum flexibility in the future. For example, it becomes possible to
specify an arbitrary number in case if the kernel ever allows it.
The warning "-Wcast-align=strict" seems useful and will be enabled
next. Fix places that currently cause the warning by using the
new macro NM_CAST_ALIGN(). This macro also nm_assert()s that the alignment
is correct.
GArray.data is a char pointer. Most of the time we track other data in
a GArray. Casting that pointer can trigger "-Wcast-align=strict"
warnings.
Avoid them. Most of the time, instead use the nm_g_array*() helpers,
which also assert that the expected element size is correct.
openvswitch accepts "dot1q-tunnel" as vlan mode:
A dot1q-tunnel port is somewhat like an access port. Like an
access port, it carries packets on the single VLAN specified
in the tag column and this VLAN, called the service VLAN,
does not appear in an 802.1Q header for packets that ingress
or egress on the port. The main difference lies in the be‐
havior when packets that include a 802.1Q header ingress on
the port. Whereas an access port drops such packets, a
dot1q-tunnel port treats these as double-tagged with the
outer service VLAN tag and the inner customer VLAN taken
from the 802.1Q header. Correspondingly, to egress on the
port, a packet outer VLAN (or only VLAN) must be tag, which
is removed before egress, which exposes the inner (customer)
VLAN if one is present.
Support this mode.
Add a new "ovs-port.trunks" property that indicates which VLANs are
trunked by the port.
At ovsdb level the property is just an array of integers; on the
command line, ovs-vsctl accepts ranges and expands them.
In NetworkManager the ovs-port setting stores the trunks directly as a
list of ranges.
Support managing the loopback interface through NM as the users want to
set the proper mtu for loopback interface when forwarding the packets.
Additionally, the IP addresses, DNS, route and routing rules are also
allowed to configure for the loopback connection profiles.
https://bugzilla.redhat.com/show_bug.cgi?id=2060905
This is the version shipped in Fedora 37. As Fedora 37 is now out, the
core developers switch to it. Our gitlab-ci will also use that as base
image for the check-{patch.tree} tests and to generate the pages. There
is a need that everybody agrees on which clang-format version to use,
and that version should be the one of the currently used Fedora release.
Also update the used Fedora image in "contrib/scripts/nm-code-format-container.sh"
script.
The gitlab-ci still needs update in the following commit. The change
in isolation will break the "check-tree" test.
Previously we'd note if NM is stopped, but not if it's running.
I suppose it's nice for the user to know that the monitor started
running, but, it's also important for the monitor to be testable (so
that we know that we are ready to start adding mock objects, etc.)
This also gets rids of some duplication at expense of a little less
nuanced message.
This is the better name, becuse this is not in particular about "docs".
It's about generating an XML with the information from the settings
meta data for nmcli.
We will do something similar with the libnm-core meta data.
It just feels nicer to be explicit about the filenames and
not rely on a specific naming.
Also, in meson we can directly pass the target as argument, which
expands to the filename but also adds a dependency.
It seems really ugly, to pass a callback function of wrong
signature. Granted, it probably works due to the C calling
convention, but it seems odd.
Use callbacks of the proper type instead. Then we also don'
need g_signal_connect_swapped().
While at it, rename. "connected_state_cb()" seems a bad name.
Unfortunately, for this we require SetLinkDNSEx() API from v246.
That adds extra complexity.
If the configuration contains no server name, we continue using
SetLinkDNS(). Otherwise, at first we try using SetLinkDNSEx().
We will notice if that method is unsupported, reconfigure with
SetLinkDNS(), and set a flag to not try that again.
The pager_fallback() runs in the forked child process.
As such, it can only use functions from `man signal-safety`
or that are explicitly allowed.
We are mostly good, but g_printerr() is not allowed. It can deadlock.
Just avoid it. It's not very to print those error messages anyway.
setenv() cannot be called after fork, because it might allocate memory,
which can deadlock.
Instead, prepare the environment and use execvpe().
`man 2 fork` says:
After a fork() in a multithreaded program, the child can safely call
only async-signal-safe functions (see signal-safety(7)) until such time
as it calls execve(2).
This means, we are quite strongly limited what can be done in the child
process, before exec. setenv() is not listed as async-signal-safe, obviously
because it allocates memory, and malloc() isn't async-signal-safe either.
See also glib's documentation of GSpawnChildSetupFunc ([1]) about what
can be done in the child process.
[1] 08cb200aec/glib/gspawn.h (L124)
Fix the following crash:
$ nmcli device monitor a
Error: Device 'a' not found.
Segmentation fault (core dumped)
Found by coverity:
1. NetworkManager-1.41.3/src/nmcli/devices.c:0: scope_hint: In function 'do_devices_monitor'
2. NetworkManager-1.41.3/src/nmcli/devices.c:2932:28: warning[-Wanalyzer-null-dereference]: dereference of NULL 'devices'
2930| }
2931|
2932|-> for (i = 0; i < devices->len; i++)
2933| device_watch(nmc, g_ptr_array_index(devices, i));
2934|
Fixes: 2074b28976 ('nmcli/devices: return GPtrArray instead of GSList from get_device_list()')