The function n_dhcp4_c_connection_send_request does not release or take
ownership of its request argument. Because of that, setting it to NULL
in the caller prevents the auto-cleanup of the variable to be executed,
causing a resource leak. Fix it.
Fixes: e23b3c9c3a ('Squashed 'shared/n-dhcp4/' content from commit fb1d43449')
Fixes: 243cc433fb ('n-dhcp4: add new client probe function to send RELEASE message')
Using `n_dhcp4_c_connection_start_request()` will cause staying in
`connection->request`, as a result, it will cause the resending of
DHCPRELEASE and DHCPDECLINE message, thus, use
`n_dhcp4_c_connection_send_request()` directly instead to avoid
unnecessary retransmission timeout, as suggested by
f030927a54 (r1531834009).
We should not send the DHCP release message when udp socket is still in
the PACKET state, this state is typically used during the discovery and
offer phases, where the client broadcasts DHCP packets like DHCPDISCOVER
and receives responses like DHCPOFFER. At this point, the client has no
lease because it has not yet completed the DHCP handshake.
Section 4.9 of RFC 4594 specifies that DHCP should use the standard
(CS0 = 0) service class. Section 3.2 says that class CS6 is for
"transmitting packets between network devices (routers) that require
control (routing) information to be exchanged between nodes", listing
"OSPF, BGP, ISIS, RIP" as examples of such traffic. Furthermore, it
says that:
User traffic is not allowed to use this service class. By user
traffic, we mean packet flows that originate from user-controlled
end points that are connected to the network.
Indeed, we got reports of some Cisco switches dropping DHCP packets
because of the CS6 marking.
For these reasons, change the default value to the recommended one,
CS0.
(cherry picked from commit d8b33e2a97)
The client currently always sets the DSCP value in the DS field
(formerly known as "TOS") to CS6. Some network equipment drops packets
with such DSCP value; provide a way to change it.
(cherry picked from commit 2f543f1154)
Sending the client-identifier (DHCP Option 61) is not mandatory,
although it's recommended, and there are some weird cases where
clients need not to send it.
Allow not to send it by leaving client_id unset.
Usually we want no difference between the upstream project that we fork
via git-subtree, and our copy. However, for the subprojects, we need to
patch them. Do it.
If you know a better way, that allows to overwriting the subprojects
please send a patch.
Previously, during decline we would clear probe->current_lease,
however leave the state at GRANTED.
That is a wrong state, and can easily lead to a crash later.
For example, on the next timeout we will end up at
n_dhcp4_client_dispatch_timer(), then current-lease gets
accessed unconditionally:
case N_DHCP4_CLIENT_PROBE_STATE_GRANTED:
if (ns_now >= probe->current_lease->lifetime) {
Instead, return to INIT state and schedule a timer. As suggested
by RFC 2131, section 3.1, 5) ([1]).
[1] https://datatracker.ietf.org/doc/html/rfc2131#section-3.1
The lease list and the probe's state are strongly related. That is
evidenced by the fact that sometimes we check the state and then
access probe->current_lease without further checking.
The code in "n-dhcp4-c-probe.c" (select_lease, accept, decline) already
changes and maintains the state, it should also maintain the lease list.
Move the code.
The caller is supposed to call accept/decline/select with the lease that
was just announced. Calling it in the wrong state or with the wrong
lease is a user error.
Return an error when called in the wrong state, so that the user
notices they did something wrong.
If we have a lease and we get a NAK renewing/rebinding it, the lease
is lost.
Without this, probe->current_lease remains set and after the next
DISCOVER/OFFER round, any call to n_dhcp4_client_lease_select() will
fail at:
if (lease->probe->current_lease)
return -ENOTRECOVERABLE;
As in:
[5325.1313] dhcp4 (veth0): send REQUEST of 172.25.1.200 to 255.255.255.255
[5325.1434] dhcp4 (veth0): received NACK from 172.25.1.1
[5325.1435] dhcp4 (veth0): client event 3 (RETRACTED)
[5325.1436] dhcp4 (veth0): send DISCOVER to 255.255.255.255
[5325.1641] dhcp4 (veth0): received OFFER of 172.25.1.200 from 172.25.1.1
[5325.1641] dhcp4 (veth0): client event (OFFER)
[5325.1641] dhcp4 (veth0): selecting lease failed: -131 (ENOTRECOVERABLE)
Upstream: https://github.com/nettools/n-dhcp4/pull/33
Upstream: e4af93228ehttps://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/993e43b1791a3 ('Merge commit 'e23b3c9c3ac86b065eef002fa5c4321cc4a87df2' as 'shared/n-dhcp4'')
Each connection object includes a 64KiB scratch buffer used for
receiving packets. When many instances of the client are created,
those buffers use a significant amount of memory. For example, 500
clients take ~30MiB of memory constantly reserved only for those
buffers.
Since the buffer is used only in the function and is never passed
outside, a stack allocation would suffice; however, it's not wise to
do such large allocations on the stack; dynamically allocate it.
https://github.com/nettools/n-dhcp4/issues/26https://github.com/nettools/n-dhcp4/pull/2764513e31c0
I got a report of a scenario where multiple servers reply to a REQUEST
in SELECTING, and all servers send NAKs except the one which sent the
offer, which replies with a ACK. In that scenario, n-dhcp4 is not able
to obtain a lease because it restarts from INIT as soon as the first
NAK is received. For comparison, dhclient can get a lease because it
ignores all NAKs in SELECTING.
Arguably, the network is misconfigured there, but it would be great if
n-dhcp4 could still work in such scenario.
According to RFC 2131, ACK and NAK messages from server must contain a
server-id option. The RFC doesn't explicitly say that the client
should check the option, but I think it's a reasonable thing to do, at
least for NAKs.
This patch stores the server-id of the REQUEST in SELECTING, and
compares it with the server-id from NAKs, to discard other servers'
replies.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1144
On RHEL-8.5, s390x with gcc-8.5.0-2.el8, we get a compiler warning:
$ CFLAGS='-O2 -Werror=maybe-uninitialized' meson build
...
cc -Isrc/libndhcp4-private.a.p -Isrc -I../src -Isubprojects/c-list/src -I../subprojects/c-list/src -Isubprojects/c-siphash/src -I../subprojects/c-siphash/src -Isubprojects/c-stdaux/src -I../subprojects/c-stdaux/src -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c11 -g -D_GNU_SOURCE -O2 -Werror=maybe-uninitialized -fPIC -fvisibility=hidden -fno-common -MD -MQ src/libndhcp4-private.a.p/n-dhcp4-c-connection.c.o -MF src/libndhcp4-private.a.p/n-dhcp4-c-connection.c.o.d -o src/libndhcp4-private.a.p/n-dhcp4-c-connection.c.o -c ../src/n-dhcp4-c-connection.c
../src/n-dhcp4-c-connection.c: In function ‘n_dhcp4_c_connection_dispatch_io’:
../src/n-dhcp4-c-connection.c:1151:17: error: ‘type’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
uint8_t type;
^~~~
https://github.com/nettools/n-dhcp4/pull/24