Commit graph

350 commits

Author SHA1 Message Date
Thomas Haller
eca8ebef18
platform: get extack_msg innm_platform_ip_route_sync()
Request the extack_msg for nm_platform_ip_route_add() call. Note that we (currently)
don't do anything with it, however requesting it has no downsides. That is, the
message already is heap allocated in the lower layers, so this only affects whether
it will be returned up to nm_platform_ip_route_sync().
2023-02-28 12:12:08 +01:00
Thomas Haller
d755b50808
platform: return extack message from add address/route operations 2023-02-28 12:08:07 +01:00
Thomas Haller
61388fd9c7
platform: drop logging for unexpected sequence number
It is not clear how that information is relevant. Since it is also
only logged when building with a non-default configure option, this
doesn't seem useful. Drop it.
2023-02-28 12:08:07 +01:00
Thomas Haller
bb9894abec
platform: minor cleanup of event_seq_check()
- unindent the code by "continue" the loop for the irrelevant case.
- fix indentation of comments.
- avoid unnecessary g_strdup() call if the extack message is NULL.
2023-02-28 12:08:06 +01:00
Thomas Haller
1d69b41db9
platform: log extack warning messages for netlink requests
The extack can also be returned on success. In that case,
they are warnings. Log them, it might be useful.
2023-02-28 12:08:06 +01:00
Thomas Haller
6ca537fa6a
platform: rename variables for extack message
Consistently name those variables and parameters "extack_msg".
The previous term "errmsg"/"msg" was not used consistently, and it
is also not clear what message this really is. For netlink, it
is well understood what Extended ACK means.
2023-02-28 12:08:06 +01:00
Thomas Haller
6f854ecaeb
platform/netlink: cleanup nla_strlcpy() to not wipe remaining buffer
strlcpy()/g_strlcpy() has a well understood behavior. nla_strlcpy()
did not behave like that. Instead, it also used to always wipe the
remainder of the string, similar to what strncpy() would do.

True, if we do

  nla_strlcpy(obj->link.name, tb[IFLA_IFNAME], IFNAMSIZ);

then we might want to clear the remainder and don't care about the
overhead of writing up to 14 bytes unnecessarily... However, actually
all callers of nla_strlcpy() either operate on a buffer that is already
pre-inialized with zero, or they really don't care about the
uninitialized memory after the string. So this was nowhere the desired
behavior.

Change nla_strlcpy() to not wipe the remainder of the buffer, so it behaves
mostly like strlcpy()/g_strlcpy() and as one would expect.

Add nla_strlcpy_wipe(), which on top of it also clears the remaining
buffer. In that aspect, it bears some similarities with strncpy(), but it
differs in other regards from strncpy (always NUL terminating and
returning the srclen). Yes, the name nla_strlcpy_wipe() is maybe
unfamiliar to the user, but it really is like nla_strlcpy() with the
addition to clear the buffer. That seems simple enough to understand
based on the name.

Note that all existing callers of nla_strlcpy() do not care about
clearing the memory, and the change in behavior is fine for them.
2023-02-28 12:08:06 +01:00
Thomas Haller
d73a5d692b
platform/netlink: assert for valid string in nla_get_string() 2023-02-28 12:08:06 +01:00
Etienne Champetier
0decc027ba
platform/trivial: fix route type name (unavailable -> unreachable)
Fixes: 766349879e ('platform/trivial: add code comments for NMPGlobalTracker')
2023-02-28 11:29:59 +01:00
Thomas Haller
6dafe78088
platform: ensure ext-data is of expected type
We just lookup the link info by ifindex. There is no guarantee that that
ifindex is of the expected type, to have a suitable ext-data. Check for
that.

Fixes: a7d2cad67e ('platform/linux: add support for WPAN links')
2023-02-24 10:16:08 +01:00
Fernando Fernandez Mancera
79611e4fcc platform: introduce function to globally track local route rule
The new function tracks local route rule in the GlobalTracker properly.
It also allow the developer to specify the untrack user tag.
2023-02-21 15:36:38 +01:00
Thomas Haller
4ccca2b5bd
platform,core: better handle onlink flag for ECMP routes
The onlink flag is part of each next hop.

When NetworkManager configures ECMP routes, we won't support that. All
next hops of an ECMP route must share the same onlink flag. That is fine
and fixed by this commit.

What is not fine, is that we don't track the rtnh_flags flags in
NMPlatformIP4RtNextHop, and consequently our nmp_object_id_cmp() is
wrong.

Fixes: 5b5ce42682 ('nm-netns: track ECMP routes')
(cherry picked from commit 6ed966258c)
2023-02-07 14:26:45 +01:00
Thomas Haller
09d5c4e22e
platform: fix handling the onlink route attribute for routes without gateway
For IPv6, kernel doesn't care. If the gateway is ::, you may or may
not set the onlink attribute. But for IPv4 routes, that gets rejected:

  # ip route add 1.2.3.4/32 dev v onlink
  Error: Invalid flags for nexthop - PERVASIVE and ONLINK can not be set.

Silently suppress setting the flag in that case and ignore the user
request. After all, the effect is probably the same (that is, the route
is onlink anyway).

(cherry picked from commit 8b14849877)
2023-02-07 14:26:44 +01:00
Thomas Haller
ae906e42da
platform: detect EINVAL as failure to set the MTU
Some drivers will reject an invalid MTU size with EINVAL.

Quote from [1]:

  While investigating, I did notice that do_change_link in
  nm-linux-platform.c really ought to count -EINVAL as an MTU out-of-range
  error and not just -ERANGE. Even if the hardware supports a large MTU,
  if the transmit FIFO is set too small, stmmac_change_mtu [2] will return
  -EINVAL. For example, on my device, the maxmtu is 9000 but in practice I
  can't set an MTU larger than 4096 unless I first run ethtool
  --set-channels eno1 tx 3.

[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1198#note_1738311
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c?h=v6.1#n5577

(cherry picked from commit 621b41ebfa)
2023-02-01 10:50:11 +01:00
Thomas Haller
5579fca916
platform: allow setting multi_idx instance for NMPlatform
The major point of NMDedupMultiIndex is that it can de-duplicate
the objects. It thus makes sense the everybody is using the same
instance. Make the multi-idx instance of NMPlatform configurable.

This is not used outside of unit tests, because the daemon currently
always creates one platform instance and everybody then re-uses the
instance of the platform.

While this is (currently) only used by tests, and that the performance
optimization of de-duplicating is irrelevant for tests, this is still
useful. The test can then check whether two separate NMPlatform objects
shared the same instance and whether it was de-duplicated.
2023-01-19 08:56:21 +01:00
Thomas Haller
2c22c96235
platform: add NMP_OBJECT_TYPE_NAME() macro 2023-01-19 08:56:21 +01:00
Thomas Haller
7752b2e059
platform: abort handling routes in _rtnl_handle_msg() when resync is required
There really is nothing left to do. Skip the rest and do a resync.
2023-01-19 08:56:21 +01:00
Thomas Haller
6fc0dc3fcb
platform: resync route cache upon NLM_F_REPLACE flag
There really is no way around this. As we don't cache all the routes
(e.g. ignored based on rtm_protocol or rtm_type), we cannot know which
route was replaced, when we get a NLM_F_REPLACE message.

We need to request a new dump in that case, which can be expensive, if
there are a lot of routes or if replace happens frequently.

The only possible solutions would be:

1) NetworkManager caches all routes, but it also needs to make sure to
  get *everything* right. In particular, to understand every relevant
  route attribute (including those added in the future, which is
  impossible).

2) kernel provides a reasonable API (rhbz#1337855, rhbz#1337860) that
  allows to sufficiently understand what is going on based on the
  netlink notifications.
2023-01-19 08:56:21 +01:00
Thomas Haller
4ec2123aa2
platform: parse routes of any type to handle replace
When you issue

  ip route replace broadcast 1.2.3.4/32 dev eth0

then this route may well replace a (unicast) route that we have in
the cache.

Previously, we would right away ignore such messages in
_new_from_nl_route(), which means we miss the fact that a route gets
replaced.

Instead, we need to parse the message at least so far, that we can
detect and handle the replace.
2023-01-19 08:56:21 +01:00
Thomas Haller
854f2cc1fc
platform: better handle ip route replace for ignored routes
We don't cache certain routes, for example based on the protocol. This is
a performance optimization to ignore routes that we usually don't care
about.

Still, if the user does `ip route replace` with such a route, then we
need to pass it to nmp_cache_update_netlink_route(), so that we can
properly remove the replaced route.

Knowing which route was replaces might be impossible, as our cache does
not contain all routes. Likely all that nmp_cache_update_netlink_route()
can to is to set "resync_required" for NLM_F_REPLACE. But for that it
should see the object first.

This also means, if we ever write a BPF filter to filter out messages
that contain NLM_F_REPLACE, because that would lead to cache inconsistencies.
2023-01-19 08:56:21 +01:00
Thomas Haller
c64053e6e6
platform: minor cleanup in nmp_cache_update_netlink_route()
It reads nicer. It will also work better with the change that follows.
2023-01-19 08:56:21 +01:00
Thomas Haller
a3cea7f6fb
platform: fix nmp_lookup_init_route_by_weak_id() to honor the route-table
The route table is part of the weak-id. You can see that with:

  ip route replace unicast 1.2.3.4/32 dev eth0 table 57
  ip route replace unicast 1.2.3.4/32 dev eth0 table 58

afterwards, `ip route show table all` will list both routes. The replace
operation is only per-table. Note that NMP_CACHE_ID_TYPE_ROUTES_BY_WEAK_ID
already got this right.

Fixes: 10ac675299 ('platform: add support for routing tables to platform cache')
2023-01-19 08:56:21 +01:00
Thomas Haller
0d458dbf07
platform: avoid printing raw pointer values in log 2023-01-19 08:56:21 +01:00
Lubomir Rintel
38d3834e2c merge: branch 'lr/nl-retry'
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1501
2023-01-17 19:25:51 +01:00
Thomas Haller
3cd02b6ed6
libnm,platform: fix range for "weight" property of next hops for routes
In kernel, the valid range for the weight is 1-256 (on netlink this is
expressed as u8 in rtnh_hops, ranging 0-255).

We need an additional value, to represent

- unset weight, for non-ECMP routes in kernel.

- in libnm API, to express routes that should not be merged as ECMP
  routes (the default).

Extend the type in NMPlatformIP4Route.weight to u16, and fix the code
for the special handling of the numeric range.

Also the libnm API needs to change. Modify the type of the attribute on
D-Bus from "b" to "u", to use a 32 bit integer. We use 32 bit, because
we already have common code to handle 32 bit unsigned integers, despite
only requiring 257 values. It seems better to stick to a few data types
(u32) instead of introducing more, only because the range is limited.

Co-Authored-By: Fernando Fernandez Mancera <ffmancera@riseup.net>

Fixes: 1bbdecf5e1 ('platform: manage ECMP routes')
2023-01-17 14:05:13 +01:00
Lubomir Rintel
b8738002ed platform: retry link change on RESULT_FAILED_RESYNC
Sometimes the buffer space of the netlink socket runs out and we lose
the response to our link change:

  <info>  [1670321010.2952] platform-linux: netlink[rtnl]: read: too many netlink events. Need to resynchronize platform cache
  <warn>  [1670321010.3467] platform-linux: do-change-link[2]: failure changing link: internal failure 3

With 3 above being WAIT_FOR_NL_RESPONSE_RESULT_FAILED_RESYNC.

Let's try harder.

https://bugzilla.redhat.com/show_bug.cgi?id=2154350
2023-01-16 12:52:40 +01:00
Lubomir Rintel
1e6fd1288d platform: log something nice about RESULT_FAILED_RESYNC
This is not nice:

  <warn>  [1670321010.3467] platform-linux: do-change-link[2]: failure changing link: internal failure 3

Let's explain what "internal failure 3" is.
2023-01-16 08:30:35 +01:00
Lubomir Rintel
ad659de3ba platform: remove log_result from do_change_link()
It conveys no useful information beyond what
wait_for_nl_response_to_string() returns.
2023-01-16 08:30:35 +01:00
Lubomir Rintel
3f6d040274 platform: don't negate lefthand argument in set comparison
This 1.) was ugly, 2.) makes it cumbersome to check for both positive
and negative elements in one go.
2023-01-16 08:30:35 +01:00
Beniamino Galvani
2883203df4 platform: fix NULL pointer dereference
src/libnm-platform/nmp-object.c:930: var_deref_op: Dereferencing null pointer "klass->cmd_plobj_to_string_id".

Fixes: 8feeb199ad ('platform: drop redundant hook implementations from NMPObject classes')
2022-12-22 11:34:09 +01:00
Beniamino Galvani
115102efe9 platform: fix build failures due to missing VTI definitions
Older kernel headers don't ship definitions for IFLA_VTI_*, redefine
them.

Fixes: 1cf8df2f35 ('platform: support VTI tunnels')
Fixes: b669a3ae46 ('platform: support VTI6 tunnels')
2022-12-22 09:57:28 +01:00
Beniamino Galvani
b669a3ae46 platform: support VTI6 tunnels 2022-12-21 14:04:44 +01:00
Beniamino Galvani
1cf8df2f35 platform: support VTI tunnels 2022-12-21 14:04:43 +01:00
Thomas Haller
2191e739ae
platform: fix "-Wcast-align" warning for NMPlatformQdisc cast 2022-12-16 10:55:04 +01:00
Thomas Haller
0b1177cb18
all: use _NM_G_TYPE_CHECK_INSTANCE_CAST() for internal uses
G_TYPE_CHECK_INSTANCE_CAST() can trigger a "-Wcast-align":

    src/core/devices/nm-device-macvlan.c: In function 'parent_changed_notify':
    /usr/include/glib-2.0/gobject/gtype.h:2421:42: error: cast increases required alignment of target type [-Werror=cast-align]
     2421 | #  define _G_TYPE_CIC(ip, gt, ct)       ((ct*) ip)
          |                                          ^
    /usr/include/glib-2.0/gobject/gtype.h:501:66: note: in expansion of macro '_G_TYPE_CIC'
      501 | #define G_TYPE_CHECK_INSTANCE_CAST(instance, g_type, c_type)    (_G_TYPE_CIC ((instance), (g_type), c_type))
          |                                                                  ^~~~~~~~~~~
    src/core/devices/nm-device-macvlan.h:13:6: note: in expansion of macro 'G_TYPE_CHECK_INSTANCE_CAST'
       13 |     (G_TYPE_CHECK_INSTANCE_CAST((obj), NM_TYPE_DEVICE_MACVLAN, NMDeviceMacvlan))
          |      ^~~~~~~~~~~~~~~~~~~~~~~~~~

Avoid that by using _NM_G_TYPE_CHECK_INSTANCE_CAST().

This can only be done for our internal usages. The public headers
of libnm are not changed.
2022-12-16 10:55:03 +01:00
Beniamino Galvani
cf11884a85 macsec: fix tracking of parent ifindex
For MACsec interfaces, kernel announces the parent ifindex in the
generic IFLA_LINK netlink attribute, which we save in
NMPlatformLink.parent. There is no need to have a dedicate member in
NMPlatformLnkMacsec.

The dedicate member was never set and during a restart of
NetworkManager the parent of the MACsec device could be unset leading
to a failed assertion:

  act_stage2_config: assertion 'parent' failed

Fixes: 85103656e9 ('platform: add support for macsec links')

https://bugzilla.redhat.com/show_bug.cgi?id=2122564
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1481
2022-12-15 16:30:29 +01:00
Beniamino Galvani
bd24e0b274 platform: support VLAN protocol
Add support for the "protocol" attribute of VLAN links.
2022-12-14 11:33:03 +01:00
Thomas Haller
052ed480a6
platform: fix "-Wcast-align" warning on i686 in nmp_object_ref()
With gcc-12.2.1-4.fc37 on i686 we get:

  ./src/libnm-platform/nmp-object.h: In function 'nmp_object_ref':
  ./src/libnm-platform/nmp-object.h:626:12: error: cast increases required alignment of target type [-Werror=cast-align]
    626 |     return (const NMPObject *) nm_dedup_multi_obj_ref((const NMDedupMultiObj *) obj);
        |            ^
  cc1: all warnings being treated as errors

Work around that be increasing the alignment of NMDedupMultiObj.
It has no downsides, because we usually put a NMDedupMultiObj in heap
allocated memory, which is already suitably aligned. Or we put it on
the stack, where wasting a few bytes for the alignment doesn't matter.
We basically never embed NMDedupMultiObj in an array where the increase
of alignment would waste additional space.
2022-12-14 09:46:33 +01:00
Thomas Haller
36f8de25c4
all: fix various "-Wcast-align=strict" warnings
The warning "-Wcast-align=strict" seems useful and will be enabled
next. Fix places that currently cause the warning by using the
new macro NM_CAST_ALIGN(). This macro also nm_assert()s that the alignment
is correct.
2022-12-09 09:15:56 +01:00
Thomas Haller
6996fa64b6
platform: ensure all NMPlatform* structs have same alignment
We put all these structs inside the tagged union NMPObject.
Also, in a sense NMPlatformObject is the base "type" of all
these structs, meaning, it should be able to up and downcast.

Ensure the alignment matches.

This helps to avoid "-Wcast-align" warnings when trying to cast
a (NMPlatformObject*) to another (NMPlatformXXX *) type. Something
we commonly do.
2022-12-09 09:15:54 +01:00
Thomas Haller
4ae5f7f76b
platform: move "struct _NMPlatformObject" to "nmp-plobj.h"
All our platform structs should move there. For now, just move
struct _NMPlatformObject because it will be needed there.
2022-12-09 09:15:54 +01:00
Wen Liang
e8618f03d7
support loopback interface
Support managing the loopback interface through NM as the users want to
set the proper mtu for loopback interface when forwarding the packets.
Additionally, the IP addresses, DNS, route and routing rules are also
allowed to configure for the loopback connection profiles.

https://bugzilla.redhat.com/show_bug.cgi?id=2060905
2022-11-23 20:51:22 +01:00
Thomas Haller
2afadee27f
platform: workaround build error in nm_platform_ip4_route_hash_update() with old clang
clang-3.4.2-9.el7 does not like nesting NM_MAX() macro inside nm_hash_update_vals() macro.
Workaround by using MAX() instead. NM_MAX() uses an expression statement and NM_UNIQ()
to evaluate the arguments only once. We don't need that here and glib's MAX() suffices.

    CC       src/libnm-platform/src_libnm_platform_libnm_platform_la-nm-platform.lo
  ../src/libnm-platform/nm-platform.c:8247:53: error: in-class initializer for static data member is not a constant expression
                                      (guint8) NM_MAX(obj->weight, 1u));
                                                      ^
  ../src/libnm-std-aux/nm-std-aux.h:399:40: note: expanded from macro 'NM_MAX'
  #define NM_MAX(a, b) __NM_MAX(NM_UNIQ, a, NM_UNIQ, b)
                                         ^
  ../src/libnm-std-aux/nm-std-aux.h:402:39: note: expanded from macro '__NM_MAX'
          typeof(a) NM_UNIQ_T(A, aq) = (a);                                              \
                                        ^
  ../src/libnm-glib-aux/nm-hash-utils.h:124:36: note: expanded from macro 'nm_hash_update_vals'
          NM_HASH_COMBINE_VALS(_val, __VA_ARGS__);      \
                                     ^

Fixes: 8cc41d41fe ('platform: add NM_PLATFORM_IP_ROUTE_CMP_TYPE_ECMP_ID for comparing ECMP base route')
2022-11-23 16:28:34 +01:00
Thomas Haller
3fb8c0f614
clang-format: reformat code with clang-format 15.0.4-1.fc37
This is the version shipped in Fedora 37. As Fedora 37 is now out, the
core developers switch to it. Our gitlab-ci will also use that as base
image for the check-{patch.tree} tests and to generate the pages. There
is a need that everybody agrees on which clang-format version to use,
and that version should be the one of the currently used Fedora release.

Also update the used Fedora image in "contrib/scripts/nm-code-format-container.sh"
script.

The gitlab-ci still needs update in the following commit. The change
in isolation will break the "check-tree" test.
2022-11-23 09:17:21 +01:00
Thomas Haller
48d7d1d78e
platform: drop inline cmp() wrappers around "full" versions
We sometimes have functions foo() and foo_full(), in which case
foo() has fewer arguments and just calls foo_full(). The "full"
function here is the more powerful one, and foo() is implemented
in terms of the former.

nm_platform_ip4_route_cmp_full() and m_platform_ip4_route_cmp() inverted
that pattern. The "_full" there stands for the full comparison, to not
allowing to select the comparison type.

That inconsistency is ugly. Also, these wrappers were used at only few
places. Let's drop them.

While at it, also drop nm_platform_qdisc_cmp() and rename
nm_platform_qdisc_cmp_full(). Here cmp()/cmp_full() followed the common
pattern foo()/foo_full(), but it's still hardly used and unnecessary.
2022-11-21 17:56:48 +01:00
Thomas Haller
8cc41d41fe platform: add NM_PLATFORM_IP_ROUTE_CMP_TYPE_ECMP_ID for comparing ECMP base route 2022-11-21 17:46:34 +01:00
Thomas Haller
9270bf611f platform: add nm_platform_ip4_route_hash() helper 2022-11-21 11:19:39 +01:00
Fernando Fernandez Mancera
151b2bed36 platform: pass extra_hops to ip_route_add function
When adding a new route we need to consider it contains extra nexthops
i.e it is a ECMP route. As we cannot modify the NMPObject once created,
we need to pass the extra nexthops as an argument.

We cannot use the original NMPObject because normalization is happening
during when adding the route.
2022-11-21 11:19:19 +01:00
Fernando Fernandez Mancera
1bbdecf5e1 platform: manage ECMP routes
When reading from netlink an ECMP IPv4 route, we need to parse the
multiple nexthops. In order to do that, we are introducing
NMPlatformIP4RtNextHop struct.

The first nexthop information will be kept at the original
NMPlatformIP4Route and the new property n_nexthops will indicate how
many nexthops we need to consider.
2022-11-21 11:18:03 +01:00
Thomas Haller
57b23c12cc
platform: only initialize actual data for stackinit NMPObject
The NMPObject is a tagged union. There is no need to initialize anything
after the size of the actually used union field. Change this, so maybe we
get a valgrind warning about uninitialized memory if we wrongly try to
access it.

On the other hand, the object really is supposed to be a full
NMPObject. Previously, we would get a valgrind warning, if we tried to
pass fewer data there.

It really doesn't matter much, but all other functions don't assume
that there is any important data after the size indicated by the class.
2022-11-08 12:57:24 +01:00