Commit graph

386 commits

Author SHA1 Message Date
Thomas Haller
ad06cc78dc platform: make nm_platform_kernel_support_get() macro an inline function
clang (3.4.2-9.el7) on CentOS 7.6 fails related to nm_hash_update_vals().

I am not even quoting the error message, it's totally non-understandable.

nm_hash_update_vals() uses typeof(), and in some obscure cases, clang dislikes
when the argument itself is some complex macro. I didn't fully understand why,
but this works around it.

I would prefer to fix nm_hash_update_vals() to not have this limitation.
But I don't know how.

There is probably no downside to have this an inline function instead of
a macro.
2019-05-29 09:42:40 +02:00
Beniamino Galvani
121c58f0c4 core: set number of SR-IOV VFs asynchronously
When changing the number of VFs the kernel can block for very long
time in the write() to sysfs, especially if autoprobe-drivers is
enabled. Turn the nm_platform_link_set_sriov_params() into an
asynchronous function.
2019-05-28 10:35:04 +02:00
Beniamino Galvani
abec66762a platform: add async sysctl set function
Add a function to asynchronously set sysctl values.
2019-05-28 10:34:53 +02:00
Beniamino Galvani
3ed23d405e platform: use 'self' argument name for platform functions
Uniform all functions to use 'self' as first argument.
2019-05-28 10:34:53 +02:00
Thomas Haller
f2ae994b23 device/trivial: add comment about lifetime of "kind" in tc_commit()
In general, all fields of public NMPlatform* structs must be
plain/simple. Meaning: copying the struct must be possible without
caring about cloning/duplicating memory.
In other words, if there are fields which lifetime is limited,
then these fields cannot be inside the public part NMPlatform*.

That is why

  - "NMPlatformLink.kind", "NMPlatformQdisc.kind", "NMPlatformTfilter.kind"
    are set by platform code to an interned string (g_intern_string())
    that has a static lifetime.

  - the "ingress_qos_map" field is inside the ref-counted struct NMPObjectLnkVlan
    and not NMPlatformLnkVlan. This field requires managing the lifetime
    of the array and NMPlatformLnkVlan cannot provide that.

See also for example NMPClass.cmd_obj_copy() which can deep-copy an object.
But this is only suitable for fields in NMPObject*. The purpose of this
rule is that you always can safely copy a NMPlatform* struct without
worrying about the ownership and lifetime of the fields (the field's
lifetime is unlimited).

This rule and managing of resource lifetime is the main reason for the
NMPlatform*/NMPObject* split. NMPlatform* structs simply have no mechanism
for copying/releasing fields, that is why the NMPObject* counterpart exists
(which is ref-counted and has a copy and destructor function).

This is violated in tc_commit() for the "kind" strings. The lifetime
of these strings is tied to the setting instance.

We cannot intern the strings (because these are arbitrary strings
and interned strings are leaked indefinitely). We also cannot g_strdup()
the strings, because NMPlatform* is not supposed to own strings.

So, just add comments that warn about this ugliness.

The more correct solution would be to move the "kind" fields inside
NMPObjectQdisc and NMPObjectTfilter, but that is a lot of extra effort.
2019-05-07 21:05:12 +02:00
Thomas Haller
36d6aa3bcd platform: use bool bitfields in NMPlatformActionMirred structure
Arguably, the structure is used inside a union with another (larger)
struct, hence no memory is saved.

In fact, it may well be slower performance wise to access a boolean bitfield
than a gboolean (int).

Still, boolean fields in structures should be bool:1 bitfields for
consistency.
2019-05-07 20:58:17 +02:00
Thomas Haller
666d58802b libnm: rename "memory" parameter of fq_codel QDisc to "memory_limit"
Kernel calls the netlink attribute TCA_FQ_CODEL_MEMORY_LIMIT. Likewise,
iproute2 calls this "memory_limit".

Rename because TC parameters are inherrently tied to the kernel
implementation and we should use the familiar name.
2019-05-07 20:58:17 +02:00
Thomas Haller
973db2d41b platform: fix handling of default value for TCA_FQ_CODEL_CE_THRESHOLD
iproute2 uses the special value ~0u to indicate not to set
TCA_FQ_CODEL_CE_THRESHOLD in RTM_NEWQDISC. When not explicitly
setting the value, kernel treats the threshold as disabled.

However note that 0xFFFFFFFFu is not an invalid threshold (as far as
kernel is concerned). Thus, we should not use that as value to indicate
that the value is unset. Note that iproute2 uses the special value ~0u
only internally thereby making it impossible to set the threshold to
0xFFFFFFFFu). But kernel does not have this limitation.

Maybe the cleanest way would be to add another field to NMPlatformQDisc:

    guint32 ce_threshold;
    bool ce_threshold_set:1;

that indicates whether the threshold is enable or not.
But note that kernel does:

    static void codel_params_init(struct codel_params *params)
    {
    ...
            params->ce_threshold = CODEL_DISABLED_THRESHOLD;

    static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt,
                               struct netlink_ext_ack *extack)
    {
    ...
            if (tb[TCA_FQ_CODEL_CE_THRESHOLD]) {
                    u64 val = nla_get_u32(tb[TCA_FQ_CODEL_CE_THRESHOLD]);

                    q->cparams.ce_threshold = (val * NSEC_PER_USEC) >> CODEL_SHIFT;
            }

    static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
    {
    ...
            if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD &&
                nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD,
                            codel_time_to_us(q->cparams.ce_threshold)))
                    goto nla_put_failure;

This means, kernel internally uses the special value 0x83126E97u to indicate
that the threshold is disabled (WTF). That is because

  (((guint64) 0x83126E97u) * NSEC_PER_USEC) >> CODEL_SHIFT == CODEL_DISABLED_THRESHOLD

So in kernel API this value is reserved (and has a special meaning
to indicate that the threshold is disabled). So, instead of adding a
ce_threshold_set flag, use the same value that kernel anyway uses.
2019-05-07 20:58:17 +02:00
Thomas Haller
46a904389b platform: fix handling of fq_codel's memory limit default value
The memory-limit is an unsigned integer. It is ugly (if not wrong) to compare unsigned
values with "-1". When comparing with the default value we must also use an u32 type.
Instead add a define NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET.

Note that like iproute2 we treat NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET
to indicate to not set TCA_FQ_CODEL_MEMORY_LIMIT in RTM_NEWQDISC. This
special value is entirely internal to NetworkManager (or iproute2) and
kernel will then choose a default memory limit (of 32MB). So setting
NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET means to leave it to kernel to
choose a value (which then chooses 32MB).

See kernel's net/sched/sch_fq_codel.c:

    static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt,
                             struct netlink_ext_ack *extack)
    {
    ...
            q->memory_limit = 32 << 20; /* 32 MBytes */

    static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt,
                               struct netlink_ext_ack *extack)
    ...
            if (tb[TCA_FQ_CODEL_MEMORY_LIMIT])
                    q->memory_limit = min(1U << 31, nla_get_u32(tb[TCA_FQ_CODEL_MEMORY_LIMIT]));

Note that not having zero as default value is problematic. In fields like
"NMPlatformIP4Route.table_coerced" and "NMPlatformRoutingRule.suppress_prefixlen_inverse"
we avoid this problem by storing a coerced value in the structure so that zero is still
the default. We don't do that here for memory-limit, so the caller must always explicitly
set the value.
2019-05-07 20:58:17 +02:00
Lubomir Rintel
900292147d tc/tfilter: add mirred action 2019-04-30 15:59:41 +02:00
Lubomir Rintel
1efe982e39 tc/qdisc: add support for fq_codel attributes 2019-04-30 15:59:41 +02:00
Thomas Haller
a9cf54c4ce platform: detect kernel support for FRA_L3MDEV
(cherry picked from commit eba4fd56f5)
2019-04-18 11:19:26 +02:00
Thomas Haller
ff686dd6c1 platform: detect kernel support for FRA_UID_RANGE
(cherry picked from commit 1dd1dcb81e)
2019-04-18 11:19:26 +02:00
Thomas Haller
4127583189 platform: detect kernel support for FRA_IP_PROTO, FRA_SPORT_RANGE, FRA_DPORT_RANGE
(cherry picked from commit 91252bb2fb)
2019-04-18 11:19:26 +02:00
Thomas Haller
6bfce3587e platform: detect kernel support for FRA_PROTOCOL
(cherry picked from commit cd62d43963)
2019-04-18 11:19:26 +02:00
Thomas Haller
bf36fa11d2 platform: refactor detecting kernel features
Next we will need to detect more kernel features. First refactor the
handling of these to require less code changes and be more efficient.
A plain nm_platform_kernel_support_get() only reqiures to access an
array in the common case.

The other important change is that the function no longer requires a
NMPlatform instance. This allows us to check kernel support from
anywhere. The only thing is that we require kernel support to be
initialized before calling this function. That means, an NMPlatform
instance must have detected support before.

(cherry picked from commit ee269b318e)
2019-04-18 11:19:26 +02:00
Beniamino Galvani
da204257b1 all: support bridge vlan ranges
In some cases it is convenient to specify ranges of bridge vlans, as
already supported by iproute2 and natively by kernel. With this commit
it becomes possible to add a range in this way:

 nmcli connection modify eth0-slave +bridge-port.vlans "100-200 untagged"

vlan ranges can't be PVIDs because only one PVID vlan can exist.

https://bugzilla.redhat.com/show_bug.cgi?id=1652910
(cherry picked from commit 7093515777)
2019-04-18 09:53:18 +02:00
Thomas Haller
f5e8bbc8e0 libnm,core: enable "onlink" flags also for IPv6 routes
Previously, onlink (RTNH_F_ONLINK) did not work for IPv6.
In the meantime, this works in kernel ([1], [2]). Enable it also
in NetworkManager.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fc1e64e1092f62290d59151d16f9de0210e303c8
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=68e2ffdeb5dbf54bc3a0684aa4e73c6db8675eed

https://github.com/NetworkManager/NetworkManager/pull/337
2019-04-10 09:02:35 +02:00
Beniamino Galvani
fafde171ea platform: add support for bridge vlans 2019-03-26 17:19:39 +01:00
Thomas Haller
9992ac1cf8 platform: add routing-rule add/delete netlink functions 2019-03-13 09:03:59 +01:00
Thomas Haller
9934a6a0e3 platform: add support for routing-rule objects and cache them in platform
Add and implement NMPlatformRoutingRule types and let the platform cache
handle rules.

Rules are special in two ways:

- they don't have an ifindex. That makes them different from all other
  currently existing NMPlatform* types, which have an "ifindex" field and
  "implement" NMPlatformObjWithIfindex.

- they have an address family, but contrary to addresses and routes, there
  is only one NMPlatformRoutingRule object to handle both address
  families.

Both of these points require some special considerations.

Kernel treats routing-rules quite similar to routes. That is, kernel
allows to add different rules/routes, as long as they differ in certain
fields. These "fields" make up the identity of the rules/routes. But
in practice, it's not defined which fields contribute to the identity
of these objects. That makes using the netlink API very hard. For
example, when kernel gains support for a new attribute which
NetworkManager does not know yet, then users can add two rules/routes
that look the same to NetworkManager. That can easily result in cache
inconsistencies.

Another problem is, that older kernel versions may not yet support all
fields, which NetworkManager (and newer kernels) considers for identity.
The older kernel will not simply reject netlink messages with these unknown
keys, instead it will proceed adding the route/rule without it. That means,
the added route/rule will have a different identity than what NetworkManager
intended to add.
2019-03-13 09:03:59 +01:00
Thomas Haller
7c5ad2d910 platform: drop unused nm_platform_refresh_all()
The function is unused. It would require redesign to work with
future changes, and since it's unused, just drop it.

The long reasoning is:

    Currently, a refresh-all is tied to an NMPObjectType. However, with
    NMPObjectRoutingRule (for policy-routing-rules) that will no longer
    be the case.

    That is because NMPObjectRoutingRule will be one object type for
    AF_INET and AF_INET6. Contrary to IPv4 addresses and routes, where
    there are two sets of NMPObject types.

    The reason is, that it's preferable to treat IPv4 and IPv6 objects
    similarly, that is: as the same type with an address family property.

    That also follows netlink, which uses RTM_GET* messages for both
    address families, and the address family is expressed inside the
    message.

    But then an API like nm_platform_refresh_all() makes little sense,
    it would require at least an addr_family argument. But since the
    API is unused, just drop it.
2019-03-13 09:03:59 +01:00
Thomas Haller
ac4a1deba0 platform: add NMPlatformObjWithIfindex helper structure for handling NMPObject types
Until now, all implemented NMPObject types have an ifindex field (from
links, addresses, routes, qdisc to tfilter).

The NMPObject structure contains a union of all available types, that
makes it easier to down-case from an NMPObject pointer to the actual
content.

The "object" field of NMPObject of type NMPlatformObject is the lowest
common denominator.

We will add NMPlatformRoutingRules (for policy routing rules). That type
won't have an ifindex field.

Hence, drop the "ifindex" field from NMPlatformObject type. But also add
a new type NMPlatformObjWithIfindex, that can represent all types that
have an ifindex.
2019-03-13 09:03:59 +01:00
Thomas Haller
7259195a91 platform: unify IPv4 and IPv6 variables for NMPlatformVTableRoute 2019-03-13 09:03:59 +01:00
Thomas Haller
153b41fa97 platform: add peer_flags argument to nm_platform_link_wireguard_change() 2019-02-14 08:00:29 +01:00
Thomas Haller
1e1b03c089 platform: add flags for setting individual WireGuard options of link 2019-02-14 08:00:29 +01:00
Thomas Haller
2ed01e2e34 platform: add change-flags argument to platform's link_wireguard_change()
We will need more flags.

WireGuard internal tools solve this by embedding the change flags inside
the structure that corresponds to NMPlatformLnkWireGuard. We don't do
that, NMPlatformLnkWireGuard is only for containing the information about
the link.
2019-02-14 08:00:29 +01:00
Thomas Haller
737ab51472 all: include "nm-utils/nm-errno.h" via "nm-default.h" 2019-02-12 08:50:28 +01:00
Thomas Haller
6f8c7b580d platform: add @replace_peers argument to nm_platform_link_wireguard_change()
The caller may not wish to replace existing peers, but only update/add
the peers explicitly passed to nm_platform_link_wireguard_change().

I think that is in particular interesting, because for the most part
NetworkManager will configure the same set of peers over and over again
(whenever we resolve the DNS name of an IP endpoint of the WireGuard
peer).

At that point, it seems disruptive to drop all peers and re-add them
again. Setting @replace_peers to %FALSE allows to only update/add.
2019-01-22 16:30:23 +01:00
Thomas Haller
3263cab596 all: add static assertion for maximumg alloca() allocated buffer
Add a compile time check that the buffer that we allocate on the stack
is reasonably small.
2019-01-15 09:52:01 +01:00
Thomas Haller
a5c894c35f platform: create wireguard netdev interface
The netlink code for WG_CMD_SET_DEVICE is strongly inspired by
WireGuard ([1]) and systemd ([2]).

Currently, nm_platform_link_wireguard_change() always aims to reset
all peers and allowed-ips settings. I think that should be improved
in the future, to support only partial updates.

[1] https://git.zx2c4.com/WireGuard/tree/contrib/examples/embeddable-wg-library/wireguard.c?id=5e99a6d43fe2351adf36c786f5ea2086a8fe7ab8#n1073
[2] 04ca4d191b/src/network/netdev/wireguard.c (L48)
2019-01-09 16:46:41 +01:00
Thomas Haller
691e5d5cc9 platform: return platform-error from link-add function
We need more information what failed. Don't only return success/failure,
but an error number.

Note that we still don't actually return an error number. Only
the link_add() function is changed to return an nm-error integer.
2018-12-27 21:33:59 +01:00
Thomas Haller
d18f40320d platform: merge NMPlatformError with nm-error
Platform had it's own scheme for reporting errors: NMPlatformError.
Before, NMPlatformError indicated success via zero, negative integer
values are numbers from <errno.h>, and positive integer values are
platform specific codes. This changes now according to nm-error:
success is still zero. Negative values indicate a failure, where the
numeric value is either from <errno.h> or one of our error codes.
The meaning of positive values depends on the functions. Most functions
can only report an error reason (negative) and success (zero). For such
functions, positive values should never be returned (but the caller
should anticipate them).
For some functions, positive values could mean additional information
(but still success). That depends.

This is also what systemd does, except that systemd only returns
(negative) integers from <errno.h>, while we merge our own error codes
into the range of <errno.h>.

The advantage is to get rid of one way how to signal errors. The other
advantage is, that these error codes are compatible with all other
nm-errno values. For example, previously negative values indicated error
codes from <errno.h>, but it did not entail error codes from netlink.
2018-12-27 21:33:59 +01:00
Thomas Haller
91b5babff2 core/trivial: rename nm_platform_sysctl_set_ip6_hop_limit_safe()
Now that we have other helper function on platfrom for setting
IP configuration sysctls, rename the function to set the hop-limit
to match the pattern.
2018-12-19 09:05:12 +01:00
Thomas Haller
7fa398d596 platform: add nm_platform_sysctl_ip_conf_*() wrappers 2018-12-19 09:05:12 +01:00
Thomas Haller
67f02b2a14 platform: assert length of stack allocation in NMP_SYSCTL_PATHID_NETDIR_unsafe()
NMP_SYSCTL_PATHID_NETDIR_unsafe() uses alloca() to allocate the string.
Assert that the "path" argument is reasonably short.

In practice, that is of course the case, because there are only 2 callers
which take care not to pass an untrusted, unbounded path argument.
2018-12-19 08:56:51 +01:00
Thomas Haller
b445b1f8fe platform: add nm_platform_link_get_ifi_flags() helper
Add helper nm_platform_link_get_ifi_flags() to access the
ifi-flags.

This replaces the internal API _link_get_flags() and makes it public.
However, the return value also allows to distinguish between errors
and valid flags.

Also, consider non-visible links. These are links that are in netlink,
but not visible in udev. The ifi-flags are inherrently netlink specific,
so it seems wrong to pretend that the link doesn't exist.
2018-11-29 13:50:10 +01:00
Thomas Haller
37e47fbdab build: avoid header conflict for <linux/if.h> and <net/if.h> with "nm-platform.h"
In the past, the headers "linux/if.h" and "net/if.h" were incompatible.
That means, we can either include one or the other, but not both.
This is fixed in the meantime, however the issue still exists when
building against older kernel/glibc.

That means, including one of these headers from a header file
is problematic. In particular if it's a header like "nm-platform.h",
which itself is dragged in by many other headers.

Avoid that by not including these headers from "platform.h", but instead
from the source files where needed (or possibly from less popular header
files).

Currently there is no problem. However, this allows an unknowing user to
include <net/if.h> at the same time with "nm-platform.h", which is easy
to get wrong.
2018-11-12 16:02:35 +01:00
Lubomir Rintel
0573656eeb platform/wpan: allow setting channel 2018-10-07 15:46:02 +02:00
luz.paz
f985b6944a docs: misc. typos
Found via `codespell -q 3 --skip="*.po"`

https://github.com/NetworkManager/NetworkManager/pull/203
2018-09-15 09:08:03 +02:00
Thomas Haller
62d14e1884 platform/wireguard: rework parsing wireguard links in platform
- previously, parsing wireguard genl data resulted in memory corruption:

  - _wireguard_update_from_allowedips_nla() takes pointers to

      allowedip = &g_array_index (buf->allowedips, NMWireGuardAllowedIP, buf->allowedips->len - 1);

    but resizing the GArray will invalidate this pointer. This happens
    when there are multiple allowed-ips to parse.

  - there was some confusion who owned the allowedips pointers.
    _wireguard_peers_cpy() and _vt_cmd_obj_dispose_lnk_wireguard()
    assumed each peer owned their own chunk, but _wireguard_get_link_properties()
    would not duplicate the memory properly.

- rework memory handling for allowed_ips. Now, the NMPObjectLnkWireGuard
  keeps a pointer _allowed_ips_buf. This buffer contains the instances for
  all peers.
  The parsing of the netlink message is the complicated part, because
  we don't know upfront how many peers/allowed-ips we receive. During
  construction, the tracking of peers/allowed-ips is complicated,
  via a CList/GArray. At the end of that, we prettify the data
  representation and put everything into two buffers. That is more
  efficient and simpler for user afterwards. This moves complexity
  to the way how the object is created, vs. how it is used later.

- ensure that we nm_explicit_bzero() private-key and preshared-key. However,
  that only works to a certain point, because our netlink library does not
  ensure that no data is leaked.

- don't use a "struct sockaddr" union for the peer's endpoint. Instead,
  use a combintation of endpoint_family, endpoint_port, and
  endpoint_addr.

- a lot of refactoring.
2018-09-07 11:24:17 +02:00
Thomas Haller
39efc65096 platform: drop unused virtual function NMPlatformClass.wifi_get_ssid() 2018-08-22 10:49:34 +02:00
Thomas Haller
c085b6e3a7 platform/ethtool: add code to get/set offload features via ethtool
Also, add two more features "tx-tcp-segmentation" and
"tx-tcp6-segmentation". There are two reasons for that:

 - systemd-networkd supports setting these two features,
   so lets support them too (apparently they are important
   enough for networkd).

 - these two features are already implicitly covered by "tso".
   Like for the "ethtool" program, "tso" is an alias for several
   actual features. By adding two features that are already
   also covered by an alias (which sets multiple kernel names
   at once), we showcase how aliases for the same feature can
   coexist. In particular, note how setting
   "tso on tx-tcp6-segmentation off" will behave as one would
   expect: all 4 tso features covered by the alias are enabled,
   except that particular one.
2018-08-10 10:38:19 +02:00
Javier Arteaga
edd5cf1a3c platform: rename instances of Wireguard to WireGuard
Respect WireGuard canonical capitalization on identifiers.
As per discussion on:
https://github.com/NetworkManager/NetworkManager/pull/162
2018-08-06 08:34:27 +02:00
Beniamino Galvani
8720dd3df1 platform: add support for changing VF attributes 2018-07-11 16:16:22 +02:00
Beniamino Galvani
7df3333879 platform: allow setting drivers-autoprobe on SR-IOV PFs
It is possible to tell kernel not to automatically autoprobe drivers
for VFs. This is useful, for example, if the VF must be used by a VM.
2018-07-11 16:16:22 +02:00
Lubomir Rintel
79ddef403c merge: branch 'wireguard-platform' of https://github.com/jbeta/NetworkManager
https://github.com/NetworkManager/NetworkManager/pull/143
2018-07-09 11:08:12 +02:00
Beniamino Galvani
09a868a24e platform: add ip6gre/ip6gretap tunnels support
Add platform support for IP6GRE and IP6GRETAP tunnels. The former is a
virtual tunnel interface for GRE over IPv6 and the latter is the L2
variant.

The platform code internally reuses and extends the same structure
used by IPv6 tunnels.
2018-07-02 17:55:14 +02:00
Beniamino Galvani
4c2862b958 platform: add gretap tunnels support
Add platform support for GRETAP tunnels (Virtual L2 tunnel interface
GRE over IPv4) partially reusing the existing GRE code.
2018-07-02 17:55:14 +02:00
Javier Arteaga
0827d4c2e4 platform: add support for WireGuard links
Add support for a new wireguard link type to the platform code. For now
this only covers querying existing links via genetlink and parsing them
into platform objects.
2018-07-01 14:52:46 +02:00