All callers pass a C string literal as the user-data tag of NMAuthChain.
It makes little sense otherwise because you usually know which user data
you need in advance.
So don't bother with copying the string. Just reference the defacto
static string.
Rename nm_auth_chain_set_data() to nm_auth_chain_set_data_unsafe() to indicate
that the lifetime of the tag string is now the caller's responsibility.
The nm_auth_chain_set_data() macro now ensures that the tag argument is
a C string literal. In fact, all callers that we had did that already.
There are two similar cases where the caller attaches the requested
permission to the auth-chain. There the tag "perm" is used.
Rename for consistency.
Out of the 33 callers of nm_auth_chain_add_call(), the permission
argument is:
- 29 times a C string literal like NM_AUTH_PERMISSION_NETWORK_CONTROL.
- 3 times assign a string that is in fact a static string (it's just
not a string literal)
- only NMManager's device_auth_request_cb() passes a permission of
(possibly) non static origin. But it already duplicates the string
for it's own purposes and attaches it as user-data to the
NMAuthChain.
There really is no need to duplicate the string.
Replace nm_auth_chain_add_call() by a macro that ensures that the
permission string is a C literal.
Rename nm_auth_chain_add_call() to nm_auth_chain_add_call_unsafe() to
indicate that the lifetime of the permission argument is now the
responsibility of the caller.
We track the user-data in a linked list. Hence, when setting
a user data we would need to search the list whether the
tag already exists.
This has an overhead and makes set-data() O(n). But we really don't need
this. The NMAuthChain allows a simple way to attach user-data to the
request. We don't need a full-blown g_object_set_data() (which btw is
also just implemented as a linked list).
NMAuthChain tracks arbitrary user-data pointers for convenience of the
caller.
Previously, it would also track the permission request result as such
user-data. That is not optimal.
- for one, we should keep the namespaces for user data (tags) and
permissions separate. When the user requests a permission result with
nm_auth_chain_get_result() then clearly no user-data is requested.
- we already track permissions in a separate linked list. Previously, when the
auth-call completes, we would destroy the entry in the linked list for
requests and create an entry in the linked list for user-data.
Possibly this was done in the past, because the user-data list was
indexed by a hash table. So, in that case, we only would need to
lookup the permission results in the indexed user-data hash, but never
do any search of a linked list. That is no longer the case, because
in practice the number of tracked user-data/permissions is fixed and
small (at which point it's just more efficient to search a short
linked list).
- There is already distrinct API for getting user-data and permissions.
By keeping the lists separate we don't need to search the list for
entries of the wrong type (it's faster).
- also assert that nm_auth_chain_get_result() indeed asks for a result
that was previously requested. It makes no sense otherwise.
NMAuthChain usually requests several permissions at once. Hence, an error
argument in the overall callback does not make sense, because you
wouldn't know which request failed.
If at all, it could only mean that the overall request failed (like an
D-Bus failure communicating to D-Bus *for all permisssions*),
but we don't need to handle that specially. In fact, we don't really care
why permission was not granted, whether it's due to an error or legitimate
reasons.
The error in the callback was always set to %NULL. Remove it.
The number of user-datas attached to the NMAuthChain is generally fixed
and small.
For example, in current code impl_manager_get_permissions() will be the
instance that ends up with the largest number of data items. It
performs zero calls of nm_auth_chain_set_data() but 16 times
nm_auth_chain_add_call(). So currently the maximum number is 16.
With such a fixed, small number of elements it is expected to be more
efficient to just track the elements in a CList instead of a GHashTable.
- consistently name the ChainData variable "chain_data"
- return the ChainData element from _get_data(). This way it
also can be used by nm_auth_chain_steal_data(), which needs
the ChainData element.
NMAuthChain is not ref-counted.
You may call nm_auth_chain_destroy() once before the callback
gets invoked. This destroys the auth-chain instance right away.
You may call nm_auth_chain_destroy() once from inside the callback.
This basically has no effect but is allowed for convenince.
All this does is remembering that destroy was called and asserts that
destroy gets called at most once.
After the callback returns, the auth-chain will always be destroyed.
That means, generally there is no need to call nm_auth_chain_destroy()
from inside the callback.
Remove that code, and refactor some code to return early (where it makes
sense).
NMAuthChain is not really ref-counted, there is no need for that additional
complexity.
But it is graceful towards calling nm_auth_chain_destroy() from inside the
callback. The caller may do so.
But we don't need a "ref_count" to track that. Two flags suffice: one to
say whether destroy was called and one to indicate that we are in the
process of finishing (to delay deallocating the NMAuthChain struct).
We already had the "done" flag that we used to indicate that the chain
is finished. So, we just need one more flag instead.
This function was left as a reminder now for 9 years. Get rid of it.
Related: 8310593ce4 ('core: ignore authorization for sleep/wake requests (but restrict to root) (rh #638640)')
The boolean value is intended to indicate success. It would indicated
failure due to a bug.
Fixes: 297d4985ab ('core/dbus: rework D-Bus implementation to use lower layer GDBusConnection API'):
Consider the situation in which ipv4.method=auto and there is an
address configured. Also, the DHCP timeout is long and there is no
DHCP server. If the link is brought down temporarily, the prefix route
for the static address is lost and not restored by NM because we
reapply the IP configuration only when the IP state is DONE.
The same can happen also for IPv6, but in that case also static IPv6
addresses are lost.
We should always reapply the IP configuration when the link goes up.
In general, all fields of public NMPlatform* structs must be
plain/simple. Meaning: copying the struct must be possible without
caring about cloning/duplicating memory.
In other words, if there are fields which lifetime is limited,
then these fields cannot be inside the public part NMPlatform*.
That is why
- "NMPlatformLink.kind", "NMPlatformQdisc.kind", "NMPlatformTfilter.kind"
are set by platform code to an interned string (g_intern_string())
that has a static lifetime.
- the "ingress_qos_map" field is inside the ref-counted struct NMPObjectLnkVlan
and not NMPlatformLnkVlan. This field requires managing the lifetime
of the array and NMPlatformLnkVlan cannot provide that.
See also for example NMPClass.cmd_obj_copy() which can deep-copy an object.
But this is only suitable for fields in NMPObject*. The purpose of this
rule is that you always can safely copy a NMPlatform* struct without
worrying about the ownership and lifetime of the fields (the field's
lifetime is unlimited).
This rule and managing of resource lifetime is the main reason for the
NMPlatform*/NMPObject* split. NMPlatform* structs simply have no mechanism
for copying/releasing fields, that is why the NMPObject* counterpart exists
(which is ref-counted and has a copy and destructor function).
This is violated in tc_commit() for the "kind" strings. The lifetime
of these strings is tied to the setting instance.
We cannot intern the strings (because these are arbitrary strings
and interned strings are leaked indefinitely). We also cannot g_strdup()
the strings, because NMPlatform* is not supposed to own strings.
So, just add comments that warn about this ugliness.
The more correct solution would be to move the "kind" fields inside
NMPObjectQdisc and NMPObjectTfilter, but that is a lot of extra effort.
While nm_platform_link_get_ifindex() is documented to return 0 if the device
is not found, don't rely on it. Instead, check that a valid(!) ifindex was
returned, and only then set the ifindex. Otherwise leave it at zero. There
is of course no difference in practice, but we generally treat invalid ifindexes
as <= 0, so it's not immediately clear what nm_platform_link_get_ifindex()
returns to signal no device.
There is only one caller, hence it's simpler to see it all in one place.
I prefer this, because then I can read the code top to bottom and
see what's happening, without following helper functions.
Also, this way we can "reuse" the nla_put_failure label and assertion. Previously,
if the assertion was hit we would not rewind the buffer but continue
constructing the message (which is already borked). Not that it matters
too much, because this was on an "failed-assertion" code path.
Arguably, the structure is used inside a union with another (larger)
struct, hence no memory is saved.
In fact, it may well be slower performance wise to access a boolean bitfield
than a gboolean (int).
Still, boolean fields in structures should be bool:1 bitfields for
consistency.
Kernel calls the netlink attribute TCA_FQ_CODEL_MEMORY_LIMIT. Likewise,
iproute2 calls this "memory_limit".
Rename because TC parameters are inherrently tied to the kernel
implementation and we should use the familiar name.
iproute2 uses the special value ~0u to indicate not to set
TCA_FQ_CODEL_CE_THRESHOLD in RTM_NEWQDISC. When not explicitly
setting the value, kernel treats the threshold as disabled.
However note that 0xFFFFFFFFu is not an invalid threshold (as far as
kernel is concerned). Thus, we should not use that as value to indicate
that the value is unset. Note that iproute2 uses the special value ~0u
only internally thereby making it impossible to set the threshold to
0xFFFFFFFFu). But kernel does not have this limitation.
Maybe the cleanest way would be to add another field to NMPlatformQDisc:
guint32 ce_threshold;
bool ce_threshold_set:1;
that indicates whether the threshold is enable or not.
But note that kernel does:
static void codel_params_init(struct codel_params *params)
{
...
params->ce_threshold = CODEL_DISABLED_THRESHOLD;
static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt,
struct netlink_ext_ack *extack)
{
...
if (tb[TCA_FQ_CODEL_CE_THRESHOLD]) {
u64 val = nla_get_u32(tb[TCA_FQ_CODEL_CE_THRESHOLD]);
q->cparams.ce_threshold = (val * NSEC_PER_USEC) >> CODEL_SHIFT;
}
static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
{
...
if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD &&
nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD,
codel_time_to_us(q->cparams.ce_threshold)))
goto nla_put_failure;
This means, kernel internally uses the special value 0x83126E97u to indicate
that the threshold is disabled (WTF). That is because
(((guint64) 0x83126E97u) * NSEC_PER_USEC) >> CODEL_SHIFT == CODEL_DISABLED_THRESHOLD
So in kernel API this value is reserved (and has a special meaning
to indicate that the threshold is disabled). So, instead of adding a
ce_threshold_set flag, use the same value that kernel anyway uses.
The memory-limit is an unsigned integer. It is ugly (if not wrong) to compare unsigned
values with "-1". When comparing with the default value we must also use an u32 type.
Instead add a define NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET.
Note that like iproute2 we treat NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET
to indicate to not set TCA_FQ_CODEL_MEMORY_LIMIT in RTM_NEWQDISC. This
special value is entirely internal to NetworkManager (or iproute2) and
kernel will then choose a default memory limit (of 32MB). So setting
NM_PLATFORM_FQ_CODEL_MEMORY_LIMIT_UNSET means to leave it to kernel to
choose a value (which then chooses 32MB).
See kernel's net/sched/sch_fq_codel.c:
static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt,
struct netlink_ext_ack *extack)
{
...
q->memory_limit = 32 << 20; /* 32 MBytes */
static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt,
struct netlink_ext_ack *extack)
...
if (tb[TCA_FQ_CODEL_MEMORY_LIMIT])
q->memory_limit = min(1U << 31, nla_get_u32(tb[TCA_FQ_CODEL_MEMORY_LIMIT]));
Note that not having zero as default value is problematic. In fields like
"NMPlatformIP4Route.table_coerced" and "NMPlatformRoutingRule.suppress_prefixlen_inverse"
we avoid this problem by storing a coerced value in the structure so that zero is still
the default. We don't do that here for memory-limit, so the caller must always explicitly
set the value.
When using nm_utils_strbuf_*() API, the buffer gets always moved to the current
end. We must thus remember and return the original start of the buffer.
In practice, there is no difference when representing 0 or 1 as signed/unsigned 32
bit integer. But still use the correct type that also kernel uses.
Also, the implicit conversation from uint32 to bool was correct already.
Still, explicitly convert the uint32 value to boolean in _new_from_nl_qdisc().
It's no change in behavior.
"NM_CMP_FIELD (a, b, fq_codel.ecn == TRUE)" is quite a hack as it relies on
the implementation of the macro in a particular way. The problem is, that
NM_CMP_FIELD() uses typeof() which cannot be used with bitfields. So, the
nicer solution is to use NM_CMP_FIELD_UNSAFE() which exists exactly for bitfields
(it's "unsafe", because it evaluates arguments more than once as it avoids
the temporary variable with typeof()).
Same with nm_hash_update_vals() which uses typeof() to avoid evaluating
arguments more than once. But that again does not work with bitfields.
The "proper" way is to use NM_HASH_COMBINE_BOOLS().
I think initializing structs should (almost) be always done with designated
initializers, because otherwise it's easy to get the order wrong. The
problem is that otherwise the order of fields gets additional meaning
not only for the memory layout, but also for the code that initialize
the structs.
Add a macro NM_VARIANT_ATTRIBUTE_SPEC_DEFINE() that replaces the other
(duplicate) macros. This macro also gets it right to mark the struct as
const.
This actually allows the compiler/linker to mark the memory as read-only and any
modification will cause a segmentation fault.
I would also think that it allows the compiler to put the structure directly
beside the outer constant array (in which this pointer is embedded). That is good
locality-wise.
- g_ascii_strtoll() accepts leading spaces, but it leaves
the end pointer at the first space after the digit. That means,
we accepted "1: 0" but not "1 :0". We should either consistently
accept spaces around the digits/colon or reject it.
- g_ascii_strtoll() accepts "\v" as a space (just like `man 3 isspace`
comments that "\v" is a space in C and POSIX locale.
For some reasons (unknown to me) g_ascii_isspace() does not treat
"\v" as space. And neither does NM_ASCII_SPACES and
nm_str_skip_leading_spaces().
We should be consistent about what we consider spaces and what not.
It's already odd to accept '\n' as spaces here, but well, lets do
it for the sake of consistency (so that it matches with our
understanding of ASCII spaces, albeit not POSIX's).
- don't use bogus error domains in "g_set_error (error, 1, 0, ..."
That is a bug and we have NM_UTILS_ERROR exactly for error instances
with unspecified domain and code.
- as before, accept a trailing ":" with omitted minor number.
- reject all unexpected characters. strtoll() accepts '+' / '-'
and a "0x" prefix of the numbers (and leading POSIX spaces). Be
strict here and only accepts NM_ASCII_SPACES, ':', and hexdigits.
In particular, don't accept the "0x" prefix.
This parsing would be significantly simpler to implement, if we could
just strdup() the string, split the string at the colon delimiter and
use _nm_utils_ascii_str_to_int64() which gets leading/trailing spaces
right. But let's save the "overhead" of an additional alloc.
Only read the keyfile databases once and cache them for the remainder of
the program.
- this avoids the overhead of opening the file over and over again.
- it also avoids the data changing without us expecting it. The state
files are internal and we don't support changing it outside of
NetworkManager. So in the base case we read the same data over
and over. In the worst case, we read different data but are not
interested in handling the changes.
- only write the file when the content changes or before exiting
(normally).
- better log what is happening.
- our state files tend to grow as we don't garbage collect old entries.
Keeping this all in memory might be problematic. However, the right
solution for this is that we come up with some form of garbage
collection so that the state files are reaonsably small to begin with.
It will be used for "/var/lib/NetworkManager/seen-bssids" and
"/var/lib/NetworkManager/timestamps" which currently is implemented
in NMSettingConnection.
ethtool/mii API is based on the ifname. As an interface can be renamed,
this API is inherently racy. We would prefer to use the ifindex instead.
The ifindex of a device cannot change (altough it can repeat, which opens a
different race *sigh*).
Anyway, we were already trying to minimize the race be resolving the
name from ifindex immediately before the call to ethtool/mii.
Do better than that. Now resolve the name before and after the call. If
the name changed in the meantime, we have an indication that a race
might have happend (but we cannot be sure).
Note that this can not catch every possible kind of rename race. If you are very
unlucky a swapping of names cannot be detected.
For getters this is relatively straight forward. Just retry when we
have an indication to fall victim to a race (up to a few times). Yes, we
still cannot be 100% sure, but this should be very reliable in practice.
For setters (that modify the device) we also retry. We do so under the
assumption that setting the same options multiple times has no bad effect.
Note that for setters the race of swapping interface names is particularly
bad. If we hit a very unlucky race condition, we might set the setting on
the wrong interface and there is nothing we can do about it. The retry only
ensures that eventually we will set it on the right interface.
Note that this involves one more if_indextoname() call for each operation (in
the common case when there is no renaming race). In cases where we make
multiple ioctl calls, we cache and reuse the information though. So, for such
calls the overhead is even smaller.
When we set the MTU on the link we remember its previous source
(ip-config, parent-device or connection profile) and don't change it
again afterwards to avoid interfering with user's manual changes. The
only exceptions when we change it again are (1) if the parent device
MTU changes and (2) if the new MTU has higher priority than the one
previously set.
To allow a live reapply of the MTU property we also need to clear the
saved source, or the checks described above will prevent setting the
new value.
Fixes: 2f8917237f ('device: rework mtu priority handling')
https://bugzilla.redhat.com/show_bug.cgi?id=1702657