NMDevice's act_stage1_prepare() now does nothing. Calling it is not
useful and has no effect.
In general, when a subclass overwrites a virtual function, it must be
defined whether the subclass must, may or must-not call the parents
implementation. Likewise, it must be clear when the parents
implementation should be chained: first, as last, does it matter?
In any case, that very much depends on how the parent is implemented
and this can only be solved by documentation and common conventions.
It's a forgiving approach to have a parents implementation do nothing,
then the subclass may call it at any time (or not call it at all).
This is especially useful if classes don't know their parent class well.
But in NetworkManager code the relationship between classes are known
at compile time, so every of these classes knows it derives directly
from NMDevice.
This forgingin approach was what NMDevice's act_stage1_prepare() was doing.
However, it also adds lines of code resulting in a different kind of complexity.
So, it's not clear that this forgiving approach is really better. Note
that it also has a (tiny) runtime and code-size overhead.
Change the expectation of how NMDevice's act_stage1_prepare() should be
called: it is no longer implemented, and subclasses *MUST* not chain up.
Note that all subclasses of NMDevice that implement act_stage1_prepare(), call
the parents implementation as very first thing.
Previously, NMDevice's act_stage1_prepare() was setting up SR-IOV. But that is
problemantic. Note that it may have returned NM_ACT_STAGE_RETURN_POSTPONE, in which
case subclasses would just skip their action and return to the caller. Later,
sriov_params_cb() would directly call nm_device_activate_schedule_stage2_device_config(),
and thus act_stage1_prepare() was never executed for the subclass. That
is wrong.
First, I don't think it is good to let subclasses decide whether to call a
parents implementation (and when). It just causes ambiguity. In
the best case we do it always in the same order, in the worst case we
call the parent at the wrong time or never. Instead, we want to initialize
SR-IOV always and early in stage1, so we should just do it directly from
activate_stage1_device_prepare(). Now NMDevice's act_stage1_prepare() does
nothing.
The bigger problem is that when a device wants to resume a stage that
it previously postponed, that it would schedule the next stage!
Instead, it should schedule the same stage again instead. That allows
to postpone the completion of a stage for multiple reasons, and each
call to a certain stage merely notifies that something changed and
we need to check whether we can now complete the stage.
For that to work, stages must become re-entrant. That means we need to
remember whether an action that we care about needs to be started, is pending
or already completed.
Compare this to nm_device_activate_schedule_stage3_ip_config_start(),
which checks whether firewall is configured. That is likewise the wrong
approach. Callers that were in stage2 and postponed stage2, and later would
schedule stage3 when they are ready.
Then nm_device_activate_schedule_stage3_ip_config_start() would check whether
firewall is also ready, and do nothing if that's not the case (relying
that when the firewall code completes to call
nm_device_activate_schedule_stage3_ip_config_start().
- use a [2] array for IPv4/IPv6 variants and a IS_IPv4 variable,
like we do for other places that have similar implementations for
both address families.
- drop ActivationHandleData and use the fields directly. Also drop
activation_source_get_by_family().
- rename "act_handle*" field to "activation_source_*", to follow the
naming of the related accessor functions.
- downgrade the severity of some logging messages.
The only change in behavior is in act_stage1_prepare().
That changes compared to before that we also set the specific
object path if it was already set (and we looked up the AP by
specific object to start with).
Also, for existing APs that we found with nm_wifi_aps_find_first_compatible(),
it changes the order of calling set_current_ap() before nm_active_connection_set_specific_object().
That should not make a different though. I anyway wonder why we even bother to
set the specific object on the AC. Maybe that should be revisited.
act_stage1_prepare() should become re-entrant. That means, we should not clear the state
there. Instead, we clear it where necessary or on deactivate (which we do already).
NM_DEVICE_MODEM_GET_PRIVATE() is based on _NM_GET_PRIVATE(), which has
some smarts to check the pointer type, but is fine with well-known parent
pointer types like "NMDevice *".
If the @append_force argument is set and the object is already in the
list, it must be moved at the end.
Fixes: 22edeb5b69 ('core: track addresses for NMIP4Config/NMIP6Config via NMDedupMultiIndex')
Add test to show a wrong result of ip_ipX_config_replace() due to a
bug in _nm_ip_config_add_obj(). When an address is added to the tail
of the index and another address with the same id already exists, the
existing object is left at the same place, breaking the order of
addresses.
Nowadays, we should prefer "/run" over "/var/run". When not specifying
during ./configure, autotools however still defaults to "/var/run".
This default is also visible in the pre-generated documenation, for
example `man NetworkManager.conf` says
Unless the symlink points to the internal file /run/NetworkManager/resolv.conf,
in which case the ...
It's important whether a setting is present or not. Keyfile writer
omits properties that have a default value, that means, if the setting
has all-default values, it would be dropped. For [proxy] that doesn't
really matter, because we tend to normalize it back. For some settings
it matters:
$ nmcli connection add type bluetooth con-name bt autoconnect no bluetooth.type dun bluetooth.bdaddr aa:bb:cc:dd:ee:ff gsm.apn a
Connection 'bt' (652cabd8-d350-4246-a6f3-3dc17eeb028f) successfully added.
$ nmcli connection modify bt gsm.apn ''
When storing this to keyfile, the [gsm] section was dropped
(server-side) and we fail an nm_assert() (omitted from the example
output below).
<error> [1566732645.9845] BUG: failure to normalized profile that we just wrote to disk: bluetooth: 'dun' connection requires 'gsm' or 'cdma' setting
<trace> [1566732645.9846] keyfile: commit: "/etc/NetworkManager/system-connections/bt.nmconnection": profile 652cabd8-d350-4246-a6f3-3dc17eeb028f (bt) written
<trace> [1566732645.9846] settings: update[652cabd8-d350-4246-a6f3-3dc17eeb028f]: update-from-dbus: update profile "bt"
<trace> [1566732645.9849] settings: storage[652cabd8-d350-4246-a6f3-3dc17eeb028f,3e504752a4a78fb3/keyfile]: change event with connection "bt" (file "/etc/NetworkManager/system-connections/>
<trace> [1566732645.9849] settings: update[652cabd8-d350-4246-a6f3-3dc17eeb028f]: updating connection "bt" (3e504752a4a78fb3/keyfile)
<debug> [1566732645.9857] ++ connection 'update connection' (0x7f7918003340/NMSimpleConnection/"bluetooth" < 0x55e1c52480e0/NMSimpleConnection/"bluetooth") [/org/freedesktop/NetworkManager>
<debug> [1566732645.9857] ++ gsm [ 0x55e1c5276f80 < 0x55e1c53205f0 ]
<debug> [1566732645.9858] ++ gsm.apn < 'a'
Of course, after reload the connection on disk is no loner valid.
Keyfile writer wrote an invalid setting.
# nmcli connection reload
Logfile:
<warn> [1566732775.4920] keyfile: load: "/etc/NetworkManager/system-connections/bt.nmconnection": failed to load connection: invalid connection: bluetooth: 'dun' connection requires 'gsm' or 'cdma' setting
...
<trace> [1566732775.5432] settings: update[652cabd8-d350-4246-a6f3-3dc17eeb028f]: delete connection "bt" (3e504752a4a78fb3/keyfile)
<debug> [1566732775.5434] Deleting secrets for connection /org/freedesktop/NetworkManager/Settings (bt)
<trace> [1566732775.5436] dbus-object[9a402fbe14c8d975]: unexport: "/org/freedesktop/NetworkManager/Settings/55"
I thought I would need this, but ended up not using it.
Anyway, it makes sense in general that the function can lookup
all relevant information, so merge it.
First of all, keyfile writer (and reader) are supposed to be able to store
every profile to disk and re-read a valid profile back. Note that the profile
might be modified in the process, for example, blob certificates are written
to a file. So, the result might no be exactly the same, but it must still be
valid (and should only diverge in expected ways from the original, like mangled
certificates).
Previously, we would re-read the profile after writing to disk. If that failed,
we would only fail an assertion but otherwise proceeed. It is a bug
after all. However, it's bad to check only after writing to file,
because it results in a unreadable profile on disk, and in the first
moment it appears that noting went wrong. Instead, we should fail early.
Note that nms_keyfile_reader_from_keyfile() must entirely operate on the in-memory
representation of the keyfile. It must not actually access any files on disk. Hence,
moving this check before writing the profile must work. Otherwise, that would be
a separate bug. Actually, keyfile reader and writer violate this. I
added FIXME comments for that. But it doesn't interfere with this
patch.
NM didn't support wpa-none for years because kernel drivers used to be
broken. Note that it wasn't even possible to *add* a connection with
wpa-none because it was rejected in nm_settings_add_connection_dbus().
Given that wpa-none is also deprecated in wpa_supplicant and is
considered insecure, drop altogether any reference to it.
This file causes a crash [1], add it to the tests.
Note that the test only check parsing the file and the
crash happens in the "upper" layers. So, it's not really
a test for the crash. But at least have such a file in
our repository.
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/235
There is really no reason for this to be a macro. Our hash-related
helpers (like nm_hash_update_val()) are macros because they do some
shenigans to accept arguments of different (compile-time) types. But
the arguments for nm_hash_obfuscate_ptr() are well known and expected
of a certain form.
Note that with "-O2" some quick testing shows that the compiler no
longer inlines the function. But I guess that's fine, probably the
compiler knows best anyway.
Logging pointer values might reveal information that can be used to defeat
ASLR. We should avoid that.
On the other hand, it's useful to tag a logging message with the pointer
value of the "source" of the message. It helps to correlate messages and
search for relevant messages in the log.
As a compromise, use NM_HASH_OBFUSCATE_PTR(), like we do at several places
already. For example, we also log
<debug> [1566550899.7901] setup NMPlatform singleton (29a6af9867f2e5d0)
This obfuscated value is a 64 bit unsigned integer with the siphash24
hash of the raw value with a randomized seed. Of course, contrary to the
pointer value, there is a tiny chance that two different pointers hash
to the same identifier. However, that seems unlikely enough to be of no
concern. Note that this pointer value is only logged to aid debugging.
It is sufficiently unlikely that this causes confusion.
One other downside of printed the obfuscated value, is that you can no
longer read the pointer from the log and use it in gdb directly. That
might be sometimes convenient, but making this impossible is kinda the
purpose of this change.
As such, nm_log_ptr() becomes a bit of a misnomer. But not too bad, it
still is a good name. For example, if we wanted we could redefine the
NM_HASH_OBFUSCATE_PTR* macros when building "--with-more-asserts".