Create a NMRouteManager singleton.
Refactor, no functional changes apart from change of log domain from
LOGD_PLATFORM to LOGD_CORE.
Subsequent commit will keep track of the conflicting routes, avoid overwriting
older ones with newer ones and apply the new ones when the old ones go away.
==1345== Invalid read of size 1
==1345== at 0x827DC15: vfprintf (vfprintf.c:1642)
==1345== by 0x8345D04: __vasprintf_chk (vasprintf_chk.c:66)
==1345== by 0x7F882DB: vasprintf (stdio2.h:210)
==1345== by 0x7F882DB: g_vasprintf (gprintf.c:316)
==1345== by 0x7F6319C: g_strdup_vprintf (gstrfuncs.c:507)
==1345== by 0x7F63258: g_strdup_printf (gstrfuncs.c:533)
==1345== by 0x472833: nm_platform_link_to_string (nm-platform.c:2337)
==1345== by 0x472A05: log_link (nm-platform.c:2754)
==1345== by 0x9DC5D5F: ffi_call_unix64 (unix64.S:76)
==1345== by 0x9DC57D0: ffi_call (ffi64.c:525)
==1345== by 0x7CBA553: g_cclosure_marshal_generic (gclosure.c:1448)
==1345== by 0x7CB9D34: g_closure_invoke (gclosure.c:768)
==1345== by 0x7CCB34B: signal_emit_unlocked_R (gsignal.c:3483)
==1345== Address 0xa91b5a0 is 0 bytes inside a block of size 5 free'd
==1345== at 0x4C2ACE9: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1345== by 0x68E7D6D: link_free_data (link.c:223)
==1345== by 0x6D47B1F: nl_object_free (object.c:186)
==1345== by 0x46C31C: put_nl_object (nm-linux-platform.c:222)
==1345== by 0x46C31C: link_change (nm-linux-platform.c:2354)
==1345== by 0x46C87F: link_set_user_ipv6ll_enabled (nm-linux-platform.c:2583)
==1345== by 0x4476C4: set_nm_ipv6ll (nm-device.c:4418)
==1345== by 0x4476C4: ip6_managed_setup (nm-device.c:7515)
==1345== by 0x453F12: _set_state_full (nm-device.c:7665)
==1345== by 0x4B6609: add_device (nm-manager.c:1885)
==1345== by 0x4B6880: system_create_virtual_device (nm-manager.c:1126)
==1345== by 0x4B6B40: system_create_virtual_devices (nm-manager.c:1163)
==1345== by 0x4B6E00: platform_link_added (nm-manager.c:2213)
==1345== by 0x4B6E00: platform_link_cb (nm-manager.c:2228)
==1345== by 0x9DC5D5F: ffi_call_unix64 (unix64.S:76)
Testing WWAN connections through a Nokia Series 40 phone, addresses of family
AF_PHONET end up triggering an assert() in object_has_ifindex(), just because
object_type_from_nl_object() only handles AF_INET and AF_INET6 address.
In order to avoid this kind of problems, we'll try to make sure that the object
caches kept by NM only store known object types.
(fixup by dcbw to use cached passed to cache_remove_unknown())
https://bugzilla.gnome.org/show_bug.cgi?id=742928
Connect: ppp0 <--> /dev/ttyACM0
nm-pppd-plugin-Message: nm-ppp-plugin: (nm_phasechange): status 5 / phase 'establish'
NetworkManager[27434]: <info> (ppp0): new Generic device (driver: 'unknown' ifindex: 12)
NetworkManager[27434]: <info> (ppp0): exported as /org/freedesktop/NetworkManager/Devices/4
[Thread 0x7ffff1ecf700 (LWP 27439) exited]
NetworkManager[27434]: <info> (ttyACM0): device state change: ip-config -> deactivating (reason 'user-requested') [70 110 39]
Terminating on signal 15
nm-pppd-plugin-Message: nm-ppp-plugin: (nm_phasechange): status 10 / phase 'terminate'
**
NetworkManager:ERROR:platform/nm-linux-platform.c:1534:object_has_ifindex: code should not be reached
Program received signal SIGABRT, Aborted.
0x00007ffff4692a97 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff4692a97 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff4693e6a in abort () from /usr/lib/libc.so.6
#2 0x00007ffff4c8d7f5 in g_assertion_message () from /usr/lib/libglib-2.0.so.0
#3 0x00007ffff4c8d88a in g_assertion_message_expr () from /usr/lib/libglib-2.0.so.0
#4 0x0000000000472b91 in object_has_ifindex (object=0x8a8320, ifindex=12) at platform/nm-linux-platform.c:1534
#5 0x0000000000472bec in check_cache_items (platform=0x7fe8a0, cache=0x7fda30, ifindex=12) at platform/nm-linux-platform.c:1549
#6 0x0000000000472de3 in announce_object (platform=0x7fe8a0, object=0x8a8c30, change_type=NM_PLATFORM_SIGNAL_REMOVED, reason=NM_PLATFORM_REASON_EXTERNAL) at platform/nm-linux-platform.c:1617
#7 0x0000000000473dd2 in event_notification (msg=0x8a7970, user_data=0x7fe8a0) at platform/nm-linux-platform.c:1992
#8 0x00007ffff5ee14de in nl_recvmsgs_report () from /usr/lib/libnl-3.so.200
#9 0x00007ffff5ee1849 in nl_recvmsgs () from /usr/lib/libnl-3.so.200
#10 0x00000000004794df in event_handler (channel=0x7fc930, io_condition=G_IO_IN, user_data=0x7fe8a0) at platform/nm-linux-platform.c:4152
#11 0x00007ffff4c6791d in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#12 0x00007ffff4c67cf8 in ?? () from /usr/lib/libglib-2.0.so.0
#13 0x00007ffff4c68022 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#14 0x00000000004477ee in main (argc=1, argv=0x7fffffffeaa8) at main.c:447
(gdb) fr 4
#4 0x0000000000472b91 in object_has_ifindex (object=0x8a8320, ifindex=12) at platform/nm-linux-platform.c:1534
1534 g_assert_not_reached ();
The address might be zero-size, and therefore nl_addr_get_binary_addr()
returns a pointer to a zero-size array. We don't want to read past the
end of that array. Since zero-size addresses really mean an address
of all zeros, just make that happen.
As an additional optimization, if the prefix length is zero, the whole
address is host bits and should be cleared.
==30286== Invalid read of size 4
==30286== at 0x478090: clear_host_address (nm-linux-platform.c:3786)
==30286== by 0x4784D4: route_search_cache (nm-linux-platform.c:3883)
==30286== by 0x4785A1: refresh_route (nm-linux-platform.c:3901)
==30286== by 0x4787B6: ip4_route_delete (nm-linux-platform.c:3978)
==30286== by 0x47F674: nm_platform_ip4_route_delete (nm-platform.c:1980)
==30286== by 0x4B279D: _v4_platform_route_delete_default (nm-default-route-manager.c:1122)
==30286== by 0x4AEF03: _platform_route_sync_flush (nm-default-route-manager.c:320)
==30286== by 0x4B043E: _resync_all (nm-default-route-manager.c:574)
==30286== by 0x4B0CA7: _entry_at_idx_remove (nm-default-route-manager.c:631)
==30286== by 0x4B1A66: _ipx_update_default_route (nm-default-route-manager.c:806)
==30286== by 0x4B1A9C: nm_default_route_manager_ip4_update_default_route (nm-default-route-manager.c:813)
==30286== by 0x45C3BC: _cleanup_generic_post (nm-device.c:7143)
==30286== Address 0xee33514 is 0 bytes after a block of size 20 alloc'd
==30286== at 0x4C2C080: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30286== by 0x6B2B0B1: nl_addr_alloc (in /usr/lib/libnl-3.so.200.20.0)
==30286== by 0x6B2B0E3: nl_addr_build (in /usr/lib/libnl-3.so.200.20.0)
==30286== by 0x6B2B181: nl_addr_clone (in /usr/lib/libnl-3.so.200.20.0)
==30286== by 0x66DB0D7: ??? (in /usr/lib/libnl-route-3.so.200.20.0)
==30286== by 0x6B33CE6: nl_object_clone (in /usr/lib/libnl-3.so.200.20.0)
==30286== by 0x6B2D303: nl_cache_add (in /usr/lib/libnl-3.so.200.20.0)
==30286== by 0x472E55: refresh_object (nm-linux-platform.c:1735)
==30286== by 0x473137: add_object (nm-linux-platform.c:1795)
==30286== by 0x478373: ip4_route_add (nm-linux-platform.c:3846)
==30286== by 0x47F375: nm_platform_ip4_route_add (nm-platform.c:1939)
==30286== by 0x4AEC06: _platform_route_sync_add (nm-default-route-manager.c:254)
https://bugzilla.gnome.org/show_bug.cgi?id=742937
Testing WWAN connections through a Nokia Series 40 phone, addresses of family
AF_PHONET end up triggering an assert() in object_has_ifindex(), just because
object_type_from_nl_object() only handles AF_INET and AF_INET6 address.
In order to avoid this kind of problems, we'll try to make sure that the object
caches kept by NM only store known object types.
https://bugzilla.gnome.org/show_bug.cgi?id=742928
Connect: ppp0 <--> /dev/ttyACM0
nm-pppd-plugin-Message: nm-ppp-plugin: (nm_phasechange): status 5 / phase 'establish'
NetworkManager[27434]: <info> (ppp0): new Generic device (driver: 'unknown' ifindex: 12)
NetworkManager[27434]: <info> (ppp0): exported as /org/freedesktop/NetworkManager/Devices/4
[Thread 0x7ffff1ecf700 (LWP 27439) exited]
NetworkManager[27434]: <info> (ttyACM0): device state change: ip-config -> deactivating (reason 'user-requested') [70 110 39]
Terminating on signal 15
nm-pppd-plugin-Message: nm-ppp-plugin: (nm_phasechange): status 10 / phase 'terminate'
**
NetworkManager:ERROR:platform/nm-linux-platform.c:1534:object_has_ifindex: code should not be reached
Program received signal SIGABRT, Aborted.
0x00007ffff4692a97 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff4692a97 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff4693e6a in abort () from /usr/lib/libc.so.6
#2 0x00007ffff4c8d7f5 in g_assertion_message () from /usr/lib/libglib-2.0.so.0
#3 0x00007ffff4c8d88a in g_assertion_message_expr () from /usr/lib/libglib-2.0.so.0
#4 0x0000000000472b91 in object_has_ifindex (object=0x8a8320, ifindex=12) at platform/nm-linux-platform.c:1534
#5 0x0000000000472bec in check_cache_items (platform=0x7fe8a0, cache=0x7fda30, ifindex=12) at platform/nm-linux-platform.c:1549
#6 0x0000000000472de3 in announce_object (platform=0x7fe8a0, object=0x8a8c30, change_type=NM_PLATFORM_SIGNAL_REMOVED, reason=NM_PLATFORM_REASON_EXTERNAL) at platform/nm-linux-platform.c:1617
#7 0x0000000000473dd2 in event_notification (msg=0x8a7970, user_data=0x7fe8a0) at platform/nm-linux-platform.c:1992
#8 0x00007ffff5ee14de in nl_recvmsgs_report () from /usr/lib/libnl-3.so.200
#9 0x00007ffff5ee1849 in nl_recvmsgs () from /usr/lib/libnl-3.so.200
#10 0x00000000004794df in event_handler (channel=0x7fc930, io_condition=G_IO_IN, user_data=0x7fe8a0) at platform/nm-linux-platform.c:4152
#11 0x00007ffff4c6791d in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#12 0x00007ffff4c67cf8 in ?? () from /usr/lib/libglib-2.0.so.0
#13 0x00007ffff4c68022 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#14 0x00000000004477ee in main (argc=1, argv=0x7fffffffeaa8) at main.c:447
(gdb) fr 4
#4 0x0000000000472b91 in object_has_ifindex (object=0x8a8320, ifindex=12) at platform/nm-linux-platform.c:1534
1534 g_assert_not_reached ();
refresh_object() raised a spurious change event for the route we
are about to delete. Suppress that by adding an internal reason flag.
Fixes: 41e6c4fac1
Deleting routes with metric 0 might end up deleting other
routes with a different metric.
Workaround this in platform to only delete a route with
metric 0 if such a route can be found prior to deletion.
Don't only look into the cache (which might be out of date).
Instead refetch the route we are about to delete to be sure.
There is still a race that we might end up deleting the wrong
route.
https://bugzilla.gnome.org/show_bug.cgi?id=741871https://bugzilla.redhat.com/show_bug.cgi?id=1172780
This match-any behavior ignoring metric is nowhere used. And even if we
would need such a behavior, using 0 is wrong because IPv4 routes can
have a metric of zero.
Handling a route with metric 0 effectively means
a metric of 1024 (user default). Adjust the add(),
delete() and exist() functions to consider routes
with metric 0 as 1024.
If we have NM running, adding a route with metric 20 might conflict
and cause NM to remove the route.
Choose a different (higher) metric that is less likely to cause a
conflict.
platform/nm-linux-platform.c: In function 'setup':
platform/nm-linux-platform.c:4364:2: error: 'object' undeclared (first use in this function)
object = nl_cache_get_first (priv->link_cache);
^
Fixes 2b8060b9b3
This reverts commit efd09845c4.
It turns out that the socket space might not be the only buffer that may get
too full. 128K ought to be enough for it and we should resynchronize with the
kernel now if needed.
Kernel can return ENOBUFS in variety of reasons. If that happens, we know we've
lost events and should pick up kernel state.
Simple reproducer that triggers an ENOBUFS condition no matter how big our
netlink socket buffer is:
ip link add bridge0 type bridge
for i in seq $(0 1023); do ip link add dummy$i type dummy; \
ip link set dummy$i master bridge0; done
ip link del bridge0
We assume that in nm_nl_cache_search() and correctly set that in
get_kernel_object(), but we rtnl_link_alloc_cache() can initialize the cache
with devices of other families.
The consequence is that we don't notify when the bridge changes to IFF_UP as we
fail to match and remove the old downed object from the cache:
nm_device_bring_up(): [0xf506c0] (bridge0): bringing up device.
nm_platform_link_set_up(): link: setting up 'bridge0' (12)
link_change_flags(): link: change 12: flags set 'up' (1)
get_kernel_object(): get_kernel_object for link: bridge0 (12, family 7)
log_link(): signal: link added: 12: bridge0 <UP> mtu 1500 bridge driver 'bridge' udi '/sys/devices/virtual/net/bridge0'
get_kernel_object(): get_kernel_object for link: bridge0 (12, family 7)
log_link(): signal: link changed: 12: bridge0 <UP> mtu 1500 bridge driver 'bridge' udi '/sys/devices/virtual/net/bridge0'
log_link(): signal: link changed: 12: bridge0 <UP> mtu 1500 bridge driver 'bridge' udi '/sys/devices/virtual/net/bridge0'
(bridge0): device not up after timeout!
(bridge0): preparing device
Since f32075d2fc, we remove the kernel
added IPv4 device route, and re-add it with appropriate metric.
This could potentially replace existing, conflicting routes. Be more
careful and only take any action when we don't have a conflicting
route and when we add the address for the first time.
The motivation for this was libreswan which might install a VPN route
for a subnet that we also have configured on an interface. But the route
conflict could happen easily for other reasons, for example if you
configure a conflicting route manually.
Don't replace the device route if we have any indication that
a conflict could arise.
https://bugzilla.gnome.org/show_bug.cgi?id=723178
For IPv4, iproute for example defaults to a metric of 0.
Hence, the name NM_PLATFORM_ROUTE_METRIC_DEFAULT was misleading.
Also add a NM_PLATFORM_ROUTE_METRIC_DEFAULT_IP4 define for completeness.
https://bugzilla.gnome.org/show_bug.cgi?id=740780
For IPv4 addresses, the kernel automatically adds a route when
configuring an IP address. Unfortunately, there is no way to control
this behavior or to set the route metric.
Fix this, by adding our own route and removing the kernel provided
one.
Note that this adds a major change in that we no longer call
nm_ip4_config_commit() for assumed devices.
https://bugzilla.gnome.org/show_bug.cgi?id=723178
Signed-off-by: Thomas Haller <thaller@redhat.com>
The upstream kernel added module aliases for nl80211 in commit
fb4e156886ce6e8309e912d8b370d192330d19d3, so querying nl80211
now auto-loads the module. Previously NM was doing this to
determine whether an ethernet-like device was a Wi-Fi device
that supported nl80211, but this leads to the nl80211 loading
on platforms that will never have or use Wi-Fi.
Since every nl80211-capable device will already have
DEVTYPE=wlan set (from /sys/class/net/wlan0/uevent), we can use
that as an indicator that the ethernet-like device is WiFi
instead of asking nl80211.
https://bugzilla.gnome.org/show_bug.cgi?id=740131
config.h should be included from every .c file, and it should be
included before any other include. Fix that.
(As a side effect of how I did this, this also changes us to
consistently use "config.h" rather than <config.h>. To the extent that
it matters [which is not much], quotes are more correct anyway, since
we're talking about a file in our own build tree, not a system
include.)
These lines are part of NM for a very long time.
I think they are wrong, because the default route is not
added to the NMIP4Config/NMIP6Config objects.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Add a new enum NMPlatformGetRouteMode. This extends the existing
functions nm_platform_ip4_route_get_all() and nm_platform_ip6_route_get_all()
to return default routes only.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Kernel, netlink an NMPlatformRoute treat route metrics as
uint32. Fix several places to use the exact type.
Signed-off-by: Thomas Haller <thaller@redhat.com>
As we use NMLinkType in NetworkManagerUtils.h, we cannot use
the utils header without nm-platform.h. That is clearly wrong.
Apparently NMLinkType has a wider use outside of platform (and
its name is not prefixed with 'platform' either).
Move the enum definition to nm-types.h.
Signed-off-by: Thomas Haller <thaller@redhat.com>