subnet flooded with lots of ADD_EDGE request
Vitaly Gorodetsky
vgorodetsky at augury.com
Tue Dec 11 15:34:19 CET 2018
We have the same network topology of 6 hub nodes and 30+ leaf nodes.
We also suffering from periodic network blockage and very high network
usage of ~300mb per node a day (50mb of upload and 250 of download).
This high traffic usage happens on idle network.
We have 1.0.34 version on our nodes.
On Tue, Dec 11, 2018 at 12:52 PM Amit Lianson <lthmod at gmail.com> wrote:
> Hello,
> We're suffering from sporadic network blockage(read: unable to ping
> other nodes) with 1.1-pre17. Before upgrading to the 1.1-pre release,
> the same network blockage also manifested itself in a pure 1.0.33
> network.
>
> The log shows that there are a lot of "Got ADD_EDGE from nodeX
> (192.168.0.1 port 655) which does not match existing entry" and it
> turns out that the mismatches were cuased by different weight received
> by add_edge_h().
>
> This network is consists of ~4 hub nodes and 50+ leaf nodes. Sample
> hub config:
> Name = hub1
> ConnectTo = hub2
> ConnectTo = hub3
> ConnectTo = hub4
>
> Leaf looks like:
> Name = node1
> ConnectTo = hub1
> ConnectTo = hub2
> ConnectTo = hub3
> ConnectTo = hub4
>
> Back to the days of pure 1.0.33 nodes, if the network suddenly
> fails(users will see tincd CPU usage goes 50%+ and unable to get ping
> response from the other nodes), we can simply shutdown the hub nodes,
> wait for a few minutes and then restart the hub nodes to get the
> network back to normal; however, 1.1-pre release seems to autoconnect
> to non-hub hosts based on the information found in /etc/tinc/hosts, which
> means that the hub-restarting trick won't work. Additionally, apart
> from high CPU usage, 1.1-pre tincd also starts hogging memory until
> Linux OOM kills the process(memory leakage perhaps?).
>
> Given that many of our leaf nodes are behind NAT thus there's no
> direct connection to them expect tinc tunnel, I'm wondering about if
> there's any way to bring the network back to work without shutting
> down all nodes? Moreover, is there any better way to pin-point the
> offending nodes that introduced this symptom?
>
> Thanks,
> A.
> _______________________________________________
> tinc mailing list
> tinc at tinc-vpn.org
> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
>
--
*Vitaly Gorodetsky*
Infrastructure Lead
Mobile: +972-52-6420530
vgorodetsky at augury.com
39 Haatzmaut St., 1st Floor,
Haifa, 3303320, Israel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20181211/8346cf5e/attachment.html>
More information about the tinc
mailing list