Mode: switch and DHCP problems on network with many nodes
Anton Avramov
lukav at lukav.com
Fri Feb 8 17:11:18 CET 2019
Hi All,
I currently have the following setup.
One central node called BackBone with the following conf:
Name = Backbone
Mode = switch
AddressFamily = ipv4
ReplayWindow=64
Compression=10
I also have approximately 440 nodes connected to this node with the
following setup:
Name = xxxxxx
Mode = switch
ConnectTo = Backbone
Compression = 10
There is dnsmasq on Backbone that serves ips to the nodes based on their
dhcp-client-identifier which is unique for each node.
The setup have worked perfect for years now with multiple versions of
debian. Currently all nodes are on debian 8 stretch and use the tinc
package from the official repository.
Now my problem is that BackBone is the single point of failure, and I
want to add backup solutions to it.
I've created 2 more "central servers" with the conf:
Name = Server1
Mode = switch
Interface = support
ConnectTo = Backbone
ConnectTo = Server2
Compression = 10
ReplayWindow=64
The servers synchronize the hosts directory with all the nodes keys
between each other.
With this setup If I set some of the nodes to connect only to Server1
for example, it works. It gets IP and everything is fine after I open
port TCP/655 on Server1.
However a few minutes after opening port TCP/655 on Server1 all goes to
hell.
The tinc process on the BackBone and Server1 gets to 100% CPU
utilization. The nodes stop renewing their IP addresses and their
dhclient are constantly trying to get a new address.
Even if I fix the ip address there is no ping to the server.
Using dhcpdump I see that there is dhcp traffic to backbone but I
suspect there is some loop going on that is causing tinc daemon to choke.
I understand that my setup is somewhat unusual and if you have nay
questions I'll be glad to provide more information.
Can someone suggest what the problem is and how to overcome it?
Any help will be greatly appreciated :)
P.P. I've tried to recreate the setup on a testing environment with
nspawn containers, but unfortunately the problems doesn't manifests there.
Best regards
More information about the tinc
mailing list