www.zeroshell.org Forum Index www.zeroshell.org
Linux Distribution for server and embedded devices
 
 SearchSearch  RegisterRegister  UsergroupsUsergroups 
 ProfileProfile  Log inLog in  Log in to check your private messagesPrivate Message 

Very bad bonding performance

 
Post new topic   Reply to topic    www.zeroshell.org Forum Index -> Linux and Networking
View previous topic :: View next topic  
Author Message
BuddyButterfly



Joined: 17 Aug 2008
Posts: 2

PostPosted: Sun Aug 17, 2008 9:47 pm    Post subject: Very bad bonding performance Reply with quote

I have set up bonding of two ADSL lines to one root server.
Both ADSL lines are 16Mb/s. Measured downloads on the single lines
where 1.7MB/s and 1.5MB/s. So I expected the bonded lines to have
a rate of about 2.5-2.8 MB/s (factoring in the overhead of vpn and
NAT, etc.). It turns out that TCP performance of the bonded lines
are not better than a single line and most of the time even worse.
It regularly goes up 1.1MB/s and down to 200kB/s. Sometimes download speed is as
expected for the first 50MB-60MB of a download of about 2.6MB/s before
it drops again. The most worse thing is that multiple connections do not
some up to the bandwidth. When one download is at 1MB/s and another is
started they both sum up to 1MB/s each one dropping to 500kB/s.

I have tried lots of things to get rid of this problem:

1. Setting tcp reordering to 127
2. Setting tcpqueuelength to 1000
3. Setting tcp window sizes etc.

I might also have done something wrong in the configuration. That is
why I am asking here in the forum. The topology is:

1. One linux server running debian. This linux box runs 2 openvpn servers.

2. 2 standard ADSL lines each with a router doing NAT which are connected
via LAN to two interfaces on the debian server. The routers are registered
dynamically with dyndns.

3. A root server running debian connected with 100Mb/s to the internet.
The root server runs 2 openvpn clients which connect to the 2 servers via
the registrated dyndns entries to the linux servers two openvpn servers.

4. Openvpn servers are configured to have 10.20 and 10.21 net (with 255.255.0.0 mask).

5. Bond interfaces on both side are configured like

linux server: bond0: 10.10.0.1 /16
root server: bond0: 10.10.0.2 /16

6. The root server runs shorewall which does masquerading for the bond0 interface.

7. Default root on the linux server is 10.4.0.2


Bond configuration on root server is as follows:


Ethernet Channel Bonding Driver: v3.0.3 (March 23, 2006)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.10.0.1

Slave Interface: tap0
MII Status: up
Link Failure Count: 2
Permanent HW addr: bc:cd:8c:18:bc:7c

Slave Interface: tap1
MII Status: up
Link Failure Count: 1
Permanent HW addr: d2:d5:e2:63:3e:1a



On the linux server:

Ethernet Channel Bonding Driver: v3.2.3 (December 6, 2007)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.10.0.2

Slave Interface: tap0
MII Status: up
Link Failure Count: 2
Permanent HW addr: 14:be:f8:eb:f7:b1

Slave Interface: tap1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 32:cd:7c:d0:7a:d6


Failover is working flawlessly.

My questions:

1. I something wrong with the setup?

2. Should the bond interfaces on both side have a different net? Or is the config ok?

3. Could the version difference of the bonding driver be the reason?

4. I recognized with wireshark that there are lots of out
of order packets and fast retransmits happening. Out of orde packets seem to occurr
inherently in bonding. Is there any chance at all to get a good performance
with tcp via bond?
Back to top
View user's profile Send private message
fulvio
Site Admin


Joined: 01 Nov 2006
Posts: 1048

PostPosted: Mon Aug 18, 2008 8:02 am    Post subject: Reply with quote

I am waiting to try OpenVPN Bonding with next release of Zeroshell that will include the net balancer. By using this, it is easier to set up a load balancing and failover VPN. When I will know more details about the vpn aggregation I will post here.

Regards
Fulvio
Back to top
View user's profile Send private message Send e-mail
BuddyButterfly



Joined: 17 Aug 2008
Posts: 2

PostPosted: Mon Aug 18, 2008 11:53 am    Post subject: Reply with quote

Hi fulvio,

thanks for the information.
The information in the setup 7. is wrong. The default points to 10.10.0.2 of course. Otherwhise it won't work.

Do you have experience with the current bonding in zeroshell regarding
performance of tcp connections?

Thanks a lot.
Back to top
View user's profile Send private message
fulvio
Site Admin


Joined: 01 Nov 2006
Posts: 1048

PostPosted: Mon Aug 18, 2008 3:09 pm    Post subject: Reply with quote

I know that other people had your same performance issues. I have not investigated yet.

Regards
Fulvio
Back to top
View user's profile Send private message Send e-mail
martinhill



Joined: 12 Sep 2008
Posts: 2

PostPosted: Mon Sep 15, 2008 8:48 am    Post subject: depends on what you want Reply with quote

it depends on what you want - but it is not possible to say load balancing is better than bonding.

may need to watch out for certain devices that claim to bond but actually load balance. they're both easy to set up. bonding is the best choice for just improving upload and download bandwidth capacity. load balancing will be better for if you are using voip. failvoer still available in both technolgioeis

see mushrom and xrio ubm devices. these will do both - inexpensive out of the 2 is xrio.

hope it helps
Back to top
View user's profile Send private message
satir



Joined: 29 Apr 2009
Posts: 2

PostPosted: Wed Apr 29, 2009 11:57 am    Post subject: Reply with quote

Is there any update on this? Did you solve your problem BuddyButterfly?

I am facing the same issue: There are two (or more) ADSL lines, each with a theoretical bandwidth of 16 Mb/s down and 1 Mb/s up. Over each ADSL line an OpenVPN connection is established to a server in a datacenter with a 100Mb/s link. This server acts as a gateway. The OpenVPN connections are bonded together with the linux bonding module mode 0, so that every connection can consume the whole bandwidth.

I'm struggling with this since weeks without finding any viable solution, so maybe someone with more experience can shed some light on this.

First I'm going to provide some speed measurements, which were made downloading a single file over FTP . The file was transferred from the server in the datacenter to the machine on the other side of the VPN, so there were no other stations except Internet routers involved.

  • Over an OpenVPN connection without any bonding: 13,3 Mb/s download with 0,28 Mb/s upload inside the tunnel
  • Over a bond which consists of 1 single VPN connection (pointless but just for testing): 13,2 Mb/s download with 0,28 Mb/s upload inside the tunnel
  • Over a bond which consists of 2 VPN connections: 15 Mb/s download with 0,87 upload inside the tunnel

As you can see, the upload of the machine which was receiving the file is increasing tremendously as soon as a second connection is added to the bond. Why is this happening? Or actually: How can this be solved?

The above numbers are all taken inside the tunnel. Since OpenVPN adds about 30 bytes per packet(!), replys are small and a lot of them are sent, this is increasing the load on the upload about 30% to 50%. The ADSL lines only have 1 Mb/s upload, so I think that my download speed with the bond is not higher because the replys of the receiving machine get dropped since the upload is saturated.

Also the speed of the bonded connection with 2 VPN slaves is very unsteady. Most of the time its about 11-14 Mb/s, sometimes it goes up to 25 Mb/s for a few seconds just to fall down to 2 Mb/s shortly after. I assume that with 25 Mb/s downspeed the machine can't get out its replys fast enough, because the upload is full, and then the sender is decreasing the rate again. Downloading multiple files at once doesn't make a difference either... Uploading files on the other hand works fine with about 1,6 Mb/s over two lines.

I already tried tinkering with tcp_reordering, OpenVPN queue length and buffer sizes, all to no avail. OpenVPN is running with "proto udp, auth none, no-iv, no-replay" and without tls-cipher or pings to minimize its overhead. Can it be lowered even more or are there other tunnel solutions which provide a smaller overhead?
Back to top
View user's profile Send private message
satir



Joined: 29 Apr 2009
Posts: 2

PostPosted: Thu Apr 30, 2009 2:40 pm    Post subject: Reply with quote

An update:
I disabled TCP timestamps and selective ACKs on the side with the two ADSL links (net.ipv4.tcp_sack=0, net.ipv4.tcp_timestamps=0). That way the ACK packets became smaller in size (84 byte per packet with OpenVPN overhead compared to 124 byte before), and the FTP download faster (20 Mb/s). The upload while downloading with 20 Mb/s was 0,76 Mb/s, so much better than the 15 Mb/s from my last posting which resulted in 0,87 Mb/s upload.
However 0,76 Mb/s upload for 20 Mb/s download over the two bonded VPN cnnections is still twice as much as I get when I "bond" only one connection. Tcpdump shows that the the machine with the two ADSL links is sending every ACK over both(!) connections, ie over both TAP devices. Every ACK gets therefore transmitted twice. Why is that?!

Anyway I'm probably going to give up on this, since I spent already more than 100 hours on it. Even if I can solve the problem with the duplicate ACKs, I'm still facing the problem that the overhead of the multiple VPN connections are eating up half of the upload of my lines... I guess bonding is only feasible with symmetrical lines that all have the same round trip times.
Back to top
View user's profile Send private message
ninnic



Joined: 10 Nov 2009
Posts: 7

PostPosted: Thu Nov 12, 2009 10:37 am    Post subject: Reply with quote

Hi,

I have the same problem but with UMTS modems, I think the problem is that one side have only one IP , I am waiting to get from my provider to try.

In the mean time let me know if there are some news about

bye
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    www.zeroshell.org Forum Index -> Linux and Networking All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group