Dynamic load balancing scheme on massive file transfer system

. In this paper, a dynamic load balancing scheme applied to massive file transfer system is proposed. The scheme is designed to load balance FTP server cluster. Instead of recording connection number, runtime load information of each server is periodically collected and used in combination of static performance parameters collected on server startup to calculate the weight of servers. Improved Weighted Round-Robin algorithm is adopted in this scheme. Importantly, the weight of each server is initialized with static performance parameters and dynamically modified according to the runtime load. Apache Zookeeper cluster is used to receive all information and it will inform director of the runtime load variation and offline behavior of any server. In order to evaluate the effect of this scheme, a contrast experiment with LVS is also conducted.


Introduction
The rapid growth of Internet makes the number of access to multimedia network servers increase rapidly. The server needs to provide service to a large number of concurrent accesses. The ability of processing and I/O has become the bottleneck of providing services. To solve this problem, one solution is expensive high-performance server or SMP, the other is connecting multiple servers to form a cluster so that performance could be improved through parallel processing and high speed information exchange between each other. The latter solution, with high overall performance (such as response time, throughput), high scalability, high availability, and higher performance/price ratio, has become the principal method to build high performance information servers.
Load balancing is the core part of the normal work in cluster system. Its main purpose is to distribute tasks reasonably to all of the nodes in cluster, achieve the balanced state of the whole system and ensure the processing capacity and quality of service of the system. Load balancing can be achieved directly based on hardware products or software [1]. The load balancer is installed between the server and the external network. The hardware implementation of the load controller is expensive and not flexible, and it couldn't support more improved load balancing strategy and more complex application protocol.
In software scheme, the load balancing algorithms could be classified into static and dynamic algorithm. Studies have shown that both the static load balancing scheduling and the dynamic load balancing algorithms could improve the performance of the cluster system. But in practice both of the scheduling methods still have some limitations.
The static scheduling is poor in adaptability. It never takes the cluster's runtime state into consideration for the directing scheme is fixed. While the parameters traditional dynamic scheduling utilized like connections number couldn't precisely reflect the load condition [2].
The dynamic load balancing scheme proposed in this paper is capable of balancing nodes in cluster on the basis of runtime load information and it also takes the hardware static performance into consideration. It has great adaptability, working well in heterogeneous cluster and makes self-adjustment in time when load condition of cluster changes.

System model
The cluster system is consisted of a director and several file servers that connected by a high-speed network. 1) There is a load collector on every file server, collect static performance parameters on server startup and periodically collecting runtime load information.
2) An Apache Zookeeper cluster works as the manager of load information. All load collectors on server have registered a corresponding znode in zookeeper and upload load information to it.
3) Director has registered child-watcher on servers' root znode and data-watcher every server's znode that stores load data to monitor all servers' state.
At time to refresh the load, load collector collects load information and update the corresponding zookeeper node. Then the director is informed and refresh the load table. Weights of servers are recalculated on basis of the load table.

Znode structure
As a commonly used distributed coordination service provider, Apache Zookeeper could work as a data manager and it guarantees data consistency [3]. File system in Zookeeper is organized as hierarchy tree structure and every node in it is called znode.
There are two types of znode that a client can create [4]. Persistent-clients manipulate create and deleting them explicitly. Ephemeral-client create such znode and they either delete them explicitly or let the system remove them automatically when the session that creates them terminates.
Based on znode's feature, static performance parameters and runtime load information are stored at different znodes as figure 2 shows.

Static performance parameters
Static parameters are used to measure the carrying capability of server and initialize the weight of server. It includes number of CPU cores, disk I/O speed, memory capacity and bandwidth. At server startup, load collector collects all these parameters, store them into serialized data structure. After data collection, load collector will communicate with zookeeper cluster to examine whether "Parameters" znode exists under "Static" znode. If it doesn't exist, then the Zookeeper client on this server will create it with the static performance parameters and record the current server's weight as 1. Otherwise, the Zookeeper client will get the data "Parameters" znode records and calculate current server's weight and record the weight locally.

Runtime Load Information
Znodes stored with runtime load information are ephemeral znodes. Servers' online/offline behavior will lead to the creation and elimination of corresponding child znode under "Dynamic" znode. Therefore, director has registered child listener to "Dynamic" znode to be informed of online/offline and data listener to every server's corresponding znode to be informed of the load change [5].
Runtime load information is used to dynamically modify the weight of server. It includes initial weight, usage percentage of CPU, disk I/O, memory and bandwidth. These information is periodically collected and upload to the corresponding znode that attached to "Dynamic" znode. Every time runtime load information updated the director will be informed.

Basic algorithm
Servers' runtime usage percentage of hardware resources are utilized to measure load condition rather than recording connection number in this scheme. Therefore, the commonly used algorithm Improved Weighted Round-Robin is the most suitable one to be applied in this scheme.
Assume there are three file servers in cluster and corresponding weights are W 1 , W 2 and W 3 . The implementation of Improved WRR needs a new data structure "Node" consists of id, weight and current weight of server. The calculation of the server list to be allocated is described in pseudocode as follows. Compared with traditional WRR, Improved WRR is more suitable to work as the basis of dynamic load balancing scheme. Traditional WRR does not scatter elements of high weight in server list when selecting element. Therefore, it will give great pressure to servers with higher weight. On the contrary, Improved WRR scatters elements in server list so that servers will be alternately selected and the weight will be considered at the same time [6]. This mechanism will lead to better balance effect.

Weight initialization
At server start-up, every server collects its own static performance parameters and download the standard parameters stored in "/Load/Static/Parameters" znode if the node exists. Then weighted summation is used to calculate the initial weight. All parameters' definition concerning initial weight calculation are presented in table 2.

Calculation of load factor
The load collector running on server periodically collects runtime load information. This information will be updated to corresponding znode and the director will use it to calculate load factor F. The load factor is also calculated through a weighted calculation formula. All parameters related to dynamic load factor are defined in table 3. The calculation of dynamic load factor is stated as follows.
This formula is chosen rather than weighted summation for it is more efficient in detecting heavy load on a particular hardware. If usage of single hardware resource is much higher than others, then the load factor should be nearly 1 to warn that the server is very busy but weighted summation couldn't work out. This formula suits the requirement better than weighted summation.
All these weights to calculate the factor can be modified to satisfy special requirements such as setting maximum usage of a particular hardware to make sure it's always able to deal with another task. In experimental environment there is no special requirement so [W c , W d ,W m ,W b ] is set to be [1,1,1,1].

Weight modification
Every time load factor is updated, the weight server is also calculated. Define the modified weight as W m then the modification of weight is stated as follows: E in this equation is used to amplify the weight. Both (1-F) and W i are smaller float than 1 but the weight applied into Improved WRR needs to be integer so it's necessary to set an amplify factor. In experiment it's set E=10. And the final result will be converted to integer.
In case all servers in cluster are busy and W m is below or equal to 0, W m will be set to 1 if the final result is 0. This strategy guarantees that there will always be available server to be allocated.

Re-allocate interval
Ftp cluster provides service to a large number of ftp clients. If client requests for server allocation every time establishing connection, the pressure on director will be increasing rapidly. In this load balancing scheme ftp client requests for re-allocation at set intervals rather than every time. The time interval is returned by director along with the server address allocated. It's also modified according to the load balancing condition. Assume the load factor of servers in cluster is [F 1 ,F 2 ,F 3 ], the ceiling interval, floor interval and initial interval are [I c ,I f ,I i ].The calculation logic of the time interval in pseudocode is stated as follows. The overall logic of time interval modification is to increase if the balance condition is good and decrease on the contrary. It helps to balance the load in time and reduce the pressure on director.

Contrast experiment
To test the performance of dynamic load balancing scheme, a contrast experiment with Linux Virtual Server is conducted.

Dynamic load balancing
The experiment is conducted in a cluster with 7 servers in it. One server works as director, three servers work as ftp server and the rest work as ftp clients. All these servers' hardware configuration is the same. In experiment, load information that collector uploaded is also recorded into files. Compared to connection number and response delay, load information is better to be used to evaluate the balance effect. Gap between different servers' usage percent directly represents balance effect.
Server works as ftp client will launch multiple threads and start an ftp client in every thread to simulate a large number of clients.

LVS
As comparison, the experimental environment is set similar to the environment mentioned above. LVS has already been integrated into Linux kernel so only the administration program-Ipvsadm has to be launched on director [7]. Commonly used algorithm-SED(Shortest Expected Delay Scheduling) is chosen in experiment.
File transition must be finished in FTP PASV mode. Only in this mode, director doesn't need to forward all the packets so that its performance won't be the bottleneck of cluster. Iptables' rules are also added on every file server to solve the address conflict in PASV mode [8].

Experimental scheme
The main experimental scheme is to test balance effect under different level of load pressure. As max upload speed of single client is configured to fixed value, the number of clients is proportional to the load pressure. The purpose of the experiment can be achieved by adjusting the number of concurrent clients.
Extra load pressure is also added to servers by executing additional tasks on these servers. Through experimental observation, it's known that bandwidth usage percent is much higher than the others. Taken bandwidth usage as the main observation object, additional file transmission of other protocol is the best way could add extra load to server. In experiment multithread SCP clients are launched [9] on one of the servers running FTP clients and transfer files to one particular FTP server to supply extra load.

Experiment result analyzation
The line chart of bandwidth usage is chosen to be analyzed because the usage of bandwidth is much higher than other hardware.
In contrast experiment, the duration of each experiment is about 10 minutes, and the first 3-4 minutes for the FTP balance effect test. After that, additional load is added and the load changes are observed. After optimizing the weighting coefficients by several experiments, the test results show that the trend of bandwidth occupancy rate of FTP servers in each platform is the same under the different number of FTP clients. The test results of 90 concurrent clients transmitting files are analyzed.  As can be seen in line charts, in first 3-4 minutes the bandwidth usage curves of different servers under dynamic LB scheme almost coincide with each other. This trend is identical with LVS in SED mode, proving the dynamic scheme could achieve similar balance effect to traditional balance scheme.
After extra load is added, the bandwidth usage trend is different. Under LVS, number of connections is the only indicator to balance the cluster so the server with additional load will still be allocated approximately the number of FTP requests similar to other servers [10]. The consequence is that the server bears additional load will be much busier than the others.
Under dynamic scheme proposed in this paper, however, the cluster is able to self-adjust to achieve a balanced state. Figure 4 shows that about 1 minute later after the extra load is added, the other two servers have

Conclusion
In practice, clusters are not only responsible for one task. Sometimes file servers are responsible for some task such as file integration or file pre-processing. Traditional load balancing schemes, such as LVS, tend to be more balanced in a particular business aspect, and cannot be adjusted according to the actual situation when server's other business loads are heavy.
The dynamic load balancing scheme proposed in this paper is proved to be more aware of cluster's actual load condition and is effective in self-adjustment. It could achieve better balance effect than LVS in practice. And because the load information collecting and weight calculation is lightweight task, the additional load it brings is negligible. Therefore, this scheme is ideal for massive file transfer system.