Task assignment policies in distributed server systems: A survey

May 29, 2017 | Autor: L. Bouallouche-Me... | Categoria: Performance, Resource sharing, Distributed System, Traffic Management, Load Balancing, World Wide Web, Cost effectiveness, Task Assignment, High performance, Scalability, Load distribution, Load Balance, Data Intensive Computing, Queue, Load sharing, Response Time, Distributed Design, World Wide Web, Cost effectiveness, Task Assignment, High performance, Scalability, Load distribution, Load Balance, Data Intensive Computing, Queue, Load sharing, Response Time, Distributed Design

Share Embed

Denunciar este link

Descrição do Produto

Journal of Network and Computer Applications 34 (2011) 1123–1130

Contents lists available at ScienceDirect

Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca

Review

Task assignment policies in distributed server systems: A survey Fouzi Semchedine , Louiza Bouallouche-Medjkoune, Djamil Aı¨ssani LAMOS, LAboratory of Modeling and Optimization of Systems and Doctoral School in Computer Science, (Networking and Distributed Systems), University of Be´jaı¨a, 06000, Algeria

a r t i c l e i n f o

abstract

Article history: Received 13 September 2010 Received in revised form 7 December 2010 Accepted 21 January 2011 Available online 31 January 2011

Data intensive computing, in the Web environment, motivates the distributed designs of Web server systems (Web clusters) because of their scalability and cost-effectiveness instead of one Web server with high performance. The task assignment policy, in such systems, focuses on the manner of assigning the tasks that reach these systems (e.g. the case of intensive requests that reach the distributed Web server systems) in order to minimize the response time and thus, improve the performance. These tasks, generally, follow the ‘‘heavy-tailed’’ distribution which has the property that there is a tiny fraction (about 3%) of large tasks that makes half (50%) of the total load. Several policies were proposed in the literature to deal with the nature of this Web trafﬁc. This paper presents a state-of-art of the existing task assignment policies. We classify these policies in two classes: policies which assume that the task size is known a priori, and policies which assume that the task size is not known a priori (like TAGS, TAPTF and TAPTF-WC). The ﬁrst class of policies regroups policies which consider no knowledge of load information at the servers when assigning the incoming tasks, known as static policies (like Random, Round Robin, etc.) and, policies, known as dynamic policies (like Central Queue Policy, Least Loaded First ‘‘LLF’’, etc.) which use some load information (e.g. the processing capacity, the queue load, etc.) to process. & 2011 Elsevier Ltd. All rights reserved.

Keywords: Data intensive computing Task assignment Distributed server systems Heavy-tailed distribution Load balancing Load sharing Performance

Contents 1. 2.

3.

4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1123 Challenges to the design of the distributed server systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124 2.1. Task size distribution (heavy-tailed distribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124 2.2. Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124 Policies for task assignment in distributed server systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124 3.1. Policies assuming priori knowledge of task size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125 3.1.1. Static policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125 3.1.2. Dynamic policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126 3.2. Policies assuming no priori knowledge of task size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1127 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1128 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1129 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1129

1. Introduction The explosion of the Internet and the World Wide Web has sensibly increased the amount of the online information and services available for the Web users. The growing of information and service demands has placed a dramatic pressure on the Internet infrastructure by the need of advanced Web server systems capable of serving a large number of Web requests. Hence, the distributed Corresponding author at: LAMOS Laboratory, University of Be´jaı¨a, 06000, Algeria. Tel.: + 213 550 493 611; fax: +213 34 21 51 88. E-mail addresses: [email protected], [email protected] (F. Semchedine).

1084-8045/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.jnca.2011.01.011

designs of Web server systems (Web clusters) appear because of their scalability and cost-effectiveness, instead of one Web server with high performance like a mainframe. Figure 1 illustrates a typical architecture of a distributed server model. In this architecture, one virtual IP address (VIP) is assigned to the Web clusters, which is the IP address of the dispatcher. This is able to identify each server in the cluster through a private address and distributes the load among the Web servers based on the mechanism used to route the requests. Furthermore, the selected Web server sends the response packets to the client. The tasks (we refer to tasks as requests for Web ﬁles) arrive to the distributed server system following the Poisson process with rate l (Nozaki and Ress, 1978; Williams et al., 2005) and must be

1124

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

The paper is organized as follows: Section 2 provides the main challenges to the design of the distributed server systems. A classiﬁcation and a detailed description of the existing task assignment policies in distributed server system is given in Section 3, and we conclude our paper in Section 4.

2. Challenges to the design of the distributed server systems

Fig. 1. Distributed server model.

assigned, by the dispatcher, to one of the servers to be processed. Task assignment, in such a system, focuses on the policy that assigns these tasks in order to minimize their response time and thus, improve the performance (Crovella et al., 1998; HarcholBalter, 2002; Gilly et al., 2008). The waiting time is deﬁned as the delay between the task arrival and its service beginning. The slowdown of a task is the waiting time divided by its processing time. This metric is so important because minimizing it reduces the variation of the task sizes, as shown in Harchol-Balter (2002), Riska et al. (2002), Zhou et al. (2004) and Wei et al. (2005). The ﬂow time is deﬁned as the sum of the waiting time and the processing time; known also as the time from task arrival to task departure. The response time of a task consists of the ﬂow time and the time taken, by the policy, to assign the task. So, improving the response time is to minimize the waiting time, the slowdown and the ﬂow time. Recent observations show that the task size (service demand), in distributed server systems, often exhibits a heavy-tailed distribution where a small fraction (about 3%) of large tasks makes half (50%) of the total load (Arlitt and Williamson, 1996; Crovella and Bestavros, 1997; Crovella et al., 1999; Williams et al., 2005). This property, known as heavy-tailed property, makes additional problems: large tasks make the load very difﬁcult to balance among the servers, and the small tasks will be delayed by the large tasks when they are in the same queue (since the service time of the small tasks is smaller than the service time of the large tasks—the service time of a task is proportional to its size—it is not fair that wait so much to be processed). This paper attempts to brieﬂy and comprehensively classiﬁes and discuses a large amount of published policies for task assignment in distributed server systems. We classify these policies based on the assumption that the task size can be known or not in advance. In fact, for the Web trafﬁc, the sizes of static Web ﬁles can be known in advance; however, the sizes of dynamic Web ﬁles cannot be known in advance (e.g. cgi programs, jsp, servlets,y). Thus, the ﬁrst class is the class of policies which assume that the task size is known a priori. This class regroups static policies which consider no knowledge of load information at servers when assigning the tasks (like Random, Round Robin, etc.) and, dynamic policies which use some load information (e.g. processing capacity of the server, queue load, etc.) when selecting the dedicated server that must process the task (like Central Queue Policy, Least Loaded First ‘‘LLF’’, etc.). The second class is the class of policies which assume that the task size is not known a priori (like TAGS, TAPTF and TAPTF-WC). These policies assign a task to the ﬁrst server which processes it through a time limit. If the task has not ﬁnished its processing, it will be stopped and restarted at the next server. We also analyze the efﬁciency and limitations of the different policies with the aim of identifying the characteristic of each policy and its impact on the performance.

The distributed server systems are recently motivated by the need of powerful systems which can satisfy a large number of client requests. Web researchers have developed approaches for tackling purely two main challenges: the Web trafﬁc distribution and the response time of systems. We consider that many of the core challenges lie at the boundary between the nature of the Web trafﬁc and the performance metrics, including, the following: 2.1. Task size distribution (heavy-tailed distribution) First, heavy-tailed distribution has been observed in many natural phenomena like the distribution of people around the world, where most places are empty or tiny populated, while there are small numbers of places which are large populated. Recently, this distribution has been seen in computing systems. Authors in Arlitt and Williamson (1996), Broberg et al. (2004), Crovella et al. (1999) and Williams et al. (2005) have shown that task sizes exhibit this distribution, where a very small fraction ð o3%Þ of the very largest tasks makes up a large fraction (half) of the total workload. This property is known as the heavytailed property. Contrast this with the exponential distribution where less than 3% of the very largest tasks makes up about 5% of the total workload. The heavy-tailed property will be most effective when the variation increases. 2.2. Performance metrics The aim to design an optimal distributed Web server system is to minimize the response time. Web researchers considered the response time of the system as the sum of the ﬂow time and the time taken by the policy to assign a task. Thus, to evaluate the performance of such systems, the waiting time, the slowdown and the ﬂow time were mainly considered. Waiting time: The waiting time is the time from when the task arrives at the system until it begins its service. In other term, the waiting time of a task is the total processing time of all the tasks that are ahead of the task in the queue. Reducing the waiting time improves a perceived latency of clients. Slowdown: The slowdown of a task is the waiting time divided by its processing time. The slowdown is studied by a few people in the past: load sharing in distributed server systems (Gilly et al., 2008; Nozaki and Ress, 1978), QoS on Internet (Zhou et al., 2004; Wei et al., 2005), etc. This metric is important because minimizing it tends to reduce the variation of small and large tasks. So, the small tasks will be not queued with the large tasks which means that they will be never delayed. Flow time: The ﬂow time is deﬁned as the sum of the waiting time plus the processing time; known also as the time from task arrival to task departure. Minimizing the ﬂow time has a good impact in an improvement of the mean response time of the system. 3. Policies for task assignment in distributed server systems Figure 2 illustrates our classiﬁcation of policies for task assignment in distributed server systems. Two classes of policies are considered: policies assuming no priori knowledge of task sizes

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

Fig. 2. Task assignment policies classiﬁcation.

and policies assuming priori knowledge of task sizes. Policies assuming no priori knowledge of task sizes are based on dynamic tasks (which is the case for dynamic content such as cgi, jsp, servlets, etc.). Whereas policies assuming priori knowledge of task sizes are based on static tasks. In practice, the dispatcher can inspect the service time of each request (i.e. task size) as shown in CiscoSystems (1998), Fielding et al. (1999) and Hunt et al. (1997). On the other hand, and focusing on static requests, studies from 1997 to 2001, while Web sites are increasingly generating dynamic content, suggest that the request at Web sites is still dominated by static requests (Feldmann, 1999; Krishnamurthy and Rexford, 2001; Manley and Seltzer, 1997). In recent logs (for 2004), from proxy servers, 67–73% of requests is for static content (IRCacheHome, 2004). Serving static requests quickly is the focus of many companies, e.g. Akamai Technologies, CDNetworks, Certeon technologies, Crescendo Networks, F5 networks, Keynote Systems, Radware technologies, and Riverbed technologies. Therefore, popular Web sites often convert their dynamic content into static content when the Web site is overloaded, which makes static requests especially important (LeFebvre, 2002). 3.1. Policies assuming priori knowledge of task size This class regroups the static and dynamic policies. The static policies (such as Random, Round Robin, SITA-E, etc.) consider no knowledge of load information at the servers when assigning the tasks, whereas dynamic policies use some load information (i.e. the processing capacity, the queue length, etc.) when choosing the server that will process the task (such as Central Queue Policy, Least Loaded First ‘‘LLF’’, etc.). This class includes: 3.1.1. Static policies These policies assume no knowledge of load information at the servers when they assign the tasks. The tasks are assigned randomly (like Random), in a cyclical fashion (like Round Robin) or based on size range (like SITA-E, SITA-V,y). Random (uniform random): Random policy is a naı¨ve policy, in which, tasks are assigned randomly to the servers. The server i among N receives the task with the probability 1/N. Round Robin: In Round Robin, tasks are assigned to the servers in a cyclical fashion. Task i will be assigned to the server i mod N. This policy equalizes the expected number of tasks at each server. Random and Round Robin policies are simple to implement, but their limitation is that their performance deteriorate when the task size distribution is heavy-tailed as shown by HarcholBalter et al. (1999). These policies aim to equalize the number of tasks at each server without considering the task size variation

1125

(i.e. these policies perform well when the task size is exponential since all the tasks have nearly the same size). Size-based policies: In these policies, size ranges are affected to the servers. This allows to the dispatcher assigning the task based on its size. This kind of policies includes: SITA-E (Size Interval Task Assignment with Equal load): The key idea of SITA-E is to ensure that the total work assigned to each server in a cluster will be equal. Thus, each server has a size interval, and accepts only the tasks that size falls into this interval. These intervals are chosen so all the servers will receive the same load (Harchol-Balter et al., 1999). SITA-E, when using the size intervals, is shown to be efﬁcient in reducing the variance in task sizes compared to Random, Round Robin and Dynamic Policies (using knowledge of the servers state to assign the tasks). In fact, the small tasks will be queued with other small tasks that fall in the same size range of the tasks interval. Thus, the large tasks will be never queued with the small tasks and, consequently, the small tasks will be never delayed. SITA-E policy has an easy implementation and, unlike dynamic policies, it does not require any information about the load of the servers. One of the limitations of SITA-E is the static expressions of size intervals. This has an impact when the task size distribution changes. In Harchol-Balter et al. (1999), the simulation results show that SITA-E performs poorly when the variability increases. This is due to the high variability in task sizes. Hence, a large number of small tasks will be assigned to the ﬁrst server (due to its size interval) which will be overloaded and resulting in a bad performance of the whole system. SITA-V (Size Interval Task Assignment with Variable load): SITA-V (Crovella et al., 1998) is based on the heavy-tailed property in which there is a large number of small tasks and a few number of large tasks. In SITA-V, when the tasks arrive, the dispatcher inspects them and assigns all the small tasks to the server that is loaded under the average system load and all the large tasks to the server that is loaded over the average system load. The main idea of SITA-V is to reduce the mean slowdown by making all the small tasks to be processed on the servers that have a low load. So, this accelerates the processing of the small tasks, and their mean slowdown will be reduced. We show that this technique does not perform if the distribution was exponential. In fact, when the distribution is exponential, the tasks have nearly the same size. Thus, the server that has many tasks will be high loaded; however, the server that has a few tasks will be low loaded. In Crovella et al. (1998), the simulation results show that SITAV compared to EQ-LOAD (a special case of SITA-V where the load on the servers is equal) performs poorly when the variability is between 0.9 and 1. This is due to the high variability in task sizes. Consequently, the size of the small tasks increases and SITA-V suffers in separating the small tasks from large tasks. SITA-V, also, is shown to have a poor performance when the load is high too. In fact, when the load is high, SITA-V has a less ﬂexibility for shifting around the tasks. On the other hand, SITA-V does not fully utilize the processing capacity of the servers when there are no very large tasks that reach the system (i.e. the servers loaded over the average system load remain idles). This policy is shown by Crovella et al. (1998) be attractive for the geographically distributed Web Servers (i.e. DNS-based architecture) because it has a simple implementation and does not require communications between the servers and the dispatcher. SITA-U (Size Interval Task Assignment with Unbalanced load): SITA-U (Schroeder and Harchol-Balter, 2004) uses cutoff’s which the dispatcher considers to assign the tasks. So, each server has a cutoff, and accepts only the tasks that ﬁll in its cutoff. These cutoffs are chosen so as to minimize the mean slowdown.

1126

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

In Schroeder and Harchol-Balter (2004), the simulation results for the mean slowdown show an important improvement in performance of SITA-U over SITA-E in a wide order of magnitude under high variability of task sizes. In fact, for SITA-E, when the variability in task sizes grows, a large number of small tasks will be assigned to the ﬁrst server due to its size interval. Thus, this server will be overloaded and, this results in the bad performance of the system. Whereas, SITA-U, by unbalancing the load, tries to reduce the number of tasks redirected to the ﬁrst server which results in improvement of the slowdown. Recent analysis (Harchol-Balter et al., 2009a, 2009b) proved that when the variability in task sizes grows, the Size Interval Task Assignment (SITA) mechanism performs poorly than the naı¨ve mechanism Least Work Remaining (i.e. Section 3.1.2) under different classes of distributions (Modal, Hyperexponential, Bounded Pareto, and Pareto) and under a wide variety of system load. This is due, as mentioned above, to the use of size ranges in which, when the variability increases, a large amount of tasks is for small tasks and, this will overload servers that have the small size ranges, whereas those with large size ranges remain idles. EQUILOAD: EQUILOAD (Ciardo et al., 2001) was inspired of some previous works on size-based policies that are based on size ranges mechanism to assign the tasks to the servers (HarcholBalter et al., 1999). EQUILOAD policy partitions the task sizes into N intervals, ½s0 ¼ 0,s1 Þ,½s1 ,s2 Þ, . . . ,½sN1 ,sN ¼ 1Þ, in which the server i will be responsible for processing the tasks that size falls into the interval si 1 and si. The characteristic of EQUILOAD is that it uses a double allocation policy. First, the dispatcher assigns each incoming task very quickly to one of the N servers by using a simple policy such as Random or Round Robin. Then, when the server i receives a task from the dispatcher, it inspects its size ‘‘s’’ and if si1 o s osi , it puts the task in the queue, otherwise it reallocates it to the server ‘‘j’’ that satisfy the condition sj1 os o sj . In Ciardo et al. (2001), performance tests show an important improvement in performance of EQUILOAD over SRPT (Shortest Remaining Processing Time) and Join Shortest Queue policies, but one of the limitations of this policy is the static partition of ranges of task sizes allocated to the servers, which has an impact when the workload changes. ADAPTLOAD: ADAPTLOAD (Riska et al., 2002) is an online version of EQUILOAD that addresses the limitation of the static partition of the task size ranges. ADAPTLOAD provides an online mechanism to adapt the ranges when the workload changes. The basic idea of ADAPTLOAD is that the boundaries of size intervals are changed based on the last K tasks arrived at the system. So, if the workload changes, the ranges will also change. A limitation of the size-based policies, with respect to the use of size ranges, is that the server’s resources will not be fully utilized. Mostly in the case where many consecutive small tasks arrive with no large tasks. Then, the servers with high size ranges remain idles.

3.1.2. Dynamic policies Unlike static policies, the dynamic policies attempt to balance the load based on some load information at the servers (such as the processing capacity, the queue length, etc.). We consider: LLF: Least Loaded First: Least Loaded First is a mechanism which assigns the tasks to the server based on its load. Shortest Queue and Least Work Remaining are examples of this mechanism. In Shortest Queue, the dispatcher assigns incoming tasks based on the queues contents. So, the server that has a least number of tasks in the queue will be designed to process the next task. Least Work Remaining, uses the remaining work at the

server to dispatch the tasks. Thus, it selects the server that has the least of the remaining work in term of task sizes. In Harchol-Balter et al. (1999), the simulation results show that based only on the load of the servers, when taking the assignment decision, does not improve the performance since the most important goal is to reduce the variability in task sizes when the workload follows a heavy-tailed distribution (i.e. these mechanisms perform well when the distribution is exponential HarcholBalter et al., 1999; Winston, 1977). Harchol-Balter et al. (1999) have shown that the Least Loaded First policy is equivalent to the Central Queue policy. Central Queue policies: In these policies, the dispatcher is assumed has an FCFS queue in which the tasks will be maintained until one of the servers becomes free. Although this policy performs well under the exponential distribution where there is a low variability, authors in Harchol-Balter et al. (1999) and Schroeder and Harchol-Balter (2004) have shown a poor performance under a realistic workload, where there is a high variability in task sizes (i.e. heavy-tailed distribution). Cycle Stealing with Central Queue: Cycle Stealing with Central Queue (Harchol-Balter et al., 2003) is a variant of the Central Queue policy. The main idea of this policy is that the servers are divided into two distinct groups: the ﬁrst is designed as ‘‘the group of short servers’’ which is dedicated to process the small tasks and the second as ‘‘the group of long servers’’ to process the large tasks. Like the Central Queue policy, the tasks are held in the queue of the dispatcher. When one of the short servers becomes free, it picks the ﬁrst small task in the queue for processing. Similarly, if one of the long servers becomes free, it picks the ﬁrst large task in the queue. Nevertheless, if there is no large task that reaches the system, the long servers process the small tasks. In a similar manner, and when there is no small tasks, the short servers process the large tasks. Hence, the long servers become the short servers and vice versa. SRPT (Shortest Remaining Processing Time): SRPT is a mechanism used for task assignment in distributed server systems (Bansal and Harchol-Balter, 2001). The idea is to give priority to the tasks that are quick or do not need a high processing requirement. So, when the server processes a large task and a small task arrives, this large task will be pre-empted and the server switches to process the small one. This mechanism causes some limitations to this policy. The ﬁrst one is that the large task will be always pre-empted when the small ones arrive. This generates some additional delay in the processing of the large tasks. The second is the problem of starvation of large tasks. In fact, the large tasks will be inﬂuenced by a favor given to the small tasks. Especially when the number of these last is enormous. In Bansal and Harchol-Balter (2001), the performance testing was being realized, based on the assumption that all the requests are for small Web ﬁles. On the other hand, an implementation of SRPT at the kernel of the Web server Apache was performed and experiments were being realized in a LAN and a WAN environment, based on the static requests (Harchol-Balter et al., 2003). Based on the hypothesis that all the requests are static and for small Web ﬁles, SRPT is shown to perform better than the standard FAIR scheduling (time-shared mechanism in which a Web server proportions its resources fairly among the requests needing service) implemented on Linux; but this is not the case in the Web Servers environment. Global Class Policy: Awerbuch et al. (2002) have proposed another algorithm for task assignment in distributed server systems. The base idea of this algorithm is to create classes of tasks. The algorithm creates these classes based on the remaining processing time of the tasks. Then, if the remaining processing time of a task is in the interval [2k,2k + 1), the task will be in

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

the class k. Therefore, when a task arrives at the system, the algorithm searches the server that is idle or in which the new task has a lower class than that of the currently task in processing by this server. Hence, this new task is pushed into the queue of this server which begins its processing. Each time, the server ﬁnishes the processing of a task, the algorithm compares the class of the task placed in the pool of the dispatcher with the class of the task that is at the top of the server queue. The minimum of class founds by the algorithm allows the related task to be processed. One of the limitations of this policy is the problem of starvation of large tasks. Since there is a class-based scheduling between the small tasks and the large tasks, the small tasks will have always the lower class than the large ones. This means that the large tasks have always the low priority to be processed. Task Grouping Policy: Task Grouping Policy (Tan and Tari, 2002) aims to reach the same optimal ﬂow time at each server queue to reduce the system response time. So, when there is a group of tasks in the queue of the dispatcher, it takes up a subset of servers and tries to assign these tasks to this subset of servers based on their load. Thus, the algorithm calculates the load of each server participating in the subset and, when it ﬁnds a group of tasks where the sum of their sizes is closest to the computed load of one server, it assigns the group of tasks to this server. Simulation results in Tan and Tari (2002) show that Task Grouping Policy has a poor performance when the variability increases. In fact, the algorithm can group, in this case, a large number of small tasks in the same queue. This increases the waiting time and the slowdown of small tasks, which is not fair since the performance metrics of small tasks must be lower with respect to their small sizes. LFF (Least Flow-time First): LFF Policy assigns the task to the ﬁttest server which has a lighter load and a higher processing capacity. In other term, LFF chooses the server which has the minimum of ﬂow time. Some variants of LFF have been proposed (Fu and Tari, 2003; Fu et al., 2003; Tari et al., 2005) to address the two limitations of LLF which are: (1) LLF does not consider the order of processing and, (2) when it assigns the task, it does not consider the processing capacity of the servers. Authors in Fu and Tari (2003) and Fu et al. (2003) have proposed LFF-PRIORITY which is an extension of LFF, to deal with the task priorities. LFF-PRIORITY considers two kinds of task priorities: the task size priority and the task deadline (the time by which the task should be ﬁnished its processing) priority. In LFF-PRIORITY, the dispatcher collects the load information of the servers and calculates the ﬂow time of the tasks at each server. Each server receiving the task from the dispatcher, it makes the task in the dedicated queue based on its corresponding deadlines and its sizes. So, the tasks are putted in priority queues and processed in a priority order. The priority of the task is deﬁned as the sum of the task size priority and the task deadline priority. In Fu and Tari (2003) and Fu et al. (2003), the results of performance tests show that LFF-PRIORITY performs better, under different system load scenarios (0.4, 0.6, 0.8, 1.0 and 1.3), than Random, SITA-E and LLF policies. The last variant of LFF called LFF-SIZE was proposed Tari et al. (2005). LFF-SIZE dynamically assigns the task to the ﬁttest server and, uses a multi-section queue to reduce the delay of tasks caused by the size variation. Thus, the section of large tasks will be separated from the section of small tasks and this last has the high priority to be processed. This results in decreasing the delay caused by the large tasks. In Tari et al. (2005), the results of performance tests show an important improvement in performance of LFF-SIZE under a high system load over the existing policies such as Random, LLF and SITA-Large2Low (which is a SITA-E based policy that assigns the large tasks to the ﬁrst server having a low processing capacity).

1127

One of the limitations of LFF variants policies is the static partition of the size ranges in the priority and the multi-section queues. This will have a bad impact when the task size distribution changes; mostly in case of multi-services systems which offer different services and where each of the service follows a heavytailed distribution. DAS (Deferred Assignment Scheduling): DAS (Ungeruanu et al., 2006), like Shortest Queue mechanism, considers that the dispatcher must keep the incoming tasks before sending them to the servers. The dispatcher uses a buffer in which it keeps the tasks that reach the system and, a queue to enter the server that has dropped below a threshold parameter (used by the dispatcher to decide when the task must be assigned to the servers). Thus, the dispatcher monitors the number of tasks processed by each server. When the number of tasks processed by a server drops below the threshold and if the buffer is not empty, it assigns to this server the smallest task in the buffer. Otherwise (the buffer is empty), the dispatcher enters the designated server in the queue. When a new task arrives and the queue is not empty, the dispatcher directly assigns it to the server that is at the head of the queue. In Ungeruanu et al. (2006), experimental performance tests show that DAS outperforms Round Robin and lightly Join Shortest Queue (JSQ) (i.e. Shortest Queue policy) based on the priority given to the small tasks. As mentioned above, when the priority is given to the small tasks, the large ones suffer from the problem of starvation. Authors in Ungeruanu et al. (2006) try to address this problem by introducing some mechanism of estimation of the task service. This mechanism estimates the service of the task by normalizing its size by its waiting time to give the chance to the large tasks to be chosen by the dispatcher. This mechanism seems to be not efﬁcient with respect to the nature of the trafﬁc characterized by the heavy-tailed distribution in which the large number of tasks is for small tasks. Thus, the small tasks will have always the priority to be processed. On the other hand, choosing the optimal threshold parameter is one of the limitations of DAS. Hence, smaller is the threshold, effective is the bottleneck problem in the dispatcher. The common limitation of these policies is the bottleneck problem. When tasks reach the system, the dispatcher regroups them in the queue to construct the tasks set or to assign them to the servers. Hence, the dispatcher becomes a bottleneck. Dynamic policies need more feedback communications than the static policies. Thus, the dynamic policies are not attractive in the situations where the dispatcher might be located far away from the servers, e.g. across the Internet.

3.2. Policies assuming no priori knowledge of task size These kinds of policies consider that the task size is not known a priori based on the assumption that the dispatcher cannot calculate the size of the Web ﬁle as is the case for the dynamic Web ﬁles: cgi script, jsp, servlets, etc. This class includes: TAGS (Task Assignment by Guessing Size): TAGS policy (HarcholBalter, 2002) associates a time limit to each server. Thus, when the task arrives, the dispatcher assigns it to the ﬁrst server which processes it until the end of the time limit. If the task has not completed its processing, it is destroyed, and restarted for processing at the next server. TAGS policy has a characteristic of reducing the variation of task sizes, compared to random and least work remaining policies, by using the time limit at each server and, unbalancing the load by running the small tasks on the highest server and the large tasks on the lowest server. In Harchol-Balter (2002), the analytical results show that the performance of TAGS are poor under the high load. In fact, when

1128

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

the load is high, TAGS suffers to unbalance the load among the servers because the entire load passes by the highest servers which risk being overloaded. Another limitation of TAGS policy is that it makes some penalization by killing and restarting the tasks, speciﬁcally for the large tasks since they are the ones that must be, generally, restarted. TAPTF (Task Assignment based on Prioritizing Trafﬁc Flows): TAPTF policy (Broberg et al., 2004) was proposed to address the two limitations of TAGS: (1) Reducing the variation of tasks that share the same queue in order to regroup the tasks that have nearly the same size together and, (2) Reducing the excess on the backend server caused by the tasks that its processing time do not fall into the time limit, and will be restarted at the next server. TAPTF policy uses two kinds of queues, for each server, with cutoffs. The ﬁrst queue, called ordinary queue, receives the tasks directly from the dispatcher. Whereas the second queue, called restart queue, receives the tasks from the server just above it. When the task reaches the system, the dispatcher assigns it to one server based on some probability and makes it in the ordinary queue of the server which processes it until the end of the time limit. If the task has not completed its processing, it is stopped and moved to the restart queue of the next server to be restarted. In Broberg et al. (2004), the analytical comparisons between TAPTF, TAGS and Random were being performed. Results show that TAPTF outperforms TAGS in the case of low variability and high system load. In the TAPTF policy the problem of migration was reduced. TAPTF-WC (Task Assignment based on Prioritizing Trafﬁc Flows with Work-Conserving migration): A new variant of TAPTF policy is proposed and is called TAPTF-WC (Broberg et al., 2006). TAPTFWC tries to address the problem of restarting tasks at the next servers which made some additional delay in the processing of the tasks. Like TAPTF, TAPTF-WC policy uses two kinds of queues, for each server, with cutoffs. The ﬁrst queue, called ordinary queue, receives the tasks directly from the dispatcher. Whereas the second queue, called restart queue, receives the tasks from the server above it. In TAPTF-WC, when the task reaches the system, the dispatcher assigns it to one server based on some probability and makes it in the ordinary queue of the server which processes it until the end of the time limit. If the task has not completed its processing, it is moved to the restart queue of the next server and, unlike TAPTF, the computational work done at the server is conserved. In Broberg et al. (2006), the analytical comparisons between TAPTF, TAGS, TAGS-WC and TAPTF-WC were being performed. Results show that TAPTF-WC outperforms TAPTF, TAGS and TAGS-WC in a wide range and reduces the time of migration caused by the stop/restart process. MTTMEL (Multi-Tier Task assignment policy with Minimum Excess Load): MTTMEL (Jayasinghe et al., 2010) consists of a dispatcher and multiples tiers of servers. Each tier has one or more servers associated with a time limit of the tier. So, servers of the same tier process the tasks until the end of the time limit of the tier. In MTTMEL, when the task arrives, the dispatcher assigns it to one server of the ﬁrst tier which processes it until the end of the designated time limit. If the task has not completed its processing, it is destroyed and restarted for processing at one server of the next tier. This process continues until the task ﬁnishes its processing and quits the system. MTTMEL assumes that the number of servers of one tier is greater than the number of servers of the next tier. In Jayasinghe et al. (2010), the performance analysis between MTTMEL, TAGS and Random shows that with a high number of servers MTTMEL outperforms TAGS and Random in the case of a low variability and a high system load. However, when the number of servers is reduced, the performance of MTTMEL

deteriorates under the empirical values of the variability. This is due to the number of tiers which is also reduced and, consequently, the policy suffers to balance the load among these tiers.

4. Conclusion Distributed server systems offer some beneﬁts in term of scalability and cost effectiveness. However, data intensive computing in such systems focuses on choosing the efﬁcient task assignment policy in order to balance the load of the tasks that reach the system and improve the response time. The nature of the trafﬁc (heavy-tailed distribution) of tasks (i.e. requests for Web ﬁles), that makes the load difﬁcult to share among the servers, has motivated a proposition of a number of task assignment policies in the literature to deal with the realistic Web trafﬁc. In this paper, we have explored and analyzed these policies and, classiﬁed them according to the base of their algorithm. We showed that we can classify these policies into two classes based on the assumption that the task size can be known or not in advance. Policies which assume that the task size is known a priori regroup the static and the dynamic policies. The static policies showed to be simple to implement. However, Random and Round Robin perform poorly when the distribution is not exponential and, for size-based policies, the use of size ranges ensures that the servers will be not fully utilized and the performance deteriorate when the distribution changes. On the other hand, the dynamic policies, when using some load information at the servers to assign the tasks (such as the processing capacity, the queue length, etc.), attempt to balance the load dynamically and address the limitations of the static policies. However, the dynamic policies require feedback communication between the servers and the dispatcher than the static policies which are important where the dispatcher might be located far away from the servers, e.g. across the Internet. Although policies which assume that the task size is not known a priori try to deal with the realistic Web trafﬁc by addressing the problem of dynamic Web content, they cause some additional time when killing and restarting the tasks; which inﬂuences on the response time of the system. The majority of the task assignment policies suffer under the realistic Web trafﬁc (heavy-tailed distribution) due to the property of this distribution in which a tiny fraction of large tasks makes half of the total load. This property makes additional problems: the large tasks make the load difﬁcult to balance among the servers, and the small tasks will be delayed by the large ones when they are in the same queue. These problems are mostly effective under a high trafﬁc demand and a high variability of task sizes. One research path is in the direction of combining the technologies that are still seen be efﬁcient to improve the performance of the distributed server systems. A step further is the objective of proposing mechanisms and policies that guarantee these main points:

Reduce the variation of tasks by using mechanisms which

ensure that the small tasks are always processed ahead of the large tasks and which reduce the delay caused by task size variation. On the other hand, an efﬁcient task assignment policy must be suitable for the multiple-service distributed server systems by reducing the variation and allowing fair processing of the different services. Balance the large tasks in order to avoid overloading the system and to accelerate their processing mainly for the Web sites which are based on this kind of tasks, such as large medias

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

(CNN, New York Times, Reuters, Libration, y), large computer science companies (Microsoft, Apple, Adobe, Logitech, y), e-commerce Web sites (Voyages-sncf.com, Galerieslafayette. com, y), or media televisions (Fox, MTV, France2,y). As the scale-up architecture (single node) greatly grows (Hardware scale-up and Software scale-up), an efﬁcient task assignment policy must consider the computing ability and storage improvements of the servers in order to minimize the number of servers used in the cluster. On the other hand, it must integrate technologies that can support dynamic evolution systems (an important issue in distributed systems that undertake changes without being shut down) by using software engineering methodologies such as Object-Oriented Programming (OOP) (polymorphism, Servlets, Beans, PlugIns, loading in java,y).

Furthermore, the more challenging is to combine the security, fault tolerance and accessibility from different client devices. These issues are still addressed separately in distributed server systems. Then, proposing a policy that distributes just the load among the servers is not sufﬁcient to support end-to-end quality of service. Another research path is in the direction of adapting the existing task assignment policies to the wireless sensor networks. These sensors contain devices of sensing and wireless communication in only one circuit, with a system on chip design and cost-effectiveness. Nevertheless, one of the major problems in this type of network is the energy consumption. The emission and the reception of the packet, at the time of the communication, is a costly process in term of energy. Thus, introducing a task assignment policy in such a network, to select the best sensor that must route the data, balances the energy of the sensors and improves the network lifetime. Finally, we can observe that all these analyzed solutions change completely or partially (e.g. SITA-V) when the servers are geographically distributed across the Internet (i.e. DNS-based architectures). In this architecture, the DNS server allows the mapping of the URL to one of the Internet Protocol (IP) addresses of the Web servers. Thus, the DNS server is best suited to route the client requests to the least loaded Web server (i.e. performs as a dispatcher). However, when the DNS replies to the mapping query, it sends the IP address of the selected Web server with a Time-To-Live (TTL) period for which the translation is valid. Thus, all the client requests falling within this TTL period will be directed to the same Web server which risks being overloaded and, consequently, the DNS has a less control on the client requests. The best way to solve this drawback is that the policy must adapt the TTL value with respect to the clients and the servers information (i.e. the number of the client requests, the geographical location of the clients, the load of the Web servers, the processing capabilities of the Web servers,y) when balancing the load of the requests among the Web servers.

Acknowledgments The authors would like to thank the editor in chief, the editor and the anonymous reviewers for their valuable comments and suggestions that have improved the presentation and correctness of this survey. References Akamai technologies, /http://www.akamai.comS. Arlitt MF, Williamson CL. Web server workload characterization: the search for invariants. ACM SIGMETRICS Performance Evaluation Review 1996;24: 126–37. Awerbuch B, Azar Y, Leonardi S, Regev O. Minimizing the ﬂow time without migration. SIAM Journal on Computing 2002;31:1370–82.

1129

Bansal N, Harchol-Balter M. Analysis of srpt scheduling: investigating unfairness. ACM SIGMETRICS Performance Evaluation Review 2001;29:279–90. Broberg J, Tari Z, Zeephongsekul P. Task assignment based on prioritising trafﬁc ﬂows. In: Proceedings of the eighth international conference on principles of distributed systems, Grenoble, France, 15–17 December; 2004. p. 415–30. Broberg J, Tari Z, Zeephongsekul P. Task assignment with work-conserving migration. Journal of Parallel Computing 2006;32:808–30. CDNetworks. /http://www.us.cdnetworks.comS. Certeon technologies. /http://www.certeon.comS. Ciardo G, Riska A, Smirni E. Equiload: a load balancing policy for clustered web servers. Performance Evaluation 2001;46:101–24. CiscoSystems, LocalDirector, Documentation Available at /http://www.cisco.com/ en/US/products/hw/contnetw/ps1894/index.htmlS; 1998. Crescendo Networks. /http://www.crescendonetworks.comS. Crovella M, Bestavros A. Self-similarity in world wide web trafﬁc: evidence and possible causes. IEEE/ACM Transactions on Networking 1997;5:835–46. Crovella M, Harchol-Balter M, Murta C. Task assignment in a distributed system: improving performance by unbalancing load. ACM SIGMETRICS Performance Evaluation Review 1998;26:268–9. Crovella M, Taqqu MS, Bestavros A. Heavy-tailed probability distributions in the world wide web. London: Chapman and Hall; 1999. Feldmann A. Web performance characteristics, IETF plenary Nov. ’99. /http:// www.research.att.com/anja/feldmann/papers.htmlS, 1999. Fielding R, Gettys J, Mogul J, Frystyk H, Bernes-Lee T. Hypertext transfer protocol—HTTP/1.1. RFC 2616; 1999. Fu B, Broberg J, Tari Z. Task assignment strategy for overloaded systems. In: Proceedings of the eighth IEEE international symposium on computers and communication, Kemer-Antalya, Turkey, 30 June–03 July. IEEE; 2003. p. 1119–25. Fu B, Tari Z. A dynamic load distribution strategy for systems under high task variation and heavy trafﬁc. In: Proceedings of ACM symposium on applied computing, Melbourne, Florida, USA, 9–12 March. ACM; 2003. p. 1031–7. F5 networks. /http://www.f5.comS. Gilly K, Thomas N, Juiz C, Puigjaner R. Scalable QoS content-aware load balancing algorithm for a web switch based on classical policies. In: Proceedings of 22nd international conference on advanced information networking and applications (AINA 2008), GinoWan, Okinawa, Japan, 25–28 March. IEEE; 2008. p. 934–41. Harchol-Balter M. Task assignment with unknown duration. Journal of the ACM 2002;49:260–88. Harchol-Balter M, Crovella M, Murta C. On choosing a task assignment policy for a distributed server system. Journal of Parallel and Distributed Computing 1999;59:204–28. Harchol-Balter M, Li C, Osogami T, Scheller-Wolf A, Squillante M. Analysis of task assignment with cycle stealing under central queue. In: Proceedings of 23rd international conference on distributed computing systems (ICDCS ’03), Providence, Rhode Island, May 19–22. IEEE; 2003. p. 628–37. Harchol-Balter M, Scheller-Wolf A, Young A. Why segregating short jobs from long jobs under high variability is not always a win. In: Proceedings of the 47th annual allerton conference, Allerton House, UIUC, Illinois, USA, 30 September– 02 October. IEEE; 2009. p. 121–7. Harchol-Balter M, Schroeder B, Bansal N, Agrawal M. Size-based scheduling to improve web performance. ACM Transactions on Computer Systems 2003;21: 207–33. Harchol-Balter M, Scheller-Wolf A, Young A. Surprising results on task assignment in server farms with high-variability workloads. In: Proceedings of ACM SIGMETRICS 2009 conference on measurement and modeling of computer systems, Seattle, USA, June 15–19. ACM; 2009. p. 287–98. Hunt G, Nahum E, Tracey J. Enabling content-based load distribution for scalable services. Technical Report, IBM T.J. Watson Research Center; 1997. IRCacheHome The trace ﬁles, /http://www.ircache.net/Traces/S; 2004. Jayasinghe M, Tari Z, Zeephongsekult P. A scalable multi-tier task assignment policy with minimum excess load. In: Proceedings of the IEEE symposium on computers and communications, Riccione, Italy, 22–25 June. IEEE; 2010. p. 913–8. Keynote Systems, /http://www.keynote.comS. Krishnamurthy B, Rexford J. Web protocols and practice: HTTP/1.1, networking protocols, caching and trafﬁc measurement. In: Addison-Wesley; 2001. LeFebvre W. Cnn.com: facing a world crisis. Invited talk at the USENIX technical conference; 2002. Manley S, Seltzer M. Web facts and fantasy. In: Proceedings of the USENIX symposium on internet technologies and systems, Monterey, CA, 8–11 December. USENIX Association; 1997. p. 125–34. Nozaki S, Ress S. Approximations in ﬁnite-capacity multiserver queues with poisson arrivals. Journal of Applied Probability 1978;15:826–34. Radware technologies, /http://www.radware.comS. Riska A, Sun W, Smirni E, Ciardo G. Adaptload: effective balancing in clustered web servers under transient load conditions. In: Proceedings of the 22nd international conference on distributed computing systems, Vienna, Austria, 02–05 July. IEEE; 2002. p. 104. Riverbed technologies, /http://www.riverbed.comS. Schroeder B, Harchol-Balter M. Evaluation of task assignment policies for supercomputing servers: the case for load unbalancing and fairness. Cluster Computing 2004;7:151–61. Tan L, Tari Z. Dynamic task assignment in server farms: better performance by task grouping. In: Proceedings of the seventh international symposium on computers and communications, Ramada Hotel, Taormina-Giardini Naxos, Italy, 01–04 July. IEEE; 2002. p. 175–80.

1130

F. Semchedine et al. / Journal of Network and Computer Applications 34 (2011) 1123–1130

Tari Z, Broberg J, Zomaya AY, Baldoni R. A least ﬂow-time ﬁrst load sharing approach for distributed server farm. Journal of Parallel and Distributed Computing 2005;65:832–42. Ungeruanu V, Melamed B, Katehakis M, Bradford PG. Deferred assignment scheduling in cluster-based servers. Journal of Cluster Computing 2006;9:57–65. Wei J, Zhou X, Xu C. Robust processing rate allocation for proportional slowdown differentiation on internet servers. IEEE Transactions on Computers 2005;54: 964–77.

Williams A, Arlitt M, Williamson C, Barker K. Web workload characterization: ten years later, web content delivery. Springer; 2005. Winston W. Optimality of the shortest line discipline. Journal of Applied Probability 1977;14:181–9. Zhou X, Wei J, Xu C. Processing rate allocation for proportional slowdown differentiation on internet servers. In: Proceeding of the IEEE 18th international parallel and distributed processing symposium (IPDPS), Santa Fe, New Mexico, USA, 26–30 April. IEEE; 2004. p. 88–97.

Lihat lebih banyak...

Task assignment policies in distributed server systems: A survey

Descrição do Produto

Comentários