A hierarchical disk scheduler for multimedia systems

Share Embed


Descrição do Produto

Future Generation Computer Systems 19 (2003) 23–35

A hierarchical disk scheduler for multimedia systems Jesús Carretero a,∗ , Javier Fernández a , Félix García a , Alok Choudhary b b

a Universidad Carlos III de Madrid, Dpto. de Informática, Avda. Universidad 30, Leganés, 28911 Madrid, Spain Robert McCormick Technological Institute, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA

Received 17 October 2000; accepted 19 April 2002

Abstract An integrated storage platform for open systems should be able to meet the requirements of deterministic applications, multimedia systems, and traditional best-effort applications. It should also provide a scheduling mechanism fitting all those types of applications. In this paper, we propose a two-level hierarchical disk scheduling scheme, named 2-Q, which can guarantee deterministic deadlines, maximize the number of statistic real-time streams processed by the disk system, and minimize the average latency for best-effort requests. The upper level of the scheduling architecture, server level, is divided into three queues: deterministic, statistic, and best-effort requests. Each server may have its own scheduling algorithm. The lower level, disk driver, chooses the ready streams using its own scheduling criteria. We also propose an adaptive admission control algorithm relying on worst and average values of disk server utilization. Only streams satisfying the admission algorithm criteria are accepted for further processing by the disk server. The solution is extended to a parallel disk system by using a third hierarchical level, named meta-scheduler, briefly described in the paper. The performance evaluations demonstrate that our scheduling architecture is adequated for handling stream sets with different deterministic, statistic, or best-effort requirements. © 2002 Elsevier Science B.V. All rights reserved. Keywords: Multimedia; Parallel I/O; Clusters; Disk scheduling

1. Introduction Over the last few years, there has been a great interest in the scheduling of I/O devices, usually disks, in computer systems [1,2]. However, the requirements and the platforms for both multimedia and general systems seemed to be so different [3] that people developed specialized systems. Thus, disk scheduling algorithms for general purpose systems tried to reduce the access time, while multimedia systems tended to satisfy the real-time constraints for cyclical streams even losing performance. With the multimedia applications increasing, some authors [4] have ∗

Corresponding author. E-mail address: [email protected] (J. Carretero).

proposed the design of a new kind of system, named integrated, including facilities to support heterogeneous multimedia and general purpose information, allowing the existence of deterministic, statistic, and best-effort requests. In an integrated system, the user may request the start of a new I/O request during run-time at random. The system must determine, following some admission criteria, whether the stream is schedulable, and admit or reject it. Primarily, three kinds of I/O streams can be found in such a system: periodic real-time requests (usually related with retrieval of continuous media files and whose arrival and execution times are predictable), sporadic real-time request (usually related with critical write operations and whose arrival and execution time cannot be predicted), and best-effort requests (usually related with

0167-739X/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 7 3 9 X ( 0 2 ) 0 0 0 9 4 - 8

24

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

textual streams, that do not have any deadlines). The main problem with most of those systems is that they do not provide schedulability guarantees for deterministic applications. Deng [5] has proposed a CPU scheduling architecture for deterministic streams in an open environment that could solve this problem. In this paper we propose a hierarchical two-level disk scheduling scheme, called 2-Q, that is being used in MiPFS [6,7], a multimedia integrated parallel file system. First, we show how to schedule the I/O streams for a single device in an open and integrated system, and then we extend the former solution to schedule multiple devices in parallel. Because of the features of multimedia I/O, many real-time streams are time-driven, non-predictable, and non-preemptive in our system. Thus, in spite of the nature of the streams (deterministic, statistic, and best effort), our scheme assumes the following features for them: non-preemptable, global shared resource usage, periodic or sporadic streams, and execution time and deadline of sporadic streams are only known when they are requested. Moreover, while geometry is not relevant when scheduling a CPU, it becomes very important in disk scheduling. Thus, our scheduling algorithm includes geometry considerations to optimize the utilization of resources. Major motivations of our design, and some related work, are presented in Section 2. Section 3 describes the multi-level architecture of our scheduling scheme, defining a sufficient condition for the schedulability of deterministic streams. Section 4 presents the admission criteria used in our system to schedule different kinds of streams. Section 5 briefly describes the extension of the scheduling scheme to a parallel disk system. Section 6 presents performance evaluation results for several scheduling algorithms, comparing them with 2-Q. Finally, Section 7 shows some concluding remarks and future work.

2. Related work There are well known techniques for scheduling deterministic real-time tasks [8,9]. However, most of them are oriented to fixed priority scheduling or dynamic priority scheduling in very specific situations. However, the timing requirements of many I/O streams are not known in advance, thus a global scheduling analysis cannot be done in advance [10]. The prior-

ity of the streams must be dynamically computed using parameters as time, disk geometry, and quality of service (QoS) granted to the application releasing the stream. Several algorithms have been proposed to satisfy timing constraints in multimedia systems; EDF gives the highest priority to I/O streams with the nearest deadline [11], CSCAN [12,13] orders the streams using an ascending order, SCAN-EDF orders the I/O streams by deadline and the streams with the same deadline by ascending order [1], and SCAN-RT [14] orders the streams using an ascending order but taking into consideration the deadlines. In addition to the former scheduling strategies, some other scheduling algorithms, like SCANRT-RW [14] and TDP [15], have been proposed recently. However, none of them address the problem of integrated environments, where multiple kind of streams must be scheduled. An integrated scheduling algorithm should try to reduce the answer time of best-effort requests, but it should also provide a time guided disk scheduler giving more priority to the real-time stream with the nearest deadline. Because most of the disk scheduling schemes proposed up to now can hardly make a good trade-off between the former aspects, some multi-level hierarchical scheduling architectures have been proposed for deterministic tasks in an open environment [5] and for integrated multimedia systems [4,16]. In essence, those schemes create a slower virtual device for each stream or each class of streams. As an example, Symphony uses three kind of servers: best effort, statistic and deterministic streams. Each server i is given a budget βi = αi Φ, being αi a fraction of the total server size Φ, so that α1 +α2 +α3 = 1. A server i is ready for execution only when βi > 0. If all servers have pending streams, then each one is allocated exactly βi. The main goal of this scheduling algorithm is to provide fairness in device usage. A stream j is admitted in the server i if, and only if, (βi > 0) ∧ (βi ≥ xj ), xj being the service time of stream j, which depends on the seek time, rotational time, and transfer time of the device. This scheme has serious limitations to ensure the schedulability of deterministic streams, whose execution time and relative deadline are only known when the streams are released. Our algorithm, shown in the next section, provides to each server with an initial budget for all streams. A deterministic stream is admitted if, and only if, its server has remaining budget assuming worst case

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

conditions. If the budget required by the request is higher than the available, the server with the lowest priority is chosen as a victim and some of its budget is transferred to the deterministic server. The priority is related to the time requirements of the streams and the QoS committed to them.

3. Disk scheduling system architecture Fig. 1 shows the architecture of our hierarchical disk scheduling system. Each disk scheduler consists of two levels. An upper level with three stream server queues: S0 for deterministic (HRT), S1 for statistic (SRT), and S2 for best-effort (NRT) streams, each one with its own scheduling algorithm. A lower level including a disk driver scheduler D, a ready queue R, and a disk. Thus, each scheduling decision involves two steps. First, each server scheduler Si , receives

Fig. 1. Disk scheduler architecture.

25

streams r, and, based on some scheduling algorithm Ai , inserts them into corresponding place in its queue. Secondly, when the disk is free, D chooses one stream among the upper level queues, using its own scheduling algorithm, and put it into R. Statistic real-time streams allow a certain percentage of deadline misses, but as far as the QoS of the client can be met the services are deemed OK. Deterministic real-time streams do not allow any misses. In our prototype, the disk scheduling algorithms used at server level are: EDF for deterministic, SCAN-EDF for statistic, and CSCAN for best-effort streams. The disk driver scheduler D chooses the stream to be executed next and maintains the budget of the servers. In our hierarchical system, each server queue has a different priority that is used by D as a criteria to choose ready jobs. However, using only this priority criteria will be unfair for best-effort and statistic applications, leading to a high percentage of deadline misses. Next section will show how our 2-Q scheduling algorithm solves this problem. Let us define the sporadic and periodic streams in our system as follows. A sporadic stream, i, is defined as Pi (ai , bi , di ), where ai is the arrival time, bi the amount of data required, and di its hard deadline. A periodic stream, j, is defined as Pj (ai , bi , pi ), where ai is the activation time, bi the amount of data required, and pi its period. Best-effort streams can be considered sporadic streams without a deadline. The operations of the disk scheduler on those streams are the following. First, when the system starts, the schedulers for the disk driver D and the three servers are set up. Upon the arrival of a new stream, the server type is chosen depending on the stream QoS features; if QoS is 100%, then the stream is a deterministic one and if QoS is 0%, it is a best-effort request; in any other case, the statistic scheduler is chosen. The server size si for a stream is defined in Section 4. The best-effort requests are included in S2 without any budget. They use the budget remaining on each round R. For real-time requests, an admission test is executed (see Section 4). The budget for the service is granted only whether the test is passed, being denied in other cases. Once admitted, it is inserted in a server queue Si using the server’s disk scheduling algorithm Ai with the scheduling deadline l and the geometry as leading criteria. The budget of the server is increased in si . When the disk is available, D chooses a job from some server queue Si using algorithm in Fig. 3. When

26

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

a stream finishes, the budget of its server is diminished by si . 3.1. 2-Q scheduling algorithm To insert the streams into the server queues, two major parameters are used in our scheduling scheme: deadline and service time. The service time is totally application dependent because it depends on the track number, while the deadline of a stream may be modified depending on the stream properties. Two kinds of deadlines are considered here: application deadline, d, of a specific stream is the one set by the application through the driver interface or other operations, such as QoS negotiation; scheduling deadline, l, that is a virtual deadline internally used by the disk scheduler and computed by the server’s scheduler, so that l ≤ d. The computation of the virtual deadline l is different for each kind of stream. For a best-effort request, l is initially set to a very large value that can be dynamically modified. For a statistic stream, l is the same as its actual deadline d. The deadline for a deterministic request r, is computed using the algorithm in Fig. 2. The complexity of the scheduling deadline computation is O(n), where n is the length of the deterministic server queue. For

example, let us assume five deterministic streams in S0 with the values of l and d as shown in Table 1. Let us also assume that the largest serving time for all of them is ts = 0.5 s. Each li is calculated as di − 0.5. Since the l3 of Request 3 cannot be satisfied by l2 (5.5 + 0.5 > 5.5), l2 is decreased by subtracting ts so that the scheduling deadline of Request 3 can be satisfied (5.0 + 0.5 ≤ 5.5). For the same reason, l4 must also be decreased to satisfy l5 . The final scheduling deadlines are also shown in Table 1. Whenever the disk server is free, the driver scheduler D selects a stream from the three server queues and inserts it into the ready queue. The algorithm used by D to choose a ready stream from the server queues, at any time t, is shown in Fig. 3. One potential problem of deadline modification is fairness. As the deterministic deadlines are shortened, more statistic streams may potentially miss their deadlines. The fairness of 2-Q is evaluated in Section 6. The intuition after our policy is that if density of real-time streams is high, more disk serving time should go towards real-time streams. Otherwise, besteffort requests could be served, because the deadlines of the currently most urgent real-time streams are not likely to be missed even if the disk server turns to serve some best-effort requests first. As the results

Fig. 2. Computing the scheduling deadline of a deterministic request.

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

27

Table 1 Deadlines of deterministic requests

Real deadline Original scheduling deadline Final scheduling deadline

Request 1

Request 2

Request 3

Request 4

Request 5

5.0 4.5 4.5

6.0 5.5 5.0

6.0 5.5 5.5

7.0 6.5 6.0

7.0 6.5 6.5

shown in Section 6 corroborate, our scheduling architecture has two major features related to other disk scheduling schemes. First, the deterministic stream deadlines can always be met for streams admitted using the algorithm in Section 4. Secondly, the average waiting time for best-effort requests is small. An analytical model of the average waiting time for best-effort requests in the 2-Q algorithm is given below. For simplicity, let us assume that the round R is 1 s, that all the streams have the same service time x, that the deadline of periodic streams is the same as their period R, that the best-effort requests arrive at a rate of λ/s, and the sporadic streams arrive at an average rate of µ/s. For a specific best-effort request, if it arrives αxs before the end of that round, where xs is the service time and α a slack factor used to ensure the existence of some security margin, it would become the next ready stream. In this scenario (denoted as

case 1), the average waiting time would be x1 = 0.5xs . Otherwise, it should wait at least until the end of that cycle. In this scenario (denoted as case 2), the waiting time would be x2 = αxs + (0.5αxs λ + αts µ)xs . The average waiting time twait for a best-effort request is twait = Prob{case1}x1 + Prob{case2}x2 = (1 − αx)0.5x + αx(αx + (0.5αxλ + αxµ)x = (0.5λ + µ)α2 x3 + (α2 − 0.5α)x2 + 0.5x. (1) Assume the case that x = 40 ms, λ = 10, µ = 10, and α = 4, the average waiting time for a best-effort request is 57 ms. The correctness of this estimation will be proven in Section 6. Obviously, scheduling deterministic streams with higher priority means that some statistic streams would miss their deadlines. Even when the solution proposed in Fig. 3 leads to consider those streams failed, an alternative approach can be used to reduce the number

Fig. 3. Algorithm to choose a ready task from the server queues.

28

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

of missed deadlines: temporal degradation of the QoS of some streams. Some statistic applications, such as video-conferencing, may degrade their QoS temporarily to benefit other deterministic applications, such as telesurgery. To use this scheme, not explained in this paper, each stream should specify the average QoS required and the percentage of temporal degradation during a maximum time. This solution can satisfy most of statistic deadlines misses shown in Section 6.

4. Admission control algorithms The service time x depends on the amount of data requested by each stream and its placement in the disk. It is very difficult to make schedulability plans if all the streams requesting service are admitted into the disk server. Without any restrictions, most of the rounds R would be overflowed and it would be impossible to satisfy many deadlines. An admission control algorithm limits the number of streams that can be admitted into the server, executing acceptance tests, and allocates new budget to the servers. Our system uses two approaches in the admission control: predictable and adaptive. Predictable admission control is used for deterministic streams, requiring deterministic scheduling guarantees. This admission algorithm uses a worst case assumption to compute the maximum service time [17] for each track, the minimum data unit that can be retrieved from without executing a new seek or waiting for the rotational latency. The maximum service time can be computed as max max track xmax = tseek + tlat + ttran ,

(2)

max is the time to move the disk head through where tseek max the maximum rotational all the tracks of the disk, tlat track latency time, and ttran the time to transfer a track. The former value does not depend on the application, but only on the track size and disk geometry. Assuming also that, in the worst case, each stream can request data from a different track on each round, a new deterministic stream n + 1 will be admitted into a disk server already serving to n deterministic streams if

(n + 1)

xmax xmax ≤1− , R R

(3)

where xmax /R is the maximum possible blocking time of the new stream and R = min(di − ai ) ∀i ∈ n. The maximum number of streams that can be admitted into the system is n≤

R xmax

− 2.

(4)

For a disk like the one described in Section 6 and being R = 0.333, the value of n is 4. For R = 1 it would be 18. A stream requesting q tracks can be considered as q independent streams. The former values are very conservative. In practice, they can be relaxed by using average bandwidth, also called effective bandwidth, and adaptive algorithms [18] that rely on the average seek time and access size values. The average service time can be computed as avg

avg

avg

xavg = tseek + tlat + ttran ,

(5)

avg

where tseek is the average seek time measured in the avg

system, tlat the average rotational latency time, and avg ttran the time to transfer the average access unit from the disk. The admission conditions for a real-time stream on a disk server with n streams, k of them being deterministic streams, are different for a deterministic or statistic stream (see algorithm in Fig. 4). If the new stream is deterministic, the admission conditions are computed using Eq. (4), but now the maximum blocking time must also consider the statistic streams, as all of them are non-preemptable:    xj xmax (k + 1) , ≤ 1 − max maxj=1,k R dj − a j   xj maxj=k+1,n . (6) pj However, if it is a statistic stream, the admission conditions are computed using average values, as stated in the following equation: k

xavg xmax + (n − k + 1) R Ravg    xj , ≤ 1 − max maxj=1,k dj − a j   xj maxj=k+1,n , pj

(7)

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

29

Fig. 4. Admission control algorithm for a single disk server.

where Ravg = avg(pi ) ∀i ∈ soft real-time stream. xavg depends on the disk parameters, device allocation policy, and the access size of the applications. The larger is the application’s access size, the lower is xavg whenever the access units are located contiguously on the disk. For a disk like the one described in Sections 5 and 6, deterministic streams per second, and several statistic streams requesting 64 KB with Ravg = 0.333 (which is a typical 1.5 Mbit/s), the value of n is 16. These results match those shown in Section 6.

5. Parallel disk scheduler architecture We have scaled up the former hierarchical scheduling architecture to schedule a parallel disk server. Fig. 5 shows the architecture of our parallel disk system scheduler. It mainly consists of two parts: system meta-scheduler and integrated disk servers. The meta-scheduler is a coordinator, with three major functions: decomposing the incoming streams and dispatching them to the corresponding disk servers, gathering portions of data from different disk servers and sending them back to the applications, and coordinating the service to a specific stream among different disk servers. The first two functions are included into the parallel disk system interface, while the third one acts as an intelligent inspector among different disk servers. When a new stream arrives, it is decomposed into sub-streams by the meta-scheduler and each sub-

stream is sent to the appropriated disk server. Each disk server runs the internal admission algorithm shown in Fig. 4 returning the result to the meta-scheduler. If it is successful, the resources are committed temporarily. The meta-scheduler gathers the information from the servers, returning to the application the status of the admission tests: successful if all disk servers admitted the new stream, failed in other cases. When successful, the meta-scheduler asks the disk servers to commit the resources. A problem incurred with stream distribution is that some sub-streams could be successful while others could fail. The meta-scheduler gathers the status of the sub-streams and, if there are some deadline misses and the QoS is below the required, notifies a failure to the application. Moreover, it notifies to the remaining involved servers to abort the stream and release the budget allocated to this stream. To accomplish it, each stream is assigned a unique ID number, which is shared by all of its sub-streams, and inserted in a stream dispatching table. Whenever a disk server fails to serve a sub-stream, the meta-scheduler is informed. According to the unique ID number, the meta-scheduler changes the status of all the sub-stream corresponding to this stream to failed, informing other disk servers about the error. As a result, all the remaining sub-streams of the stream are deleted from the queue on each disk server and the resources are freed. This policy avoids wasting resources on failed streams, transferring those resources to other successful streams.

30

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

Fig. 5. Parallel disk scheduling system architecture.

6. Performance evaluation The performance results presented in this section were collected on a silicon graphics origin machine located at the CPDC of Northwestern University. All the results shown in this paper have been collected using a simulated disk, whose parameters are shown in Table 2. Its implementation is based on earlier implementations [19,20] and architectural studies [21]. Table 2 Parameters of the simulated disk Disk capacity (GB) Cylinders Tracks per cylinder Sectors per track Average seek time (ms) Minimum seek time (ms) Maximum seek time (ms) Average rotate latency (ms) Data transfer speed (MB/s)

2.8 2627 21 99 11 1.7 22.5 8 4.5

To test the features of the disk server scheduler, a heterogeneous input load, similar to the one expected in a real integrated file system, was generated: several applications requesting data with periodic statistic, sporadic deterministic, and best-effort features during 300 s. Periodic streams had a playback rate of 1.5 Mbit/s and an access unit of 64 KB. The round chosen was 1 s, with new streams arriving at random to the I/O system into the round. Each new periodic stream reads a continuous media file whose first block was allocated randomly. Next stream refer to subsequent blocks, thus reflecting the effect of storage preallocation for the data. Sporadic streams came with a specified average interval, parameterized in the experiments, an average request size of 8 KB, and a average deadline of 200 ms after the arrival time. Best-effort streams came randomly with a parameterized average interval and the same request size of the sporadic streams, as well as a random first block number. In our simulation, besides the 2-Q algorithm proposed

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

31

in this paper, three other popular disk scheduling algorithms, CSCAN, EDF, and SCAN-EDF, were also evaluated. The same workload was used on the four algorithms for the purpose of comparison. Moreover, without any lose of generality, the sporadic streams were considered as deterministic, as the periodic streams were considered always as statistic. 6.1. Evaluation of a single disk server scheduler Three main experiments were performed to evaluate the features of the 2-Q scheduling architecture on a single disk server. First, the percentages of deterministic and statistic missed deadlines were evaluated. Secondly, the average waiting time for best-effort requests was measured. And finally, we studied the average seek time. Notice that the admission algorithms were not applied in those experiments as we wanted to measure the peak performance of the system. However, the results obtained from the experiments should also be consistent with the admission test criteria. Fig. 6 shows the percentage of deterministic deadlines missed by the scheduling algorithms when the number of deterministic streams was increased, using a round of 1 s. The number of statistic periodic streams was 20 in all tests. CSCAN is not shown here because the deterministic misses are higher than 25% in all cases, rising almost to 100%.The reason is that temporal restrictions are not used in this algorithm. EDF and SCAN-EDF have very low percents of determin-

Fig. 6. Percent of hard deadlines missed.

Fig. 7. Percent of soft deadlines missed.

istic misses, but they are still higher than the 2-Q rate which is zero when the number of sporadic streams is below 35. Note that the worst value predicted by the admission algorithm is 18, which was very conservative. When this threshold is over-passed, the slope of miss percentage is smaller for 2-Q, which means that the reliability of 2-Q to satisfy the deadlines is higher than that of EDF or SCAN-EDF. When the number of sporadic streams is increased to 100, the percentage of misses raises almost to 25% in all cases, even when 2-Q is slightly better. Fig. 7 shows the percentage of statistic deadlines missed by the four scheduling algorithms when the number of statistic periodic clients was increased. The round was also 1 s. The average number of deterministic streams per round was 5. 2-Q missed less statistic deadlines than EDF and almost the same as SCAN-EDF and CSCAN. Notice that the number of statistic misses of 2-Q is 0 for 10 periodic streams, which is consistent with the predictions of the admission algorithm (11), but increasing of statistic deadlines for the other algorithms was very fast when the round was overloaded with more streams. See that CSCAN performance is similar to 2-Q for statistic requests, but it fails to satisfy the deterministic deadlines earlier than 2-Q. Correlating both figures, we can conclude that 2-Q may serve as many statistic streams as the other algorithms, while serving more deterministic requests without missing deadlines. Fig. 8 shows that the response time for best-effort requests is better for 2-Q than for the other algorithms.

32

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

high priority of the deterministic streams has negative effects on the geometrical ordering of the disk accesses. This effect is higher in EDF, whose average seek time is the worst. 6.2. Evaluation of a parallel disk server scheduler

Specifically, the results of 2-Q are much better than those of EDF and SCAN-EDF, typically used in continuous media systems. These results can be explained because of the opportunistic behavior of our disk driver scheduler, that serves best-effort requests in CSCAN order whenever it has some free time in a round. It not only reduces the answer time but also minimizes the average seek time for best-effort requests. Those features are not present in EDF or SCAN-EDF. Fig. 9 shows the average seek time for several scheduling algorithms under the same load conditions. As expected, 2-Q average seek time is higher than CSCAN and SCAN-EDF because the

Two kinds of experiments were performed to obtain the results shown in this section. First, an increasing workload was applied to a parallel disk server whose number of disk servers was varied from 1 to 16. We wanted to measure the maximum number of statistic streams served before having a deterministic deadline miss. Secondly, three different workloads were executed using a parallel disk system with eight disks servers, while varying the QoS of the video streams, to measure the effect on the number of periodic streams served. The first workload included only periodic streams, the second was a mix of periodic statistic and aperiodic deterministic streams, and the third included also best-effort streams. The former experiments were chosen because they show the scalability and the behavior of the parallel system when related to QoS. Fig. 10 shows the results of the first experiment. The workload was composed of deterministic sporadic streams, statistic periodic streams, and best-effort requests. As can be seen, the 2-Q algorithm always provides the same or better results than the others. That means, that the results observed in Figs. 6 and 7 for a

Fig. 9. Average seek time.

Fig. 10. Number of periodical clients served before a deterministic deadline miss.

Fig. 8. Response time for best-effort requests.

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

Fig. 11. QoS influence on the number of periodical clients served.

single disk server can be also corroborated when using 2-Q on a parallel disk server; it can serve more statistic clients before having a deterministic deadline missed. Moreover, 2-Q scalability is better than, or similar to, other disk scheduling algorithms. Fig. 11 shows the results of the second experiment for the 2-Q algorithm. The number of statistic periodic streams scheduled decreased as the QoS was increased for every load. As expected, the maximum number of streams served decreased when the input load included sporadic and best-effort requests. The introduction of sporadic streams into the workload was the main cause for the decrease. Anyway, the good behavior of the 2-Q algorithm in presence of a mixed load makes it very suitable for an integrated parallel disk system. The experiment was also executed for the other scheduling algorithms, but the results obtained were worst than for those of 2-Q. CSCAN, for example, only could serve six periodic streams for a 90% QoS.

7. Summary and concluding remarks In this paper, a solution for the scheduling problem in an integrated storage platform which is capable of meeting the requirements of real-time applications, multimedia systems, and traditional best-effort applications has been presented. First, we have motivated the need of such a solution, then the architecture and algorithms used in our 2-Q disk scheduling archi-

33

tecture have been described. 2-Q has a hierarchical two-level architecture which the upper level, or server level, divided into three queues for deterministic, statistic and best-effort requests, each one using a scheduling algorithm specific for that server. The solution proposed for a single disk served was generalized for a parallel disk server by using a meta-scheduler to control the achievement of the deadlines of parallel streams. The meta-scheduler uses a generalized admission algorithm to accept or reject new streams. A parallel stream is admitted if, and only if, each individual disk server involved in the stream execution admits its sub-stream. When an admitted statistic streams fails, its resources are released and allocated to new streams. Our performance evaluation demonstrates the feasibility of our scheduling architecture for handling stream sets with different deterministic, statistic, or best-effort requirements. It also demonstrates that 2-Q can guarantee the deterministic deadlines of the streams admitted into the disk server. Moreover, 2-Q maximizes the number of statistic streams processed by the disk and it minimizes the average answer time for best-effort requests. The results for the evaluation of the parallel disk scheduling architecture also demonstrate that 2-Q can scale when several disks are used. Actually, as the number of disks increases to 16 our solution provides better results than SCAN-EDF for the number of periodic streams admitted into the system. Moreover, our system can manage mixed loads with a small reduction (10%) of the number of real-time streams processed by the disk server. A prototype of 2-Q has been developed to serve a parallel file system, named MiPFS. The prototype has been included into the LINUX kernel as a loadable module, so that the scheduling policies can be changed easily. Specifically, we have replaced the disk driver scheduler, actually SCAN-EDF, by our scheduler. It is running on Pentium III biprocessor machines. Currently, we are developing a version for a cluster of workstations. On the theoretical side, work is going on to make the 2-Q algorithm optimal.

Acknowledgements This work has been supported in part by an NSF Young Investigator Award CCR-9357840, NSF

34

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35

CCR-9509143, by the Scalable I/O Initiative, contract number DABT63-94-C-0049 from the Defense Advanced Research Projects Agency (DARPA), by the Spanish “Comisión Interministerial de Ciencia y Tecnología” under the project TIC97-0955. References [1] A. Reddy, J. Wyllie, I/O issues in a multimedia system, IEEE Comput. 27 (1994) 69–74. [2] S. Ghandeharizadeh, S. Kim, C. Shahabi, On configuring a single disk continuous media server, in: Proceedings of the Conference, ACM SIGMETRICS’95, 1995, pp. 37–46. [3] D.M.G. Neufeld, N. Hutchinson, Design of a variable bit-rate continuous media server for an ATM network, in: Proceedings of the IST/SPIE Multimedia Computing and Networking, San Jose, CA, USA, 1996. [4] S.R. Prashant Shenoy, P. Goyal, H. Vin, Symphony: an integrated multimedia file system, in: Proceedings of the SPIE/ACM Conference on Multimedia Computing and Networking (MMCN’98), San Jose, CA, USA, 1998. [5] Z. Deng, J.W.-S. Liu, Scheduling real-time applications in an open environment, in: Proceedings of the IEEE Real-time Systems Symposium, San Francisco, CA, 1997, pp. 308–319. [6] J. Carretero, W. Zhu, X. Shen, A. Choudhary, MiPFS: a multimedia integrated parallel file system, in: Proceedings of the Fourth International Conference on Computer Science and Informatics, Vol. III, Research Triangle, Raleigh, NC, USA, 1998. [7] J. Carretero, W. Zhu, A. Choudhary, Design and evaluation of a multimedia integrated parallel file system, in: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS’99), Florence, Italy, 1999. [8] C. Liu, J. Layland, Scheduling algorithms for multiprogramming in a hard real time environment, J. ACM 20 (1) (1973) 46–61. [9] J. Lehoczky, L. Sha, Y. Ding, The rate monotonic scheduling algorithm—exact characterization and average case behavior, in: Proceedings of the IEEE Real-time System Symposium, 1989, pp. 166–171. [10] H. Kaneko, J. Stankovic, S. Sen, K. Ramamritham, Integrated scheduling of multimedia and hard real-time tasks, Technical Report UM-CS-1996-045.ps, Computer Science Department, University of Massachusetts, August, 1996. [11] M. Sohn, G. Kim, Earliest-deadline-first scheduling on nonpreemptive real-time threads for continuous media server, in: Proceedings of the Conference on High-performance Computing and Networking’97, 1997. [12] E.G. Coffman Jr., C.J.M. Turnbull, A note on the relative performance of two disk scanning policies, J. Inform. Process. Lett. 2 (1) (1973) 15–17. [13] E.G. Coffman, M. Hofri, On the expected performance of scanning disks, SIAM J. Comput. 10 (1) (1982) 60–70. [14] I. Kamel, Y. Ito, Disk bandwidth study for video servers, Technical Report 148-95, Matsusita Information Technology Laboratory, April 1996.

[15] S. Chaudhry, A. Choudhary, Scheduling algorithms for guaranteed service, Technical Report, CASE Research Center, Syracuse University, August 1996. [16] P. Shenoy, H. Vin, Cello: a disk scheduling framework for next-generation operating systems, in: Proceedings of the International Conference on Measurement and Modeling of Computer Systems (ACM SIGMETRICS’98), Madison, WI, USA, 1998. [17] J. Gemmel, Multimedia network file servers: multi-channel delay sensitive data retrieval, in: ACM (Ed.), Proceedings of the Conference, ACM Multimedia’93, 1993, pp. 243– 250. [18] H. Vin, P. Goyal, A. Goyal, A statistical admission control algorithm for multimedia servers, in: ACM (Ed.), Proceedings of the Conference, ACM Multimedia’94, San Francisco, 1994, pp. 33–40. [19] C. Ruemmler, J. Wilkes, Multimedia storage servers: a tutorial, IEEE Comput. 27 (3) (1994) 17–28. [20] F. Rosales, J. Carretero, F. Pérez, P. de Miguel, F. Garc´ıa, L. Alonso, CDS design: a parallel disk server for multicomputers, Technical Report FIM/83.1/DATSI/94, Universidad Politecnic Madrid, Madrid, Spain, 1994. http://www.laurel. datsi.fi.upm.es/gp/publications/datsi83.1.ps. [21] J. Hennesy, D. Patterson, Computer Architecture: A Quantitative Approach, 2nd Edition, Morgan Kaufmann, Los Altos, CA, 1995.

Jesús Carretero is a Full Professor in the Department of Computer Science at the Carlos III University (Madrid, Spain). Before moving to Carlos III University in 2000, he was an Associated Professor in the Department of Architecture and Technology of Computers at the Polytechnic University of Madrid (Spain). His research interests include operating systems, parallel and distributed systems, and real-time systems. He received his Bachelor of Science degree in Computer Science in 1989 at the Polytechnic University of Madrid (Spain), and his PhD in 1995 at the same university. During 1997 and 1998 he was visiting Northwestern University (Chicago, USA), collaborating with Alok Choudhary in the ECE department.

Javier Fernández is an Assistant Professor in the Department of Computer Science at the Carlos III University of Madrid, Spain. His research interests include operating systems, file systems, multimedia storage systems and parallel and distributed systems. He received his MS in Computer Science in 1999, and is actually in process of obtaining his PhD grade in this university.

J. Carretero et al. / Future Generation Computer Systems 19 (2003) 23–35 Félix García is an Associate Professor in the Department of Computer Science at the Carlos III University (Madrid, Spain). Before moving to Carlos III University in 2000, he was an Associated Professor in the Department of Architecture and Technology of Computers at the Polytechnic University of Madrid (Spain). His research interests include operating systems and parallel and distributed systems. He received his Bachelor of Science degree in Computer Science in 1993 at the Polytechnic University of Madrid (Spain), and his PhD in 1996 at the same university.

35

Alok Choudhary is a Full Professor at the ECE Department of the Northwestern University (Chicago, USA). He got his PhD at the University of Illinois, Urbana-Champaign, in 1989. He got several awards, such as the NSF Young Investigator Award (1993), IBM Faculty Development Award (1994), Intel Faculty Research Award (1993) and the IEEE Engineering Foundation Award (1990). His major research interests are compilers and run-time systems for high-performance, embedded and adaptive computing systems and power-aware systems, high-performance databases, OLAP and datamining, and parallel and distributed input-output for scientific and information processing applications. He is also the Area Editor of the Journal of Parallel and Distributed Computing and he has chaired several conferences, such as ICPP’93, IOPADS’96 and IOPADS’97.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.