Video quality-of-service for consumer terminals - a novel system for programmable components

June 16, 2017 | Autor: Reinder Bril | Categoria: Scalable Video Coding, Quality of Service, Resource Limitation, Video Quality, Cost effectiveness, Image Enhancement, Electrical And Electronic Engineering, Image Enhancement, Electrical And Electronic Engineering

Share Embed

Denunciar este link

Descrição do Produto

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

1367

Video Quality-of-Service for Consumer Terminals A Novel System for Programmable Components Christian Hentschel, Member IEEE, Reinder J. Bril, Yingwei Chen, Ralph Braspenning, and Tse-Hua Lan

Index Terms—consumer terminals, scalable algorithms, QoS control, dynamic resource management.

hardware solutions, these programmable components will have to be used very cost-effectively. Novel scalable algorithms aim for a more effective use of system resources by trading resource usage for output quality. Fig. 1 shows a range of a programmable product family versus algorithm requirements. Programmable platforms with different resources (Fig. 1a) will exist in parallel to suit different markets. Current media processing algorithms are designed for highest quality at given resources. In Fig. 1b, the height of the algorithms illustrates the resources needed for operation. The resource usage and output quality are usually not scalable, meaning that the number of algorithms allowed to run in parallel is platform dependent and very limited. A way of getting beyond these limitations is to design SVAs (Fig. 1c). These may have a kernel, which is not scalable (dark areas), and a part which is scalable to increase quality (light areas).

I. INTRODUCTION Future consumer terminals such as digital TV sets, set-top boxes (STBs) and displays combine high-quality digital audio and video with applications from the mainstream multimedia domain as found on PCs (e.g. video phone, streaming audio and video via internet). They will be based on programmable platforms, allowing media processing functions in software and enabling these systems to become open and flexible. Expected advantages over dedicated single-function hardware solutions include versatile, future proof, upgradable products, reduced time-to-market for new features, and reuse of hardware and software modules to support product families. High volume electronics (HVE) consumer products are heavily resource constrained, with high pressure on silicon cost and power consumption. To compete with dedicated C. Hentschel is with Philips Research Laboratories, Eindhoven, The Netherlands (phone: +31 40 2743091; fax: +31 40 2742630; e-mail: [email protected]). R. J. Bril is with Philips Research Laboratories, Eindhoven, The Netherlands (e-mail: [email protected]). Y. Chen is with Philips Research Laboratories, Briarcliff Manor, USA (email: [email protected]). R. Braspenning is with Philips Research Laboratories, Eindhoven, The Netherlands (e-mail: [email protected]). T.-H. Lan was with Philips Research Laboratories, Briarcliff Manor, USA. 1.

Programmable Product Family Resources

Algorithm 4 HighMidLowend

range

end

Algorithm 3 Algorithm 2 Algorithm 1

Algorithm 4 Algorithm 3 Algorithm 2 Algorithm 1

(a) (b) (c) Fig. 1 Programmable product family and fixed or scalable software algorithms.

Traditional systems are designed for a specific target functionality at high quality (Fig. 2). Going slightly beyond that target functionality increases the costs significantly. The scalable approach has no such limits, instead the quality depends on the functionality used at a given time. Even an upgrade of existing or later installed applications may run at very high quality. However, when multiple or complex applications are running concurrently, the quality may drop depending on the available resources.

We used a TM1300 @ 180 MHz, a member of the TriMedia™ Technologies Inc.® family of very long instruction word (VLIW) processors; see [1].

Contributed Paper Manuscript received July 4, 2003

SW-Modules

min max

Abstract—Future consumer terminals will be more and more based on programmable platforms instead of only dedicated hardware. Novel ‘scalable video algorithm’ (SVA) software modules trade off resource usage against quality of the output signal. SVAs together with a strategy manager and a Quality-of-Service Resource Manager (QoS-RM) aim for flexible, robust, and cost-effective media processing in software on programmable architectures. We developed some SVAs including MPEG-2 decoding and image enhancement algorithms and the basic QoS control software to illustrate the power of the overall system. The resource limited hardware is a currently available DSP board with a single dedicated media processor1. The system supports two inputs and several modes for main and picture-in-picture (PiP) window applications, along with background recording.

0098 3063/00 $10.00 © 2003 IEEE

1368

Cost

IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003

traditional systems

Quality

traditional systems

scalable approach scalable approach target limit

target limit

Functionality Functionality Fig. 2 Expected cost-effectiveness and quality trade-off of traditional systems compared with the scalable approach.

A number of business benefits and opportunities result from the flexibility of the scalable approach. A scalable approach does not come for free, however. Traditional systems neither support dynamic control of resources nor change of quality levels of an algorithm. Moreover, software solutions must preserve typical qualities of HVE consumer products, such as robustness and stability, and meet the stringent timing requirements imposed by high-quality digital audio and video processing. To meet the new challenges (openness and flexibility) and the existing ones (cost-effectiveness and robustness), future architectures must provide explicitly designed dynamic behaviour, explicit management of the available resources, and stability under stress and fault conditions. Our work on scalable algorithms is embedded in a Qualityof-Service approach, in which the overall perceptual quality is optimized at run-time, and in which seamless switching between different modes of operation (with different functionalities) is supported. Together with a strategy manager, a QoS resource manager (QoS-RM) enables SVAs to run in parallel by allocating the available resources and performing run-time optimization of the system. SVAs together with dynamic resource management support open architectures in a fast changing multimedia environment. See [2]-[9] for more details. In summary, consumer terminals using scalable media processing with dynamic resource management on programmable platforms aim at the following benefits: • versatile, future proof, upgradable products, • fast time-to-market, • reuse of modules to support product families, and • cost-effectiveness while maintaining existing qualities of HVE consumer terminals such as robustness, and meeting stringent timing requirements imposed by high-quality digital audio and video processing. In this paper, we focus on the first operational real-time system based on these ideas. The results presented are intended as proof of concept and feasibility. The following section compares quality-of-service approaches used in a network environment with the novel quality-of-service approach for consumer terminals. Section 3 describes the basic properties of SVAs. SVAs can instantly replace traditional video processing algorithms by applying the appropriate (fixed) quality-resource settings. Since they do not necessarily

require an additional framework, they are described first. In the longer term, with an increasing number of media algorithms running in software, SVAs become part of the overall system architecture including the control framework. A conceptual description of the overall system architecture is given in Section 4, followed by system software and application aspects in Section 5. Section 6 elaborates on the prototype implementation and the results achieved. II. TERMINAL QOS VERSUS NETWORK QOS High-quality video processing in CTs has a number of distinctive characteristics when compared to mainstream multimedia processing in, for example, a (networked) workstation environment [10]. Consumer terminals need to connect to various input sources, and are increasingly being integrated in wired and wireless network environments. The transmission of various data streams including graphics, audio and video over networks started in the workstation and PC domains. From a single user point-of-view, data transmission requests are often seen as point-to-point transmissions. Network requests from other users are independent and these additional activities are recognized by the single user only because of long delays or even network access denials. The most limiting resource is network bandwidth, which has to be shared by all current users. To solve these transmission problems, QoS has been introduced to optimise the service between different users. Data streams may get priorities and a specific portion of the network bandwidth. Especially in wireless networks, bandwidth cannot be guaranteed due to possible interference. Typical QoS parameters for streaming video over networks are image resolution, image size (window), frame rate, color depth, bit rate and compression quality in order to lower the transmission bandwidth. In summary, network QoS trades bandwidth resources to optimise the overall quality. Consumer Terminal 1

Server

Display Decoder Network

Encoder

Media Processing

Speakers Storage

Quality Consumer Terminal 2 Terminal QoS

QoS

Network QoS

Display Decoder Encoder

Processing Resources

Media Processing

Speakers Storage

Bandwidth

Fig. 3 Differences between terminal QoS and network QoS.

Fig. 3 shows an example of two future consumer terminals and a server in a network environment. These terminals provide decoders and encoders, which are parts of the media processing and network interfaces. Output devices may be displays, speakers and storage devices. The different applications (e.g. view a movie, access the internet, play

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

games, etc.) require different media processing algorithms, which also depend on the input data (resolution, frame rate, quality, etc.). Typically, the processing resources are the limiting factor in consumer terminals, and not the bandwidth resources. Therefore, QoS in consumer terminals is different from that in networks. Multimedia consumer terminals usually have a fixed resolution. The image size is determined by the display or chosen by the end-user, but not by the system. Consumer terminals have real-time requirements and do not allow frame rate fluctuations or audio interruptions. As depicted in Fig. 3, terminal QoS trades processing resources over quality. The triangle also connects processing resources with bandwidth. An example for the validity of this triangle is an MPEG transmission. With the given transmission bandwidth, the data quality can also be influenced by the encoder processing resources. With more processing resources, a higher quality can be achieved at the same transmission bandwidth. CTs such as TV sets and STBs are currently receivers in a broadcast environment, and therefore do not have the option to negotiate compression quality and bit-rate, although that may change in the future for CTs in an in-home digital network.

1369

resources. They are unlikely to follow a simple law. Fig. 5 shows the basic structure of a scalable algorithm. Having a common interface via a quality control block is essential to communicate the possible quality levels and resources required to the QoS environment. An algorithm can be split into a set of specific functions (Fig. 5), some of which are scalable to provide different quality levels. The properties of the active algorithm depend on the appropriate combination of the quality levels of the functions. These combinations may vary, but only a few may provide acceptable quality levels for the SVA (Fig. 6). The quality control block contains this information and the appropriate settings for the functions. The optimal qualityresource combinations are connected by the curve with maximum quality at lowest resources. SCALABLE ALGORITHM ALGORITHM FOR MEDIA PROCESSING signal in FUNCTION 1

FUNCTION 2

signal out FUNCTION 4

III. SCALABLE VIDEO ALGORITHMS

FUNCTION 3

A resource-quality scalable algorithm is an algorithm that • allows the dynamic adaptation of output quality versus resource usage on a given platform, • supports different platforms/product families for media processing, and • is easily controllable by a control device for several predefined settings.

external resourcequality control

QUALITY CONTROL

Fig. 5 Basic structure of a scalable algorithm.

Protection + Noise Adaptive

Protection against Artifacts

Basic with Artifacts

None

higher

Quality

lower

higher

Resources

0

Quality

Sharpness enhancement with additional functionality

Fig. 4 Quality versus resources on a scalable algorithm.

An example of a resource-quality scalable video algorithm is illustrated in Fig. 4 for sharpness enhancement. Sharpness enhancement is a special case, because there is no basic functionality required to provide output images. Thus scalable sharpness enhancement does not necessarily require resources for the lowest quality level as shown in Fig. 4. The next quality levels ‘Basic with Artifacts’ can include 1D and 2D filters for detail extraction, which are added to the original image for a sharper image. Adding information to a signal which already occupies the entire luminance quantization range can cause clipping artifacts and an increase of noise visibility. Higher quality levels may include protection against these artifacts and adaptation to the noise level of the input signal. With higher quality levels the resource needs increase, but nothing can be said about the exact relation between quality and

Resouces Fig. 6 Best choices of quality-resource combinations for functions of the entire scalable algorithm.

Clearly, a small range of resource usage, combined with its control overhead, does not gain significant advantages over traditional algorithms with fixed quality. We therefore developed some SVAs to investigate the range of resources required to perform the functionality at different quality levels. A. MPEG-2 Decoder MPEG-2 decoding complexity scalability can be achieved through both SNR quality degradation and spatial resolution reduction. We refer to the first mode for complexity scalability as graceful degradation (GD) and the latter as embedded

1370

IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003

scaling (ES). Computational graceful degradation was first introduced by Mattavelli et. al. [11] in 1998, and further studied by S. Peng [3]. In this type of reduced-complexity decoding, the resolution at which decoding is performed remains the same as dictated by the input bitstream. Known complexity reduction mechanisms include DCT coefficient masking and reduced complexity motion compensation. Both techniques are used in our scalable MPEG-2 decoder shown in Fig. 7. bitstream

VARIABLE LENGTH DECODER

INV. SCAN / INVERSE QUANTIZATION

video out IDCT

MOTION COMPENSATION FRAME MEMORY external resourcequality control

QUALITY CONTROL

Fig. 7 Block diagram of the scalable MPEG-2 decoder. The scalable functions are marked with arrows.

In DCT masking, selected high frequency DCT coefficients are masked to zero and a reduced-size IDCT such as 8x4, 4x4, or 8x2, is correspondingly performed to save computation. The DCT masking is performed independent of the signal, that is, the same DCT mask is applied to all DCT blocks in a video frame. For most natural video areas with smooth transition, there is relatively low energy in the high frequency DCT coefficients and therefore masking those coefficients to zero leads to only a small loss in quality. Even for textured video areas with large high-frequency DCT coefficients, masking out these coefficients causes a loss of spatial detail that may still be tolerable. However, for areas of interlaced video coded with frame DCT, vertical high-frequency DCT coefficients deserve special treatment as they can also result from inter-field difference as commonly seen in interlaced video with horizontal motion. Proper interlacing of the two fields will be lost if these highfrequency DCT coefficients are discarded, and visible artifacts such as motion jitter will result. In addition to reduced complexity IDCT, the complexity of motion compensation can be reduced through coarser precision interpolation for MC, e.g. full-pel MC for all macroblocks. However, our experiments show that reduced-precision MC leads to very annoying geometrically distorted images with shifted edges. In the following, we describe the technologies we used to improve the complexity-quality performance. 1) Signal Adaptive Processing Prior work on reduced complexity decoders uses the same technique independent of the local data being processed. For example, a fixed 4x4 DCT coefficient mask may be used on all

DCT blocks to remove the high frequency coefficients, thereby reducing the complexity of IDCT and subsequent decoding operations. In our work, we found several problems with the signal-independent approach. First, some DCT blocks coded with frame-type DCT contain large vertical high frequency DCT coefficients, which when removed will result in motion jitter artefacts. To tackle this problem, we developed an algorithm to detect such DCT blocks and selectively retain some vertical high-frequency coefficients for GD decoders. Second, coarser resolution motion compensation in ES decoders may lead to serious mismatch of the motion prediction signals on the encoder and decoder sides, and result in totally wrong decoded blocks [12]. Our reduced-resolution motion compensation incorporates a module to detect such macro blocks and use the correct reference signal for motion compensation. In both cases, the signal-dependent approach eliminates visible artefacts caused by signal-independent processing with only a slight increase in complexity due to the detection process. 2) Picture-Type Dependent Processing Prediction drift is a main factor that contributes to the quality loss due to complexity-scalable decoding and it is therefore critical that the amount of error propagation be tightly controlled. Because I and P pictures serve as reference pictures while B pictures do not, I and P pictures should be decoded with better quality while B frames can be decoded with less quality. This unequal treatment of reference and nonreference pictures yields better overall video quality than treating all types of pictures equally when given a fixed amount of total computation resources for a group of reference and non-reference pictures. Due to the flexibility of software implementation, it is feasible to vary the processing algorithms for different types of pictures. In picture type dependent complexity scalable decoding, anchor pictures are always decoded with the high-complexity algorithm, and predicted from other good-quality anchor pictures. B frames are always decoded with the lowcomplexity algorithm but predicted from good-quality anchor pictures. This ensures that the output video maintains good quality even towards the end of GOPs, which is not possible with picture-type independent algorithms where quality degrades progressively due to prediction drift throughout a GOP until the next intra picture. The quality-resource usage for a scalable MPEG-2 GD decoder is illustrated in Fig. 8. The scalable functions are limited to the IDCT and motion compensation blocks. The resources used are measured results for the VLIW CPU. The developed scalable MPEG-2 decoder needs resources between about 78-101 MIPS. These numbers are not fixed even at the same quality level, they vary due to data dependencies. The scalability ranges from about 77-100%, corresponding to a resource range of about 23 MIPS. The quality estimate is a coarse “thumb nail” estimate done by a few experts.

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

can be implemented with a look-up-table (LUT). In unstructured areas the output of the detail filter is zero. This changes when details like textures or transitions are present. The steeper the edges or the higher the pixel differences (contrast), the higher become the filter output amplitudes. The non-linear processing is approximated by an e.g. sinusoidal function between the input and output

Quality estimate

MPEG-2 GD

0

20

40

60

1371

80

f (x ) =

100 120

Resources [MIPS] Fig. 8 Best choices of quality-resource combinations for functions of the scalable MPEG-2 decoder.

a π ⋅ x  ⋅ sin  π  a  ,

(1)

with a = 128, and shown in Figure 11. 40

B. Sharpness Enhancement The second scalable video algorithms perform sharpness enhancement. The block diagram is depicted in Fig. 9. The detail filter extracts the information on higher frequencies, which are present at e.g. edges and textures. This detail information is further processed by the amplitude control function and added to the original incoming signal. The output signal appears visually sharper.

20

-100

50

-50

100

-20 -40 Fig. 11 Amplitude control function.

video in

video out

DETAIL FILTER

external resourcequality control

AMPLITUDE CONTROL

QUALITY CONTROL

Fig. 9 Block diagram of scalable sharpness enhancement. The scalable functions are marked with arrows.

1 -1 0 2 0 -1 4 -1 0 1 0 8 0 -1

0 0 0 0 0

0 0 4 0 0

0 0 0 0 0

-1 0 0 0 -1

Fig. 10 Detail filter: The top shows the horizontal 1-dimensional filter kernel, and the bottom the non-separable 2-dimensional filter kernel.

The detail filter can be scaled in various ways by choosing 1D or 2D filters, number of filter coefficients, quantization and others. We chose only for two simple filters as shown in Fig. 10. More information can be found in [7]. One problem of adding detail information to the original signal is that of clipping artifacts. The amplitude control block in Fig. 9 is a non-linear function to reduce these artifacts. It

The non-linear function ensures a high gain in low contrast areas. High contrast areas need less or no additional sharpness enhancement. If the contrast is already high, the additional detail signal can cause the unwanted clipping artifacts. The non-linear function takes care that the detail signal will be significantly reduced for high input signal amplitudes. With two detail filters and the optional non-linear function, four quality levels are already possible. In the special case that no processing is done, the output image becomes the input image, which is also a valid option. The sharpness enhancement algorithm operates between 029 MIPS and thus is 100 % scalable. The quality levels, resource usage and settings are listed in Table 1 and shown in Fig.12. TABLE I QUALITY LEVELS AND FUNCTIONAL DETAILS OF THE SCALABLE SHARPNESS ENHANCEMENT ALGORITHM

Quality Level 0 1 2 3 4

Resource Usage [MIPS] 0 16.5 19.3 25.4 29.3

Detail Filter off 1-dimensional 1-dimensional 2-dimensional 2-dimensional

Amplitude Control off off on off on

1372

IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003

Quality estimate

Sharpness Enhancement

0

10 20 Resources [MIPS]

30

Fig. 12 Best choices of quality-resource combinations for functions of the scalable sharpness enhancement algorithm.

Fig. 13 shows the result of three quality levels (QL0, QL1, QL4) in a small image part of a soccer sequence. Frames consisting of two fields from interlaced video are shown to have the full vertical resolution. The “egg-slice” artifacts at the borders are created by temporal differences in the two fields and not visible in the moving, real-time video. QL0 depicts the original image and is the lowest quality level requiring zero resources. QL1 uses the horizontal detail filter and shows in increase of sharpness especially visible in the gate and the letters. QL4 used the two-dimensional detail filter and the amplitude control function. It also provides additional sharpness in the vertical direction, visible at the horizontal lines in the soccer field. C. Down-Scaler The block diagram of a scalable down-scaler is shown in Fig. 14. PiP applications require only simple down-scaling by natural factors, so the algorithms require only a low-pass filter, followed by the down-sampling. In our case, we restrict the down-scaling to a factor of four, and make the low-pass filter scalable. Of course, only output pixels need to be calculated, which reduces processing resources significantly. The scalable down-scaler with a decimation factor of 4 in both the horizontal and vertical directions requires between 414 MIPS (Fig. 15). At the lowest quality level QL0, subsampling alone without any pre-filtering is performed. Quality level QL1 uses an average filter over 4x4 pixels for the luminance, while QL3 uses the same filter for the chrominance too. Separable 5-tap filters (1, 4, 6, 4, 1) for the luminance (QL2) and for both luminance and chrominance (QL4) complete the scalability range. The scalability ranges from 29100 %, corresponding to a resource range of 10 MIPS.

QL0 (0 MIPS)

QL1 (16.5 MIPS)

QL4 (29.3 MIPS) Fig. 13 Scalable sharpness enhancement algorithm at different quality levels QL.

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

2D LOW-PASS FILTER

video in

external resourcequality control

video out

M

QUALITY CONTROL

Fig. 14 Basic structure of scalable image down-scaler.

1373

provide robustness between applications. For further details see also [15]. Fig. 16 presents the overall system architecture for highquality video, and illustrates the QoS environment for SVAs. The application part consists of a strategy manager (SM) and a number of resource consuming entities (RCEs). RCEs are composed of one or more media processing components, in our case SVAs. Hence, SVAs are encapsulated (or clustered) into RCEs in our architecture. A modular set of SVAs may perform the different video functions needed in a digital TV set, STB, or multimedia PC. The system part is composed of a resource manager (RM) and a quality manager (QM).

Down-Scaler 4:1 Quality estimate

User Interface Strategy Manager SVA SVA SVA 0

5 10 Resources [MIPS]

15

Fig. 15 Best choices of quality-resource combinations for functions of the down-scalers for PiP applications.

D. Scalability Performance If all three SVAs are active, the total resource requirements range from about 82-144 MIPS, leaving 62 MIPS for additional functionality when requested. The resource range corresponds to an overall scalability range from about 57100 %. This range (in percentage) would be significantly larger without the MPEG-2 decoder. Only two functions of the MPEG-2 decoder were made scalable, and we learned that some functions are hard to scale. All scalable algorithms have a control overhead of less than 1 %. With the large range of scalability, the SVAs developed achieve the desired flexibility. IV. OVERALL SYSTEM ARCHITECTURE The system architecture is a result of a close cooperation between experts from the high-quality video domain and system software specialists, with input from experts in yet other application domains (e.g. 3D-graphics; see [13]). This is not only desirable, but even essential and inevitable to be able to build consumer terminals supporting cost-effective media processing in software. In our approach to QoS for consumer terminals, media processing applications and system software have separate, but complementary, responsibilities. Moreover, our approach is based on two related, but distinct, concepts: quality level and resource budget. Quality levels are provided by the applications, whereas resource budgets are guaranteed (and enforced) by the system software. Note that the notion of resource budgets (or reservations [14]) is a proven concept to

Quality Manager

Resource Manager

Platform and Operating System Fig. 16 Overall system architecture including scalable video algorithms and the supporting Quality-of-Service environment.

The SM and QM cooperate to determine the preferred quality settings and the budgets for the RCEs. The SM collects information about resource requirements for all quality levels and provides a strategy for the overall quality optimization of a running application. The SM deals with application domain semantics. The QM, on the other hand, works with a general (semantically neutral) notion of utility. The QM optimizes the system utility based on the utilities of the individual applications, the weights of these applications and their resource estimates. It also starts and stops applications. The resource manager (RM) provides the functionality of a resource kernel (similar to [16]), controlling the admission and scheduling of RCEs and their run-time execution budgets. Together, the QM and the RM constitute the QoS-RM. Both the SM and the QoS-RM receive inputs from the user interface to react on commands. V. MAINTAINING SYSTEM QUALITY In HVE consumer terminals, software media processing is done using dedicated media processors that are expensive compared to dedicated hardware solutions, both in cost and power consumption. Therefore, cost-effectiveness is a major issue. Cost-effectiveness requires high average resource utilization. This requirement is in conflict with the hard realtime requirements of high-quality video that are traditionally met by worst-case resource allocation. Many media processing functions like audio and video

1374

IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003

encoding and decoding, VRML and motion estimation have resource demands that show large variations over time, due to the varying complexity of the media data to be processed. Moreover, the real number of cycles available to an application in a certain time is unpredictable. This unpredictability is due to cache and bus interference between media processing functions, and overhead involved in control and interrupt processing. Hence, there is no fixed relationship between CPU budget and effective CPU cycles. Guaranteed and realistic worst-case allocation of CPU budgets is thus currently not possible. For these two reasons, we are forced to opt for well below worst-case resource allocation. Without taking precautionary measures, such a resource allocation may jeopardise system qualities such as robustness and stability. A. Robustness The RM contributes to robustness, resolving the temporal interference between applications, which is a major threat for open systems. The issue of cost-effectiveness of HVE consumer terminals gives rise to an additional robustness problem within applications. These latter problems have to be resolved by the applications themselves. Stated in other words, applications have to get by with their budgets. This is illustrated by means of an example. Fig. 17 shows imaginary CPU demands of two applications over time. The height of the figure represents the total available CPU time. The CPU demands of application 1 and 2 are drawn from the bottom upwards and from the top downwards, respectively. A resource budget below the worstcase resource demands is allocated to each of the applications. For ease of presentation, the sum of these budgets equals the total amount of CPU time. CPU time appl. 2

budget 2

budget 1

CPU time appl. 1 time

Fig. 17 Resource budgets and overloads.

In Fig. 17, application 1 exceeds its budget at four places. Within the limited time frame shown, algorithm 2 does not exceed its budget. In two of those four situations, application 1 may run on the “slack” (i.e. the time allocated to but not used by) of application 2. In the other two situations, which are marked by the shaded areas in the figure, application 1 has to face the overload. Note that a temporal problem within an application does not hamper other applications. The MPEG-2 decoder described above is an example of a video algorithm providing adaptive control. By using an MPEG-2 decoding complexity estimation model, the required computational load can be predicted, and the computation is

subsequently scaled such that it will not exceed its resource budget. B. Stability During Mode Changes A mode change occurs when a user initiates an additional application, e.g., opens a picture-in-picture (PiP) window. First, the system determines the optimal quality levels at which the new mix will run, i.e., the maximum system utility that can be obtained with the available resources. Next, the new quality levels have to be implemented. The applications are responsible for providing smooth transitions from the old quality levels to the new ones. Quality level reductions must precede the start of new applications to prevent overload during these transitions. VI. DEMONSTRATOR We have built a demonstrator to show the feasibility of the approach and to test the concept. In the following, we first describe the goals of the demonstrator, then the applications provided, the implementation, and finally the results achieved. Further information on system aspects of the demonstrator can be found in [17]. A. Goals The demonstrator has three main goals. As a first goal, the demonstrator should show that an application can be added to a fully loaded terminal, illustrating the benefits of the scalable approach compared to traditional systems. The other two goals are related to the preservation of typical qualities of HVE consumer terminals, in particular robustness and stability. The demonstrator should therefore show robustness upon load increases as a second goal, and should show basic stability during mode changes. B. Description Our first system with dynamic resource management is shown in Fig. 18. Two inputs are supported, an MPEG-2 stream and an analog video input, both with standard resolution (SD, i.e. 720 pixels by 576 lines). The outputs, video and audio, are connected to a TV. The system supports the following algorithms: • scalable algorithms (MPEG-2 decoder, sharpness enhancement, down-scaler (QL0…QL3) for picture-inpicture applications), • non-scalable algorithms (demultiplexer, audio decoder, SW mixer, HW scaler, HW mixer, MPEG-1 encoder), • control modules (strategy manager, QoS-RM, graphics overlay with performance measurements). Our system demonstrator provides four main applications: a) Main: MPEG-2 stream (DVD-output) viewing in the main window; b) PiP: Analog video (camera-output) viewing in a PiP window; c) Rec: Analog video recording on disk (in MPEG-1 format); d) Back: Video playback from disk.

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

DVD

Audio

Audio Decoder

MPEG stream

Demultiplexer

MPEG GD/AFD

Sharpness Enhancement

SD SW Mixer

analog video SD

CIF

HW Scaler

1375

MPEG-1 Encoder

SW Scaler

QCIF

Video

HW Mixer

Graphics Overlay (User Interface)

Fig. 18 Block diagram of system implementation on a multimedia processor. Strategy manager, quality manager, resource manager, and platform and operating system are not shown.

The system basically implements four different modes of operation. These are selectable by an IR remote control, in which different sets of these applications are provided: Mode 1. (Main only): The MPEG-2 stream is demultiplexed in an audio and video stream and subsequently decoded. Video is decoded in standard resolution, and the sharpness is enhanced. The video output is shown in the main window. Mode 2. (Main + PiP): This second mode extends mode 1 with the PiP application. The SW scaler is used to scale the analog video input down to QCIF format, and a SW mixer combines the SD output and QCIF output. Mode 3. (Main + PiP + Rec): This third mode extends mode 2 with the Rec application. The analog video input is scaled down from SD-format to CIF format, encoded with an MPEG-1 encoder, and subsequently recorded on disk. Mode 4. (Back): The recorded material is played using MPEG decoding and sharpness enhancement. This mode basically illustrates that recording in mode 3 actually took place. The graphics overlay is active in all four modes and shows the selected quality levels, assigned resources and resource usage of the three scalable algorithms. An additional bar displays the overall resource usage of the VLIW-CPU. An additional fifth mode has been implemented to illustrate the use of the MPEG-2 decoder with embedded scaling. In this mode, the decoder provides down-conversion from SD (MPEG stream from DVD) to CIF resolution. The output is centered in the main window.

C. Implementation The system was built using the streaming architecture and a commercial-off-the-shelf (COTS) real-time operating system (RTOS) provided by the DSP board. The RM is implemented on top of this COTS RTOS, providing periodic budgets that support applications from the high-quality video domain. All signal processing is done on the DSP board that is hosted in a PC. The TV is used as a monitor only. Only basic means for QoS control have been implemented in the demonstrator. The optimization of the system utility by the QM is implemented as a simple function using the quality settings and processor requirements of the RCEs. The SM assumes a fixed set of applications and contains a look-up table that determines the quality settings of the RCEs for each mode. In both mode 1 and mode 2, all algorithms are able to run at their highest quality for most of the time. In mode 1, the CPU is occupied approximately 60-80% on average. The resource requirements depend on the input streaming data and change dynamically with the image content. In mode 2, the total load of the applications is about 90% of the processor capacity. Depending on the input data, regulation to a lower quality level in at least one application may be appropriate to avoid interference from peak load overheads. Mode 3 is the most interesting because the three applications Main, PiP, and Rec that are provided in this mode create a severe load on the system. Only when down-scaling all three SVAs to their lowest quality levels are there sufficient resources to provide smooth output signals for all three applications for most of the time. Mode 4 is the playback mode for the recorded material, with

1376

IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003

no severe demands on resources. The total overhead of the dynamically scalable algorithms and the basic QoS control was estimated as being below 3 %. Compared to the system benefits (scalability ranges from about 57-100 %, when all three SVAs are active), this overhead is negligible. D. Results All three goals were met. Mode 3 illustrates that an application (background recording) can be added to a consumer terminal that is already fully loaded (mode 2). The stability of the system was illustrated not only during mode changes but also during normal operation; none of the applications crashed or had to be terminated. Even a manual override of selected quality levels of the scalable algorithms in mode 3 to higher quality levels does not lead to instability. The manual override also illustrates the robustness of the system. The corresponding application faced the overload with no means to further lower the quality level. Still, the application was able to provide output. For video, this was realized by basic degradation techniques such as frameskipping. For audio, interruptions of the sound were clearly noticeable. We therefore conclude that our demonstrator meets its expectations by illustrating the benefits of a scalable approach while still meeting specific consumer terminal requirements such as robustness, real-time constraints, and stability. VII. CONCLUSION We have demonstrated that scalable algorithms in combination with dynamic resource management can reduce the current limitations of programmable platforms for consumer terminals. Several SVAs are ready and embedded in a basic system with dynamic resource management. The proof of concept and feasibility was successful, confirming the major advantages of this novel technology. ACKNOWLEDGMENT The authors thank all project members in Eindhoven (NL) and Briarcliff Manor (USA) for their valuable contributions to the project and their support in making it happen in general, and Maria Gabrani, Iulian Nitescu, Clara M. Otero Pérez, and Zhun Zhong in particular. REFERENCES [1] [2]

[3] [4]

F. Sijstermans, G. Slavenburg, “Providing the processing power for consumer multimedia”, IEEE Int. Conf. on Consumer Electronics (ICCE), Digest of Technical Papers, 1997, pp. 156 – 157. C. Hentschel, R. J. Bril, M. Gabrani, L. Steffens, K. van Zon, S. van Loo, “Scalable video algorithms and dynamic resource management for consumer terminals”, Int. Conf. on Media Futures (ICMF), Proceedings, May 2001, pp. 193-196. R. J. Bril, M. Gabrani, C. Hentschel, S. van Loo, L. Steffens, “QoS for consumer terminals and its support for product families”, Int. Conf. on Media Futures (ICMF), Proceedings, May 2001, pp. 299-302. S. Peng, “Complexity scalable video decoding via IDCT data pruning”, IEEE Int. Conf. on Consumer Electronics (ICCE), Digest of Technical Papers, June 2001, pp. 74-75.

[5] [6]

[7] [8] [9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

Z. Zhong, Y. Chen, “Scaling in MPEG-2 decoding loop with mixed processing”, IEEE Int. Conf. on Consumer Electronics (ICCE), Digest of Technical Papers, June 2001, pp. 76-77. R. J. Bril, L. Steffens, S. van Loo, M. Gabrani, C. Hentschel, “Dynamic Behavior of Consumer Multimedia Terminals: System Aspects”, IEEE Int. Conf. on Multimedia and Expo (ICME), Proceedings, August 2001, CD-ROM, ISBN 0-7695-1198-8. T.-H. Lan, Y. Chen, Z. Zhong, “MPEG-2 decoding complexity regulation for a media processor”, IEEE Multimedia and Signal Processing Workshop (MMSP), October 2001, pp. 193-198. C. Hentschel, R. Braspenning, M. Gabrani, “Scalable Algorithms for Media Processing”, Int. Conf. on Image Processing (ICIP), Proceedings, October 2001, pp. 342-345. R. J. Bril, C. Hentschel, E. F. M. Steffens, M. Gabrani, G.C. van Loo, J. H. A. Gelissen, “Multimedia QoS in consumer terminals”, (invited lecture) IEEE Workshop on Signal Processing Systems (SIPS), Proceedings, September 2001, pp. 332 – 343. K. Nahrstedt, H. Chu, S. Narayan, “QoS-aware Resource Management for Distributed Multimedia Applications”, Journal on High-Speed Networking, Special Issue on Multimedia Networking, IOS Press, Vol. 8, No. 3-4, pp. 227-255, 1998. Marco Mattavelli, Sylvain Brunetton, “Implementing real-time video decoding on multimedia processors by complexity prediction techniques”, IEEE Transactions on Consumer Electronics, Vol. 44, no. 3, August 1998, pp. 760-767. Z. Zhong, Y. Chen, T.-H. Lan, “Signal adaptive processing in MPEG-2 decoders with embedded resizing for interlaced video”, Visual Communications and Image Processing (VCIP), Proceedings, January 2002, pp. 434-441. W. van Raemdonck, G. Lafruit, E. F. M. Steffens, C. M. Otero Pérez, R. J. Bril, “Scalable 3D graphics processing in consumer terminals”, IEEE Int. Conf. on Multimedia and Expo (ICME), Proceedings, August 2002, CD-ROM, ISBN 0-7803-7305-7. C. W. Mercer, S. Savage, H. Tokuda, “Processor Capability Reserves: Operating System Support for Multimedia Applications”, Int. Conf. on Multimedia Computing and Systems (ICMCS), Proceedings, pp. 90 – 99, May 1994. R. J. Bril, E. F. M. Steffens, “User focus in consumer terminals and conditionally guaranteed budgets”, 9th International Workshop on Quality of Service (IWQoS), Proceedings, Lecture Notes in Computer Science (LNCS) 2092 (Ed.: Lars Wolf, David Hutchison and Ralf Steinmetz), Springer-Verlag, pp. 107 – 120, June 2001. R. Rajkumar, K. Juvva, A. Molano, S. Oikawa, “Resource kernels: A resource-centric approach to real-time and multimedia system”, SPIE/ACM Conference on Multimedia Computing and Networking, Proceedings, January 1998. C. M. Otero Pérez, I. Nitescu, “Quality of Service Resource Management for Consumer Terminals: Demonstrating the Concepts”, Work in Progress Session of the 14th Euromicro Conf. on Real-Time Systems (ECRTS), Research report 36/2002, Vienna University of Technology, pp. 29 – 32, June 2002.

BIOGRAPHIES Christian Hentschel (M’99) received his Dr.Ing. (Ph.D.) in 1989 and Dr.-Ing. habil. in 1996 at the Technical University of Braunschweig, Germany. He worked on digital video signal processing with focus on quality improvement. In 1995, he joined Philips Research in Briarcliff Manor, USA, where he headed a research project on moiré analysis and suppression for CRT based displays. In 1997, he moved to Philips Research in Eindhoven, Netherlands, leading a cluster for Programmable Video Architectures. Currently, he holds the position of a Principal Scientist and coordinates a project on scalable media processing with dynamic resource control between different research laboratories. He is a member of the Technical Committee of the International Conference on Consumer Electronics (IEEE) and a member of the FKTG in Germany.

C. Hentschel et al.: Video Quality-of-Service for Consumer Terminals - A Novel System for Programmable Components

Reinder J. Bril received a B.Sc. and a M.Sc. (both with honours) from the Department of Electrical Engineering of the University of Twente, The Netherlands. Since 1985, he has been with Philips. He has worked in both Philips Research and Philips’ Business Units on various topics including fault-tolerance, formal specifications, and software architecture analysis, and in different application domains. He is currently working at Philips Research Laboratories Eindhoven (PRLE), the Netherlands, in the area of Quality of Service (QoS) for consumer devices, with a focus on dynamic resource management in receivers in broadcast environments (such as digital TV-sets and STBs).

Yingwei Chen received her B.E. from Tsinghua University, Beijing, China in 1992 (Summa Cum Laude) and her M.S. from Rensselaer Polytechnic Institute, Troy, New York in 1995, both in Electrical Engineering. Since 1996, she has been with Philips Research in Briarcliff Manor, New York. Her research interests include video compression, processing and transmission.

1377

Ralph Braspenning was born in Zundert, The Netherlands, in 1976. He received his M.Sc. from the Eindhoven University of Technology in 2000. In the same year he joined Philips Research Eindhoven, The Netherlands, where he is a Research Scientist in the Video Processing and Visual Perception group. He is currently working on complexity scalable video algorithms, in particular scan-rate up-conversion algorithms. His particular interests are video and image analysis and processing.

Tse-Hua Lan (S’94-M’00) was born in Taiwan, R.O.C. He received a B.Sc. in electrical engineering from the University of Costa Rica, San Jose, Costa Rica, and M.S. and Ph.D. in electrical engineering from the University of Minnesota, Minneapolis, MN, in 1995 and 2000 respectively. From 1995 to 1999, he was a graduate research assistant at the Multiscale Multimedia signal processing group, University of Minnesota. From 1999 to 2002 he was with Philips Research USA, Briarcliff Manor, NY, as a senior member of research staff. Since 2002, he has been with EG Technologies in Atlanta, Georgia. His research interests include low power wireless multimedia communications, image and video coding, digital image watermarking, complexity scalable system design, and video processing for quality enhancement.

Lihat lebih banyak...

Video quality-of-service for consumer terminals - a novel system for programmable components

Descrição do Produto

Comentários