Applied Ergonomics 32 (2001) 461–471
Marker-less systems for tracking working posturesFresults from two experiments S. Pinzkea,*, L. Koppb a
Division of Work Science, Department of Agricultural Biosystems and Technology, Swedish University of Agricultural Sciences, PO Box 88, SE-230 53 Alnarp, Sweden b Cognitive Science, Lund University, Kungshuset, Lundaga(rd, SE-222 22 Lund, Sweden Received 22 April 1999; accepted 10 April 2001
Abstract Two experiments are performed to examine the usability of different marker-less approaches in image analysis and computer vision for automatic registration of OWAS (Ovako working posture analysing system) postures from video film. In experiment 1, a parametric method based on image analysis routines is developed both for separating the subject from its background and for relating the shapes of the extracted subject to OWAS postures. All 12 analysed images were correctly classified by the method. In experiment 2 a computer neural network is taught to relate postures of a subject to OWAS postures. When the network was trained with 53 images the rest of the set of 138 images was correctly classified. The experiments described in this paper show promising results regarding the use of image analysis and computer vision for tracking and assessing working postures. However, further research is needed including tests of different human models, neural networks, and template matching for making the OWAS method more useful in identifying and evaluating potentially harmful working postures. r 2001 Elsevier Science Ltd. All rights reserved.
1. Introduction Many occupational tasks are still associated with strenuous working postures and movements. Combined with a heavy physical workload, they result in a high frequency of work-related musculoskeletal disorders (WMSDs). Several physical risk factors for WMSDs can be identified in working life such as postures, manual handling, high peak loads, static loads, vibration, repetitive work, contact stress, speed or acceleration of movements, and demands for precision. These risk factors interact, often in a multiple way; they are not clearly defined and they overlap (Kilbom, 1994; Kilbom, 1995). It is documented that there is a relationship between awkward postures and pain, and symptoms and injuries in the musculoskeletal system (Grandjean and Hu. nting, 1977; Corlett and Manenica, 1980). Awkward posture means a considerable deviation from the neutral position of one, or a combination of joints. These postures typically include reaching behind, twisting, working overhead, wrist bending, kneeling, *Corresponding author. E-mail addresses:
[email protected] (S. Pinzke), Lars.Kopp@fil.lu.se (L. Kopp).
stooping, forward and backward bending, and squatting. Such postures are related to injuries that are incurred during tasks that are static in nature and relatively long-lasting, and during tasks that demand exertion of force (Westgaard and Aar(as, 1984; Haslegrave, 1994). To assess possible health risk factors connected with body postures it is necessary to determine the actual postures performing a certain task (Vedder, 1998). Various methods have been developed to track postures and measure postural loads in order to assess potential risk factors for WMSDs in the industry. The methods can be divided into (1) direct measurements, (2) observational methods, and (3) self-report techniques. They require equipment ranging from the highly technical and computer-based down to simple paper and pencil techniques (Nordin, 1982; Atha, 1984; Colombini et al., 1985; Kilbom, 1994; Hagberg et al., 1995; Pinzke, 1996; Li and Buckle, 1999). Cost and accuracy are factors that decrease with 1–3, while capacity, versatility and generality are factors that increase with the listed order of the techniques (Winkel and Mathiassen, 1994). Direct measurements include instrumentation and techniques such as electromyography, goniometers,
0003-6870/01/$ - see front matter r 2001 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 3 - 6 8 7 0 ( 0 1 ) 0 0 0 2 3 - 0
462
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
biomechanical analyses, and optical methods that give detailed information about muscle activity, angle deviation, forces, and body movements, respectively. Direct methods also include measures of metabolic and respiratory loading based on e.g. heart rate and oxygen uptake. The direct methods are quantitative and highly accurate but also expensive and time-consuming to use. The limitations lead to only a certain number of body regions and a small number of subjects can be assessed (Kilbom, 1994). The observational methods are contactless (as opposed to direct methods where the devices are attached to the body) and indirect measurements depend on the analyst’s judgement to identify the various body postures (Genaidy et al., 1993). The third group of measurements is the use of selfreport techniques for collecting data from the workers’ own experiences of their work environment. These methods include questionnaires, interviews, diaries, checklists, ranking and rating scales. Self-reports seem to be the most appropriate and practical instrument in large scale studies (Winkel and Mathiassen, 1994). However, their use in studying cumulative exposure over time, and the reliability and validity for assessment of postural loads are not very high compared with observational and direct methods (Burdorf and Laan, 1991; Kilbom, 1994). Thus, observational methods may offer a compromise between the high cost of direct methods and the low validity and the subjectivity of self-report techniques (Kilbom, 1994). There is a need to establish practical methods for early identification and quantification of postures associated with WMSDs in order to form a basis for appropriate preventive and interventive measures. Easyto-use field methods are required first of all for screening studies generating information that will stimulate indepth research on particular problems (Pinzke, 1997). The aim of this paper is to contribute to the development of systems for tracking and assessing working
postures which combine some of the advantages of the direct and observational methods. Two experiments will be presented for illustrating the use of image analysis and computer vision in building such systems. As a reader service, the main features of different methods and concepts referred to in the paper are described in some detail.
2. Methods 2.1. OWASFan observational method One of the most widely used methods of observation in working posture studies is the Ovako working posture analysis system (OWAS) (Karhu et al., 1977). It is used to identify and evaluate harmful working postures. The OWAS method is based on sampling from typical working postures for the whole body. Table 1 shows the OWAS codes which are used to make up the 84 different posture combinationsFfour back postures, three arm postures and seven leg postures (three additional leg postures are included in the extended OWAS, but are not used here). The use of strength or the weight of loads handled is classified by a three-class scale. Taking these three load levels into account, the basic OWAS has 252 (4 3 7 3) posture and load combinations. The method produces the frequency and relative proportion of time spent in the individual positions and an assessment on a four-grade scale of the harmfulness of the postures as well as the urgency to correct these postures. From the beginning, the OWAS method was manual and the registrations and calculations were performed on special pre-printed forms. Nowadays, several semi-computerised systems based on the OWAS method have been developed for making the work less demanding for the operator by automation of the input and output procedures (Kant et al., 1992; Long, 1992; Leskinen and To. nnes, 1994; Pinzke, 1994; Vedder,
Table 1 OWAS posture code definitions Back
Arms
Legs
1=straight 2=bent 3=twisted 4=bent and twisted
1=both arms below shoulder level 2=one arm at or above shoulder level 3=both arms at or above shoulder level
1=sitting 2=standing on both legs straight 3=standing on one straight leg 4=standing on both legs bent 5=standing on one bent leg 6=kneeling on one or both legs 7=walking
additional postures 8=squatting 9=the legs cannot be leaned on 0=crawling
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
1998). However, these systems still require an operator to classify the performed working postures and records them on a form or at the computer keyboard. This is a time-consuming and tedious task. This paper demonstrates possibilities for automatically recording and assessing working postures as specified in the OWAS method using video and computer techniques. In addition, a fully computerised system can be achieved if the presented routines for tracking and classifying working postures are combined with existing semi-automatic OWAS methods. Fig. 1 shows the different components in such a system. This paper places special emphasis on the use of image analysis and computer vision as a part of its future system. The working postures are recorded with a video camera (a). The film is automatically analysed by connecting a VCR (b) to a computer (c) that grabs the video images with a ‘‘frame grabber’’ card at chosen time intervals (d). The computer processes the images by image analysis (e) and background elimination (f). The remaining object is classified and compared with the OWAS postures (g). Each posture in the computer has a code (h). The distribution of the different postures is
463
calculated and presented in a form (i) that also contains recommendations for correction measures. The equipment for the steps (a–d) is available on the market at arelatively low cost. The existing semi-computerised OWAS systems also include steps (h and i). The computerisation and automation of the system (e–g) will be discussed in this paper, and some preliminary tests are also reported. 2.2. Direct methods Several direct methods have been developed for analysing human motion and postures. Even though the image analysis and computer vision methods of this paper are related to only the optoelectronic techniques taken up below, the other direct methods are all briefly mentioned. The category of direct methods includes devices attached on the body for the measurements: electrogoniometers (Penny and Giles Biometrics, Blackwood, Gwent, UK); electromagnetic methods such as Flock of Birds (Ascension Technology, Burlington, VT) and Fastrak (Polhemus, Colchester, VT); ultrasound emitters (Fleischer and Lange, 1983; Hsiao and Keyserling,
Fig. 1. The components of a fully computerised OWAS system.
464
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
1990); accelerometer-based devices (Hansson et al., 1992); and the optoelectronic techniques, e.g. the SELSPOT system (Selcom Selective Electronic,Valdese, USA), the MacReflex system (Qualisys, Glastonbury, USA), and the Peak Motus measurement system (Peak Performance Technologies). The optoelectronic techniques use active or passive markers attached at joints and other points of interest for tracking and quantifying body movements. Several authors (Pinzke, 1996; Li and Buckle, 1999) have reviewed the usefulness and the limitations of the direct methods. Some important advantages and drawbacks are mentioned here. Electro-goniometers are light, flexible and are made practical for use in field studies. The recordings are also sufficiently accurate and reliable for epidemiological studies. However, mounting, aligning and calibrating the instruments are tedious tasks that require extra care to minimise errors. The recordings by some of the direct methods can be influenced by the environmental conditions at work sites, e.g. the magnetic field produced by surrounding equipment may affect the data received by the electromagnetic systems; the sonic systems are sensitive for noise, temperature, humidity, etc; and the accelerometer-based systems are sensitive for damping, temperature, etc. Limitations of the optoelectronic systems are the problems of obscured point of interest, crossover conflict between the markers, the time for calibration, and tedious analysis. These last-named systems are also relatively expensive. Furthermore, the electromagnetic, sonic and optical systems can only be applied where the study subject is in a restricted area with free sight and are therefore mostly limited to use in laboratory settings. 2.3. Image analysis and computer vision An alternative to the marker driven methods is to analyse a videotaped worker without any markers attached on the joints using image analysis. The explosive development of personal computers in recent years with regard to memory, computation speed and graphics has made it possible to perform advanced image analysis even on an ordinary personal computer. There are two main difficulties in using computer vision to extract and interpret working postures from video sequences: the segmentation problem, i.e. how to separate human postures from the background, and the problem of interpreting and classifying the extracted objects. Both will be dealt with later in this paper. 2.4. Marker and marker-less methods Motion, detection of isolated points and detection of lines and edges in images are powerful cues to improve the performance of segmentation algorithms (Gonzalez
and Woods, 1993). Moving edge detection, i.e. both motion and edge information tracked from an image sequence, has been used to generate the outline of a moving object (Leung and Yang, 1987a; Leung and Yang, 1987b; Leung and Yang, 1995). Motion for objects in a sequence has been determined by following single points (Johansson, 1964). A number of papers have been published based on Johansson’s work with Moving Light Display images and the techniques that fix markers to human body joints (Rashid, 1980; Webb and Aggarwal, 1982; Goddard, 1989) for helping the systems to find the body segments in an image sequence. However, these methods have problems with missed markers because of occlusion by other segments or skin deformations. Therefore, marker-less methods have recently been developed to match models of the human body segments onto edges or regions in image sequences (Kuinii and Sun, 1990; Fujii and Moriwaki, 1991; Mochimaru and Yamazaki, 1994; Persson, 1996) or to the skeleton of the observed person’s silhouette (Guo et al., 1994). The idea of experiment 1 follows Guo’s approach.
3. Experiments Two experiments were performed to examine whether automatic registration of OWAS postures from videofilm is possible using different marker-less approaches in image analysis and computer vision. A parametric method is used in experiment 1. Image analysis routines are used both for segmentation, i.e. separating the subject from the background, and for skeletonising, i.e. thinning the shape of the studied subject to its medial axis. Length and angle calculations of different segments of the skeleton figure are then executed to determine the corresponding OWAS posture. In experiment 2, the user teaches a computer neural network to recognise the different postures. The network uses edge detection and the spatial arrangement of features for the segmentation process and feature matching to recognise and classify the object into OWAS postures. 3.1. Materials and procedure of experiment 1 In this experiment, only OWAS postures for the arm (codes 1 and 2, Table 1) were tested, i.e. whether the upper arm of the person studied was above or below shoulder height. The OWAS method is not very precise in the definition of the arm positions, especially when the back is bent. In this experiment the arm is defined to be above or below shoulder level if the angle between the upper arm and the vertical line through the shoulder joint is more or less than 901, respectively, regardless of
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
the bend of the back (Fig. 2). This definition can be easily changed if the angle between the upper arm and the trunk line through the shoulder joint is instead more or less than 901, respectively. A person was videotaped in the sagittal plane assuming different postures in front of a light background. The videotape was digitised into a Power Macintosh computer with one second time sampling. Twelve consecutive frames from the whole scene were
Fig. 2. Angle definition (y) between the vertical line through the shoulder joint and the upper arm.
465
selected for the analysis (Fig. 3). Each frame consists of 136 189 grey scale (0–255) pixels. For the analysis, the public domain NIH Image Program (developed at the US National Institute of Health) was used. The segmentation procedure is shown in Fig. 4 (frames 1–10). Frame 1 shows one of the time-sampled images (frame number 11 in Fig. 3). This original grey scale image was first median filtered to reduce noise (frame 2). Smooth continuous backgrounds can be subtracted with a ‘‘2D rolling ball’’ algorithm in the NIH Image Program (frame 3). After choosing automatic thresholding, the image becomes binary. The object with pixel values 255 was caught and all the holes were filled (frame 4). As can be seen in frame 4, the shadow below the stomach of the person studied has wrongly been classified in the threshold step as pixels belonging to the person. For that reason a ‘‘Mexican hat’’ filter (Marr and Hildreth, 1980) is applied to frame 1, which does both smoothing and edge detection in one operation. The result is shown in frame 5. Frame 4 is then multiplied by frame 5, holes are filled and maximum and minimum filters are used to get a completely filled object (frames 6–9). The shaded object is skeletonised (frame 10) which is the input frame to the classifying process.
Fig. 3. Selected frames from the original video image sequence.
466
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
Fig. 4. The segmentation and classification process.
The classifying routine first finds the endpoints (defined by having one neighbour pixel) and the intersection points (X3 neighbour pixels). All small line segments (o20 pixels) are considered as noise and are therefore removed (frame 11). The number of endpoints, and the positions and the lengths of segments from the endpoints to the intersection points decide which segments are the legs, arms, trunk and head. The angle between the vertical line through the shoulder joint and the arm that has the highest position is calculated. If the angle is less than 901 the working position is classified as below shoulder level (frame 12). 3.2. Results of and comments to experiment 1 All the 12 images in the sequence studied were correctly classified by the parametric method (Fig. 5). The program uses some length segment criteria for classifying the skeleton figures. For example, the arm segment must be longer than the head segment if angles are to be calculated. Otherwise the assumption is that the upper arm is below shoulder height (frames 2 and 9). The calculated arm angles for frames 1 and 7 are greater
than the actual angles because of the intersection points between the arm and trunk that do not correspond to the shoulder joints. This can be adjusted if the skeletonised object is matched to a stick figure model with known line segment relations (Guo et al., 1994). Such a model will be included in future work. The system needed about nine seconds per frame in computer time for the segmentation and skeletonisation routine and about 20 seconds per frame for the classification process. The system used is programmed with a macro language which is computer time intensive. It is possible to translate the macrocode to a compiling language, e.g. Pascal, which will considerably reduce the computer time for the NIH Image Program. 3.3. Materials and procedure of experiment 2 The aim of the second experiment was to examine whether it is possible to automatically classify postures for the whole body into OWAS classes by using feature matching. For this experiment, a computer program based on a neural network architecture for vision (Kopp, 1998) was used. The developed system, called
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
467
Fig. 5. The results of the classification process.
Expectation based Elastic Template Matching Network (XETM network), is taught by the user showing the network where interesting features are located on the studied subject. The user indicates a region of interest (ROI) with a window tool on the image of the subject. It is possible to indicate the ROI on the subject image in four spatial scales, where the first scale has the most detailed representation of the subject, and the fourth the coarsest representation. The resolutions for the four scales are 256 256, 128 128, 64 64, and 32 32 pixels, respectively. The ROI is matched to the internal feature memory in the network system. If the feature is novel to the network, i.e. if the coefficient of correlation for the feature matching is below a specified threshold, it is learned; otherwise, it is adapted to one of the learned categories. Every feature selected by the user is located relative to the other selected features belonging to the same subject. The co-ordinates for every feature that belongs to the subject are stored in a spatial memory. The network builds models of the studied subject depending on the selected features and their spatial arrangement. The models are also stored in a separate memory.
The same videotape as in experiment 1 was used for experiment 2. A 69 s image sequence (138 grey scale images, each 250 250 pixels) was digitised into a personal computer (Pentium II, 300 MHz) with 0.5 second time sampling. Every third image (46 images) from the whole sequence was selected to teach the XETM network and to build models for classifying the whole image sequence. Three tests were performed to examine how the network reacted to the models when different threshold values were used. The threshold values should be chosen so that the network system can correctly classify as many images as possible. If too high a value is chosen, then the system may not classify all the images. On the other hand, if too low a threshold value is chosen, then the network may then classify images wrongly. An optimal value is difficult to specify because it differs for different image sequences. However, a threshold value of 0.85 may be a good value to start with based on the work at hand and earlier studies (Kopp, 1998). This was the value chosen for the first test. To investigate how the system reacted to both a minor and a major change of the threshold value, a value of 0.84
468
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
was chosen for the second test and a value of 0.65 for the third test. In a study of similarity and dissimilarity between squared objects it was found that objects became more similar to other objects in the XETM network course scales (Kopp, 1998). As the aim of experiment 2 was to classify similar postures into the same OWAS class, it was decided to indicate the ROI features of the subject image in the third scale. 3.4. Results of and comments to experiment 2 The 46 images were classified into seven different OWAS postures (Table 2). The network needed 20 taught models, when using a correlation threshold of 0.85, to classify the first image sequence. This meant that the system could correctly identify 57% of the OWAS postures in the first sequence. The 20 models were then applied to the total 138 image sequence. In the first test with a threshold value of 0.85, the network was not able to identify 18 of the 138 images and 13 were incorrectly identified. First the network is taught 18 new models with the same threshold of 0.85. Then the 13 wrongly classified images were corrected by increasing the correlation threshold to 0.95 when reteaching the network for just these 13 postures. Going back to a threshold value of 0.85, the procedure now
incorrectly classified two other postures. Finally, these two postures were retaught while once again increasing the correlation threshold to 0.95. Fifty three models (20 models from the first sequence+13 errors+18 not classified images+2 new errors) were consequently needed to classify the whole sequence into ten OWAS classes with the same correlation threshold of 0.85. Thus, the network had the ability to correctly identify 62% of the images in the total sequence (Table 2). The sequence consisted mostly, i.e. 67 of 138 images, of straight postures for the back and legs with both arms below shoulder level (OWAS code 112). These postures were also the easiest for the network to identify (93%). Postures of straight back and legs with one arm above shoulder level (OWAS code 122) and postures of bent back and straight legs with both arms below shoulder level also occurred often (26 and 27 images, respectively). These two postures had a more complex structure than the first one and were consequently more difficult for the network to identify (35% and 30%, respectively). In the second test, when a threshold value of 0.84 was used, 15 images became unclassified and 15 images wrongly classified. For the third test, with a threshold value of 0.65, one image became unclassified and 21 incorrectly identified. This means that, for the image sequence used, the number of unclassified images
Table 2 Results of the first test of experiment 2 OWAS posturea
First image sequenceb
Back
Arms
Legs
Imagesd
1 1 1 1 1 1 1 2 2 2
1 1 1 2 2 2 3 1 2 1
2 4 8 2 4 8 2 2 2 4
26 1 1 7
2 1 1 7
92 0 0 0
1 1 9
1 1 7
0 0 22
46 a
Modelse
Total image sequencec
20
Identifiedf (%)
57
Imagesg 67 5 3 26 2 3 2 27 1 2 138
Errorsh
Unidentifiedi
1
2 2
9 1
1 1
1 1
1 10
New errorsj
Modelsk
Identifiedl (%)
1
5 4 1 17 2 1 2 19 1 1
93 20 67 35 0 67 0 30 0 50
53
62
1
1 13
18
2
The OWAS posture codes are explained in Table 1, e.g. the code 112 means Back (1) straight, Arms (1) both arms below shoulder level and Legs (2) standing on both legs straight. b 46 images, coefficient of correlation 0.85. c 138 images, coefficient of correlation 0.85. d Number of images in the first sequence. e Number of models taught by the operator for the first sequence. f % postures correctly identified by the computer for the first sequence. g Number of images in the total sequence. h Number of incorrectly identified images by the computer for the total sequence. i Number of unidentified images by the computer for the total sequence. j Number of new incorrectly identified images by the computer after the incorrectly identified and the unidentified images in the first run have been corrected by the operator. k Number of models taught by the operator for the total sequence. l % postures correctly identified by the computer for the total sequence.
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
decreases but the number of incorrectly identified images increases if the coefficient of correlation for the feature matching is chosen below the original threshold value. Fifty three of the images in the total sequence had to be taught the right OWAS code. This number can probably be reduced if ROI features are taken from different XETM scales of the subject at the same time. Then the network is taught to see both the similarity and dissimilarity between the images in the sequence. The network system used in this experiment has the ability to relate a posture of a subject to a OWAS posture. However, this remark is only based on the image sequence tested in the present study. More tests of different task sequences and test–retest evidence of the same set of images will be required to show the reliability of the system. It is worth mentioning that it was a human expert in the OWAS method who taught the network of the OWAS postures from the beginning as well as decided if the network had identified the images correctly. Thus, there may be some human error to consider when evaluating the reliability of the system.
4. Discussion The experiments described in this paper have shown promising results regarding the use of image analysis for tracking and assessing working postures. All 12 images in experiment 1 were correctly classified by the parametric method, and the computer neural network used in experiment 2 had the ability to correctly identify 62% of the studied images. There are, however, still a number of problems left behind for further research. The systems presented are based on the analyses of video recordings from a workplace. It is believed that video-based methods are practical and easy-to-use techniques for an observer. The video user need not be a well-trained analyst as must be an observer who performs the observations directly on-site. An operator with a video camera needs only to concentrate on optimal camera positions of the recorded worker and leave the analysis to the computer afterwards. There are both other advantages and drawbacks to consider when using video for recording postures compared to direct observations. The video recording reduces interference with the workers to a minimum and holds the possibility of a more detailed analysis while reducing observation errors. A disadvantage is the limited area of observation. A person working periodically outside the area observed by the camera cannot be analysed (Vedder, 1998). Kilbom (1994) stated that postures are likely to be recorded more accurately by direct observation since human vision is three-dimensional whereas a video recording is reduced to a two-dimensional image. Moreover, it is easier for an observer to secure optimal
469
viewing angles without a video camera than with one. On the other hand, an observer can only register and assess a few variables and the work pace must be slow, while a video recording of many posture variables and tasks done at a high work pace can be analysed repeatedly and in slow motion. The systems used in the two experiments have been limited to the examination of postures parallel only to the imaging plane. Thus, twisted postures, postures performed in the other co-ordinate planes and postures that contain occlusions between the body parts have not been possible to examine. To overcome these problems either two or more cameras would be needed for registering the postures in three dimensions (Kakadiaris and Metaxas, 1995), but this will make the systems more impractical to use in the workplace. A better solution is to match the projection of a three-dimentional body model, e.g. to edge data in images (Gavrila and Davis, 1996) or to optical flow data (Pentland and Horowitz, 1991). Consequently, the analysis will be more complicated. Another simplification that is made in the experiments is that the postures are performed in front of a light homogeneous background. It is only the subject that is moving in the image sequence. In real world scenes the conditions are usually different. The background is not always fixed, the light conditions change in the sequence, and the subject can sometimes be in a hidden position. Furthermore, the postures in the experiments are recorded with a fixed stationary camera. If the systems are to be used for analysing postures more than in static workstations, then the video recorders need to be mobile. The OWAS method is developed and most suitable for targeting and identifying hazardous working postures in dynamic occupations where the workers move around their workstation. The systems presented must then be further developed to handle both mobile workers and mobile cameras. More cues, such as motion, velocity, colours, and the history in the image sequence are needed to help solve the problems just described. The two experiments show approaches that both have their merits and limitations. The parametric method used in experiment 1 is a static or fixed system that calculates angles to decide the postures. The accuracy of the method is determined once and for all by the rules that decide which of the line segments of the skeleton figure are the head, arms, trunk, and legs, and by the internal algorithms for calculating the angles. The appearance of the skeletonised figure depends on the segmentation process, i.e. how well the program can distinguish the subject from its background as well as separate the body segments from each other. The accuracy can be increased if the skeletonised figure is matched to a stick figure model with known line segment relations and more precise anthropometric measures. One advantage of the parametric method is that the determined angles can be used as input in
470
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471
biomechanical programs for calculating forces and load moments, such as the two-dimensional or three-dimensional Static Strength Prediction Program (University of Michigan, 1993). The elastic template system used in experiment 2 has the ability to handle small changes in rotation (751) and deformation without losing correlation performance (Kopp, 1998). This property contributes to the robustness and ability to interpolate between other stored views of the subject. While the calculations for the posture determinations in the parametric approach of experiment 1 are fixed, the posture recognition process in the network method of experiment 2 depends on the different models taught. The more models the network system is taught, e.g. models of different subject shapes, from different views, and under different light conditions, the better the working postures in general are recognised. In future studies, there will be an implementation of different kinds of filters in the vision model, and colour and texture will be integrated as additional properties in the network for increasing the robustness in the segmentation and recognition process. The OWAS method is a time-lapse sampling method, where the observations are performed at regular intervals. The higher the sampling frequency, the more accurate the observation study will be. The OWAS method recommends 30 s time-lapse sampling for direct observations in the field, while from video film shorter observation intervals can be used, e.g. 15 s or 5 s in the cases when the study work task has a short time span (Mattila et al., 1993). A major advantage in using image analysis for recording working postures is the possibility of using much higher time sampling frequencies. In the two experiments, a sampling of 1 s and 0.5 s was used, respectively, even if the computer hardware for digitisation allows a frequency up to 1/30 s. Thus, with computer techniques it is possible to digitise a large number of images as a basis for accurate analysis. Also, the access to an image sequence with high frequency between successive frames makes it possible to measure motion as a cue in the segmentation process. The user has a minor influence on the performance of the parametric method as compared to the neural network system. If the parametric program could be built robust enough regarding the segmentation of the subject and the body parts, which is shown to be a main problem in computer vision, then the user only needs to start the program and, after a while, get the result from the analysis. For the elastic template system, the user has to spend time teaching the system the different postures, and then depending on the accuracy of the system, to correct the wrongly classified postures. Both described approaches depend on further improvements of the segmentation and recognition routines to become robust systems for tracking and classifying working postures. A further development could be to combine
the merits of the two approaches, i.e. use the network for the segmentation process to identify the different body parts, and use the parametric method to measure angles for the classifying process. At the Division of Work Science in Alnarp and in cooperation with the Department of Cognition Science in Lund, further research is being carried out in the field of computer vision. The research includes tests of different human models, neural networks and template matching for automatically tracking working postures. Tests on real world image sequences will also be conducted. This research has the aim of making the OWAS method more useful for identifying and evaluating harmful working postures.
References Atha, J., 1984. Current techniques for measuring motion. Appl. Ergon. 15 (4), 245–257. Burdorf, A., Laan, J., 1991. Comparison of methods for the assessment of postural load on the back. Scand. J. Work Environ. Health 17 (6), 425–429. Colombini, D., Occhipinti, E., Molteni, G., Grieco, A., Pedotti, A., Boccardi, S., Frigo, C., Menoni, O., 1985. Posture analysis. Ergonomics 28 (1), 275–284. Corlett, E.N., Manenica, I., 1980. The effects and measurement of working postures. Appl. Ergon. 11 (1), 7–16. Fleischer, A.G., Lange, W., 1983. Analysis of hand movements during the performance of positioning tasks. Ergonomics 26 (6), 555–564. Fujii, N., Moriwaki, T., 1991. A study on motion measurement of fingers based on image analysis. The Jpn. J. Ergon. 27, 151–157. Gavrila, D.M., Davis, L.S., 1996. 3-D model-based tracking of humans in action: a multi-view approach. IEEE Computer Vision and Pattern Recognition, San Francisco, USA. Genaidy, A.M., Simmons, R.J., Guo, L., Hidaldgo, J.A., 1993. Can visual perception be used to estimate body part angles? Ergonomics 36 (4), 323–329. Goddard, N.H., 1989. The interpretation of visual motion: recognizing moving light displays. Workshop on Visual Motion, Irvine, California, March 1989, pp.212–220. Gonzalez, R.C., Woods, R.E., 1993. Digital Image Processing. AddisonWesley Publishing Company, Reading, Massachusetts, US. Grandjean, E., Hu. nting, W., 1977. Ergonomics of posture-review of various problems of standing and sitting posture. Appl. Ergon. 8 (3), 135–140. Guo, Y., Xu, G., Tsuji, S., 1994. Tracking human body motion based on a stick figure model. J Visual Commun. Image. Representation 5 (1), 1–9. Hagberg, M., Silverstein, B., Wells, R., Smith, M.J., Hendrick, H.W., Carayon, P., P!erusse, M., 1995. In: Kuorinka, I., Forcier L. (Eds.), Work related musculoskeletal disorders (WMSDs): a reference book for prevention. Taylor & Francis, London. ( ., Bjo. rn, F., Carlsson, P., 1992. A new triaxial Hansson, G.-A accelerometer and its application as an advanced inclinometer. Abstracts of the Ninth International Congress of ISEK, 28 June–2 July, Florence, Italy. Haslegrave, C.M., 1994. What do we mean by a working posture? Ergonomics 37 (4), 781–799. Hsiao, H., Keyserling, M., 1990. A three-dimensional ultrasonic system for posture measurement. Ergonomics 33 (9), 1089–1114. Johansson, G., 1964. Perception of Motion and changing form. Scand. J. Psychol. 5, 181–208.
S. Pinzke, L. Kopp / Applied Ergonomics 32 (2001) 461–471 Kakadiaris, I.A., Metaxas, D., 1995. 3-d human body model acquisition from multiple views. Proc. of Fifth Intl. Conf. on Computer Vision, pp. 618–623. Kant, I., de Jong, L.C.G.M., van Rijssen-Moll, M., Borm, P.J.A., 1992. A survey of static and dynamic work postures of operating room staff. Int. Arch. Occup. Environ. Health 63 (3), 423–428. Karhu, O., Kansi, P., Kuorinka, I., 1977. Correcting working postures in industry: a practical method for analysis. Appl. Ergon. 8 (4), 199–201. ( ., 1994. Assessment of physical exposure in relation to Kilbom, A work-related musculoskeletal disordersFwhat information can be obtained from systematic observations? (special issue),Scand J Work Environ Health 20 30–45. ( ., 1995. Prevention of musculoskeletal disorders through Kilbom, A standards and guidelines; possibilities and limitations. Proceedings of the International Symposium From Research to Prevention, 20–23 March, 1995, Helsinki, Finland, pp. 178–185. Kopp, L., 1998. A neural network architecture for vision: introducing the XETM-network. Lund University Cognitive Science, Lund, Sweden. Kuinii, T.L., Sun, L., 1990. Dynamic analysis-based human animation. C.G. int. 90 (Sapporo (JPN)), 3–15. Leskinen, T., To. nnes, M. 1994. Utilization of a video-computer system for analyzing postural loadFevaluation of observation. Proceedings of the 12th Triennial Congress of the International Ergonomics Association (IEA ‘94), Toronto, Canada, pp. 383–385. Leung, M.K., Yang, Y-H., 1987a. Human Body motion segmentation in a complex scene. Pattern Recognition 20, 55–64. Leung, M.K., Yang, Y-H., 1987b. A Region Based Approach for Human Body Motion Analysis. Pattern Recognition 20 (3), 321–339. Leung, M.K., Yang, Y-H., 1995. First sight: a human body outline labeling system. IEEE Trans. Pattern Anal. Mach. Intel. 17 (4), 359–377. Li, G., Buckle, P., 1999. Current techniques for assessing physical exposure to work-related musculoskeletal risks, with emphasis on posture-based methods. Ergonomics 42 (5), 674–695. Long, A.F., 1992. A computerised system for OWAS field collection and analysis. Proceedings of the International Conference on Computer-Aided Ergonomics and Safety ’92-CAES 1992, Tampere, Finland, pp. 353–358.
471
Marr, D., Hildreth, E.C., 1980. Theory of Edge Detection, Series B: 207: In Proc. Royal Soc. London, 187–217. Mattila, M., Karwowski, W., Vilkki, M., 1993. Analysis of working postures in hammering tasks on building construction sites using the computerized OWAS method. Appl. Ergon. 26 (6), 405–412. Mochimaru, M., Yamazaki, N., 1994. The three-dimensional measurement of unconstrained motion using a model-matching method. Ergonomics 37 (3), 493–510. Nordin, M., 1982. Methods for studying work load with special reference to the lumbar spine. Doctoral Thesis, Department of Orthopaedic Surgery I, Department of Orthopaedic Surgery, University of Go. teborg, Go. teborg, Sweden. Pentland, A., Horowitz, B., 1991. Recovery of nonrigid motion and structure. IEEE Trans. Pattern Anal. Mach. Intel. 13 (7), 730–742. Persson, T., 1996. Analysis of joint loads and balanceFmethods intended for routine use in clinical settings. Licentiate Thesis, Teknikum, Institute of Technology, Uppsala University, Uppsala. Pinzke, S., 1994. A computerised system for analysing working postures in agriculture. Int. J. Ind. Ergon. 13, 307–315. Pinzke, S., 1996. Musculoskeletal disorders and methods for studying working postures in agriculture. Licentiate Thesis, Report 107, Department of Agricultural Biosystems and Technology, Swedish University of Agricultural Sciences, Lund. Pinzke, S., 1997. Observational methods for analyzing working postures in agriculture. J. Agric. Safety Health 3 (3), 169–194. Rashid, R.F., 1980. LIGHTS: a system for interpretation of moving light displays. Doctoral Thesis, Computer Science Dept., University of Rochester, Rochester, NY, USA. University of Michigan. 1993. 3-D static strength prediction program, Version 2.0, Users manual, The University of Michigan, Center for ergonomics, Michigan, USA. Vedder, J., 1998. Identifying postural hazards with a video-based occurrence sampling method. Int. J. Ind. Ergon. 22, 373–380. Webb, J.A., Aggarwal, J.K., 1982. Structure from motion of rigid and jointed objects. Artif. Intel. 19, 107–130. Westgaard, R.H., Aar(as, A., 1984. Postural muscle strain as a causal factor in the development of musculo-skeletal illnesses. Appl. Ergon. 15 (3), 162–174. Winkel, J., Mathiassen, S.E., 1994. Assessment of physical work load in epidemiologic studies: concepts, issues and operational considerations. Ergonomics 37 (6), 979–988.