Development of emergent processing loops as a system of systems concept

Share Embed


Descrição do Produto

Development of Emergent Processing Loops as a System of Systems Concept James Gainey [email protected]

Erik Blasch [email protected]

Us Air Force Research Lab, Sensors Directorate 2241 Avionics Circle, Wright-Patterson AFB, OH 45433-7318

ABSTRACT This paper describes an engineering approach toward implementing the current neuroscientific understanding of how the primate brain fuses, or integrates, "information" in the decision-making process. We describe a System of Systems (SoS) design for improving the overall performance, capabilities, operational robustness, and user confidence in IdentifIcation (ID) systems and show how it could be applied to biometrics security. We use the physio-associaz'ive temporal sensor integration algorithm (PATSIA) which is motivated by observed functions and interactions of the thalamus, hippocampus, and cortical structures in the brain. PATSIA utilizes signal theory mathematics to model how the human efficiently perceives and uses information from the environment. The hybrid architecture implements a possible SoS-level description of the Joint Directors of US Laboratories (JDL) for Fusion Working Group's functional description involving 5 levels of fusion (i.e., Preprocessing, kinematic, situation, threat, and process refinement) and their associated definitions. This SoS architecture proposes dynamic sensor and knowledge-source integration by implementing multiple Emergent Processing Loops (EPL) for Predicting, feature Extracting, Matching, and Searching both static and dynamic thtabases like MSTAR's PEMS loops. Biologically, this effort demonstrates these objectives by modeling similar processes from the eyes, ears, and somatosensory channels, through the thalamus, and to the cortices as appropriate while using the hippocampus for short-term memory search and storage as necessaiy. The particular approach demonstrated incorporates commercially available speaker verification (simulating 1D audio/signal inputs) and face recognition (simulating 2D video/image inputs) software and hardware to collect data and extract features to the PATSIA. The PATSIA maximizes the confidence levels for target identification or verification in dynamic situations using a belief filter. The proof of concept described here is easily adaptable and scaleable to other military and nonmilitary sensor fusion applications.

1. OVERVIEW One of the main attributes of the human process for recognition is the ability to use multi-dimensional, multi-modal sensory information to identify objects or events of interest. From the identification process, "information" is extracted from raw sensory data. Sensory data exists in various modes and types. For the real-world recognition task various sensors make data available, but human integration is still required today to extract and use information from more than one type. Current military systems for automatic identification of targets usually rely on single sensor results presented to the operator (e.g., EW, RESM for Air/Air, SAR, FLIR, HRR - Air/Ground). Additionally, 1-D high range resolution (HRR) and 2-D synthetic aperture radar (SAR) information is given to the image analyst who is in turn responsible for assessment or decision making. For the identification of people, many security systems use a single source to identify a person. One such case is a video camera (2D) surveillance system to detect people crossing the field of view. While the surveillance system works well in a constrained environment, its capabilities are limited in real-world, dynanuc environments. For both examples, we want to aid human operators with a physiologically-motivated system-of-systems (SoS) sensor integration design[1J. Our approach shows promise for improving overall performance and utility for security and combat identification (CD) applications.

In order to improve system robustness, we employ a sensor integration system in which auditory-like data is coupled with that ofvisual-like data to aid either sensor in the detection, recognition and identification of targets. For this presentation, we focus on access security where people are the target of interest and data often includes both desirable and undesirable individuals. The system must be capable of handling known (familiar) and unknown (or unfamiliar) persons. The aspect of unknown person identification is key to a robust system, and a belief filtering approach [2,3] is used to deal with unknowns to include extended operating conditions (HOC) from those specifically trained. Other unique attributes for improving Part of the SPIE Conference on Sensor Fusion: Architectures, Algorithms. 186

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

and Aoolications Ill . Orlando, Florida • April 1999 SPIE Vol. 3719. 0277-786X199/$lO.OO

robustness involve the implementation of emergent processing loops and feature fusion [1,21 into a SoS design that maps directly to the Joint Directors of US Laboratories (JDL) for Fusion Working Group's functional Fusion Tree model[4]. The physiologically-motivated design is developed from recent neuroscience research and builds on proven engineering concepts. As automatic systems for target recognition and identification overcome processing barriers, users are accepting them. The flexibility of multi-dimension and multi-mode sensor integration processing leads us closer to the realization of multisensormultitarget recognition and identification. The paper develops a biological approach to sensor fusion in the case of identifying people for access security and describes an extension to CD technology. Section 2 details some of the current development efforts of the DoD and introduces our implementation towards a security access application. Section 3 details a biological solution to the security problem and

Section 4 shows results. Finally Section 5 presents some discussion for future research in automatic target ID and our conclusions.

2. ACCESS SECURITY PROBLEM In order to enhance an automatic access security system, we look to biology for clues in perceiving the person as fast as possible. Humans are excellent in perceiving and distinguishing different people. Sometimes the recognition of people is so good, that people who have not been seen for years are identified quickly once observed. What are the mechanisms by which people are identified? And what are the associations of these features that are incorporated? We understand that person ID employs recognition of a combination offeatures. Usually the person is even better identified when both visual and auditory information is available to the observer. By listening to the person's speech and combining that information with the visual information, uncertainties are resolved and the person is identified. So in our security system, we use features from auditory and visual data to identify the person. The extension to our combat identification problem likens the auditory information to Doppler processing [5], while the visual task is analogous to perceiving features in a scene formed by SAR imaging sensors[6].

2.1. Automatic Target Recognition Development One algorithm, currently under development for ATR, is the PEMS loop in which targets are identified using a predict, extract, match and search method, as shown in

Figure 1. The PEMS loop is similar to a biological system that resides in the prefrontal cortex in which data

is associated with learned hypothesis of the collected information. One deviation we make from the PEMS loops is that belief hypotheses are determined a priori. In this case, the a priori information is the stored person's uniquely recognizable features in a database.

Visual

This is similar to the case in which the people with access clearances are known; however, an unknown person must be identified from additional information. Thus, the goal is to determine if the person being tracked and observed is a known or an unidentified person.

PF(f?OIZttZ1

D.D.B.

Association Cortex Module

gu i PEMS Loop

The search mechanism looks for facial features in the image and the auditory characteristics from the person's speech. The predict mechanism utilizes the information stored to generate beliefs as to the person in question. ID features are extracted from the image sequences and voice-tracks of subjects. One characteristic of the prediction mechanism is that additional faces and auditory tracks are generated, so that a set of plausible information is available. Simultaneously, the raw data is processed to determine what auditory and visual features are extractable. The match mechanism then compares the extracted information to that of the predicted information. If a match is found, the person is identified. If a match is not found, then the comparison seeks to reduce the plausible set of people to a smaller set. When all features are processed and the set of plausible people is reduced, a higher confidence in the person ID is achieved.

187

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

2.2. Acquisition and Tracking Acquisition and tracking are necessary parts to any ID problem. The ability to relate sensor(s) to the target(s) of interest is not always straightforward. While the sensors might be displaced a certain distance, finer resolution information is often gathered at the expense of a measurement system where the sensor must rotate or move. Also, if the target is moving, the sensor must follow the target to determine the best angle to collect useful data about the target. It becomes significant at this point to completely understand sensor parameters and capabilities. The (jjjm sensor data must be transformed into relevant target and event information. Sound

In radar sensors, depression angle and azimuth information is Speech

important to determine the relative pose ofthe target Likewise, in

a biological system, we move our head so as to acquire

Words

information with our eyes. When the head is moving, the system is tracking coarse features in the image. When the head correctly positions itself on the target, then the eyes can focus or saccade to resolve the target of interest.

Ph

The head often moves in response to an interrupt vector from auditory cues. The head rotates to the perceived location of the sound. Once there, the eyes are used to detect images of the

Name

person, which are converted to learned personal attributes, so as to

identify them. From Figure 2, it is obvious that the auditory infonnation can cue the visual system in order to orient the visual sensor for better person recognition. Movement - Position Velocity

So far we have established the relative orientation between sensor and target, and the PEMS loop is chosen to identify the target from

Figure 2. Person Identification.

each sensor. We can now implement higher level information processing for a security system in which the known people are identified and those unknown are tracked until determination is made that these people are not to be granted access. If the person is deemed to be unrecognizeable, the algorithm reports a

high confidence that security is at risk The extension to our more general ID problem provides the ability to identify a hostile target creating a threatening situation.

3. BIOLOGICALLY MOTIVATIVATED SOS ARCHITECTURE The PATSIA algorithm shown in Figure 3 implements a biological data association algorithm (BDAA). It is important to point out the visual (2D) and auditory (1D) pathways. Once each sensor is processed, the data must be associated across time, space, and the sensor modalities. In the event that each sensor correctly identifies the person, the association of information to obtain an ID is simple. If however, one sensor reports a different person, then the algorithm must resolve the conflicts between the sensors. Here is the need for the cortical loops — to resolve the set of recognized people from each sensor.

Fills and Posner presented a way for humans to learn new tasks [7]. They presented three stages of development as cognitive, association, and automatic. In the case in which a human is presented with a new and complex problem, they first use declarative knowledge in acquiring new facts to understand the cognitive problem. In the association stage, evidence is accumulated to prune or eliminate extraneous facts. Additionally in this stage of conflict resolution, facts are matched in

order to develop relationships between the targets. Finally, a third stage in which the association rules are used to automatically perform the task. The automatic process occurs once a skill is obtained where the targets and events of interest are familiar.

Like FiUs and Posner, we chose to employ these stages. Our modifications for the ID problem, include processing the incoming data is actually in the automatic stage since raw information gathered by the sensors is converted to facts or features based on learned rules and phenomenology. The second difference is that the association of data is resolved into

188

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

infonnation components. Finally a cognitive stage is used to identify unknown or unfamiliar target types. For in the research, identifying an unknown person is in the cognitive processing ioop.

3.1. Emergent Processing Loops (EPL) Emergent processing includes a recursive method for updating perceptions, confinuing beliefs, and thus maintaining the consistent, stable world model. Two types of emergent processing ioops (EPL), Thalamo-Cortical Loops (TCL) and CorticoCortical Loops (CCL), are used and the details are presented in previous papers [8J. Our premise is that fusion occurs

simultaneously at different levels of vigilance. We implement three levels within the BDAA: time-space event, feature association, and decisionfusion, as shaded in Figure 3. Consider each level as an EPL, which utilizes combinations of TCLs and CCLs as necessary for implementing basic functions toward effective recognition and ID tasks. These EPLs connect

inputs to outputs as well as across sensor modalities. The human brain is not well understood, even just for specific recognition tasks, but these EPLs within the overall architecture provide an initial engineering design.

USER INTERFACE

Cognitive

Association

Automatic

Figure 3. PATSIA SOS Architecture.

The overall EPL is itself a loop which we train to accomplish the specific task, see Figure 4. Decision fusion occurs within this overall EPL, or oEPL. Decisions are made after combining all featureflD information from the various sensors for a

particular event. If all the sensors agree, a fully constructive pattern match occurs and the process completes. However, if the sensors do not agree, conflict resolution initiates. That is where the oEPL resolves the difference. A belief, with an associated confidence level is generated and compared to the next instantiations until the desired confidence is attained to complete the ID process. Besides combined featurellD association and

conflict resolution, the oEPL involves resource management and prediction. The highest confidence beliefs are compared against learned associations in memory to derive an error. Error is accumulated over time

and distributed to the various sensor channels. The sensor managers manipulate, or reposition individual sensors, to obtain different data or refine database searches prior to the next time step. At this point in time, a prediction of all state information in the next time step is generated for

comparison to the next actual input from each sensor. The process continues until the desired confidence is attained. No overall output

Figure 4. Architecture Overview.

occurs until the threshold confidence is reached. The time from the initial

189

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

input to final output is what we call a perceptual time window. In real time, this oEPL can take anywhere from a few hundred milliseconds to days depending on the familiarity or Unfamiliarity of a given object or event. Another critical factor is the level of threat. High threat situations tend to reduce the desired confidence in favor of making a decision within the limited amount of time available. Such a scenario is extremely well suited for illustrating the usefulness of this BDAA for intelligence, surveillance, and reconnaissance (ISR) as well as tactical military applications. 100

200

The perceptual time window concept has actually been measured and

documented by Libet [9]. Figure 5 illustrates the concept he calls, "backwards referral in time." In Libet's experiment, performed on a human subject, stimulus is applied to the skin at time 0. Processing occurs for approximately 500 msec to achieve neuronal adequacy, which makes the neuron fire. A perception can not be complete

mV

before neuronal adequacy. However, the subject reported "experiencing" the stimulus within 100 msec after the actual stimulus. Libet's work proves the necessity of prediction (especially within the perceptual time window) for maintaining the consistent, stable world model. Note, sensory inputs that occur after some other input, but still within the time window of a perception, actually could affect the outcome of the perception. We call this concept apparent non-causality. Over time, the brain processes sensor-specific inputs in parallel, but the information is constantly passed back and forth among the sensor modalities to keep each channel updated. Within a

""""

Nrona Adequac

_____________________ C

Figure 5. Retroactive Referral of Subjective SensorY Expenence.

single perceptual time window, sensors are updating each other's perceptions of the environment with an output only occurring at the end of the window. For instance, an oboe, playing a bar of music in an orchestra, follows the clarinet in time; yet, the two instruments are perceived as if in harmony. The perceived music is better than the simple sum of its parts. Sensor information fusion occurs due to our concept ofprocessing within the EPLs.

With an understanding of the oEPL, we can now discuss the sensor-specific event-level fusion and feature association. Information initially enters the SoS architecture, which is already trained with some level of SA and with certain beliefs, through the thalamus to accomplish a pre-determined task Each sensor modality and area of the cortex functions with active participation of thalamus, which is reciprocally and topographically connected. The thalamus, being in a central location with access to the multiple sources and modes of information, instigates information fusion. Association links are formed between the thalamus and the sensor modalities by building a common reference in time and space across all the sensor channels. Attributes, or features, are extracted from the data providing information that is analyzed and re-analyzed as if it comes from some new, sophisticated sensor. Sensor-specific event level EPLs occur simultaneously in all the sensor channels, and each channel feeds features to a sensor-specific feature association EPL. The goal of the learned mechanism for

feature association is to further abstract belief features consistent with previous perceptions of the world model. At the feature-fusion level, the thalamus is indirectly connected and the CCLs are the main action sources. That is, most of the necessaiy computations take place between pairs of cortical areas.

The CCLs are similar to the PEMS algorithms of the Moving and Stationary Target Acquisition and Recognition (MSTAR) program, Figure 1. Search scans the scene, extract pulls out the salient information in the image, predict is the ability to hypothesize what features are available, and match is the association of the predicted feature to that of the measured feature. The MSTAR PEMS loop is similar to the mechanisms by which the brain integrates video and audio information. These CCLs are replicated for each modality with modifications only as necessary to deal with the dimensionality and unique characteristics of each sensor within the System of Systems.

3.2. Cortical Loops for Feature Extraction Feature extraction is a major part ofobject tracking, identification, and classification systems[31. For tracking, image content and registration are important for time and location referencing. Additionally, ATR algorithms are subject to uncertainty measurements and capacity constraints. For instance, if the information is passed through a communication channel, then the desired output is to maximize the information available to the pilot, given bandwidth and time constraints.

Consider Figure 2 as an environment that the security analyst is monitoring. We will assume, single sensors can detect targets like moving people. Assume that the region of interest in a 1- or 2-Dframe is composed off features, which compose

190

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

a single target. Any feature in a frame can be measured independently ofthe others, andthe outcome of each measurement is a random variable indicating the magnitude of the feature. The probability density of each feature measurement depends on whether the target feature is actually present or not. Further assume that a fixed number of M measurements will be taken

from an observation using entropy metrics. Combining these entropy metrics allow a confidence-level decision to be rendered as to which orientation the target is in and its ID. The assumption is that the target type, e.g. person, is known a priori and the orientation information will further help reference target features for identification. Learned-observation information metrics are stored in memory and the ATR algorithm is to compare the return image-feature measurements to a known database.

The person-ID problem is to determine what sequence is the correct person, as well as the minimal necessary number and type of features to measure. These actions should provide the highest probability that the target orientation and ID will be isolated. After M measurements and comparisons to 0 observations, a measure of mutual information will determine the information-theoretic feature content. if a threshold is achieved, a preliminary orientation is determined which allows the ID routine to extract features for initial target-person matching. The primary feature extracted is mutual information on target orientation. The motivation for mutual information is that 1) it utilizes the measured probability density function of features, 2) it can easily be adapted for learning to remove uncertainties in the problem space, and 3) orientation information can be used for the image identification process. The mathematical model for target detection, feature association, and orientation determination, are presented.

3.3. Target Feature Identification The identification of a target can be achieved by maximization of mutual information. The goal is to obtain a learned estimate ofthe association, A, that associates the target-measured features, T = f(m), and the detected target observation 0 by maximizing their mutual information over an association estimate:

A = max [I{T(j); O(A(f))} ]

(1)

wherefis a random feature variable that ranges over the visual image. Mutual Information, defined using entropy, is

I{T(f);O(A(/))} = h(T(f)) + h(O(A(f)) - h(T(f),O(A(f))

(2)

where h( ) is the differential entropy of a continuous random variable, defined as:

h(f) = - 5 Pf Oog(pf (I)) dl

(3)

Given the random variable measurements in an image or audio track, information on a referenced f1, f2 information can be used as a feature of orientation, or independently as length and width. The joint entropy oftwo random variablesf1 andf2 is

h(tjj) = -5 5 p(ff)(tjf)(log(p1)(fjf2)) df1 df2 Entropy can also be expressed as: I{T;O(A(f))} =h(O(A(f))) — h(O(A(f)) I T(f))

.

(4)

(5)

and, h(fIfj) is the conditional entropy which is interpreted as a measure ofuncertainty, variability, or complexity. Information, in the association problem, is divided into three characteristic functions: 1) Entropy of the target, independent of A, 2) Entropy of the image which the target is associated with, and 3) The negative joint entropy of the observation 0 with the Target T.

A large negative value exists when the target and the observation are functionally related. Basically, it learns associations

where the observation 0 identifies the target T above a desired threshold. Hence, functions (2) and (3) are learned associations for complexity reduction. The sensor-target identfIcation problem can be formulated as a belief filtering problem. The mathematical algorithm for measurement processing is similar to a system with independent hypotheses. Each individual belief test, denoted Bk, is referenced to the 0th image observation and simply states, "the observation contains a F of which we concentrate on identification in this target feature." B beliefs are postulated, one for each featuref= 1

191

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

analysis. In (I — 1) images, Bk is false; and in one image Bk is true. Let k denote the stage of the detection, where k = 0, 1,

...,K. At every stage k > 0, a sensor makes a feature measurement and compares to an observation in image I. By convention, let the range of feature measurement be 1) outcome y(t) = 1 denote a perfect feature correlation and 2) y(t) = 0 denote no available feature. Measurements, which are independent from stage to stage, have a probability density that is conditioned on the presence or absence ofthe feature and depends on the probabilities offalse alarm and missed detection. Let 1(t) = {(i(s), y(s)), s 0, . . . ,k} be the total information available at stage k, consisting of (i, y) measurement pairs, i(s) being the sample feature and y(s) the realized measurement at each epoch through stage k for the featuref Now let Bel(f) = [Belk(t)1 [f1(t), f2(t), . . . , fK(t)IT denote the vector of conditional probability offeature estimates for the combination of cells in the frame. The summed element of Belk(t) is the total conditional probability that Bk is true given the accumulated feature measurements in cells k through stage K, i.e. Belk(t) = P(Bk I 1(t)). Denote Bel(O) = [Fk(O)]. as the vector of initial probabilities. Assuming that feature hypotheses are independent across images, values measured in image k affect that image's ID belief and no other. The independent assumption allows multiple beliefs to be true at the same time.

Focusing on featureJ two cases present themselves corresponding to whether the measurement of featuref+l is useful for classification or not. Bayes' Rule governs the assimilation of the measurements, where Bel(t) is our estimate for the conditional probability offeaturefbefore the measurement infis processed: Detection: Be!0 (f+1) = P(B0(f)I I(f+1))

= P(B0(/)I(i(f+1) = j, y(f÷1)

detection), 1(j))

P(feature in 0 detection ofO, 1(j))

(6)

= P(detection of 0 feature.in O).P(feature. in j 1(f)) + P(detection of 0 I no feature m O).P(no feature in 0 1(t))) .

P(detection of 0 feature in o) .P(feature in 0 1 1(f))

By analogy, Bayes' update ofBel/J for the case when the sensor does not report a feature is: No detection: Be10 (f+1) = P(feature in o no detection ofo, 1(f)) Note in general that the sum ofBelk(J) values across all I images is not unity.

(7)

3.4. Sensor Fusion by Information Association Direct detection is an uninformed method, which devotes equal attention to every image in a database. The procedure is to choose a starling Image J* and advance through the image analyzing one noisy feature per image, processing it to update Belk(t) for that image, and then advancing to the next feature in the predetermined sequence. When the frame is completed, the pattern is repeated, starting over with the next image. A run is completed when 0 observations have been compared and

processed. In order to ensure equal numbers of observations in each image, T is chosen to be a multiple of K so only complete frames of measurements occur. Direct detection implements more simply than other detection methods because its

TliSPace

te*t

Feature Level Fusion

Feature Assodation

cyclic predetermined detection pattern obviates decisions about how to advance

through the frame. Even though the individual measurements are processed

using Bayes rule, it should not be

expected to perform well. It is included here to provide an uninlonned baseline from which to draw comparisons. We note that many fielded systems use direct detection as their default mode.

The second detection method, the learned-association rule, attempts to shorten the time required to determine the target by following an "informed" policy. The association rule's detection

Figure 6. Feature recognition for Person Identification.

192

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

P

identI1icIIon

policy is to compare only to the most likely feature, that is. the feature with the highest probability. (i. for targctlperson identification, as shown in Figure 6. Since all images. containing a known entropy. are equally hkelv at inception, the detection procedure begins by choosing a feature fat random. companng the mutual information to update lk'lk-(f) using belief-filtering fusion. If f* is a non-target image feature and the measurement indicates no target. Be/k-(f) immediately falls below the Belk(f) of every other image through a conflict function, thereby allowing the identification to advance to an image that now has the largest Be/k(f). If multiple images have equally large BeIk(f)'s. as the will at the beginning of a run, random choice is again used to break the ties. If f* is a non-target image feature and the measurement indicates target a false alarm occurs). Be/k.(t') will increase leaving it somewhat larger than the other Be/k(f). As more feature measurements are added in the images 1*. Be/k-(f) will eventually fall below the other Be/k(f) and thereby allow another images to increase the highest probabilit of mutual information. Analogous arguments apply when f* is a target image feature. The final target belief is weighted by its confidence. . for parameter fusion.

With the association rule procedure. it is likely that some features will not be measured dunng an entire run since helmet associations vill exceed identification thresholds, The measurements not utilized in these images are expended in images where false alarms occur and in the target image which tends to absorb every nicasurement once its BeI(t) value gets large enough to surpass a confidence level. The ID information can be resolved after the sensors have been cued by the auditors system. In Figure 6. this is represented by the fact the original auditory signal can be resolved spatially into a position signal. x. and then the visuals system will search through the features. f in order to deternune the person's identification information. What is key is that only the features necessary to identify the person arc used. In the case in which the first feature available identifies the person. the algoritlun terminates since a high comifidence is established. If however. the person is not identified. further evidential features are used to distinguish the person.

4. BIOMETRICS VERIFICATION APPLICATION To assess the validity of the method. we just have to look the human's recognition

system for implementation. However, we wish to automate the system and took implement it in hard are In order to fuse the audio and usual information ':1 data samples on people in a room. The audio information, sampled at 50 windows (or frames) per second, updates the belief in the person. A voice feature model created from a speech pattern. typically shown in Figure 7. was used to verify the Figure 7. Audio Frack belief against time-sequenced speech inputs using a commercial-off-the-shelf (COTS) speaker verification program. SpeakerKev. For the audio information, a set of keywords spoken by the person was used to characterize their speech pattern. Another COTS software program. h'ue/ac. creates a facial feature file then analyzes an image of a person providing an ID and confidence as illustrated in Figure 8. By processing these results over time, such as block processing for a source of data. the fusion of ID and 2D imiformation occurred and a higher confidence in the correct person ID resulted.

T'lie evidential belief was established as the new data was extracted from a sequence of images and audio tracks. Sequenced tracks of audio information were processed together with their corresponding image sequence.

As one of a number of trials where the data was loaded in from each sensor, the desired person was identified using the belief filter for analysis as displayed in Figure 9. Figure 9 shows that over time, the correct person was identified with increasing confidence. Yet, for individual trial, a high confidence could not be obtained. Typically. the algorithm could not match an unknown person with anything stored in the data set and a plausibility analysis was performed to nile out a

priori learned people. Only in the cases in which a decision was rendered for the desired person. did the confidence level remain high. A lower confidence was experienced in cases in which the unknown person looked very similar to someone else. The P5 decision was selected since we required a confidence above 90% for this person ID task. Automatic biometric verification for security access control can be significantly improved by processing time-sequenced inputs and using more than a single sensor modality.

Figure 8 Visual 1 rack

193 Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

DT ATTC A DTT TT

PI(Pi)

Measurement No.

Measurement No.

Figure 9. Belief Analysis.

5. FUTURE RESEARCH AND CONCLUSIONS 5.1. Fusion Within The Combat Target ID Problem Target identification is the process of obtaining an accurate characterization of entities in a region of interest to the extent that

high confidence, real-time application of options and resources can occur. Most recently, fusion algorithms are being examined for their role in automatic target ID. In this paper, we will focus on an extension of the physiologically-motivated information fusion architecture towards the common combat ID solution. The success attributable to any automatic ID system is rooted in the ability to exploit unique target phenomenology and the entire electromagnetic spectrum available to accomplish that goal.

From a user perspective, receiving ID information from various sources can be confusing. This is especially true if the sources of information conflict with each other. For example, if the pilot's onboard non-cooperative ID system has identified the target as a hostile tank, and an off-board source has identified a target in the same area as a friendly tank, then how does the pilot resolve this conflict? Are the onboard and off-board systems referring to the same target? Did one of the ID systems make an erroneous decision? Does the pilot launch a missile? The technology solution to this dilemma is fusion. A fusion algorithm will assist the pilot in resolving multiple ID inputs into a single decision for each target

A robust ID capability is vital to the warfighter. However, major deficiencies exist in current perceptions and capabilities. Virtually no capability, other than the human eye, exists for identifying ground targets. A limited capability for identifying moving air targets exists. As confidence builds by using improved fusion algorithms to combine off-board ID information

with the onboard cooperative and non-cooperative ID systems, warfighters will begin to utilize all of the pertinent information available to determine the identity of targets. Compatibility with all of the Services and interoperability with allied forces is required. Furthermore, the cost of ID solutions is major driver. The most likely solutions will include retrofits to the current sensors. Thus, a joint, system of systems approach is the answer. We develop a human identification system, which is similar to identifying targets in a battlefield.

5.2. CONCLUSIONS Our system-of-systems design worked well and we are continuing to enhance the fusion of information. The work merits review in that security and CID is maintained at a high confidence level at decision time. The resulting decision allowed for the automatic processing of personnel based on the EPL as a Systems of Systems concept. Since known people are identffied, over iterations the confidence level approaches unity. For the cases in which the unknown person looked like someone else and the audio track is similar, the decision had a lower confidence. The key attribute of the architecture is that using the belief filter within the overall Emergent Processing Loop, confidence that a given person possesses security access (or does not) is improved significantly. The PATSIA SOS architecture proves useful for fusion of lD and 2D sensor data. Continued research is this area is needed as it could be applied for fusing multi-dimensional information from military sensors to aid the warfighter in Combat Identification and other recognition tasks.

194

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

6. REFERENCES 1. E. Blasch and J. Gainey, Jr., "Physiologically-Motivated Fusion Architecture", NSSDF98 Open Session, Atlanta, GA, 29 April 1998, pp. 137-150. 2. E. Blasch and J. Gainey, Jr., "Physio-Associative Temporal Sensor Integration", SPIE mt. Symp. On Aerospace/Defense Simulation and Control, App. and Science ofComputational Intelligence, Orlando, FL, 13-17 April 1998, pp. 440 — 450.

3. E. Blasch and L. Hong, "Simultaneous Tracking and Identification," Conference on Decision Control, Tampa, FL, December 1998, pg. 249-256. 4. Steinberg, Sensor andData Fusion Workshop, WPAFB, 18 December 1997. 5. E. Blasch, "Sensor Fusion Cognition using BeliefFiltering for Target Tracking and Identification," to appear in SPJE mt. Symp. On Aerospace/Defense Simulation and Control, Sensor Fusion, Orlando, FL, April 1999. 6. E. Blasch and M. Bryant, "Information Assessment of SAR Data For ATR," Proceedings ofIEEE National Aerospace andElectronics Conference, Dayton, OH, July 1998, pg. 414 —419. 7. P. FiUs and M. Posner, Human Performance, Brookes Cole, Belmont, CA., 1967. 8. J. Gainey, Jr. and. E. Blasch, "Biological Data Association for Combat Target Identification", NSSDF98, 31 March —3 April 1998, Atlanta, GA. 9. B. Libet, Neurophysiology of Consciousness, Boston: Birkhauser, 1993. 10.F. Crick, "Function of the Thalaniic Reticular Complex: the Searchlight Hypothesis," Proc. of the National Academy of Science, Vol. 81, 1984, pp. 4586-4590. 1 1.D. Hall, Mathematical Techniques in Multisensor Data Fusion, Artech House, Boston, 1992. 12.E. Waltz and J. Llinas, Multisensor Data Fusion, Artech House, Boston, MA, 1990.

195

Downloaded From: http://spiedigitallibrary.org/ on 03/27/2014 Terms of Use: http://spiedl.org/terms

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.