A Signal Processing System Based upon Monolithic Neural Coprocessors J.G. Guimarães, A.R.S. Romariz, P.U.A. Ferreira, J. V. Campêlo Jr., M.L. Graciano Jr., O.R. Maia Jr., J.Zancanaro, J. C. da Costa Laboratório de Projeto de Circuitos Integrados - LPCI Departamento de Engenharia Elétrica - Universidade de Brasília Cx. Postal 4386 - Brasília - DF - 70919-970 - Brasil e-mail:
[email protected] Tel: 55-61-2735977 Fax 55-61-2746651
Abstract A signal processing system based upon a custommade neural coprocessor integrated circuit is presented. The system comprises a dedicated PCB containing up to four coprocessors, which works in association with a microcomputer. All control and communication software, including a program for neural network training, were also developed. A speech recognition application and a performance evaluation are also presented.
1. Introduction The development of neural network based signal processing systems is becoming more and more popular due to the intrinsic pattern recognition capabilities of the artificial neural networks. Most implementations, however, employ conventional von Neumann-like processors to emulate the desired topologies, neglecting the velocity advantages provided by the parallel distributed processing character of such networks. Another approach is the entire neural network’s realization in a monolithic I.C., which usually requires a non-negligeable chip area, with a somewhat limited network dimension. The most relevant limitations in that approach are related to storage capability and complexity aspects. In this work a third approach was considered, the realization of a neural network processing system based upon a dedicated hardware accelerator, coupled to a conventional microcomputer. A set of neural monolithic I.C. coprocessors, operating in a dedicated circuit board, carries out the basic classifying tasks, while the microcomputer provides the permanent memory, learning and simulation facilities. In the following sections, the hardware and software implementations that constitute the system, as well as an
application example and a performance estimation are presented.
2. System description The signal processing system contains a dedicated hardware (a printed circuit board containing up to four neural mixed-mode coprocessor [1] [2] chips, a local control processor, a local memory module, data conversion and I/O interfaces) attached to a microcomputer. This hybrid structure is used to realize various artificial neural network processor configurations, mostly based upon the perceptron paradigm. The basic functionalities (such as synaptic and/or neuron operations) of a neural network are performed at board level, while the learning algorithm and the permanent weight’s memory of the network, along with the system management’s tasks remain in the microcomputer. The data flow between the microcomputer and the coprocessor is based on a pipeline strategy [3]. With such an arrangement, a significant improvement in the microcomputer’s performance for typical signal processing tasks is achieved, due to the parallel, distributed character of the neural coprocessor’s architecture and to its analog processing units. In order to fully implement the system, a basic software, which carries out control and communication tasks between the PCB and the microcomputer was developed. Also a firmware, resident in the PCB’s microcontroller to supervise all steps of its operation, was developed. A specific simulation and training software, which takes advantage of the system’s features was, as well implemented. The system’s structure is schematically shown in Fig. 1.
Figure 1: System structure
The fast (8 MIPS) microcontroller available on the card runs basic routines that control communication with the PC and supervises the coprocessor state machine. The proposed firmware contains basically the following routines [4] (Fig. 2): a setup routine, triggered automatically during power-up; command interpretation and execution (detailed bellow); a refresh procedure for the synaptic weight capacitors; and finally a reset routine started once a full training or execution cycle is ended. Routines are triggered by one out of three proposed interrupts: the timer interrupt (which indicates that capacitor refresh is needed), a Reg_In interrupt (which signalizes that a command was generated from the PC) and an end-of-cycle interrupt. The instructions available are due basically to execution parameters exchange between the PC and the coprocessor card. Specific instructions are available for network topology programming (either 2x1, 2x2 or 3x1 configuration), processing start and functionality test requisition. At the output register, the card informs execution progress or the value of a required parameter.
3. Dedicated hardware An ISA board is used (the coprocessor’s operating frequency is 12 MHz) to host up to four mixed-mode coprocessors [4] [5]. Its basic function is to assure the communication between the microcomputer and the coprocessors. The board contains D/A (input) and A/D (output) converters and a dual port 16 K SRAM, which was adopted to improve the data transfer between the microcomputer and the coprocessors. In order to enhance the overall system’s performance a local microcontroller (Dallas DS80C320) carries out all management operations at board level, through a firmware resident in a 32 kB EPROM module. Three different coprocessor configurations can be selected in the board: 1 coprocessor (single layer, up to 15 physical synapses and 15 outputs); 2x2 coprocessors (double layer, up to 30 physical synapses and 30 outputs) and 3x1coprocessors (double layer, up to 45 physical synapses and 15 outputs). Other topologies can be implemented through a suitable management (via software) of input data, weight values and output data. An example of this feature can be seen in section 7. A board’s block diagram is presented in Fig. 3. A picture of the implemented prototype board is shown in Fig. 8.
4. Firmware
Power Up Endofcycle Setup
Idle Routine execution Refresh Command
Timer Interrupt
interpreter
Reg_In Interrupt
Figure 2: Firmware basic state diagram
Figure 3: PCB block diagram
5. Basic Operating Software A Basic Operating Software is necessary due to system’s complexity, avoiding undesirable penalties to applications’ development. This software can make a proper exploitation of neural coprocessor card characteristics, propitiating, for the application program, control of the board’s basic functions. Nevertheless, one of its main functions is to adjust the data generated by the application software for processing in the dedicated board. Resources’ management and data adjustment made by the Basic Operating Software bring simplification to developing applications involving the neural coprocessor card and, in addition, hardware changes or new implementations will not necessarily result in application changes. Some basic features were considered while developing this system: • use of well-known technologies like IBM-PC, ISA-AT, DOS and Windows; • possibility of simultaneous use of 4 coprocessors, propitiating various neural net topologies • system and coprocessor’s test capacity [9] The Basic Operating Software, using available resources from DOS or Windows, interacts with the application sending data and commands. On the other
side, using bus and interrupts available in the PC, it can exchange information with the board (Fig. 4).
Figure 4: Functional Division [6] Communication between the board and the Basic Operating Software has two modes: Control/Supervision and In/Out Data Exchange. Control/Supervision scheme is made by messages, commands and answers. Messages are sent from the board to the Basic Operating Software, in an unidirectional way. Commands are originated from the PC to the board via the Basic Operating Software. Answers are generated by the patient unit, PC or Board, that received a command or a message, respectively (Fig. 5). Communications between the application and the Basic Operating Software are made by two groups of procedures. The first group is Operation Procedures, which are related to the processing neural resources and the second is Control/Supervision Procedures, which are
related to control and supervision and are not manipulated by the application (Fig. 5).
Figure 5 – Communications Ways [6] The Basic Operating Software was implemented in two forms: integrated and dynamic link libraries (DLL). In the integrated form, the basic operating software corresponds to a set of routines ready to be used together with the application. The same resources are available in DLLs, that constitute a recompilation of the integrated mode and are suitable for Windows.[6] 6. Simulation and training programs The User Interface, Simulation and Training Program (UISTP) was developed in Borland C++, version 4.02, using object oriented language, and works under Windows. Computer must be IBM-PC standard, with instructions compatible to 80486 from Intel. It should have at least 8Mbytes of RAM, 270Mbytes of hard disk and a color monitor [7]. This program simulates, trains and allows a neural net application implementation using the neural coprocessor board. It was designed to comply with neural processing system’s architecture and the data exchange between this program and the coprocessors is realized according to the scheme presented in Fig. 1 Essentially, it has five parts, as shown on Fig. 6.
Figure 6 – UISTP program structure
a) Read and Save Net Configuration : this routine reads the net configuration (number of layers, number of neurons, activation function type), input vectors, desired outputs and weights, and also saves weights and calculated outputs. b) Net Initialization : responsible for net weights’ initialization and also manipulates input and desired output vectors, shuffling them in order to accomplish the training . c) Net Training : this routine implements one out of nine user-selected training strategies for Perceptronbased neural nets. Some examples are backpropagation strategies, including total square error, partial square error, square error with momentum or cost analysis with neuron suppression. In addition, training strategies concerning the random search of weights are available. d) Net Application : this routine allows visualisation, selection and edition of input training patterns. e) Help : access to extensive help file is provided at each operating window of the training software. Off-line and on-line training strategies are available. In the first one, all changes in weights are executed by software and then the final values are presented to the board. In the second one, weight updates are realized by the training software using the coprocessor’s response to input signals. 7. System application The neural coprocessor board can implement a large variety of applications. As a first example, a speech recognition case was chosen. The goal is to make a computer’s operation easier and faster, executing basic functions just by single voice commands. For training this system, firstly a voice data base was generated, containing 15 key-words, including numerical digits and basic commands. Each word was repeated 70 times by two male and two female speakers and recorded using a 8KHz sampling frequency and 16 bits per sample resolution.[8] However, before presenting a speech signal to the neural net, it is necessary to make a parameterization, extracting relevant information and eliminating redundancy. Therefore cepstral analysis was chosen, since it is capable of representing the signal in a compact way, making its processing easier.[8] After parameterization, a time adjustment should be done, because the patterns presented to a neural classifier must have a fixed size according to the number of inputs.[8] Finally, each word is pre-processed before being presented to the net, resulting in a 256 element vector, as shown in Fig. 7.
The signal processing system can implement the on-line recognition 100 times faster than a neural network software emulation (running in a PC_AT Pentium 133MHz).
8. Conclusions A hybrid signal processing system based upon monolithic mixed mode neural coprocessors was developed. The system’s development comprised all hardware and software aspects and was satisfactorily attained. Its architecture employs a dedicated board that works as a hardware accelerator for its host microcomputer, allowing a performance enhancement of two orders of magnitude for a speech recognition task. Further developments include the final board fabrication and the implementation of other applications using the full system’s capabilities, including on-line training strategies.
Figure 7 – Pre-processing Scheme Just one speaker was used to train a single layer Perceptron with 15 neurons, each neuron corresponding to one key-word, characterizing a speaker-dependent isolated word recognition. 675 patterns were used for training and other 375 distinct parameters were used for testing, reaching out a identification rate of 98,2%. A speaker-independent isolated word recognition task is also being tested, with single and multilayer Perceptron, reaching out a identification rate of 93,6% up to now. After the training process, the goal is to make the system’s operation on-line, so that recognition can be done as fast as possible by the net. For this kind of operation, the neural net is just a classifier algorithm and weights are accessed only to execute the spoken word classification. Table 1 shows a comparison between the performance of the developed signal processing system and an optimized PC program [8] that simulates the neural classifier, when identifying the same input voice signal using the previously described cepstral parameterization. Table 1 – Classification Processing Time Comparison Classifying Environment Neural Classifier Simulation (Software) Neural Coprocessor Board (Estimated)
Processing Time (ms) 50ms 0,46ms
Figure 8 – Prototype dedicated board photo
References [1] A.R.S. Romariz, P.U.A.Ferreira, J.V.Campelo Jr., M.L. Graciano Jr., J.C. da Costa. Design of a Hybrid Digital-Analog Neural Co-processor for Signal Processing. Proceedings of the 22nd EUROMICRO Conference, Prague,Czech Republic, September 1996. [2] A.R.S. Romariz, P.U.A.Ferreira, J.V.Campelo Jr., M.L. Graciano Jr., J.C. da Costa. Realization of a mixed-mode neural coprocessor for signal processing. Submitted to this symposium. [3] E.B.F. Lima. Nova arquitetura para implementação de Perceptron multicamada para ligação em circuitos com DSP.(in Portuguese). Unpublished. [4] O.R. de Maia Jr. Master degree dissertation. Departamento de Engenharia Elétrica – Universidade de Brasília , Brasília , Brasil, 1997. [5]J. Zancanaro. Technical Report. Departamento de Engenharia Elétrica – Universidade de Brasília, Brasília, Brasil, 1997.
[6] P.U.A. Ferreira. Master degree dissertation. Departamento de Engenharia Elétrica – Universidade de Brasília , Brasília , Brasil, 1997. [7] J.V.Campelo Jr. Master degree dissertation. Departamento de Engenharia Elétrica – Universidade de Brasília , Brasília , Brasil, 1996. [8]J.G. Guimarães, J.M.P. Diniz, A.R.S. Romariz, J.C. da Costa, L.M. da Silva. Sistema de Reconhecimento de Palavras Isoladas Baseado em Redes Perceptron e Análise Cepstral. Anais do IV Simpósio Brasileiro de Redes Neurais, Goiânia, Brasil, 1997. [9] M.L. Graciano Jr. Master degree dissertation. Departamento de Engenharia Elétrica – Universidade de Brasília , Brasília , Brasil, 1996.