Quantum-Dot Cellular Automata Serial Comparator

June 3, 2017 | Autor: Iztok Lebar Bajec | Categoria: Cellular Automata, Digital design, Data Flow Diagram, Quantum-Dot Cellular Automata
Share Embed


Descrição do Produto

11th EUROMICRO CONFERENCE on DIGITAL SYSTEM DESIGN Architectures, Methods and Tools

Quantum-Dot Cellular Automata Serial Comparator Blaz Lampreht, Luka Stepancic, Igor Vizec, Bostjan Zankar, Miha Mraz, Iztok Lebar Bajec and Primoz Pecar University of Ljubljana, Faculty of Computer and Information Science Trzaska 25, 1000 Ljubljana, Slovenia [email protected]

Abstract

Dot Cellular Automata (QCA) platform. The novel computing paradigm set by C. S. Lent at the start of the last decade [9, 7, 14] has soon attracted computer designers. This resulted in intensive research in the field of combinatorial as well as sequential QCA devices. Despite of the simplicity of primitives (wire, inverter, AND/OR gates, etc.) the design of complex logic devices (adder, memory, etc.) has proved to be more complicated. This is, apart from the obvious implementation issues, mainly due to relying on current optimized CMOS designs, which usually can not be simply mapped to QCA. We believe that in order to exploit the full potential of the novel platform every device should be designed from scratch. Having the above in mind and the initial work regarding comparators done by Q. Ke-ming [5] and J. R. Janulis [4] we decided to focus our research on the design of an n bit serial comparator. The proposed device shall not show merely a match/mismatch between two n bit numbers, but also their relation (less than, equal to, greater than). The CMOS serial comparator circuit spends n + 1 clock cycles for comparison of two n bit numbers. We will show that one can come near this efficiency also in the QCA domain. In section 2 we will present an overview of the QCA. In section 3 we will describe the design of the QCA serial comparator. Section 4 concludes with the analysis of its behaviour using the QCADesigner simulation tool.

The Quantum-dot Cellular Automata (QCA) are one of the few alternative computing platforms that meet most of the criteria desired for computing platforms of the future. One of the basic concepts that has popularized the QCA platform to computer designers is adiabatic pipelining, which implicitly assures the correct data flow and in this view simplifies digital design. When designing QCA devices this should be taken advantage of instead of copying digital device designs that were optimized for the CMOS technology. One of the basic digital devices and probably a vital part of many modern computing platforms is the comparator. We here present a novel approach to comparator design, which takes advantage of adiabatic pipelining. Our analysis was based on the QCADesigner tool.

1. Introduction Comparing two binary words is an operation that is commonly used in computer and communication systems, especially in the realm of image processing, 3D graphics, data mining and sorting. Therefore, a digital comparator is one of the fundamental components that have been and still are extensively researched and optimized in the domain of CMOS technology [6]. Nevertheless, with the continual downscaling of device dimensions CMOS technology is reaching its limits, which will sooner or later hold further improvement. One of the most evident causes is the avoidance of quantum effects that dominate the nano scale region [2]. Nanotechnology, a science that engages in the study and manipulation of material in the respective scale, offers an alternative approach. Instead of annihilating or avoiding quantum effects it takes advantage of them. Following this approach many new alternative processing platforms have emerged. One of the most promising, one that some day might replace current CMOS computing devices, is the Quantum-

978-0-7695-3277-6/08 $25.00 © 2008 IEEE DOI 10.1109/DSD.2008.49

2. QCA overview A Quantum-dot Cellular Automaton (QCA) is a planar array of, so called, quantum-dot cells (QCA cells). Each QCA cell contains two mobile electrons, which can reside only at designated locations, the quantum dots. They are able to tunnel between adjacent quantum dots, but tunnelling outside of a cell is impossible. The Coulomb interaction between the electrons causes them to occupy opposite corners of the cell. The two diagonal arrangements correspond to energetic minimal states, namely ground states. Following the principle of ground state computing, which

447

asserts that from a computing point of view the only acceptable state of a QCA cell is its ground state, the two states can be interpreted as binary values 0 and 1. When both arrangements are energetically equal it is said that the cell is in neutral state (see Fig. 1). The presence of ex-

Figure 2. The cyclic signal controls the rate at which electrons are able to tunnel between the dots in the cell and therefore to switch from one state to another. When the barriers are low, the electrons are equally spread out over the cell and the cell is in neutral state. When the barriers are high, the electrons are diagonally localized in the cell’s quantum dots so as to minimize the Coulombic interaction with the electrons in the neighbouring cells.

Figure 1. The two ground states that are mapped to binary values 0 and 1 and the representation of a cell in neutral state. ternal influences, i.e. placing one or more QCA cells in the observed cell’s neighbourhood, usually causes one of the arrangements to become the favoured ground state. The cell to cell interaction is strictly Coulombic and involves only rearrangements of electrons within individual cells, thus it enables computation. This means that with specific planar arrays (arrangements) of QCA cells it is possible to mimic the behaviour of interconnecting wires as well as logic gates and by interconnecting these more complex devices capable of processing can be constructed [14, 10]. The reliability of the QCA device behaviour depends foremost on the reliability of the switching process, i.e. the transition of a cell’s state that corresponds to one logic value to a state that corresponds its inverse value and vice versa. The desired stability is achieved by means of the adiabatic switching concept, where a cyclic signal, namely adiabatic clock, controls the cells’ switching dynamic [13, 8]. The signal comprises four phases (see Fig. 2). The switch phase (S) serves the cells’ gradual update of the state with respect to their neighbours. The hold phase (H) is intended for the stabilisation of the cells’ states when they are to be transmitted to the neighbours that are in the switch phase (i.e. the cells act as a fixated input for all other cells). The release (R) phase and the relax (L) phase support the cells’ gradual preparation for a new switch (i.e. the states of the cells gradually transit to a neutral state). Adiabatic switching introduces one of the most effective architectures in modern digital design, the pipeline architecture. The basic idea is to divide a computational problem into sub problems, which can be solved concurrently and independently. Using pipelining the computing performance is improved by the increase of throughput of the system when processing a stream of data. This is achieved by a set of processing elements, namely stages, interconnected so that the output of one stage is the input of the next one. Each stage is dedicated to solving a particular sub problem

so that these can be executed in parallel. To overcome the synchronization problems the execution of the stages is usually controlled by one or more clock signals. Since the adiabatic clock signal is composed of four phases, the QCA based structures can be decomposed to stages or subsystems controlled by four phase shifted adiabatic clock signals (four clocking zones). The obtained structure then shows behaviour similar to a pipeline. Indeed each subsystem controlled by an independent clock signal acts as one pipeline stage. This allows the decomposition of the computation problem to a number of sub problems and each subsystem can be designated to solve only one. The phase shifted nature of the controlling signals allows the blocks that are in the hold phase to act as inputs for blocks that are in the switch phase. Therefore a subsystem after performing the computation can be designed to lock its state and act as the input for another subsystem. As the transaction is finished the second subsystem can start processing while the first subsystem is ready for processing on new inputs [8]. With the correct assignment of different cells to different clocking zones, the data flow direction can be controlled. Large regions of nearby cells are usually assigned to the same clocking zone in order to eliminate the challenges that would be caused by attempting to deliver a separate clock signal to every cell. The latency of a QCA circuit is determined by the number of clocking zones along its critical path. A sequence of four clocking zones causes the delay of one clock cycle. Consequently minimizing the number of clocking zones leads to a better QCA design [10, 12, 11]. The design of complex QCA devices is based on three primitives: the QCA wire, the QCA inverter and the QCA majority gate. A sequence of QCA cells that enables prop-

448

agation of data from input to output is called the QCA wire. There exist two different designs that differ in their behaviour. The 90-degree QCA wire (see Fig. 3a) can be

Another way of obtaining the inverse is by connecting a 90-degree wire at the proper locations of a 45-degree wire. This way a complemented and uncomplemented signal value can be concurrently extracted from a 45-degree wire (see Fig. 5). The performed function can be described

Figure 3. Two designs of QCA wire (a,b) and co-planar crossover of two QCA wires (c).

described as a processing element performing the function: y = w90 (x) = x,

(1) Figure 5. Extracting an uncomplemented and complemented value from a 45-degree wire.

where x is the binary value that corresponds to the state of cell X and y is the binary value that corresponds to the state of cell Y. The 45-degree QCA wire (see Fig. 3b), also called inversion chain, propagates data in an alternating fashion, performing the function:  x, if the number of cells is odd y = w45 (x) = x, if the number of cells is even, (2) where x is the binary value that corresponds to the state of cell X and y is the binary value that corresponds to the state of cell Y. These two designs make single plane crossovers possible, as shown in Fig 3c. The data propagating along the 45-degree wire does not interact with the data on the 90-degree wire. The inverter has many different implementations, which mostly serve to increase its robustness [14]. We here use the basic implementation presented in Fig. 4. The given

with the following logic expressions: y1 = x, y2 = x,

(4)

where x is the binary value that corresponds to the state of cell X and y1 and y2 are the binary values that correspond to the state of cells Y1 and Y2 , respectively. The described property diminishes the number of inverters in the design, thus the 45-degree wire is usually used as an input line. The QCA majority gate is the fundamental logic gate in QCA design. It is constructed as a crossing of three QCA wires, as shown in Fig. 6. The structure has three input

Figure 6. The QCA majority gate and the corresponding schematic symbol.

Figure 4. The QCA inverter. structure evaluates the classical equation y = i(x) = x,

cells denoted X1 , X2 and X3 , a device cell in the centre and an output cell Y. It acts as majority voting logic, thus the device cell assumes the state that equals to the majority of states present in the three input cells and propagates it to the output cell. The behaviour can be described as a threshold

(3)

where x is the binary value that corresponds to the state of cell X and y is the binary value that corresponds to the state of cell Y.

449

function: y = m(x1 , x2 , x3 ) = x1 x2 ∨ x2 x3 ∨ x1 x3 ,

be implemented as an SR flop-flop with the characteristic equation (7) D1 q = rq ∨ s, rs = 0,

(5)

where x1 , x2 , x3 are binary values that correspond to the states of input cells X1 , X2 , X3 and y is the binary value that corresponds to the state of the output cell Y. Probably the most important feature of the majority gate is that it can be used to implement the AND and OR logic functions. The AND function can be implemented by fixing one input logic value to 0. Similarly, the OR function can be implemented by fixing one input logic value to 1. The QCA inverter and the QCA majority gate represent the functionally complete set of boolean logic functions, thus any arbitrary boolean logic function can be constructed using these two structure interconnected by QCA wires.

where s and r are binary input values and q is the binary output value of the register. Using the QCA platform the basic SR flip-flop is implemented as a delay loop with two majority gates acting as AND and OR gates (see Figs. 7 and 8). The delay loop is implemented using the adiabatic pipelin-

Figure 7. The QCA SR flip-flop schematics. The symbol D denotes where the actual delay is achieved.

3. Design of the QCA serial comparator The n-bit serial comparator is a sequential device that performs comparison of two n-bit numbers using only one comparing module instead of n-modules, replacing them by two 1-bit registers. The numbers are delivered sequentially, one bit per clock cycle, least significant bit (LSB) first and compared bitwise. The comparing module that returns the relation (less than, equal to, greater than) between two bits is in CMOS a combinatorial device. There are two inputs, namely a and b, i.e. the two compared bits and three outputs corresponding to relations a < b, a = b and a > b. The equations of the outputs are: f (a, b)< f (a, b)> f (a, b)=

= = =

ab, ab, (f (a, b)< ∨ f (a, b)> ).

ing concept [1, 3], which constantly moves the stored data in the memorizing cell, as shown in Fig. 8. It can be noticed

(6) Figure 8. The QCA SR flip-flop layout.

As evident from equation (6) the comparison is performed in two steps. In the first step the relations < and > are resolved using two AND gates with inverted a and b inputs and in the second step relation = is resolved using a NOR gate with the results of the first step as inputs. In order to serially compare two n-bit numbers a device has to perform n bitwise comparisons and memorize the relation between them throughout the comparison. The relations , = can be coded with two bits, namely q1 and q2 as shown in table 1. The two bits are stored in two 1-

that, as a contrast to CMOS sequential circuits, there is no need for an extra clock signal, as it is implicitly used for pipeline control. From equations (6) and (7) and table 1 follows that q1 is forced to 1 in case of relation and preserves the value in case of relation =. Similarly, q2 is forced to 1 in case of relation >, to 0 in case of relation < and preserves the value in case of relation =. This can be achieved by executing the comparison in phases. The first evaluates for relations < and > and sets the appropriate values to registers Q1 and Q2 . These are then used to evaluate the intermediate relation. Recalling our discussion on the bit comparison module, the two registers can be inserted between the two steps of the comparison process. When dealing with data streams a device has to know when one stream ends and the next begins. Let c be a reset signal, which forces both registers to 0 when low. Thus, a

Table 1. Coded relations . Relation q2 q1 = 0 0 < 0 1 > 1 0 bit registers denoted as Q1 and Q2 . The 1-bit register can

450

successful reset results in the output f (a, b)= being 1. The equations of the described comparator can be written as: D1 q1 D1 q2 f (a, b)< f (a, b)> f (a, b)=

= = = = =

q1 (ab ∨ c) ∨ (abc), q2 (ab ∨ c) ∨ (abc), q1 , q2 , (f (a, b)< ∨ f (a, b)> ),

(8)

where q1 and q2 are outputs of registers Q1 and Q2 respectively, a and b are input values and f (a, b)< , f (a, b)> , f (a, b)= are output functions. The proposed design based on the inverter, AND and OR gates can be implemented using the QCA platform, but it is, if we take into consideration space complexity, inefficient. Using the inverter, AND and OR gates as a functionally complete set is fine for CMOS technology, but it is suboptimal for the QCA platform. As already stated in QCA the basic logic devices are the inverter and the majority gate. Reducing the functionality of the majority gates to simple AND and OR gates typically results in an increase of space complexity. In order to alleviate this deficiency we adapted the current design using minimization based on majority gates [16]. Starting from equation (8) and disregarding the reset signal c the outputs of the SR memorizing cells can be written as:

Figure 9. The schematics of the serial comparator, implemented using the majority gate and inverter as basic building blocks.

D1 q1 = q1 (ab) ∨ ab = q1 (a ∨ b) ∨ ab = q1 a ∨ q1 b ∨ ab, D1 q2 = q2 (ab) ∨ ab = q2 (a ∨ b) ∨ ab = q2 a ∨ q2 b ∨ ab. (9) Comparing the obtained equations with equation (5) it can be noticed that they are practically the same. Thus the SR memorizing cell can be described with a single majority gate as: D1 q1 = m(q1 , a, b), (10) D1 q2 = m(q2 , a, b).

Figure 10. The layout of the QCA serial comparator, implemented using the majority gate and inverter as basic building blocks.

To incorporate also the reset signal c into the design another AND gate has to be added. As stated in section 2 this can be easily achieved using the majority gate with one input forced to 0. Bearing these optimizations in mind the QCA serial comparator can be described as: D1 q1 D1 q2 f (a, b)< f (a, b)> f (a, b)=

noticed that we have used co-planar wire crossings and 45degree input lines. Comparing the layout to the one based on the inverter, AND and OR functionally complete set the number of majority gates has reduced from 13 to 5.

= = = = =

m(q1 , a, b)c = m(m(q1 , a, b), 0, c), m(q2 , a, b)c = m(m(q2 , a, b), 0, c), q1 , q2 , m(f (a, b)< , f (a, b)> , 1). (11) The corresponding schematics is shown in Fig. 9 and can be directly implemented using the QCA platform and the concept of adiabatic pipelining. From the layout of the proposed QCA serial comparator (see Fig. 10) it can be

4. Analysis of the QCA serial comparator The analysis was carried out using the QCADesigner as design and simulation tool [15]. The simulation results shown in Fig. 11 were obtained while comparing numbers a = 114d = 1110010b and b = 78d = 1001110b using the bistable approximation simulation engine. The first three curves represent the input waveforms of the c, a and b signals. The next three represent the outputs f (a, b)< ,

451

parators.

References [1] D. Berzon and T. Fountain. Computer memory structures using QCA. Technical report, University College London, 1998. [2] T. Cole and J. Lusth. Quantum-dot cellular automata. Progress in Quantum Electronic, 25(4):165–189, 2001. [3] S. Frost, A. Rodrigues, A. Janiszewski, R. Raush, and P. Kogge. Memory in motion: A study of storage structures in QCA. In 8th International Symposium on High Performance Computer Architecture (HPCA–8), First Workshop on Non-Silicon Computation (NSC–1), Boston, Massachusetts, 2002. [4] J. R. Janulis, P. D. Tougaw, S. C. Henderson, and E. W. Johnson. Serial bit-stream analysis using quantum-dot cellular automata. IEEE Transactions on nanotechnology, 3(1):158– 164, 2004. [5] Q. Ke-ming and X. Yin-shui. Quantum-dots cellular automata comparator. In 7th International Conference on ASIC, ASICON ’07., pages 1297–1300, 2007. [6] J.-Y. Kim and H.-J. Yoo. Bitwise competition logic for compact digital comparator. In ASSCC ’07 IEEE Asian SolidState Circuits Conference, pages 59–62, 2007. [7] C. Lent and P. Tougaw. Lines of interacting quantumdot cells: A binary wire. Journal of Applied Physics, 74(10):6227–6233, 1993. [8] C. Lent and P. Tougaw. A device architecture for computing with quantum dots. Proceedings of the IEEE, 85(4):541– 557, 1997. [9] C. Lent, P. Tougaw, W. Porod, and G. Bernstein. Quantum cellular automata. Nanotechnology, 4:49–57, 1993. [10] M. T. Niemier and P. Kogge. Logic in wire: Using quantum dots to implement a microprocessor. In GLS ’99: Proceeding of the Ninth Great Lakes Symposium on VLSI, page 118, Washington, DC, 1999. IEEE Computer Society. [11] M. T. Niemier and P. M. Kogge. Exploring and exploiting wire-level pipelining in emerging technologies. In Proceedings 28th Annual International Symposium on Computer Architecture, pages 166–177, 2001. [12] M. T. Niemier and P. M. Kogge. Problems in designing with QCAs: Layout = timing. International Journal of Circuit Theory and Applications, 29:49–62, 2001. [13] P. Tougaw and C. Lent. Dynamic behaviour of quantum cellular automata. Journal of Applied Physics, 80(8):4722– 4736, 1996. [14] P. D. Tougaw and C. S. Lent. Logical devices implemented using quantum cellular automata. Journal of Applied Physics, 75(3):1818–1825, 1994. [15] K. Walus, T. Dysart, G. Jullien, and R. Budiman. QCADesigner: A rapid design and simulation tool for quantum-dot cellular automata. IEEE Transactions on Nanotechnology, 3(1):26–31, 2004. [16] R. Zhang, K. Walus, W. Wang, and G. A. Jullien. A method of majority logic reduction for quantum cellular automata. IEEE Transactions on Nanotechnology, 3(4):443– 450, 2004.

Figure 11. The QCADesigner simulation results for the serial comparator, implemented using majority gate and inverter as basic building blocks.

f (a, b)= and f (a, b)> . The last curve corresponds to the clock signal C0, which controls input and output cells. According to Fig. 2 the hold phase, i.e. the low state, of signal C0 determines the validity of data at inputs and outputs. The rest of the clock signals (C1, C2, C3) control the internal cells and are as such not relevant for the given analysis, hence not shown. To understand the behaviour presented in Fig. 11 a starting latency of 2 clock cycles has to be taken into account. The latency is determined by 8 clocking zones along the critical path from the inputs to the outputs. The comparison starts with an active low signal c when clock C0 is low. A successful reset is indicated by a high signal f (a, b)= and low signals f (a, b)< and f (a, b)> all delayed by 2 clock cycles from the start. Afterwards every hold phase of the C0 clock signal updates the state of the signals f (a, b)< , f (a, b)= and f (a, b)> indicating the intermediate relation between the numbers a and b. For example the least significant bits of both numbers are 0, which results as high f (a, b)= and low f (a, b)< and f (a, b)> . The end of the stream is determined by a low signal c. The complete comparison of two 7-bit numbers takes 9 clock cycles or 10 if we take reset into account. Thus for the comparison of two arbitrary n-bit numbers the QCA serial comparator needs n + 2 clock cycles or n + 3 with reset included, respectively.

5. Conclusion The paper presents a novel design of a QCA serial comparator that is able to compare two n-bit numbers. The proposed device is based inverters and majority gates. Employing them helped to reduce the number of logic elements and consequently the number of clocking zones along the critical path. This has resulted in a starting latency of two clock cycles, thus comparing two n-bit numbers takes n + 2 clock cycles, reaching almost a theoretical limit for serial com-

452

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.