

# **CENICS 2013**

The Sixth International Conference on Advances in Circuits, Electronics and Microelectronics

ISBN: 978-1-61208-302-5

August 25-31, 2013

Barcelona, Spain

## **CENICS 2013 Editors**

Vladimir Privman, Clarkson University - Potsdam, USA

## **CENICS 2013**

## Foreword

The Sixth International Conference on Advances in Circuits, Electronics and Microelectronics [CENICS 2013], held between August 25-31, 2013 in Barcelona, Spain, continued a series of events initiated in 2008, capturing the advances on special circuits, electronics, and micro-electronics on both theory and practice, from fabrication to applications using these special circuits and systems. The topics cover fundamentals of design and implementation, techniques for deployment in various applications, and advances in signal processing.

Innovations in special circuits, electronics and micro-electronics are the key support for a large spectrum of applications. The conference is focusing on several complementary aspects and targets the advances in each on it: signal processing and electronics for high speed processing, micro- and nano-electronics, special electronics for implantable and wearable devices, sensor related electronics focusing on low energy consumption, and special applications domains of telemedicine and ehealth, bio-systems, navigation systems, automotive systems, home-oriented electronics, bio-systems, etc. These applications led to special design and implementation techniques, reconfigurable and self-reconfigurable devices, and require particular methodologies to be integrated on already existing Internet-based communications and applications. Special care is required for particular devices intended to work directly with human body (implantable, wearable, eHealth), or in a human-close environment (telemedicine, house-oriented, navigation, automotive). The mini-size required by such devices confronted the scientists with special signal processing requirements.

We take here the opportunity to warmly thank all the members of the CENICS 2013 Technical Program Committee, as well as the numerous reviewers. The creation of such a high quality conference program would not have been possible without their involvement. We also kindly thank all the authors who dedicated much of their time and efforts to contribute to CENICS 2013. We truly believe that, thanks to all these efforts, the final conference program consisted of top quality contributions.

Also, this event could not have been a reality without the support of many individuals, organizations, and sponsors. We are grateful to the members of the CENICS 2013 organizing committee for their help in handling the logistics and for their work to make this professional meeting a success.

We hope that CENICS 2013 was a successful international forum for the exchange of ideas and results between academia and industry and for the promotion of progress in the field of circuits, electronics and micro-electronics.

We hope Barcelona provided a pleasant environment during the conference and everyone saved some time for exploring this beautiful city.

## **CENICS 2013 Chairs:**

Vladimir Privman, Clarkson University - Potsdam, USA Sergey Y. Yurish, Technical University of Catalonia (UPC-Barcelona), Spain Martin Horauer, University of Applied Sciences Technikum Wien, Austria Adrian Muscat, University of Malta, Malta

## **CENICS 2013 Research/Industry Chairs**

Ravi M. Yadahalli, PES Institute of Technology & Management - Karnataka, India

## **CENICS 2013 Industry Liaison Chairs**

Falk Salewski, Lacroix Electronics, Germany

## **CENICS 2013 Publicity Chair**

Sandra Sendra Compte, Universidad Politécnica de Valencia, Spain

CENICS 2013 Special Area Chairs Formalisms Peeter Ellervee, Tallinn University of Technology, Estonia Application-oriented Josu Etxaniz Marañon, University of the Basque Country / Universidad del País Vasco / Euskal Herriko Unibertsitatea - Bilbao, Spain Sensors

Yulong Zhao, Xi'an Jiaotong University, China

## **CENICS 2013**

## Committee

### **CENICS Advisory Chairs**

Vladimir Privman, Clarkson University - Potsdam, USA Sergey Y. Yurish, Technical University of Catalonia (UPC-Barcelona), Spain Martin Horauer, University of Applied Sciences Technikum Wien, Austria Adrian Muscat, University of Malta, Malta

## **CENICS 2013 Research/Industry Chairs**

Ravi M. Yadahalli, PES Institute of Technology & Management - Karnataka, India

## **CENICS 2013 Industry Liaison Chairs**

Falk Salewski, Lacroix Electronics, Germany

## **CENICS 2013 Publicity Chair**

Sandra Sendra Compte, Universidad Politécnica de Valencia, Spain

## **CENICS 2013 Special Area Chairs**

### Formalisms

Peeter Ellervee, Tallinn University of Technology, Estonia

### **Application-oriented**

Josu Etxaniz Marañon, University of the Basque Country / Universidad del País Vasco / Euskal Herriko Unibertsitatea - Bilbao, Spain

### Sensors

Yulong Zhao, Xi'an Jiaotong University, China

## **CENICS 2013 Technical Program Committee**

Amir Shah Abdul Aziz, TM Research & Development, Malaysia Said Al-Sarawi, The University of Adelaide, Australia Mohammad Amin Amiri, Iran University of Science and Technology, Iran Lotfi Bendaouia, ETIS-ENSEA, France Yngvar Berg, Vestfold University College, Norway Madhu Bhaskaran, RMIT University, Australia Manuel José Cabral dos Santos Reis, University of Trás-os-Montes e Alto Douro, Portugal Javier Calpe, University of Valencia, Spain Jose Carlos Meireles Monteiro Metrolho, Polytechnic Institute of Castelo Branco, Portugal David Cordeau, LAII-IUT Angoulême, France Marc Daumas, Université de Perpignan, France Javier Diaz-Carmona, Technological Institute of Celaya, Mexico Gordana Jovanovic Dolecek, Institute INAOE - Puebla, Mexico Peeter Ellervee, Tallinn University of Technology, Estonia Ykhlef Fayçal, Centre de Développement des Technologies Avancées, Algeria Sérgio Adriano Fernandes Lopes, Universidade do Minho, Portugal Francisco V. Fernández, IMSE, CSIC and University of Sevilla, Spain Joaquim Filipe, EST Setubal, Portugal Julian Gardner, University of Warwick, U.K. Luis Gomes, Universidade Nova de Lisboa, Portugal David Greenhalgh, University of Strathclyde, U.K. Petr Hanáček, Brno University of Technology, Czech Republic Martin Horauer, University of Applied Sciences Technikum Wien, Austria Chun-Hsi Huang, University of Connecticut, U.S.A. Wen-Jyi Hwang, National Taiwan Normal University, Taiwan Emilio Jiménez Macías, University of La Rioja, Spain Anastasia N. Kastania, Athens University of Economics and Business, Greece Kenneth Blair Kent, University of New Brunswick, Canada Tomas Krilavicius, Vytautas Magnus University - Kaunas & Baltic Institute of Advanced Technologies -Vilnius, Lithuania Junghee Lee, Georgia Institute of Technology, USA Kevin Lee, Murdoch University, Australia Alie Eldin Mady, University College Cork (UCC) - Cork, Ireland Cesare Malagu', University of Ferrara and Istituto di acustica e sensoristica Orso Maria Corbino CNR-**IDASC**, Italy José Carlos Metrôlho, Instituto Politécnico de Castelo Branco, Portugal Bartolomeo Montrucchio, Politecnico di Torino, Italy Adrian Muscat, University of Malta, Malta Arnaldo Oliveira, Universidade de Aveiro, Portugal Adam Pawlak, Silesian University of Technology - Gliwice, Poland Angkoon Phinyomark, Prince of Songkla University, Thailand Eduardo Correia Pinheiro, Instituto de Telecomunicações - Lisboa, Portugal Anton Satria Prabuwono, Universiti Kebangsaan Malaysia, Malaysia Vladimir Privman, Clarkson University - Potsdam, USA Càndid Reig, University of Valencia, Spain Marcos Rodrigues, Sheffield Hallam University, U.K. Falk Salewski, Lacroix Electronics, Germany Arvind K. Srivastava, NanoSonix Inc., USA Ivo Stachiv, Institute of Physics, Academia Sinica - Taipei, Taiwan Ephraim Suhir, University of California – Santa Cruz, USA Ivo Stachiv, Institute of Physics - Academia Sinica, Taiwan Felix Toran, European Space Agency, Germany Francisco Torrens, Institut Universitari de Ciencia Molecular / Universitat de Valencia, Spain Manuela Vieira, UNINOVA/ISEL, Portugal Chin-Long Wey, National Central University, Taiwan Ravi M. Yadahalli, PES Institute of Technology & Management - Karnataka, India

Jianhua (Joshua) Yang, Hewlett Packard Laboratories - Palo Alto, USA Sergey Y. Yurish, IFSA, Spain David Zammit-Mangion, University of Malta – Msida, Malta

## **Copyright Information**

For your reference, this is the text governing the copyright release for material published by IARIA.

The copyright release is a transfer of publication rights, which allows IARIA and its partners to drive the dissemination of the published material. This allows IARIA to give articles increased visibility via distribution, inclusion in libraries, and arrangements for submission to indexes.

I, the undersigned, declare that the article is original, and that I represent the authors of this article in the copyright release matters. If this work has been done as work-for-hire, I have obtained all necessary clearances to execute a copyright release. I hereby irrevocably transfer exclusive copyright for this material to IARIA. I give IARIA permission or reproduce the work in any media format such as, but not limited to, print, digital, or electronic. I give IARIA permission to distribute the materials without restriction to any institutions or individuals. I give IARIA permission to submit the work for inclusion in article repositories as IARIA sees fit.

I, the undersigned, declare that to the best of my knowledge, the article is does not contain libelous or otherwise unlawful contents or invading the right of privacy or infringing on a proprietary right.

Following the copyright release, any circulated version of the article must bear the copyright notice and any header and footer information that IARIA applies to the published article.

IARIA grants royalty-free permission to the authors to disseminate the work, under the above provisions, for any academic, commercial, or industrial use. IARIA grants royalty-free permission to any individuals or institutions to make the article available electronically, online, or in print.

IARIA acknowledges that rights to any algorithm, process, procedure, apparatus, or articles of manufacture remain with the authors and their employers.

I, the undersigned, understand that IARIA will not be liable, in contract, tort (including, without limitation, negligence), pre-contract or other representations (other than fraudulent misrepresentations) or otherwise in connection with the publication of my work.

Exception to the above is made for work-for-hire performed while employed by the government. In that case, copyright to the material remains with the said government. The rightful owners (authors and government entity) grant unlimited and unrestricted permission to IARIA, IARIA's contractors, and IARIA's partners to further distribute the work.

## **Table of Contents**

| High Speed and Ultra Low-Voltage CMOS Carry Propagation Chain using Floating-Gate Transistors <i>Yngvar Berg</i>                                            | 1  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Robustness of the Ultra Low-Voltage Domino Gates CMOS<br>Omid Mirmotahari and Yngvar Berg                                                                   | 7  |
| High-frequency One-port Colpitts SAW Oscillator for Chemical Sensing<br>Sanju Thomas, Zoltan Racz, Marina Cole, and Julian Gardner                          | 13 |
| FPGA Implementation of Disparity Estimation Proceesing Architecture for Stereo Camera System<br>Hi-Seok Kim, Young-Hwan Kim, Sea-Ho Kim, and Choong-Mo Youn | 18 |

## High Speed and Ultra Low-Voltage CMOS Carry Propagation Chain using Floating-Gate Transistors

Yngvar Berg Department of Informatics University of Oslo Oslo, Norway Email: yngvarb@ifi.uio.no

Abstract—Ultra low-voltage (ULV) CMOS logic for highperformance applications is presented. By applying floating capacitors we can increase the current level of MOS transistors for supply voltages below 500mV. The current level of the transistors may be increased by a factor 40 for supply voltages below 0.3V. Simple NAND gates are presented using different topologies. The NAND gates are exploited to provide a highspeed and ultra low-voltage serial carry chain. Compared to conventional serial CMOS carry gates the delay is reduced by a factor 10 or more. Simulated data are based on *SpectreS* simulator provided by *Cadence* and are valid for 90nm TSMC CMOS process.

*Keywords*-CMOS; Low-Voltage; Carry; High-Speed; Floating-Gate; Pass Transistors.

#### I. INTRODUCTION

The demand for high-speed digital circuits is ever increasing. At the same time the supply voltages in modern CMOS processes are reduced in order to prevent transistor failure due to short channel effects. The low supply voltage is a challenge for high-speed circuit design. With the emergence of sensor and biomedical applications that require ultra low energy [1], [2], we have to adequately address design optimization near the minimum-energy point. It has long been established that for most logic families the minimumenergy point occurs in the subthreshold operational region of the MOS transistors and that the value of the minimum energy is set by leakage current.

Energy-efficiency is one of the most required features for modern electronic systems designed for high-performance and/or portable applications. In one hand, the ever increasing market segment of portable electronic devices demands the availability of low-power building blocks that enable the implementation of long-lasting battery-operated systems. The general trend of increasing operating frequencies and circuit complexity, in order to cope with the throughput needed in modern high-performance processing applications, requires the design of high-speed circuits. The supply voltage region addressed in this paper is between 200mV and 400mV. In this region traditional CMOS logic design is hampered with subtreshold current which both reduces the circuit speed and robustness. In the approach presented in this paper the main transistors are operating above the threshold voltage for ultra low supply voltages. For a supply voltage equal to 300mV the delay of the proposed carry gates is reduced to less than 10% of a conventional CMOS carry gate while the noise margin is increased due to an enhanced current level. The delay variation is reduced accordingly due to large driving currents. The energy required to drive the gates are comparable to conventional CMOS [3].

Addition is a fundamental arithmetic operation that is broadly used in many VLSI systems, such as applicationspecific digital signal processing (DSP) architectures and microprocessors. This module is the core of many arithmetic operations such as addition/subtraction, multiplication, division and address generation. Thus, taking this fact into consideration, the design of a full-adder having low-power consumption and low propagation delay is of great interest for the implementation of modern digital systems. Arithmetic circuits, like adders and multipliers, are also one of the basic components in the design of communication circuits.

The ultra low-voltage circuits presented in this paper are derived from digital floating-gate circuits [4]. In Section II, we introduce ultra low-voltage semi-floating-gate logic and compare the performance to conventional, i.e., complementary CMOS. ULV NP domino inverters are described and the potential delay reduction and increased performance in terms of minimum-energy-point (MEP) is presented. In Section III, we present novel ULV NAND and AND gates using enhanced pass transistors. Different ultra low-voltage carry gates are presented in Section IV. The high-speed ULV carry gates can be used in simple serial adders to reduce the complexity of parallel conventional CMOS adders for supply voltages below 500mV.

#### II. ULTRA-LOW-VOLTAGE SEMI-FLOATING-GATE LOGIC

The ULV logic styles presented in this paper are related to the ULV domino logic style presented in [5]. The main purpose of the ULV logic style is to increase the current level for low supply voltages without increasing the transistor widths. We may increase the current level compared to complementary CMOS using different initialization voltages to the gates and applying capacitive inputs. The extra loads represented by the floating capacitors are less than extra load given by increased transistor widths. The capacitive inputs



Figure 1: ULV domino inverters.

lower the delay through increased transconductance while increased transistor widths only reduce parasitic delay. Furthermore, the accuracy of the capacitor values are not critical and the process variations and temperature will only have a significant impact on the transistor currents through the high relative transconductance in the subthreshold region. The ULV logic styles may be used in critical sub circuits where high speed and low supply voltage is required. The ULV logic styles may be used together with more conventional CMOS logic and will require a specialized CMOS process. The floating capacitors can be either poly-poly, metal-metal or MOS parasitics.

The simple dynamic edge and level ULV inverters [5] are shown in Figure 1. Apply a clock signal to power the inverter, i.e., either  $\phi$  to  $E_n$  to and  $V_{DD}$  to  $E_p$ , or  $\overline{\phi}$  to  $E_p$  and GND to  $E_n$  and precharge to 1 or 0 respectively. The gate resembles NP domino logic. In order to hold the precharged value until an input transition arrives the E transistor connected to a supply voltage is made stronger than the other E transistor. The function of the inverter can be described as  $\overline{O} = \overline{AD}$ 

The different ULV logic styles are defined by the applied terminal inputs as shown in Table I.  $\Delta V$  is the output voltage swing. The simple model for the noise margin NM' is given by the ratio of the ON current and the OFF current. The capacitive division factor,  $\frac{Cin}{CT}$  where  $C_T$  is the total capacitance seen by a floating gate is assumed to be 0.5. The delay is relative to a standard complementary CMOS inverter. The ON and OFF currents of a complementary CMOS inverter is given by the effective gate source voltages  $V_{DD}$  and 0V respectively. Assuming  $\frac{Cin}{CT} = 0.5$  where  $C_T$  is

the total capacitance seen by a floating gate, we may estimate the delay, dynamic and static power and noise margins of the different ULV logic styles relative to a complementary CMOS inverter.



Figure 2: Performance.

The ULV domino inverters can exploited to improve speed for ultra low supply voltages. Relative delay, Power-Delay-Product (PDP) and Energy-Delay-Product (EDP) for ULV inverter compared to complementary CMOS is shown in Figure 2, the delay for supply voltages below 0.4V can be reduced to less than 5%.



Figure 3: Energy delay trade-off, complementary CMOS and ULV.

The optimum energy-delay trade-off in standard CMOS logic, shown in Figure 3, is traditionally close to the minimum-delay point (MDP). The region left of the optimum energy-delay curve is not feasible for conventional or standard CMOS logic. By applying a current boost provided

Table I: ULV LOGIC STYLES

| $\Delta V$             | $E_p$             | $E_n$  | $V_{gs} I_{ON}$     | $V_{gs} I_{OFF}$    | NM'                | Relative delay | Style | Comment          |
|------------------------|-------------------|--------|---------------------|---------------------|--------------------|----------------|-------|------------------|
| $\pm \frac{V_{DD}}{2}$ | $\overline{\phi}$ | $\phi$ | $\frac{5V_{DD}}{4}$ | $\frac{3V_{DD}}{4}$ | $\frac{V_{DD}}{2}$ | $\approx 10\%$ | DU    | Dynamic          |
| $V_{DD}$               | $\overline{\phi}$ | GND    | $\frac{3V_{DD}}{2}$ | $\frac{V_{DD}}{2}$  | $V_{DD}$           | $\approx 5\%$  | DNU   | Dynamic prech. 0 |
| $-V_{DD}$              | V <sub>DD</sub>   | $\phi$ | $3\frac{V_{DD}}{2}$ | $\frac{V_{DD}}{2}$  | $V_{DD}$           | $\approx 5\%$  | DPU   | Dynamic prech. 1 |

by the ULV logic the optimum energy-delay curve is moved to the left in Figure 3, which yields a significant reduction in delay for low supply voltages or a specific energy level. The improvement of the current boost can be exploited in different ways. Firstly, we can for a specific ultra-low supply voltage obtain significant speed improvement. For example, assuming a supply voltage equal to 330mV the delay can be reduced to approximately 5% compared to standard CMOS at the same energy level. Secondly, if a system is required to operate at a certain speed, we can for meet the speed requirement for a lower supply voltage for the ULV logic than for standard CMOS. Assuming a gate delay equal to 40ps, the required supply voltage for standard CMOS is 500mV whereas the required supply voltage for the ULV logic is 250mV. The energy consumed by the ULV logic will be approximately 25% compared to standard CMOS. Hence, the ULV transistors can be exploited for ultra lowvoltage and high-speed design and ultra low-voltage lowenergy design.

#### III. ULV AND AND NAND GATES

Different applications of the ULV domino inverter is shown Figure 4. The inverters can be used to implement AND2 and NAND2 functions by using one of the inputs to set the precharge level. The gates in Figure 4 can be described as a pass transistor with an increased current level. The delay of the gates are dependent on the input delay. If we consider the gate in a) we observe that the evaluate transistor  $E_n$  acts as a pass transistor for the input  $\overline{B}$  when  $\overline{B}$  switches from 1 to 0 and the other input A switches from 0 to 1. The delay of the AND2 and NAND2 gates are less than 8% of a standard complementary NAND gate for supply voltages below 500mV. For a supply voltage equal to 300mV the delay of a NAND2 gate is 310ps whereas the delay for a complementary NAND2 gate is more than 6ns.

The different configurations of the AND/NAND ULV gates are given in Table II. Consider the NAND gate in Figure 5 a). The response of the gate is depending on the closk signals and inputs:

- Figure 5 b)  $\phi = 1$ , precharge. The output is precharged to 1 and the gate of the evaluate transistor  $E_n$  and  $E_p$  are recharged to 0 and 1 respectively. The input A is precharged to 0 and the input  $\overline{B}$  is precharged to 1.
- Figure 5 c) φ = 1 and A = 0 (no change) and B = 1 (no change). The gate of the evaluate transistors are not changed and the output remains high.



Figure 4: ULV domino AND and NAND gates.

- Figure 5 d)  $\phi = 1$  and A = 0 (no change) and  $\overline{B} = 0$  (transition). The output remains high due to a strong pMOS evaluate transistor  $E_p$ .
- Figure 5 e) φ = 1 and A = 1 (transition) and B
   = 1 (no change). The nMOS evaluate transistor E<sub>n</sub> is enhanced and the pMOS evaluation transistor E<sub>p</sub> is weakened. The output will not be affected by this condition.
- Figure 5 f)  $\phi = 1$  and A = 1 (transition) and  $\overline{B} = 0$  (transition). The nMOS evaluate transistor  $E_n$  is enhanced and the pMOS evaluation transistor  $E_p$  is weakened and the ouput will be pulled to GND (0) by an enhanced nMOS evaluate transistor.

An alternative and simplified NAND gates using an ULV pass transistor is shown in Figure 6 a). The NAND gate in recharge/precharge mode, i.e., output precharged to 1 and inputs B and  $\overline{A}$  precharged to 0 and 1, is shown in b) and the simplified circuit equivalents for different inputs are shown

Table II: DIFFERENT CONFIGURATIONS

| A  | В              | 0  | Function |
|----|----------------|----|----------|
| A  | $\overline{B}$ | 0  | a)       |
| 0  | 0₩             | 1  | NAND     |
| 0  | 1              | 1  | NAND     |
| 1介 | 0₩             | 0₩ | NAND     |
| 1↑ | 1              | 1  | NAND     |
| Ā  | В              | 0  | b)       |
| 0₩ | 0              | 1  | NAND     |
| 0₩ | 1 🏠            | 0↓ | NAND     |
| 1  | 0              | 1  | NAND     |
| 1  | 1 🕆            | 1  | NAND     |
| A  | $\overline{B}$ | 0  | c)       |
| 0  | 0₩             | 0  | AND      |
| 0  | 1              | 0  | AND      |
| 1介 | 0₩             | 1介 | AND      |
| 1↑ | 1              | 0  | AND      |
| Ā  | В              | 0  | d)       |
| 0₩ | 0              | 0  | AND      |
| 0₩ | 1 🏠            | 1↑ | AND      |
| 1  | 0              | 0  | AND      |
| 1  | 1 1            | 0  | AND      |

in Figure 6 c) to f). For input  $\overline{A} = 1$  (no change) shown in c) and d) the output will remain high. Details of operation are:

- Figure 6 c) B = 0 (no change) and  $\overline{A} = 1$  (no change). The output remains high unless some other circuitry pull the output down to 0 by an enhanced nMOS transistor. If so the gate of the evaluate transistor will be pulled to 0 and turn off the evaluate transistor E in order to prevent the output to affect the input  $\overline{A}$ .
- Figure 6 d) B = 1 (transition) and A = 1 (no change). The evaluate transistor E is enhanced and the output will remain high unless some other circuitry pull the output down to 0 by an enhanced nMOS transistor. If so the gate of the enhanced evaluate transistor will be pulled to 0 and turn off the evaluate transistor E in order to prevent the output to affect the input A.
- Figure 6 e) B = 0 (no change) and  $\overline{A} = 0$  (transition). The evaluate transistor E is not enhanced and the output may be pulled to 0 or remain high depending on other circuitry. If there are no other transistor driving the output, the output will be pulled slowly towards 0. The time constant for the pull-down is however much higher than the time constant for an active pull-down by an enhanced evaluate transistor.
- Figure 6 f) B = 1 (transition) and  $\overline{A} = 0$  (transition). The output is pulled to 0 by the enhanced evaluate transistor E.

#### IV. ULV CARRY GATES

We can use the NAND gates shown in Figure 4 to implement a high speed ULV carry function. By combin-



Figure 5: Simplified circuit equivalents for the NAND gate.

Table III: TRUTH TABLE FOR CARRY GATE

| $C_{in}$ | A   | В   | $\overline{A}$ | $\overline{B}$ | $\overline{C_{out}}$ | Cout |
|----------|-----|-----|----------------|----------------|----------------------|------|
| 0        | 0   | 0   | 1              | 1              | 1                    | 0    |
| 0        | 0   | 1 🕆 | 1              | 0 ↓            | 1                    | 0    |
| 0        | 1 🏠 | 0   | 0₩             | 1              | 1                    | 0    |
| 0        | 1 🏠 | 1 🕆 | 0₩             | 0↓             | 0↓                   | 1 1  |
| 1 ↑      | 0   | 0   | 1              | 1              | 1                    | 0    |
| 1 ↑      | 0   | 1 🕆 | 1              | 0↓             | 0↓                   | 1 1  |
| 1 ↑      | 1 ↑ | 0   | 0₩             | 1              | 0↓                   | 1↑   |
| 1 ↑      | 1 🏠 | 1 🕆 | 0₩             | $0\Downarrow$  | 0↓                   | 1 1  |

ing three NAND2 gates we obtain the carry defined by  $\overline{C_{out}} = \overline{AB} + \overline{AC_{on}} + \overline{BC_{in}}$ . Furthermore, we may assume that the inputs A and B will arrive prior to the carry input signal in a serial carry propagation chain. By applying the A and B inputs to the source of the evaluate transistors we minimize the worst case propagation delay. If  $A \neq B$  the carry output is only dependent on the carry input, which is the worst case scenario for a serial adder.

A ULV carry gate is shown in Figure 7. By using 3 ULV NAND gates from Figure 4 we can implement the



Figure 6: NAND ULV pass transistor gates.



Figure 7:  $\overline{C_{out}} = \overline{AB + AC_{in} + BC_{in}}$ .



Preliminary simulation data for the ULV carry chain using the carry gates in Figures 7 and 8 is shown in Figure 9. The



Figure 8:  $C_{out} = AB + AC_{in} + BC_{in}$ .



Figure 9: Preliminary simulation data for the ULV carry chain.

clock signal  $\phi$  and  $\overline{\phi}$  are provided by large complementary inverters and the large fall and rise time compared to the ULV carry gate of these signals are evident. The delay through a complementary inverter is 2.2ns and the delay for a complementary carry gate is more than 5ns for a supply voltage equal to 300mv. The delay for the ULV carry chain is 175ps for bit 1, 501ps for bit 2 and approximately 650ps



Figure 10: Alternative carry gate using pass transistor.

for the following bits.

An alternative carry gate exploiting the NAND gates in Figure 6 is shown in Figure 10. This gate has less transistors and less capacitive input load than the carry gate in Figure 7. The carry input signal is used to initiate the precharge the output, hence the output will be precharged to 1 until the input carry signal arrives unless the carry output is pulled to 0 by A = B = 1. Note that the carry signals is only applied to recharge transistors and input capacitors and not as pass inputs. The carry input is precharged to 0, which turns on the recharge transistors. In the same phase, the output is precharged to 1 while the inputs  $\overline{A}$  and  $\overline{B}$  are precharged to 1.

#### V. CONCLUSION

The potential of ultra low-voltage domino and pass transistor CMOS has been presented. Different NAND and AND gates have been presented and are applied in ULV carry chains. Preliminary simulation results shows potential speed improvement of the proposed carry gates compared to complementary CMOS by a factor of 10 or more. Simulated data are based on *SpectreS* simulator provided by *Cadence* and are valid for 90nm TSMC CMOS process.

#### REFERENCES

 A.P. Chandrakasan, S. Sheng and R.W. Brodersen, "Lowpower CMOS digital design", IEEE Journal of Solid-State Circuits, Volume 27, Issue 4, April 1992 Page(s):473 - 484

- [2] N. Verma, J. Kwong and A.P. Chandrakasan, "Nanometer MOSFET Variation in Minimum Energy Subthreshold Circuits", IEEE Transactions on Electron Devices, Vol. 55, NO. 1, January 2008 Page(s):163 - 174
- [3] . Berg and O. Mirmotahari, "Novel high-speed and ultra-low-voltage CMOS NAND and NOR domino gates", In CEN-ICS 2012: The Fifth International Conference on Advances in Circuits, Electronics and Microelectronics. International Academy, Research and Industry Association (IARIA). ISBN 978-1-61208-213-4. Chapter. s 5 10.
- [4] Y. Berg, D.T. Wisland and T.S. Lande, "Ultra Low-Voltage/Low-Power Digital Floating-Gate Circuits", IEEE Transactions on Circuits and Systems, vol. 46, No. 7, pp. 930– 936, J uly 1999.
- [5] Y. Berg, "Novel Ultra Low-Voltage and High Speed Domino CMOS Logic", In proc. IEEE/IFIP International Conference on VLSI and system-on-Chip (VLSI-SoC), Madrid 27-29 September 2010.

## **Robustness of the Ultra Low-Voltage Domino Gates CMOS**

Omid Mirmotahari and Yngvar Berg Nanoelectronics System Group, Department of Informatics University of Oslo Oslo, Norway Email: omidmi@ifi.uio.no, yngvarb@ifi.uio.no

Abstract—In this paper, we elaborate on the dimensioning of the ultra low voltage gate with keeper. We compare the gate configuration to ULV5 and demonstrate the potential and weaknesses of the new gate configuration with the keeper. We also pinpoint the crucial signal paths (mainly regarding the clock drivers) while also providing an overview of the propagation through a chain of gates.

Keywords-NP domino logic; ultra low voltage; floating-gate; CMOS, high-speed; clock drivers; 90 nm process.

#### I. INTRODUCTION

For decades, technology has been driven by scaling the size of the transistor. We have evolved a fabrication process to reduce the size from several micrometres ( $\mu$ m) to tens of nanometres (nm). The heart of technology is the transistor and it has been one of the key components that has allowed the plethora of portable electronic gadgets that enrich our everyday lives. Unfortunately, millions of transistor chips fabricated using modern processes suffer from very low yields (<50%) [1]. Meanwhile, the consumer market has dramatically increased demand for sophisticated portable electronics, such as handheld computers and smart-phones.

Portable electronics drive the need for low power and low voltage due to a limited budget set by a fixed maximum battery mass. Soon, we will see research toward embedded circuits in human bodies and the need to harvest energy will become more evident. Several approaches exist to lower the energy consumption. One of the most fundamental and effective approaches is to lower the supply voltage [2], [3], [4]. When the supply voltage is reduced to hundreds of millivolts, it is known as Ultra Low Voltage (ULV) [5], [6]. However, scaling of the supply-voltage has an adverse effect on the speed of operation of the design. The main challenge is to obtain high speed at supply-voltages that are as low as possible. To maintain good response times at ultra low supply voltages, the threshold voltages of the transistors must also be reduced [7]. Unfortunately, this requires a change to the CMOS fabrication process. The multiple- $V_{dd}$  technique has been proposed for low voltage high performance circuit designs [8] without the need to change the fabrication process. Floating-Gates (FG) have also been proposed for ULV and Low Power (LP) logic [9]. Unfortunately, modern processes face significant gate



Figure 1. Low power ULV inverter based on domino logic with floating-gate. This gate configuration is known as ULV5 (the fifth modification). The symbols are recharge transistor ( $R_p$  and  $R_n$ ), evaluation transistor ( $E_p$  and  $E_n$ ), keeper transistor ( $K_p$  and  $K_n$ ), clock signal ( $\phi$ ).

leakage due to the thin oxide. A ULV floating-gate inverter employing a frequent recharge technique has shown good properties for achieving high speed at ultra low voltages [10] Even though the ULV gate has shown good performance it also has limitations, due the leakage at the semi-floatinggates (SFG). A differential ULV gate has been proposed which includes a keeper function [11]; it is argued to have the speed of an ULV but the stability of a standard CMOS gate.

In this paper, we elaborate more on the attributes of the ULV gate and its modification to resemble a precharge logic. Furthermore, we discuss the inclusion of a keeper transistor that enables reduction of the static power consumption. The main aim of this paper is to evaluate the advantages and reliability that the modification allows in terms of delay response, transistor matching and power consumption. We also discuss secondary effects such as clock driver dimensioning.

The structure of this paper is as follows: in Section II, the ULV5 and the keeper transistor modification of the ULV structure are presented. In Section III, a discussion of the results achieved is given. Finally, the paper concludes by



Figure 2. The ULV5 gate is modified by adding a keeper transistor  $(KE_{\rm a} \text{ and } KE_{\rm p})$  to the evaluation transistors, highlighted in the gray box.

highlighting the optimal design parameters. The simulation results demonstrated throughout this paper were obtained using a simulation produced in a TSMC 90 nm process environment provided by Cadence.

#### II. ULTRA LOW VOLTAGE GATE WITH KEEPER

The ULV gates have been presented in five evolution steps, from ULV1 to ULV5 [12], [13]. The most recent modification of ULV5 is presented in [12], with an additional improvement made by adding a keeper transistor to the evaluation transistor in the N- and P-domino logic gates. The ULV5 without the keeper transistor is illustrated in Figure 1, while modification of the gate with the keeper transistor is illustrated in Figure 2. The keeper transistors ( $KE_n$  and  $KE_p$ ) are highlighted in gray. Considering Figure 2(a), the transistor  $KE_n$  would contribute to weaken the pull-down transistor ( $E_n$ ) when the output ( $V_{out}$ ) is to be kept high ( $V_{dd}$ ). The signal flow for a precharge 1 domino inverter with the keeper transistors would be as follows, for input with:

- (a) Non-transition The output is to be kept high. The KE<sub>n</sub> would be turned off, the gate of P<sub>p</sub> would be low (0) and hence V<sub>out</sub> is held high through P<sub>p</sub> to V<sub>dd</sub>. The feedback from the keeper transistor KE<sub>n</sub> would pull the floating-gate at the input of E<sub>n</sub> to the source of the KE<sub>n</sub>, which, at the time, would be 0 (φ= 0). Given time, the keeper transistor KE<sub>n</sub> would turn the En completely off and moreover significantly lower the static power consumption during the evaluation period.
- (b) **Positive transition** The output would, as a result of the input transition, be pulled towards 0 through the  $E_n$ -transistor. As the output  $V_{out}$  decreases, the feedback keeper  $KP_p$ -transistor would increasingly turn on and contribute to the shut-down of the pull-up transistor  $P_p$ .



Figure 3. Simulation of a ULV5 with keeper. The plot shows the current dissipation through the En for the input signal with no transition (presented in section II(a).

In turn, this would increase the speed of pulling  $V_{out}$  to 0. The more  $V_{out}$  is lowered, the weaker the keeper transistor  $KE_n$  contribution to the floating-gate at the  $E_n$ .

The main improvement made by adding the keeper transistor is the significant decrease of static power consumption. This is primarily due to the shut-down of the evaluation transistor, which competes with the precharge transistor during the evaluation period for a non-transition input. The current consumption during a clock cycle for a nontransition input for both the ULV5 and the ULV5 with keeper configurations is shown in Figure 3. The simulation results show that the keeper configuration has a current dissipation factor approximately 10,000 times lower through the  $E_n$ .



Figure 4. The simulation of a ULV5 with keeper. The plot shows the current dissipation through the En for the input signal with a positive transition (Section II(b).



Figure 5. Parametric simulation with changes based on the width of the precharge transistors. The plot shows the evaluation period for a ULV5 (without keeper), and its ability to hold the output ( $V_{out}$ ) for the case of a non-transition at the input (Section II(a)).

During the precharge period, the current dissipations are different due to the starting point of the floating-gates and the specific DC voltage at the output  $V_{out}$ . For a positive input transition the current dispassion is equal, hence the high speed of the gate is ensured. The simulation result for the gate with a positive transition is shown in Figure 4. The ULV5 with keeper configuration is therefore a great improvement in terms of static power, while all other beneficial attributes are maintained compared with standard CMOS. In the following section, we are to elaborate on the details of the different elements within the gate. We aim to analyse which dimension gives the best overall effect.

Table I The trade-off for delay and load for the ULV5 with keeper.

| (µm) | Tim    | (ns)    |       |
|------|--------|---------|-------|
| Ci   | In 50% | Out 50% | Delay |
| 0.5  | -      | -       | -     |
| 1.0  | 5.81   | 7.30    | 1.49  |
| 1.5  | 5.58   | 5.96    | 0.38  |
| 2.0  | 5.47   | 5.73    | 0.26  |
| 2.5  | 5.41   | 5.62    | 0.21  |
| 3.0  | 5.38   | 5.56    | 0.18  |
| 3.5  | 5.35   | 5.51    | 0.16  |
| 4.0  | 5.33   | 5.48    | 0.15  |
| 4.5  | 5.32   | 5.46    | 0.14  |
| 5.0  | 5.31   | 5.44    | 0.13  |
| 5.5  | 5.30   | 5.43    | 0.13  |



Figure 6. Parametric simulation for the ULV5 with keeper. The width of the precharge transistors are parameterised for the case of a non-transition at the input during an evaluation period. This plot is based on the same simulation environment as for the ULV5 without the keeper, shown in Figure 5.

#### III. SIMULATION RESULTS AND DISCUSSION

We start by elaborating on the effect of the capacitive input to the gates. The dimension of the input capacitance  $(C_i)$  directly affects the gate by attenuating the transition (the input voltage swing) and thus increasing the delay. The lower  $C_i$  is, the more significant the role the parasitic gatecapacitances play. The higher  $C_i$  is, the higher the load  $(C_{\rm L})$  that is required to burden the previous gate. Therefore, there should be a trade-off between the amount of parasitic capacitance and the load. Our parametric simulation for the gate with keeper configuration considering parameterisation of  $C_i$  is shown in Table II, with focus on the gate delay. From the table, we have chosen to use a 2.5 fF input capacitance. The input capacitance for the N- and P-domino gates has different input gates, hence nMOS and pMOS; therefore considerations must be made regarding matching of the nMOS and pMOS transistors for both evaluation and precharge. Transistor matching for the nMOS and pMOS for these low supply voltages  $(V_{dd})$  has different mobility abilities. Matching of the nMOS and pMOS for a standard CMOS inverter is shown in Table II. The dimensions of the evaluation transistors are preferably kept at a minimum, especially concerning matching of the  $C_i$ . This leads to the dimensioning of the precharge transistors, hence these directly affect the matching of the evaluation-transistors and the precharge through relative values. The other side-effect of changing the precharge transistors is the fact that the



Figure 7. Simulation plot regarding the delay of the clock drivers dependency on the width of the nMOS of the clock driver.

precharge delay would be either longer or shorter. The delay in precharge (isolated) is of no concern due to the fact that we can use a skew clocking strategy, and the only aspect of importance is the DC value of the level, either  $V_{\rm dd}$  or ground. The other consideration when dimensioning the precharge-transistors is their ability to hold the signal. This is of particular importance for the ULV5 configuration. As shown in Figure 5, the larger the width of the precharge transistor the better, and the longer it holds the output value in the case of a non-transition. For the ULV5 with the keeper configuration, shown in Figure 6, we see that holding of the value is of no concern. We would particularly like to stress the fact that a ULV5 with keeper can actually have a configuration with width of the precharge transistor as small as 100 nm. In this case, we are able to lower the area consumption for an overall perspective. One potential issue which needs to be addressed is that, contrary to

| Width ( $\mu$ m ) |       |  |  |
|-------------------|-------|--|--|
| pMOS              | nMOS  |  |  |
| 0.50              | 0.57  |  |  |
| 1.00              | 1.15  |  |  |
| 2.00              | 2.95  |  |  |
| 3.00              | 4.85  |  |  |
| 4.00              | 6.80  |  |  |
| 5.00              | 8.80  |  |  |
| 6.00              | 11.00 |  |  |

Table II MATCHING OF THE NMOS AND PMOS FOR A STANDARD CMOS INVERTER GATE, SIMULATED WITHIN THE SAME CONDITIONS AS THE REST OF THE GATES IN THIS PAPER. THE SUPPLY VOLTAGE IS SET TO 300 MV.



Figure 8. Simulation plot for the power dissipation through the clock driver for the specified gate based on the parameterised width of the nMOS.

CMOS domino logic, the clock signals (drivers) play a more dominant role. The clock signals are not only connected to the gate node of the transistors, but also to the drain/source of the evaluation transistors, specifically to the  $E_n$ ,  $E_n$ ,  $KE_n$ and  $KE_{p}$  transistors. Hence the clock drivers must be strong enough to not become a bottleneck for the evaluation. The most crucial path is for the  $E_{\rm n}$  and  $E_{\rm p}$ . In Figure 9, the critical path is shown in red. In the case shown in Figure 9(a), of an evaluation for a positive transition at the input, the evaluation transistor  $E_n$  must pull the output  $V_{out}$  down to ground. The critical path shows that the nMOS of the clock driver has a bottleneck issue; it must be large enough to pull through all the current dissipation and at the same time not increase the overall power dissipation. Our simulation results indicate that the optimal size for the nMOS width of the clock-driver is  $4.0\mu m$ , with a length of 200 nm. The pMOS is kept minimum, thus the pull-up and pull-down of the clock driver is skew matched. Figure 7 shows the effect of the delay of the gate with regard to the increase in the width, while in Figure 8 illustrates the power consumption through the same node. Figure 10 and Figure 11 show the energy (PDP) and the EDP, respectively.

#### **IV. CONCLUSION**

In this paper, we have elaborated on the matching and dimensioning of different elements of the ULV gate. We have demonstrated the potential improvement concerning static power dissipation of the ULV5 with keeper configuration compared with the unmodified ULV5. Furthermore, we have discussed the importance of the clock drivers dimensions. In particular, we have considered the issue of bottlenecks for



a) Low power precharge to 1 domino inverter

b) Low power precharge to 0 domino inverter Figure 11. Calculated values for the EDP for different widths of the clock drivers nMOS.

Figure 9. The ULV5 with keeper domino gate, with single clock driver. The highlighted critical path is shown in red.



Figure 10. Simulation plot for the calculated energy (PDP) dissipation for the clock driver based on the parameterised width of the nMOS.

the evaluation transistors. The optimal values obtained in this work are Ci=2.5 fF, and the width of the clock driver (crucial path) is  $4.0\mu m$ , with a length of 200 nm. The supply voltage was 300 mV and all transistors were kept to a minimum size except for the precharge transistor. The optimal dimensions for the precharge depend on the delay and the length of the chain used for each design. For the ULV5 without the keeper, we have simulated the relationship between the number of



Figure 12. The ULV5 without the keeper is simulated to plot the relation between number of bits in a chain and the width of the precharge transistor. The delay is given in ns. The evaluation transistor size is kept minimal.

bits in a chain and the width of the precharge transistor. The delay is plotted in Figure 12, and the representative hold time (the time-slot which holds the output value valid) is shown in Figure 13. Finally, Figure 14 shows the valid bits and the same configurations. There was a significant improvement for the ULV5 with keeper, especially in holding a value valid, resulting in a higher bit chain.



Figure 13. The ULV5 without the keeper is simulated to plot the relation between number of bits in a chain and the width of the precharge transistor. The hold time (the time-slot over which the output is kept valid) is given in ns.



Figure 14. The simulation plot shows the development of the valid bit through the chain for different widths of the precharge transistor.

#### REFERENCES

- [1] Closing the nanometer yield chasm, 2001. Cadence Design Systems, White paper, http://w2.cadence.com/whitepapers/ClosingNanometer061301.pdf.
- [2] N. Verma, J. Kwong, and A. Chandrakasan. Nanometer mosfet variation in minimum energy subthreshold circuits. *IEEE Transactions on Electron Devices*, 55(1):847–854, August 1995.

- [3] A.P. Chandrakasan, S. Sheng, and R.W. Brodersen. Lowpower CMOS digital design. *IEEE Journal of Solid-State Circuits*, 27(4):473–484, April 1992.
- [4] M. Alioto. Ultra-low power vlsi circuit design demystified and explained: A tutorial. *IEEE Transactions on Circuits and Systems I: Regular Papers*, 59(1):3–29, January 2012.
- [5] J.B. Burr and A.M. Peterson. Ultra low power CMOS technology. In NASA VLSI Design Symposium, pages 4.2.1 – 4.2.13, 1991.
- [6] J.B. Burr and J. Shott. A 200mV self-testing encoder/decoder using stanford ultra low-power CMOS. In *International Solid-State Circuits Conference (ISSCC)*, pages 84–85. IEEE, 1994.
- [7] R. Lashevsky, K. Takaara, and M. Souma. Neuron MOSFET as a way to design a threshold gates with the threshold and input weights alterable in real time. In Asia Pacific Conference on Circuits and Systems (APCCAS), pages 263– 266. IEEE, 1998.
- [8] K. Usami and M. Horowitz. Clustered voltage scaling technique for low-power design. In *International Symposium* on Low Power Electronics and Design, pages 3–8. IEEE, October 1995.
- [9] Y. Berg, D. T. Wisland, and T. S. Lande. Ultra low-voltage/low-power digital floating-gate circuits. *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, 46(7):930–936, July 1999.
- [10] Y. Berg, O. Mirmotahari, P.A. Norseng, and S. Aunet. Ultra low voltage CMOS gate. In *International Conference on Electronics, Circuits and Systems (ICECS)*, pages 818–821. IEEE, December 2006.
- [11] O. Mirmotahari and Y. Berg. Digital ultra low voltage high speed logic. In Accepted at International MultiConference of Engineers and Computer scientists (IMECS), pages 1–1, Mars 2009.
- [12] Y. Berg and O. Mirmotahari. Ultra low-voltage and high speed dynamic and static cmos precharge logic. In *FTFC* 2012: The 11th International Conference of Faible Tension Faible Consommation (FTFC), pages 1–4. IARIA, May 2012.
- [13] Y. Berg and O. Mirmotahari. Novel high-speed and ultra-low-voltage cmos nand and nor domino gates. In CENICS 2012: The Fifth International Conference on Advances in Circuits, Electronics and Microelectronics, pages 5–10. IARIA, August 2012.

## High-frequency One-port Colpitts SAW Oscillator for Chemical Sensing

S. Thomas, Z. Rácz, M. Cole and J.W. Gardner Microsensors and Bioelectronics Laboratory University of Warwick Coventry, UK J.W.Gardner@warwick.ac.uk

Abstract—This paper reports upon the design and development of a low cost, high sensitivity, high frequency surface acoustic wave resonator (SAWR) based system for gas sensing applications. The 262 MHz one-port SAWR operates in a grounded-base Colpitts oscillator arrangement that was developed based on an equivalent circuit model. Electrical characteristics of the fabricated SAWR show good agreement with its equivalent device model at the resonant frequencies, and it was found to have good stability and sensitivity with a O-factor in air of about 2,870 at its fundamental resonant frequency. The sensor system is designed to operate in a dual configuration in which one resonator is coated with a gassensitive polymer (polyethylene) coating, whilst the second one is used as a reference channel; thereby eliminating common mode interferences on the baseline signal. Mass sensitivity was found to be ca. 1 Hz/ng, which corresponds to sub-ppm sensitivity to gas/odour concentration.

#### Keywords-acoustic waves; one-port; Colpitts oscillator; BVD model; SAW resonator; polymer coating

#### I. INTRODUCTION

Both bulk acoustic and surface acoustic wave (SAW) based sensor systems have been reported in chemical sensing applications over the past few decades [1–4]. Due to their high sensitivity and simple drive/readout circuitry, more recent focus has been on surface acoustic wave based devices where a SAW device forms the frequency selective component within an oscillator circuit. Polymer-coated SAW based chemical sensors impart high sensitivity and selectivity to specific volatile compounds. The absorption of the ligand molecule onto the polymer changes the physicochemical and electrical behavior of the SAW device resulting in a change in its oscillation frequency.

Common methods to implement SAW oscillator circuits are typically based on the feedback loop method or the negative resistance method [5], [6]. The frequency stability and vapor sensitivity of the SAW sensor system directly depends on the type of the employed oscillator circuit. Nimal *et al.* [7] have recently reported that one-port Colpitts oscillators are more sensitive, but less stable, than two-port Pierce oscillators. The sensitivity can also be improved by tuning the phase point set within the SAWR in the pass band thereby reducing the noise performance of the oscillator circuit [8].

In this study, we present <u>one-port</u> polymer-coated Rayleigh wave based SAW resonators, fabricated on an ST-

cut quartz wafer, for application in low-cost chemical sensors. An investigation into different equivalent circuit models is also presented, which lead to the conclusion that the most suitable oscillator circuit for one-port SAWR sensors is a Colpitts oscillator configuration.

#### II. ONE PORT SAW RESONATOR

SAW resonators are commonly available as one-port and two-port devices employing either delay line or resonator configurations. Because of the potential for high Q-value, low noise level and higher stability, we have selected a oneport resonator structure. These resonators are designed to operate at a baseline frequency of 262 MHz in a dual configuration to obtain differential measurements (Fig. 1).

The design and modeling of surface acoustic devices are normally carried out using the well-established Coupling of Modes (COM) theory [9], [10]. Although a COM model allows for an accurate description of the SAW resonator by simulating the admittance behaviour, the formulas are somewhat cumbersome and are not very informative - as far as circuit analysis and simulation is concerned. In addition, the accuracy of this model is limited to a narrow frequency band around the resonance region. Hence, the COM theory must assume *near-resonance* frequencies in order to derive a simplified electrical model of the SAW resonator [11].

The Butterworth Van Dyke (BVD) model, as a simple electrical equivalent circuit model, is more suitable for circuit designers. Morgan [12] established that the electrical acoustic impedance behavior of a SAW device, obtained using a lumped-element equivalent circuit model, is in good agreement with conventional COM analysis. This equivalent circuit model conveniently relates the acoustic perturbations due to surface mass loading in a SAW device to its electrical behavior.

A BVD model [11] was developed for the 262 MHz oneport SAWR, shown in Fig. 2., allowing quick simulation and design of the associated oscillator circuitry. The motional and static arm parameters were extracted using the transmission parameters of the SAWR. As shown in Fig. 2, the electrical components R, L and C are the motional inductance, capacitance and resistance respectively, which form the *motional arm* producing the resonant frequency while the capacitor  $C_0$  forms the *static arm* providing the anti-resonant frequency. The motional arm signifies the electro-acoustic properties [13] of the piezoelectric material and it models the vibration of the crystal. R represents the acoustic attenuation in the resonator and capacitance  $C_o$  the capacitance of the piezoelectric crystal.



Fig.1. Optical micrograph of the 262 MHz one-port dual SAW resonator sensor, fabricated in aluminium on a ST-cut quartz wafer substrate. The top resonator is coated with a chemically-sensitive non-conductive polymer (polyethylene) and the bottom resonator is uncoated thus acting as a reference channel.

The designed 1-port SAWR comprises 60.25 finger pairs with 3  $\mu$ m finger width forming the inter-digitatedtransducer (IDT), and 500 reflectors on each side to create a standing wave pattern with an overall die size of 7.4 mm × 2.4 mm. The dual resonator configuration [14] with a reference channel eliminates common mode interferences on the baseline signal, such as changes in ambient temperature or pressure. The SAWRs were fabricated on an ST-cut quartz substrate with aluminum IDTs using UV lithography (PacTech, Germany).



Fig.2. Illustration of the BVD equivalent circuit lumped element model of a one-port SAW resonator.

In addition to the fundamental mode of operation, the SAWR exhibits several overtone frequencies, which can be modelled by adding additional series-resonant branches to the BVD model. For operation around a certain resonant frequency, the crystal can be modelled by the circuit with a single motional arm. The impedance of this modelled circuit is given by [15]:

$$Z(s) = \frac{s^2 + \left(\frac{R}{L}\right)s + \omega_s^2}{sC_o \left[s^2 + \left(\frac{R}{L}\right)s + \left(1 + \frac{C}{C_o}\right)\omega_s^2\right]}$$
(1)

where

$$\omega_{\rm s} = 2\pi f_{\rm s} = \frac{1}{\sqrt{LC}},\tag{2}$$

$$f_s = \frac{1}{2\pi\sqrt{LC}}$$
(3)

Here,  $f_s$  is the series resonance frequency of the SAWR, modeled by the motional arm. The unloaded quality factor of a SAWR is given by

$$Q_{u} = \frac{\omega_{s}L}{R}$$
(4)

Due to the high Q-factor of a SAWR, R can be neglected. Thus (1) becomes,

$$Z(s) = \frac{s^2 + \omega_s^2}{sC_o \left[s^2 + \left(1 + \frac{C}{C_o}\right)\omega_s^2\right]}$$
(5)

This shows that the resonator exhibits a parallel resonance at:

$$f_a = \frac{1}{2\pi \sqrt{LC_T}}$$
(6)

where

$$C_{\rm T} = \frac{CC_{\rm o}}{C + C_{\rm o}} \tag{7}$$



Fig.3. Real (Solid line) and imaginary parts (dotted line) of the impedance presented by 262 MHz one-port SAWR. The series, parallel and center frequencies are marked on the diagram.

The real and imaginary parts of the one-port SAWR impedance exhibiting a minimum resistance at resonance and a maximum resistance at anti-resonance frequencies, obtained by an RF network analyzer (E5071B, Agilent Technologies), is shown in Fig. 3. The series resonance frequency,  $f_s$ , is 261.91 MHz, the parallel resonance,  $f_a$  is 261.94 MHz and the center frequency,  $f_c$  is 261.92 MHz. This also demonstrates that the SAWR center frequency lies

as expected between the series and parallel resonant frequencies. The phase curve in Fig. 3 also shows that the Barkhausen criterion of  $0^{\circ}$  phase condition for oscillation is satisfied at the center frequency of the SAWR.

#### III. COLPITTS OSCILLATOR DESIGN

Several similar transistor-based circuit configurations are available for the realisation of SAW oscillators, such as Pierce, Colpitts, and Clapp, with the main difference lying in the transistor grounding options. The performance of the three configurations varies with the difference in the position of the biasing resistors and capacitances. The most desirable option is the Pierce configuration due to its simplicity, robustness and ability to work at higher frequencies (> 500 MHz) because it is arguably the least affected by stray capacitances [15]. However, the Pierce oscillator can only work with a two-port SAW resonator within a feedback loop to attain the required 180° phase shift.

The Colpitts oscillator, however, allows the SAWR to operate in a 1-port configuration [7], and therefore was selected for this work. The schematic of the Colpitts SAW oscillator circuit with a grounded base configuration, where the SAWR input is connected to the transistor's base and the output port is connected to the ground, is shown in Fig. 4.



Fig. 4. Simplified schematic of the Colpitts oscillator circuitry used to drive the 1-port SAW resonator sensor.

The Colpitts oscillator offers good stability at higher frequencies, lower harmonics, lower component count and hence lower cost than other types including feedback-based oscillator. The transition frequency of the transistor,  $f_T$ , limits its frequency of operation, when the capacitors needed for obtaining the oscillation frequency are comparable to the transistor's terminal capacitances. This may be avoided by using a high  $f_T$  value (a few gigahertz) BJT in the oscillator circuit or by using the crystal in a series resonance configuration [15]. The use of RF transistor (BFR92P, Infineon) rather than an operational amplifier also

reduces parasitic capacitances allowing radio frequency (RF) oscillator operation. To obtain the tuned oscillation frequency close to the SAWR Q-factor, tight tolerance components were selected for the capacitor and the inductor values.

In this configuration, the resonator shows an inductive behavior between the series  $(f_s)$  and parallel resonances  $(f_p)$ . The transistor along with the feedback capacitors  $C_1$  and  $C_2$  provides the negative resistance to compensate for resistive losses in the resonator. The major limitation of such an oscillator circuit is that the parasitic capacitances begin to affect the effective operation of the circuit at frequencies above 500 MHz.

#### IV. CHEMICAL DETECTION SYSTEM SETUP

A robust, high-sensitivity chemical detection system based on polymer-coated one port SAW sensor has been designed and implemented. The SAW oscillator has been realized by interfacing the dual SAW resonators to Colpitts oscillator circuitry. A two layered Printed Circuit Board (PCB) has been designed using Altium Designer software. Figure 5 shows the photograph of the dual Colpitts SAW oscillator based chemical sensor system. The PCB ensures minimal cross-talk associated with high frequency signals. The phase shifts linked with the RF signal paths to the resonators have also been taken into account during the PCB design.



Fig.5. Photograph of the SAW resonator with associated Colpitts oscillator circuit on the backside of a custom PCB.

The experimental arrangement demonstrating chemical detection using SAWR oscillator consists of a  $14 \times 14 \times 40$  cm<sup>3</sup> gas/odor chamber (photograph of the setup is shown in Fig. 6) to which a neMESYS multi-channel syringe pump (Cetoni GmbH, Germany) is attached. The microliter precision syringe delivers the chemicals into the chamber via capillary lines, where it gets vaporised. The SAW sensors, arranged in dual configuration, where one is coated with the sensing polymer *polyethylene* and the other

sensor forms the reference channel are attached to the far end of the chamber. A commercial FQ4 interface instrument (JLM Innovation, Tubingen, Germany) was connected to the oscillator output for frequency measurement. The oscillation frequencies of the individual sensors were monitored to obtain the SAWR differential signal.



Fig.6. Experimental arrangement for chemical detection sensor system consisting of an odour chamber, a venting pump, syringe pump, and SAW sensors with the Colpitts circuitry.

#### V. EXPERIMENTAL RESULTS

Figure 7 shows the oscillator's resonant frequency output obtained by an RF oscilloscope (LeCroy LT342 Waverunner). The measured frequency value is in good agreement with the theoretically modeled value. The load sensitivity is significantly less for this oscillator circuit. The output of the SAWR oscillator is practically noise and distortion free.



Fig.7. Photograph of the baseline frequency (261.9 MHz with amplitude of 4.2 V) of a Colpitts SAW oscillator sensor system shown at the channel 2 of an RF Oscilloscope.

The typical frequency shifts of a dual SAWR sensor after the detection of a volatile chemical compound (here an insect sex pheromone) shows that the one-port SAW oscillator provides a highly-sensitive system for chemical detection. The response has a low level of noise as shown in Fig. 8. On the introduction of 10  $\mu$ l of the insect pheromone Z9-14:OAc into the odor chamber, a differential frequency shift of about 6 kHz was measured at the SAW output, which shows that the average sensor response to the pheromone compound is about 0.6 *Hz/nl* of liquid, i.e., sub-ppm levels of phermone in air.

The response time of the system is relatively slow ( $\sim 100$  s) and it is associated with the evaporation and diffusion of the volatile compound inside the chamber. However, the actual response time of the SAWR itself is below one second.



Fig.8. Differential frequency response of polymer-coated SAWR sensor to pheromone Z9-14:OAc demonstrating the high sensitivity of the polymer-coated SAWR sensor.

### VI. CONCLUSION

A high frequency one-port Colpitts SAWR oscillator has been designed and fabricated for application in a low-cost, low-power gas sensor. An equivalent model has been developed, which formed the basis of an oscillator circuit design for a highly sensitive chemical sensor. The SAWR exhibits a high quality factor of 2,870 and has an estimated 0.5 Hz/ng mass sensitivity after coating with a thin gas sensitive non-conducting polymer film.

Further studies are being carried out on the detection of specific blends (i.e. mixtures) of chemical compounds. In addition, technological developments of this work include the creation of a smart low-cost, low-power chemical sensor on a chip - by the integration of the SAWR sensor with full custom CMOS oscillator circuitry thus resulting in an application-specific integrated circuit (ASIC) BioMEMS chip.

### ACKNOWLEDGMENT

The authors wish to thank first Mr. Frank Courtney (Warwick University, UK) for his assistance in all mechanical and technical matters and secondly Mr. Ian Griffith (Warwick University, UK) for his help in the manufacturing of the oscillator PCB.

#### References

- I. Avramov, "Design of Rayleigh SAW Resonators for Applications as Gas Sensors in Highly Reactive Chemical Environments," 2006 IEEE International Frequency Control Symposium and Exposition, Jun. 2006, pp. 381–388, doi: 10.1109/FREQ.2006.275415.
- [2] K. Mitsakakis, A. Tserepi, and E. Gizeli, "SAW device integrated with microfluidics for array-type biosensing," Microelectronic Engineering, vol. 86, no. 4–6, Apr. 2009, pp. 1416–1418, doi: 10.1016/j.mee.2008.12.063.
- [3] V. M. Yantchev, S. Member, V. L. Strashilov, M. Rapp, U. Stahl, and I. D. Avramov, "Theoretical and Experimental Mass-Sensitivity Analysis of Polymer-Coated SAW and STW Resonators for Gas Sensing Applications," IEEE Sensors Journal, vol. 2, no. 4, Aug. 2002, pp. 307–313, doi: 10.1109/JSEN.2002.804039.
- [4] Z. Rácz, M. Cole, J. W. Gardner, S. Pathak, M. D. Jordan, and R. A. J. Challiss, "Cell-Based Surface Acoustic Wave Resonant Microsensor for Biomolecular Agent Detection," 16th International Conference on Solid-State Sensors, Actuators and Microsystems, June 2011, pp. 2168-2171, doi: 10.1109/TRANSDUCERS.2011. 5969348.
- [5] M.-I. Rocha-Gaso, C. March-Iborra, Á. Montoya-Baides, and A. Arnau-Vives, "Surface Generated Acoustic Wave Biosensors for the Detection of Pathogens: A Review," Sensors, vol. 9, no. 12, Jul. 2009, pp. 5740–5769, doi: 10.3390/s90705740.
- [6] M. ElBarkouky, P. Wambacq, and Y. Rolain, "A low-power 6.3 GHz FBAR overtone-based oscillator in 90 nm CMOS technology," Research in Microelectronics and Electronics Conference, July 2007, pp.61-64, doi: 10.1109/RME.2007.4401811.
- [7] A. T. Nimal, M. Singh, U. Mittal, and R. D. S. Yadava, "A comparative analysis of one-port Colpitt and two-port Pierce SAW oscillators for DMMP vapor sensing," Sensors and Actuators B: Chemical, vol. 114, no. 1, Mar. 2006, pp. 316– 325, doi: 10.1016/j.snb.2005.05.021.
- [8] S. Stier, A. Voigt, M. Rapp, F. K. Gmbh, P. O. Box, and D.-Karlsruhe, "Influence of Phase Position on the Performance of Chemical Sensors Based on SAW Device oscillators," Analytical Chemistry, vol. 70, no. 24, 1998, pp. 5190–5197, doi: 10.1021/ac9805504.
- [9] P. V. Wright, "Analysis and design of low-loss SAW devices with internal reflections using coupling-of-modes theory," Proc. IEEE Ultrasonics Symp., 1989, pp. 141–152, doi: 10.1109/ULTSYM.1989.66974.
- [10] C. S. Hartmann, D. P. Chen, and J. Heighway, "Modeling of SAW transversely coupled resonators filters using couplingof-modes modeling technique" in Ultrasonics Symposium, 1992, pp. 39–44, doi: 10.1109/ULTSYM.1992.276067.
- [11] R. Kshetrimayum, R. D. S. Yadava, and R. P. Tandon, "Modeling electrical response of polymer-coated SAW resonators by equivalent circuit representation.," Ultrasonics, vol. 51, no. 5, Jul. 2011, pp. 547–553, doi: 10.1016/j.ultras.2010.12.006.
- [12] D. P. Morgan, "Simplified analysis of surface acoustic wave one-port resonators," Electronics Letters, vol. 39, no. 18, Sept. 2003, pp. 4–5, doi: 10.1049/el.
- [13] C. K. Campbell, Surface Acoustic Wave Devices for Mobile and Wireless Communications, vol. 9, no. 8. Academic Press, San Diego., 1998.
- [14] S. Thomas, S. L. T. Leong, Z. Rácz, M. Cole, and J. W. Gardner, "Design and Implementation of a High-Frequency Surface Acoustic Wave Sensor Array for Pheromone Detection in an Insect-inspired Infochemical Communication System," 14th International Meeting on Chemical Sensors, 2012, pp. 11–14, doi: 10.5162/IMCS2012/P2.2.12.

[15] G. Gonzalez, Foundations of Oscillator Circuit Design, Artech House, 2007.

## FPGA Implementation of Disparity Estimation Proceesing Architecture for Stereo Camera System

Hi-Seok Kim dept. electronics engineering Cheongju University Cheongju, KOREA <u>khs8391@cju.ac.kr</u>

Abstract—With the advance of image processing and computer vision, the stereo vision system with two cameras has become the research of interest in many areas since its ability to realize the depth information is similar to human vision. Depth map algorithm allows camera system to estimate depth. It is a computation intensive algorithm, but can be implemented with high speed on hardware due to the parallelism property. In this paper, by analyzing digital image stabilization (DIS) algorithms, we propose an efficient disparity estimation architecture, which combines gray-scaled projection and Affine transformation model. We develop the architecture by describing the various computation units in hardware description language (Verilog) and synthesizing the design into a FPGA. The synthesis and experimental results for three video test images show that the proposed hardwired architecture is better than traditional sum of absolute difference (SAD) architecture, which based on block matching algorithm in terms of frame rate (frame/sec) while keeping the competitive PSNR results.

Keywords -Gray-scale projection; Steroscopic; architecture; 3D

#### I. INTRODUCTION

Recently, industrial demand and interest of stereoscopic image systems are increased due to 3D movies and HD -TV. The depth information is the main element in 3D image systems. Stereo matching or disparity estimation is exploited to find the depth information from stereoscopic images. The goals of this paper are to propose a real-time processing architecture for disparity estimation and to show application systems based on real-time stereo camera system. The proposed system operates in the FPGA board environment with stereo camera.

It has potential uses in robotic navigation, 3D imaging, camera surveillance and object recognition systems. A typical depth estimation system consists of two cameras with overlapping field of view and a processing unit. To estimate the depth, several depth-map algorithms have been developed [1]. The idea is to find the displacement between two projections of the same object in the two images. From that, the distance is calculated based on the relative position of the two cameras and other dimensions such as focal length, angle between optical axes. The depth value is represented in Young-Hwan Kim, Sea-Ho Kim, Choong-Mo Youn dept. electronics engineering Pohang University, Cheongju University, Seoil University Pohang, KOREA, Cheongju, KOREA, Seoul, KOREA youngk@postech.ac.kr, kensean@cju.ac.kr, 5420chong@se oil.ac.kr

the result image as pixel intensity. The depth-map algorithm is also called the disparity algorithm. The idea of a depthmap system is quite simple. However, the depth-map algorithm is computational and data intensive [2] because it has to perform an identical procedure on millions of pixel. Due to the computational complexity of the disparity algorithms, several attempts for video 3D tracking have been developed in recent years [3]. several attempts have been made[3-5], including systems implemented on personal processor digital signal computer, (DSP), field programmable gate array (FPGA) and application specific integrated circuit (ASIC). One of these attempts presented in 3D feature tracking and localization using stereo vision systems. The objective of feature localization is to localize the corresponding feature point in the right video sequence. Because the motion of the given feature point in the left and right video sequences is similar, this system is designed to obtain the motion vectors from acquired image frame of the left video sequence and to estimate the corresponding motion vectors for the feature in the right video sequence.

Various algorithms, such as projection algorithm (PA), bit-plane matching (BPM), and others, have been developed to estimate the motion vectors. In general, the gray scale projection algorithm can greatly reduce the complexity of computation in comparison with the other methods. In this paper, our focus is to develop an efficient architecture based on the motion vectors of the gray scale projection and Affine transformation model for practical implementation of 3D image processing.

This paper is organized as follows. Section II describes the algorithm of gray-scale projection and estimate motion vectors. Section III describes the design of the proposed architecture. Experimental results are shown in Section IV. Finally, the conclusion is given in section V.

#### II. GRAY SCALE PROJECTION

The gray-scale projection is an approach based on total gray-scale changes in the coordinate of an image to estimate Motion vectors between the current and reference frames. By doing a related operation using gray-scale projection algorithm, we can determine motion vector. A small amount of computation is one of the valuable characteristics of this algorithm. Normally, the gray-scale projection algorithm can be divided into three steps: Image projection, projector filter, and its correlation image projection.

The gray scale projection uses gray information of image to compute the correlation between two frames. Its projection is divided into two one-dimensional wave shaped plane, these formulas are described as follows:

$$P_{Vk}(i) = \sum_{j=1}^{N} Y(i, j)$$

$$P_{Hk}(j) = \sum_{j=1}^{M} Y(i, j)$$
(1)

In which, Y(I,j) is the gray value of point(i, j) at image frame k, Pv\_k(i) is the vertical I projection at frame k,  $P_{Hk}(j)$  is the horizontal j projection at frame k, the size of frame is M×N.



Figure 1. Results of horizontal and vertical gray projection curve at frame k.

Figure 1 shows results of the horizontal and vertical projection at frame k. S denotes the invalid width of search between the current frame and the reference frame.

#### A. Correlation calculation

In order to estimate the motion vectors, we can perform the correlated operations with vertical gray-scale projection curves [4] of left image frame and horizontal gray-scale projection curves of right image frame respectively. Thereby, we can obtain two cross-correlation curves. According to the local minimum value of two curves, we can get a motion vector between left image and right image [5]. The correlation computing formulas are described as follows:

$$C_V(w) = \sum_{i=1}^{M-2S+1} |P_V^{I_1}(i+w-1) - P_V^{I_2}(i+S)|, (1 \le w \le 2S+1)$$
(2)

$$C_{H}(w) = \sum_{i=1}^{M-2S+1} |P_{H}^{I_{1}}(i+w-1) - P_{H}^{I_{2}}(i+S)|, (1 \le w \le 2S+1)^{\binom{3}{2}}$$

In which,  $P_V^{I_1}(i+w-1)$ ,  $P_H^{I_1}(i+s)$  are the vertical and horizontal projection curves of the left frame and  $P_V^{I_2}(i+w-1)$ ,  $P_H^{I_2}(i+s)$  are the vertical and horizontal projection curves of the right frame . M, N is the vertical rows and horizontal columns at the left frame 1 and the right frame 2. The minimum of  $C_V(w)$  is  $W_V^{MIN}$  and that of  $C_H(w)$ is  $W_H^{MIN}$ , respectively. Then, we can get the motion vectors from the formulas (4) and (5):

$$Tx = S + 1 - W_H^{MIN} \tag{4}$$

$$=S+1-W_{V}^{MIN}$$
<sup>(5)</sup>

Tx and Ty are the translation (motion) vectors, denoted as offset-x and offset-y. Consequently, we can perform compensation operation to the right frame using two offset values (Tx and Ty).

#### B. AFFINE TRANSFORMATION MODEL

Ty

The change of video image can be divided into translation and rotation. Often, we choose the Affine transformation model. In this paper, we will consider changes of rotation and translation. The angle variable of  $\theta$  is the rotation parameter in Affine model. From the given left image and right image frame, by adopting Sobel gradient operator [6], we can compute the angle of  $\theta$  which is rotation parameter at Y(i,j) in the left image frame and right image frame respectively. The two 3x3 templates are used by Sobel gradient operator. Every gray value in the image frame should use these two templates to do convolution. One of the two templates has a maximum response to the vertical edge and the other has a maximum response to the horizontal edge. Then, the Sobel operator can compute the vertical and horizontal edge orientations. We can get the angle orientations for the corresponding formula as follows:

$$\theta = \tan^{-1}(\frac{H_V}{H_H}) \tag{6}$$

Here,  $H_v$  and  $H_H$  are the vertical and horizontal edge orientations. The size of the image frame is M x N.

#### III. ARCHITECTURE AND DESIGN

In this section, we develop an efficient hardware architecture based on the gray-scale projection algorithm and Affine transformation model. In particular, we focus on developing the architecture for stereoscopic image processing. The flow diagram of the computation for luminance (Y) component is shown in Fig. 2. The image processing system is designed to take an 8-bit gray-scale left and right input images. We note that the proposed stereoscopic image processing system can be applied equally well to 8-bit images by simply performing the luminance (Y) operation on the image while skipping the other color format operation. In Figure 2, we need to compute the edge orientation of each pixel of the left and right images at Y(i,j). We set up a table in RAM whose dimensions are equal to input images. Each table entry contains the computed angle orientation for the corresponding pixel of the input images at Y(i,j). Hence, we can calculate the histogram of the angel orientation. Subsequently, we find dominant angle orientation from the histogram. Let  $H_{max}$  denote the number of pixels that has dominant angle orientation in the left and right image frames. If  $H_{\text{max}}$  >threshold, then we can obtain the current frame. Then, oriented with direction,  $H_{\text{max}}$  will be set as the dominant angle orientation. After fixing the dominant angle orientation in the left and right current frames, we can get the difference angle orientation as follows:

$$\Delta \theta(m,n) = |\theta_p(m) - \theta_p(n)| \tag{7}$$

Here,  $\theta_p(m)$  and  $\theta_p(n)$  are the left and right dominant angle orientations of the current frames, where m, n is the image sequence of the left and right frames (m, n = 0, 1, ..., N).



Figure 2. Data flow graph of computation required in motion and rotation vector

After  $\Delta \theta(m,n)$  has been computed, we can compute the corresponding difference dominant orientation angle by adopting the CORDIC computing technique [7].

Computations of the rotations are performed using an additional angle accumulator, which is given by;

$$Z_i + 1 = Z_i - d_i \tan^{-1}(2^{-i})$$
(8)

where,  $d_i = -1$  if  $Z_i < 0$ , otherwise  $d_i = +1$ 

In particular, we notice that a small look up table (one entry per iteration) containing the  $\tan^{-1}(2^{-i})$  is required for the angle computation. Then, we can get the angle orientation of  $\theta$ , which is the rotation parameter of Affine transformation model described by section II. So the transformation model can be divided into two parameters of the Affine motion model described in the following formula:

$$\begin{bmatrix} X \\ Y \end{bmatrix} = \begin{bmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} X \\ Y \end{bmatrix} + \begin{bmatrix} T_x \\ T_y \end{bmatrix}$$
(9)

Here, X, Y and X', Y' are the pixel points in the left image frame and the right image frame respectively. The  $\cos\theta,\sin\theta$  are the rotation parameter.  $T_x$  is the final horizontal direction motion vector and  $T_y$  is the final vertical direction motion vector. After  $T_x$  and  $T_y$  have been computed, we compensate the current right image frame as shown in Figure. 2. Then, we can achieve the stereoscopic image process.

#### IV. EXPERIMENTAL RESULTS

In this paper, we respectively compare our proposed architecture with the traditional computing works written by  $C^{++}$  language. The proposed architecture is tested for 6 video sequences. Then, we select one frame image in video sequence as an experimental data, shown as Figure 3(a) and Figure 3(b).



Similarity can be evaluated by examining the difference between two images. We can get rid of background as well using the difference image. If two frames have movements, the difference image is white. If two images are the same, the difference image is completely black. Figure 3(d) shows the difference image between the left frame and right frame derived from proposed algorithm written in  $C^{++}$  language. Similarly, Figure 3(e) shows the difference image between the left frame 101 and right frame 101, which we implement the proposed algorithm with hardware architecture. Comparing Figure 3(d) and 3(e), the results show almost identical performance. Several blocks of our proposed algorithm are implemented in Xilinx Spartan -3 FPGA and their parameters are shown in table I. We processed four different videos adopting our new architecture. From table I, we can see that our proposed architecture can achieve realtime processing for various large size of videos. In Table II, comparison between the proposed architecture and other systems implemented by previous authors [8] are shown. From table II, we can see that our proposed architecture has faster speed (frame/s) than SAD based block matching architecture. We can't utilize the identical Xilinx FPGA implemented by D. Chaikalis, et al since our proposed system uses the internal memory block in Xilinx FPGA. Table III shows the resource usage for our proposed system.

TABLE I. HARDWARE IMPLEMENTATION DATA OF VIDEO

| FPGA<br>Device | Max<br>(MHz)<br>Frequency | Image Size | Total clock<br>number | Frame<br>/sec |
|----------------|---------------------------|------------|-----------------------|---------------|
|                | 51.287                    | 380 x 340  | 386212                | 132           |
| Spartan-3      |                           | 400 x 400  | 478412                | 106           |
| xc3a5000       |                           | 640 x 480  | 919532                | 55            |
|                |                           | 1024 x 768 | 2355980               | 21            |

 
 TABLE II.
 COMPARISON BETWEEN OUR ARCHITECTURE AND OTHER SYSTEM

| Author                 | Algorithm<br>Used                   | Platform Used               | Frame<br>Size | Frame /sec |
|------------------------|-------------------------------------|-----------------------------|---------------|------------|
| Proposed               | - Gray-Scale<br>Projection          | Xilinx Spartan-3            | 640 x 480     | 55         |
| System                 | - Affine<br>Transformation<br>Model | xc3a5000                    | 1024 x<br>768 | 39         |
| N.H. Tan ,et<br>al     | SAD                                 | Altera DE2-70<br>Cyclone II | 640 x 480     | 35         |
| D. Chaikalis,<br>et al | SAD                                 | Xilinx Virtex<br>XCV-2000E  | 1024 x<br>768 | 31         |

TABLE III. RESOURCE USAGE OF OUR SYSTEM

| Logic Utilization          | Used | Utilization |
|----------------------------|------|-------------|
| Number of Slices           | 769  | 2%          |
| Number of Slice Flip Flops | 691  | 1%          |
| Number of 4 input LUTs     | 1354 | 2%          |
| Number of bonded IOBs      | 157  | 24%         |
| Number of BRAMs            | 6    | 5%          |

From table I and III, area/throughput estimates based on the synthesis results of a Verilog description of this architecture will be provided to show the feasibility of a single chip ASIC implementation. The peak signal-to-noise ratio (PSNR) between the stabilized frames is an important criterion to evaluate the fidelity of the DIS. The PSNR gives the relation between two frames in terms of their powers. The higher the PSNR, the better is the fidelity of the DIS. The PSNR between left frame and right frame is defined as [4].

TABLE IV. HARDWARE IMPLEMENTATION DATA OF VIDEO

| Image     | Size    | Frame<br>/sec | PSNR                                          |                                               |                                              |                                     |                                     |       |
|-----------|---------|---------------|-----------------------------------------------|-----------------------------------------------|----------------------------------------------|-------------------------------------|-------------------------------------|-------|
|           |         | 55            | Original Left Image /<br>Original Right Image | 17.25                                         |                                              |                                     |                                     |       |
| Image 1   | 640-490 |               | Original Left Image /<br>S/W Process          | 20.43                                         |                                              |                                     |                                     |       |
| (tsukuba) | 040x480 |               | Original Left Image/<br>H/W Process           | 20.41                                         |                                              |                                     |                                     |       |
|           |         |               | S/W Process /<br>H/W Process                  | 43.56                                         |                                              |                                     |                                     |       |
|           | 640x480 |               |                                               |                                               | Original Left Image,<br>Original Right Image | 17.39                               |                                     |       |
| Image 2   |         | 55            | Original Left Image,<br>S/W Process           | 17.42                                         |                                              |                                     |                                     |       |
| (venus)   |         |               | 55                                            | Original Lef<br>H/W Pro<br>S/W Pro<br>H/W Pro | 00                                           | 00                                  | Original Left Image,<br>H/W Process | 17.80 |
|           |         |               |                                               |                                               | S/W Process,<br>H/W Process                  | 20.49                               |                                     |       |
|           |         |               | Original Left Image,<br>Original Right Image  | 14.18                                         |                                              |                                     |                                     |       |
| Image 3   | 640×480 |               | Original Left Image,<br>S/W Process           | 15.97                                         |                                              |                                     |                                     |       |
| (teddy)   | 040x480 | 0402460 33    | 55                                            | 55                                            | 55                                           | Original Left Image,<br>H/W Process | 16.01                               |       |
|           |         |               | S/W Process,<br>H/W Process                   | 19.06                                         |                                              |                                     |                                     |       |

Table IV shows the PSNR results for three video test images proposed by our hardwired architecture. In comparing three video images(S/W process right image) processed by DIS algorithm written in C<sup>++</sup> with the one processed by our hardware architecture, our proposed hardware algorithm shows competitive value of PSNR with less computation time. The gray-scale projection algorithm is known to be inferior because of rough compensation. However, we can take this strategy in favor of reducing the amount of computation time, although there is a disadvantage in terms of the accuracy.

### V. CONCLUSION AND FUTURE WORKS

In this paper, we proposed an efficient architecture, which combine gray-scale projection and Affine transformation model. The proposed architecture achieves real time processing speed of more than 30 fps. It is proved that implementation of the stereoscopic image processing system is feasible with the proposed architecture. For further work, it is recommended that our system is investigated more with other algorithm such as 3D tracking and depth Map algorithm.

#### ACKNOWLEDGMENT

This work was sponsored by ETRI SW-SoC R&BD Center, Human Resource Development Project.

#### REFERENCES

[1] Li-Wei Zheng, Yuan-Hsiang Chang, Zhen-Zhong Li. "A study of 3D feature tracking and localization using a stereo vision system", Computer Symposium (ICS), Dec. 2010, pp. 402-407.

[2] Zhu Juan-juan, Guo Bao-long, Feng Zong-zhe. "A digital image stability algorithm based on the Gray-scale projection". photon Journal, Oct. 2005, pp. 1266-1269.

[3] F. Vella, A.Castorina, M. Mancso et al.. "Digital image stabilization by adaptive block motion vectors filtering". IEEE Transactions on Consumer Electronics, Aug. 2002, pp. 796-801.

[4] I.Yasri, N.H.Hamid, V.V.Yap, "Performance analysis of FPGA based Sobel edge detection operator", Electronic Design, 2008. ICED 2008. International Conference on, Dec. 2008, pp. 1-4.

[5] P.K.Meher, S.Y.Park, "CORDIC Designs for Fixed Angle of Rotation", Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, Feb. 2012, pp. 1-12.

[6] Ngo Hun Tan,Nor Hisham Hamid, Patrick Sebatian. "Resource Minimization in a Real-Time Depth-map Processing System on FPGA", TENCON 2011 - 2011 IEEE Region 10 Conference, Nov. 2011, pp. 706-710.

[7] Angelos A. Amanatiadis, and Ioannis Andreadis, "Digital Image Stabilization by Independent Component Analysis," IEEE Transactions on instrumentation and measurement, Vol. 59, No. 7, July. 2010, pp. 1755-1763.

[8] Masanori Hariyama and Michitaka Kameyama, "VLSI Processor for Re-liable Stereo Matching Based on Window-Parallel Logic-in-Memory Architecture", Digest of Technical Paper 2004 Symposium on VLSI Circuits VLSI Symposium, June. 2004, pp. 166-16.

This work was sponsored by ETRI SW-SoC R&BD Center, Human Resource Development Project.