# Cell Library for Speed-Independent VLSI<sup>\*</sup>

Stepchenkov Y.A., Zakharov V.N., Diachenko Y.G., Morozov N.V., Stepchenkov D.Y. Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, Moscow, Russian Federation

{YStepchenkov, VZakharov, YDiachenko, NMorozov, DStepchenkov}@ipiran.ru

# Abstract

Paper describes content and implementation features of the cell library intended for digital selftimed (speed-independent) circuit design. The library contains more than 200 cells. Self-timed triggers with unary input and triggers with forced output are presented. The library was certified by means of developed characterization tool and was practically tested in a set of digital signal processing units manufactured in differential CMOS processes.

# 1. Introduction

Recent years have renewed an interest to the circuits, which behavior does not depend on delays only in the cells (Speed Independent, SI), or in the cells and wires (Delay-Insensitive), and to the computing systems on their base. This is due to their small consumption power and high performance [1]. They possess features that purely meet the requirements for component base made by modern data processing systems: robustness with respect to variation of the electrophysical parameters or to their degradation; operation safety and validity of the data processing results; high performance, that is maximum possible at any real environment and for each type of treated information.

Basing on Muller's theory [2], Varshavsky V.I. offered the methodology of designing SI-component base [3]. Lately it became a purpose of investigation of Institute of Informatics Problems of the Russian Academy of Sciences (IPI RAN) [1, 4]. The wider is component base, the more effective solutions will be made for SI-circuit implementation.

At the same time SI-circuits require specific logical cells missing in the standard cell libraries. So developing schematic and layout basis for designing new generation of the computer aids on base of SIcircuitry is an actual problem. Paper describes a library of CMOS SI-cells developed in IPI RAN. This library has been successfully tested in the practical designs on a base of gate arrays and custom integrated circuits.

# 2. Functional features of SI-circuits

SI-circuit conventionally consists of two parts: functional part processing input data and indication one registering termination of transients in SI-circuit as well as in each its part. Interaction between SI-circuit and environment has two special features: acknowledge-require technique and two-phase work discipline. Each data processing phase is followed by pause phase (spacer). Each phase's duration may be an arbitrary large, but finite value.

The main feature of SI-circuit is an obligatory termination of any transient initiated in any circuit's component at the current phase of work. An optimal implementation of SI-circuits is available only by means of single-stage SI-cell library.

As from Muller's hypothesis [2], a single-stage cell has the following features:

- It has single output,
- All its inputs and output take on only logical levels of "0" and "1",
- Signal delay is attached to the cell's output that drives it.

Varshavsky has proved theoretically and practically [3] that one needs a single-stage AND-OR-NOT / OR-AND-NOT logical basis to design valid indicators. Usually standard cell libraries already have such cells. But their subset is insufficient for an effective schematic synthesis of SI-circuits. So, expanding each standard cell library with cells providing optimal SIcircuit implementation is reasonable.

When Muller's theory appeared (60-th of the last century), delays of the logical cells in integrated circuit essentially exceeded the wire delays. But now delays of cells have been reduced, and wire delays are dominant. Muller's hypothesis is true only in the range

<sup>&</sup>lt;sup>\*</sup> The reported study was funded by RFBR according to the research projects (№ 13-07-12062 ofi\_m and № 13-07-12068 ofi\_m)

of equichronous area [3], which size grows short steadily due to decreasing layout feature size.

In order the Muller's hypothesis be true for submicron circuits, one should take into account the wire delays.

# 3. Cell library for SI-circuits

The circuits of all standard CMOS cells have restricted number of serially connected transistors: up to 4 n-MOS transistors, and up to 3 p-MOS transistors. In some cases this leads to significant decomposition of SI-circuit, complicates its indication part and reduces its performance. It was practically determined [4], that using up to 4 both n-MOS and p-MOS transistors, allows for obtaining optimal parameters of SI-circuits. This principle was utilized at designing SI-cell library.

#### **3.1.** Content of the library

Developed in IPI RAN, cell library for designing SI-circuits includes more than 200 cells:

- Combinatory cells expanding the facilities of the standard cell libraries on effective design of SI-circuits,
- Dual-rail multiplexors 2:1,
- Converters of bi-phase signal (an output of a bistable cell) to dual-rail signal,
- Digital comparators of dual-rail signals,
- Majority cells for unary and dual-rail signals,
- Indicators, including hysteresis triggers, which are an analogous of C-element, with 2 through 4 inputs,
- Triggers with unary input (D-triggers),
- RS-triggers,
- Counter triggers,
- Bits of shift register,
- 2-input and 3-input "exclusive OR" cells.

As an example, Figure 1 shows a circuit of one bit of SI binary counter. Here T is complementing input, C and P are clear and preset inputs correspondingly, Q and QB are data outputs, OT is complementing output, and I is an indication output.

Multi-output library cells have an indication output registering termination of transient of all inputs, outputs and internal signals in their circuits. All library cells are protected by Russian and foreign patents [5, 6].

Spacer in SI-circuits may be null (0) or unit (1). So the majority of the SI-library cells have dual analogue. For example, trigger D0SE10 (latch with spacer 0, asynchronous set and write enable input) forms a dual pair with trigger D1SE10 (latch with spacer 1); logical cell A2O3I is dual to cell O2A3I, and so on. This simplifies designing practical SI-circuits due to two reasons:



Figure 1. Cell C0CP

1) Combinational SI-circuits are implemented using dual-rail signal discipline, where each signal is formed by cell pair performing dual functions,

2) Usage of the cells with differential spacer type simplifies matching spacer of circuit's components and decreases hardware complexity.

Most interesting are the cells implementing an interface between synchronous and SI-circuits, or interface between remote parts of SI-circuit, as well as triggers with forced output.

#### 3.2. Interface cells

One of the features of SI-circuits is doubling number of its information signals comparing to synchronous analogue due to usage of self-timed code (bi-phase, dual-rail). Multibit signal buses became of double width. They increase VLSI die and complicate interface implementation between remote layout parts.

The same problem occurs when interfacing SI-unit with its synchronous environment. By force, synchronous environment generates and sends to SIcircuit bi-phase signals instead of unary ones.

Developed variants of trigger with unary input, that is an analogue of a synchronous D-trigger, resolve this problem. Figure 2 demonstrates one of such trigger, flip-flop D1RE21 with unit spacer. Here D is unary data input, E acts as write enable signal, and R is an asynchronous reset. Q and QB form bi-phase output, Iis an indication output, and EB is phase output.

Low level of the write enable input E proves a validity of the information signal D value. Independently on delay of the inverter INV, trigger acquires right information and changes output I level only after termination of transient of all elements within trigger.

Single-stage indicator AOAOI7 included into described library provides successful indication of both phases of trigger work.



Figure 2. Cell D1RE21

Figure 3 represents trigger D0RE21 with null spacer, which is dual against trigger D1RE21.





The cells of the same type as D1RE21 and D0RE21 provide a decrease of layout wire amount by factor of 2, reduction of power consumption, and enhancement of chip traceability.

### 3.3. Triggers with forced output

Triggers with forced output also have large significance for implementation of SI-circuits. Signals with large fan-out should be generated by powerful drivers to provide high performance of the circuit. For control (phase) signals, this problem is resolved by using the inverters with large output capability. But biphase signals (trigger outputs) do not permit using inverters for powering outputs, because signals generated by such manner require an additional nontrivial hardware for their indication.

Problem of strengthening information signal can be resolved by direct increasing output capability of the cell forming this signal by means of stretching width of corresponding CMOS transistors. However, such manner leads to the unreasonable increase of a layout area and to higher power consumption of the circuit.

Described library contains the cells resolving this problem. Figure 4 shows a circuit of one of such trigger, flip-flop R0C23, while Figure 5 demonstrates its latch analog, R0C13.



Figure 4. Cell R0C23



Figure 5. Cell R0C13

Here inputs R and S are dual-rail signal with null spacer; C is self-timed reset input; Q and QB generated by the inverters with specified output capability form dual-rail signal, which phase is indicated by output I. Output I also indicates all inputs of the trigger, as well as output of all components within trigger.

#### 3.4. Characterization of the cell library

All cells of the described library were successfully passed through self-timed feature analysis stage performed by ASPECT program [7]. However, this is not sufficient for admitting practical usage of the suggested library.

To build developed libraries into industry CAD systems, one should prepare standardized files determining cell's models containing electrical and timing cell parameters. In other words, one should characterize all library cells. Manual characterization is labor-intensive procedure: single cell characterization may spend few man-days. And taking into account possible correction both schematic and layout of characterized library cell, required labor and time costs may rise manifold.

Today various characterization tools for standard libraries are known, for example [8]. But they do not take into account the specificity of SI-cell work, and because of this, they do not allow for obtaining adequate functional model for SI-cells.

The authors have developed a software tool STERH [9] characterizing SI-cells. This tool automatically

calculates the electrical and timing parameters of CMOS cell on a base of its netlist including parasitic capacitors and resistors extracted from layout of the cell. As a result, STERH generates output files with cell models both in LIBERTY format, and on Verilog language. These files are compliant to the modern industrial CAD systems.

# 3.5. Cell library approbation

Described cell library intended for designing SIcircuits is implemented in CMOS processes with various feature sizes:

- 1.5 μm, for semicustom circuits on base of gate arrays of 5503 – 5509 series (MIET) [10],
- 0.18 μm, for semicustom circuits on base of gate arrays of 5521 series (MIET),
- 0.18 μm, for custom VLSI,
- 65 nm, for custom VLSI.

The library for designing semicustom circuits on base of gate arrays of 5503 – 5509 series is a part of CAD "Kovcheg" (MIET). It was certified on a set of test chips, and has successfully passed through the practical approbation as a manufactured SI-microcore [4]. Comparison of synchronous and SI-microcore versions manufactured in common technology basis and in single technology cycle has proved the advantages of the SImicrocore both on performance, and power consumption.

The files defining the models of SI-cells from described library in LIBERTY format obtained as a result of characterization process for 180 nm and 65 nm technologies were utilized during design of coprocessor [11] and fused multiply-add unit [12].

# 4. Conclusion

SI-cell library developed in IPI RAN and intended for designing SI-circuits includes more than 200 cells. It expands standard cell libraries and provides reducing hardware costs, as well as increasing performance and lowering power consumption of SI-circuits.

The scientific novelty consists in designing really SI solutions for interfacing with synchronous environment and for driving large loads. Practice value of the investigation is proved by set of SI-units fabricated by 180- and 65-nm standard CMOS processes on the base of this SI-library.

Developed library corresponds to the criteria of building SI-circuits that are optimal for constructing trusted fail-safe computer aids and super-computers.

# 5. References

[1] Y.A. Stepchenkov, Y.G. Diachenko, and G.A. Gorelkin,

"Self-Timed Circuits are a Future of Microelectronics", *Radio electronic questions*, CSRI "Electronics", Moscow, 2011, No.2, pp.153-184 (in Russian).

[2] D. Muller, and W. Bartky, "A Theory of Asynchronous Circuits", *Annals of computation laboratory of Harvard University*, V.29, 1959, pp. 204-243.

[3] Varshavsky, V., M. Kishinevsky, V. Marakhovsky et al., *Self-Timed Control of Concurrent Processes*, Kluwer Academic Publishers, Dordrecht, Netherlands, 1990.

[4] Y.A. Stepchenkov, Y.G. Diachenko, and V.S. Petrukhin, "An Experience on Designing Self-Timed Microcontroller Core on Gate Array Chip", *Nano- and micro-system technique*, Moscow, 2006, No.5, pp.29-36 (in Russian).

[5] I.A. Sokolov, Y.A. Stepchenkov, and Y.G. Dyachenko, Self-Timed RS-Trigger With the Enhanced Noise Immunity, US Patent №8232825, 2012.

[6] I.A. Sokolov, Y.A. Stepchenkov, and Y.G. Dyachenko, *Self-Timed Trigger with Single-Rail Data Input*, US Patent №8324938, 2012.

[7] Y.V. Rozhdestvenskij, N.V. Morozov, and A.V. Rozhdestvenskene, "ASPECT: A Subsystem for Event Analysis of Self-Timed Circuits", *Perspective micro- and nanoelectronics systems development problems*, Moscow, 2010, pp. 26–31 (in Russian).

[8] *Encounter Library Characterizer*, Cadence, http://www.cadence.com/rl/Resources/datasheets/ library characterizer ds.pdf (last accessed 21.06.2015).

[9] N.V. Morozov, Y.G. Diachenko, D.Y. Stepchenkov, and Y.A. Stepchenkov, "System of Self-Timed Cells Characterization", *The Systems and Means of Informatics*, Moscow, 2012, V.22, pp. 38-48 (in Russian).

[10] Stepchenkov, Y. A., A.N. Denisov, Y.G. Diachenko, et al., *Cell Library for Designing Self-Timed 5503/5507 and 5508/5509 Gate-Arrays*, Moscow, 2013, ISBN 978-5-91993-027-3 (in Russian).

[11] Y.A. Stepchenkov, Y.G. Diachenko, V.N. Zakharov, Y.V. Rogdestvenski, N.V. Morozov, and D.Y. Stepchenkov, "Quasi-Delay-Insensitive Computing Device: Methodological Aspects and Practical Implementation", *Lecture Notes in Computer Science*, V. 5953, 2010, pp. 276-285.

[12] Y.A. Stepchenkov, Y.G. Diachenko, Y.V. Rogdestvenski, N.V. Morozov, D.Y. Stepchenkov, A.V. Rogdestvenskene, and A.V. Surkov, "Self-Timed Accumulating Multiplier: Implementation Variants", *The Systems and Means of Informatics*, Moscow, 2014, V.24, No.3, pp. 63-77 (in Russian).