Welcome to the IKCEST
Integration, the VLSI Journal

Integration, the VLSI Journal

Archives Papers: 544
Please choose volume & issue:
Exploring N3ASIC technology for microwave imaging architectures
Fabrizio Riente; Andrea Giordano; Marco Vacca; Mariagrazia Graziano;
Abstracts:When studying a new technology a critical issue is to understand its performance with respect to CMOS circuits. In this work, we analyze the performance of Nanoscale Application Specific Integrated Circuits (NASIC) and its CMOS-friendly implementation, the N3ASIC. We have created a circuit model using VHDL language. The model includes the estimation of area and power consumption of devices and blocks that can be hierarchically connected to form complex circuits. Using this model we designed a hardware accelerator for an image reconstruction algorithm for biomedical application. We synthesized the same architecture with a 45 nm-CMOS and a 7-nm FinFet library to make a comparison. The results obtained for the N3ASIC are very interesting, showing a substantial reduction in both circuit power and area.
Adaptive prediction resolution video coding for reduced DRAM bandwidth
L. Pearlstein; S. Maxwell; A. Aved;
Abstracts:We present a novel approach to video coding that can dramatically reduce decoder DRAM bandwidth requirements while incurring a minimal reduction in compression efficiency, and which may yield an increase in compression efficiency under certain circumstances. Our approach is based on the principle that areas of video pictures where there is high motion are typically captured with significant blur along the direction of motion. This blur permits the judicious use of reduced resolution reference pictures for prediction without significantly reducing prediction quality. Our approach makes it feasible to limit worst-case DRAM bandwidth through the use of reasonably sized on-chip caches for pixel data, which can lead to provably compliant real-time behavior. The reductions in DRAM bandwidth can be expected to yield commensurate reductions in DRAM power dissipation, and consequently improvements in battery life for mobile devices. Although we concentrate our analysis on decoders, our approach can yield even greater advantages for encoders, which require additional bandwidth for motion estimation. We show that the compression efficiency of our encoder can approach that of a standard reference encoder on natural video sequences, but that it may fall short by a modest amount on pathological synthetic sequences.
Improving performance of FPGA-based SR-latch PUF using Transient Effect Ring Oscillator and programmable delay lines
Amir Ardakani; Shahriar B. Shokouhi; Arash Reyhani-Masoleh;
Abstracts:In this paper, we propose a new structure of SR-Latch Physically Unclonable Function (PUF) based on Transient Effect Ring Oscillator (TERO). Our proposed TERO-based scheme combines the features of two different programmable delay lines (PDLs) and generates the response bits by comparing the number of oscillations of the SR-Latches during the metastable state. The proposed scheme reduces the impact of environmental noise to increase the reliability of the response bits. Also, our proposed area-efficient PUF architecture has low complexities and hence consumes low power consumption as compared to the counterparts. Moreover, we investigate the impact of systematic variation on the uniqueness of the response bits. We have used an optimized placement (in term of area cost) for the proposed structure and implemented our proposed scheme on the Spartan3 FPGA boards. The implemented structure demonstrates considerable performance metrics such as the uniqueness of 49.32%. In addition, the proposed structure provides higher reliability when tuned with PDLs. Hence, the need for complex error correcting codes is reduced. This makes the scheme appropriate for low-cost authentication and cryptographic applications.
Low overhead online periodic testing for GPGPUs
Mohammad Abdel-Majeed; Waleed Dweik;
Abstracts:GPGPUs are used to run a wide range of applications due to their high performance. As technology scales down, processing units become more susceptible to different types of faults due to radiations, manufacturing defects, wearout and aging. Some faults are detected and fixed before deployment, while others appear during infield operation. To address infield faults, continuous and periodic testing mechanisms are leveraged. In this paper, we propose a GPGPU-specific technique that takes advantage of a couple of GPGPU workloads characteristics to reduce the performance overhead of periodic testing. First, we show that many GPGPU workloads experience high probability of input similarity between threads of the same warp. Second, we show that GPGPU workloads have noticeable variation in the threads activity of their warps. Traditional periodic testing mechanisms, when applied to GPGPUs, fail to exploit these observations. Hence, we propose the half-SP deactivation technique to exploit these workload characteristics and reduce the performance overhead of the periodic testing in GPGPU platforms. The results show that the proposed technique can reduce the performance overhead of the testing from 29% to 8% with less than 1.8% area and power overheads.
A novel design of a ternary coded decimal adder/subtractor using reversible ternary gates
Mohammad Mehdi Panahi; Omid Hashemipour; Keivan Navi;
Abstracts:In recent years, an outstanding amount of interest has been given to reversible circuits. Their applications in distinctive fields that include digital circuit design with low-power consumption, computational circuit design in quantum computer and DNA-based computations are of high significance. Because of advantages of ternary circuits over binary circuits, such as reducing the complexity of interconnects, smaller chip area and reducing the number of quantum cells for the quantum circuit, ternary logic is suggested to construct new compact circuits. Also, in quantum technology, without any restrictions with the same physical phenomena that binary circuits are implemented, new circuits can be implemented in ternary logic. Circuit design for decimal calculations, which includes addition and subtraction of decimal numbers, has continually been the interest of digital circuit designers. In reversible ternary computation, the reversible circuits for decimal calculations have been less studied. In this paper, a reversible ternary adder/subtractor circuit for the addition/subtraction of decimal digits in radix three is proposed. Ternary Coded Decimal (TCD) codes are used to display decimal inputs and outputs. In circuit implementation, first by removing the unused inputs and outputs in the required ternary adders in the TCD Adder, three blocks of reversible 3-qutrit ternary adder with the quantum cost of 29, 22, 14 and constant inputs of 0, 1, and 0 were presented, then an optimal circuit for the TCD detector with a quantum cost of 16 was introduced. The proposed TCD detector has 23% improvement in quantum cost as compared to the existing design. By applying these in the design of the reversible TCD Adder, a more optimal reversible TCD Adder circuit than the existing design was presented resulting in a 31% improvement in quantum cost and 58% improvement in the number of constant inputs. Finally, decimal 9's complement circuit for the subtraction of two decimal numbers was proposed. In the proposed TCD Adder/Subtractor, the new proposed TCD Adder and decimal 9's complement circuits were used. To realize all proposed circuits, 1-qutrit shift gates and 2-qutrit Muthukrishnan-Stroud gates, which are realized in ion-trap technology in quantum computers, were used.
NV-TCAM: Alternative designs with NVM devices
Ismail Bayram; Yiran Chen;
Abstracts:TCAM (ternary content addressable memory) is a special memory type that can compare input search data with stored data, and return location (sometime, the associated content) of matched data. TCAM is widely used in microprocessor designs as well as communication chip, e.g., IP-routing. Following technology advances of emerging nonvolatile memories (eNVM), applying eNVM to TCAM designs becomes attractive to achieve high density and low standby power. In this paper, we examined the applications of three promising eNVM technologies, i.e., magnetic tunneling junction (MTJ), memristor, and ferroelectric memory field effect transistor (FeFET), in the design of nonvolatile TCAM cells. All these technologies can achieve close-to-zero standby power though each of them has very different pros and cons.
Hardware-assisted Verilog simulation system using an application specific microprocessor
Tze Sin Tan; Bakhtiar Affendi Rosdi;
Abstracts:Verilog is a Hardware Description Language (HDL) used for VLSI design and modeling. A software-based Verilog simulator running on general purpose computer is the dominant simulation platform. However, the platform is throughput limited at simulating next generation designs. Commercial hardware-assisted solutions are proprietary with various limitations. A hardware-assisted platform through the use of an application specific simulation processor is proposed in this paper. Program flow of this processor is driven by HDL simulation semantics. The microprocessor is customized to support Verilog operations with computation using the language's native data types (0, 1, X, Z) from behavioral to gate-level abstraction, including delay and signal strength modeling. Besides, fine-grained parallel event dispatch and hardware-augmented netlist traversing are acceleration features built in the microprocessor. A prototype was built on an FPGA to demonstrate system viability. Benchmarking against a software-based compiled-code simulator had shown up to 9 times simulation time improvement despite having limited basic speed improvement techniques implemented. Capacity scalability can be achieved through parallel processing and memory expansion. The system offers speed improvement over software-based simulator, while retaining the same usability. These leave unbounded room of improvements to meet future simulation needs.
Aging mitigation of L1 cache by exchanging instruction and data caches
Mohammad Sadeghi; Hooman Nikmehr;
Abstracts:By shrinking the transistors' dimensions, some aging phenomena effects that were negligible in previous technologies have become the most serious reliability threats. The most important aging process is Bias Temperature Instability (BTI). This phenomenon dissociates the chemical bonds between the silica lattice and hydrogen at the interface between the gate insulator and the transistor channel. The SRAM structures are vulnerable to BTI due to the long stress time on the transistors of this structure during normal system operations. Regardless of the stored values on the SRAM cells, the cross-coupled structure of this cell leads to continuous stress on at least two transistors of the cell at a given time. To mitigate the aging of the SRAM structures, like cache memories, there are many aging avoidance and aging mitigation techniques in the literature on this topic. However, the overheads or the efficiency thereof are not tolerable in many cases. This paper proposes an aging mitigation mechanism for cache memories by exchanging the values that are periodically stored in data and instruction caches. The results show that while the area and performance overheads of the proposed mechanism are negligible (less than 1%), the average stress time on the cache memories decreases by 2.39×, which prolongs the Mean Time to Failure (MTTF) of the memory structure by 1.91×.
Offline Testing of Reversible Logic Circuits: An Analysis
Hari Mohan Gaur; Ashutosh Kumar Singh; Umesh Ghanekar;
Abstracts:Reversible logic is one of the foregrounds to meet the ever-changing demands electronic devices with its applications to quantum computation. The change in technology gives rise to new challenges; consequently numerous fault model came into the existence where testing plays a significant role to achieve desired results. Several testing methodologies have been proposed for the identification of different types of fault models in reversible logic circuits and are scaled on various performance parameters. We bring collective information of fault models, performance parameters and offline testing approaches from the literature where aim is to obtain a near optimal solution by efficiently exploring the entire space. The paper critically analyses a range of testing strategies reported by the researchers and presented in two broad classifications, namely automatic test pattern generation (ATPG) and design for testability (DFT) methodologies. All the methods are explained in detail with a brief illustration. Comparison results are presented in tabular form highlighting preeminent among all methodologies on the basis of performance parameters.
MOSCAP compensation of three-stage operational amplifiers: Sensitivity and robustness, modeling and analysis
Mohammad Danaie; Esmaeel Ranjbar; Mojtaba Ahmadieh Khanesar;
Abstracts:Using MOSCAPs for compensation can reduce the area needed for their implementation. However; these capacitors are highly nonlinear and their value changes when the voltage across their terminals is changed. Different compensation topologies do not exhibit equal sensitivity to the time-dependent variation of MOSCAP compensation capacitors. Therefore; the selection of a proper compensation technique is considered as an important step before its implementation. In this paper, the goal is to study the existing compensation techniques and their behavior in the presence of compensation capacitor uncertainty and MOSCAP nonlinearity. For such regard, opamps are first designed for equal performance metrics such as: gain bandwidth product, phase margin and settling time. Then the sensitivities of the different structures are evaluated. Ultimately, the least sensitive structures can be chosen as the best choice for implementation. Based on the knowledge of the authors, no such thorough analysis has been ever performed before. A state-space model is suggested to analyze different compensation topologies and to calculate the harmonic distortions and find the most robust structures. These results are compared with circuit level results. It is shown that the CFCC compensation method can lead to creation of least harmonic distortion.
Hot Journals