International Workshop on Emerging Memory Solutions
Refine
Year of publication
- 2016 (13)
Document Type
Language
- English (13)
Has Fulltext
- yes (13)
Keywords
- Cache (3)
- SRAM (3)
- DRAM (2)
- PIM (2)
- CAM (1)
- Data retention voltage (DRV) (1)
- Emerging Memories (1)
- Error Correction Codes (1)
- Error correcting coding (ECC) (1)
- Flash Memories (1)
Faculty / Organisational entity
Magnetic spin-based memory technologies are a promising solution to overcome the incoming limits of microelectronics. Nevertheless, the long write latency and high write energy of these memory technologies compared to SRAM make it difficult to use these for fast microprocessor memories, such as L1- Caches. However, the recent advent of the Spin Orbit Torque (SOT) technology changed the story: indeed, it potentially offers a writing speed comparable to SRAM with a much better density as SRAM and an infinite endurance, paving the way to a new paradigm in processor architectures, with introduction of non- volatility in all the levels of the memory hierarchy towards full normally-off and instant-on processors. This paper presents a full design flow, from device to system, allowing to evaluate the potential of SOT for microprocessor cache memories and very encouraging simulation results using this framework.
Lowering the supply voltage of Static Random-Access Memories (SRAM) is key to reduce power consumption, however since this badly affects the circuit performances, it might lead to various forms of loss of functionality. In this work, we present silicon results showing significant yield improvement, achieved with write and read assist techniques on a 6T high- density bitcell manufactured in 40 nm technology. Data is successfully modeled with an original spice-based method that allows reproducing at high computing efficiency the effects of static negative bitline write assist, the effects of static wordline underdrive read assist, while the effects of read ability losses due to low-voltage operations on the yield are not taken into account in the model.
Memory accesses are the bottleneck of modern computer systems both in terms of performance and energy. This barrier, known as "the Memory Wall", can be break by utilizing memristors. Memristors are novel passive electrical components with varying resistance based on the charge passing through the device [1]. In this abstract, the term "memristor" covers also an extension of the definition, memristive devices, which vary their resistance depending on a state variable [2]. While memristors are naturally used as memory cells, they can also be used for other applications, such as logic circuits [3].
We present a novel architecture that redefines the relationship between the memory and the processor by enabling data processing within the memory itself. Our architecture is based on a memristive memory array, in which we perform two basic logic operations: Imply (material implication) [4] and False.
This paper briefly discusses a new architecture, Computation-In-Memory (CIM Architecture), which performs “processing-in-memory”. It is based on the integration of storage and computation in the same physical location (crossbar topology) and the use of non-volatile resistive-switching technology (memristive devices or memristors in short) instead of CMOS technology. The architecture has the potential of improving the energy-delay product, computing efficiency and performance area by at least two orders of magnitude.
In this paper, we show the feasibility of low supply voltage for SRAM (Static Random Access Memory) by adding error correction coding (ECC). In SRAM, the memory matrix needs to be powered for data retentive standby operation, resulting in standby leakage current. Particularly for low duty- cycle systems, the energy consumed due to standby leakage current can become significant. Lowering the supply voltage (VDD) during standby mode to below the specified data retention voltage (DRV) helps decrease the leakage current. At these VDD levels errors start to appear, which we can remedy by adding ECC. We show in this paper that addition of a simple single error correcting (SEC) ECC enables us to decrease the leakage current by 45% and leakage power by 72%. We verify this on a large set of commercially available standard 40nm SRAMs.
Emerging Memories (EMs) could benefit from Error Correcting Codes (ECCs) able to correct few errors in a few nanoseconds. The low latency is necessary to meet the DRAM- like and/or eXecuted-in-Place requirements of Storage Class Memory devices. The error correction capability would help manufacturers to cope with unknown failure mechanisms and to fulfill the market demand for a rapid increase in density. This paper shows the design of an ECC decoder for a shortened BCH code with 256-data-bit page able to correct three errors in less than 3 ns. The tight latency constraint is met by pre-computing the coefficients of carefully chosen Error Locator Polynomials, by optimizing the operations in the Galois Fields and by resorting to a fully parallel combinatorial implementation of the decoder. The latency and the area occupancy are first estimated by the number of elementary gates to traverse, and by the total number of elementary gates of the decoder. Eventually, the implementation of the solution by Synopsys topographical synthesis methodology in 54nm logic gate length CMOS technology gives a latency lower than 3 ns and a total area less than \(250 \cdot 10^3 \mu m^2\).
Multiple-channel die-stacked DRAMs have been used for maximizing the performance and minimizing the power of memory access in 2.5D/3D system chips. Stacked DRAM dies can be used as a cache for the processor die in 2.5D/3D system chips. Typically, modern processor system-on-chips (SOCs) have three-level caches, L1, L2, and L3. Could the DRAM cache be used to replace which level of caches? In this paper, we derive an inequality which can aid the designer to check if the designed DRAM cache can provide better performance than the L3 cache. Also, design considerations of DRAM caches for meet the inequality are discussed. We find that a dilemma of the DRAM cache access time and associativity exists for providing better performance than the L3 cache. Organizing multiple channels into a DRAM cache is proposed to cope with the dilemma.
To continue reducing voltage in scaled technologies, both circuit and architecture-level resiliency techniques are needed to tolerate process-induced defects, variation, and aging in SRAM cells. Many different resiliency schemes have been proposed and evaluated, but most prior results focus on voltage reduction instead of energy reduction. At the circuit level, device cell architectures and assist techniques have been shown to lower Vmin for SRAM, while at the architecture level, redundancy and cache disable techniques have been used to improve resiliency at low voltages. This paper presents a unified study of error tolerance for both circuit and architecture techniques and estimates their area and energy overheads. Optimal techniques are selected by evaluating both the error-correcting abilities at low supplies and the overheads of each technique in a 28nm. The results can be applied to many of the emerging memory technologies.
Three-dimensional (3D) integration using through- silicon via (TSV) has been used for memory designs. Content addressable memory (CAM) is an important component in digital systems. In this paper, we propose an evaluation tool for 3D CAMs, which can aid the designer to explore the delay and power of various partitioning strategies. Delay, power, and energy models of 3D CAM with respect to different architectures are built as well.
The energy efficiency of today’s microcontrollers is supported by the extensive usage of low-power mechanisms. A full power-down requires in many cases a complex, and maybe error prone, administration scheme, because data from the volatile memory have to be stored in a flash based back- up memory. New types of non-volatile memory, e.g. in RRAM technology, are faster and consumes a fraction of the energy compared to flash technology. This paper evaluates power gating for WSN with RRAM as back-up memory.