



# Article A Timing-Based Split-Path Sensing Circuit for STT-MRAM

Bayartulga Ishdorj<sup>†</sup>, Jeongyeon Kim<sup>†</sup>, Jae Hwan Kim and Taehui Na \*D

Department of Electronics Engineering, Incheon National University, Incheon 22012, Korea; tulga5111@gmail.com (B.I.); jyeon117@inu.ac.kr (J.K.); asrgzone@naver.com (J.H.K.) \* Correspondence: taehui.na@inu.ac.kr; Tel.: +82-32-835-8452

+ These authors contributed equally to this work.

Abstract: Spin-transfer torque magnetoresistive random access memory (STT-MRAM) applications have received considerable attention as a possible alternative for universal memory applications because they offer a cost advantage comparable to that of a dynamic RAM with fast performance comparable to that of a static RAM, while solving the scaling issues faced by conventional MRAMs. However, owing to the decrease in supply voltage ( $V_{DD}$ ) and increase in process fluctuations, STT-MRAMs require an advanced sensing circuit (SC) to ensure a sufficient read yield in deep submicron technology. In this study, we propose a timing-based split-path SC (TSSC) that can achieve a greater read yield compared to a conventional split-path SC (SPSC) by employing a timing-based dynamic reference voltage technique to minimize the threshold voltage mismatch effects. Monte Carlo simulation results based on industry-compatible 28-nm model parameters reveal that the proposed TSSC method obtains a 42% higher read access pass yield at a nominal  $V_{DD}$  of 1.0 V compared to the SPSC in terms of iso-area and -power, trading off 1.75× sensing time.

**Keywords:** dynamic reference voltage; read disturbance; read yield; sense amplifier; sensing circuit; spin-transfer torque magnetoresistive random access memory (STT-MRAM)



Citation: Ishdorj, B.; Kim, J.; Kim, J.H.; Na, T. A Timing-Based Split-Path Sensing Circuit for STT-MRAM. *Micromachines* **2022**, *13*, 1004. https://doi.org/10.3390/ mi13071004

Academic Editor: Zhongrui Wang

Received: 3 June 2022 Accepted: 25 June 2022 Published: 26 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.



**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

# 1. Introduction

To prolong the battery life in lithium-ion battery-powered applications, such as smartphones, wearable devices, and wireless sensor nodes, it is crucial to achieve good performance with low power consumption [1]. Even though a conventional magnetoresistive random access memory (MRAM) has the speed of a static RAM (SRAM) and density of a dynamic RAM (DRAM), it has unique problems, including poor scalability and excessive power consumption owing to large write currents. Therefore, spin-transfer torque MRAM (STT-MRAM) has emerged as the top choice for universal memory applications owing to its short access time, low power consumption, and high density [2]. In addition, STT-MRAM has outstanding scalability, as the critical switching current of the magnetic tunnel junction (MTJ) decreases with device size to overcome the scaling problems faced by conventional memories, such as DRAM, SRAM, and flash memories [3–5]. In other words, STT-MRAM can achieve a higher performance than DRAM and smaller cell size than SRAM, with the nonvolatility of flash memory [6–9].

However, STT-MRAM faces a read yield degradation problem when used in deep submicron technologies because of the large process variations in low supply voltage ( $V_{DD}$ ) and small resistance difference between the low resistance (state 0) and high resistance (state 1) states of the MTJ [10,11]. The current ( $\Delta I_0$  or  $\Delta I_1$ ) or voltage differences ( $\Delta V_0$  or  $\Delta V_1$ ) generated in a sensing circuit (SC), which are then conveyed to a sense amplifier (SA), can be expressed as:

$$\Delta I_{0,1} = |I_{\text{data0,1}} - I_{\text{ref}}| \tag{1}$$

$$\Delta V_{0,1} = |V_{\text{data0,1}} - V_{\text{ref}}| = \Delta I_{0,1} \cdot r_{\text{O}_{\text{PLD}}}$$
(2)

where  $I_{data0}$  ( $I_{data1}$ ) and  $I_{ref}$  are the currents flowing through the data cell in state 0 (state 1) and reference cell, respectively.  $V_{data0}$  ( $V_{data1}$ ) and  $V_{ref}$  are the output voltages of the data and reference branches in the SC, respectively, and  $r_{O_PLD}$  is the output resistance of the load PMOS. Assuming a fixed  $V_{ref}$ ,  $\Delta V_{0,1}$  is approximately half of the difference between  $V_{data0}$  and  $V_{data1}$  because  $V_{ref}$  is ideally equal to ( $V_{data0} + V_{data1}$ )/2 [10]. To overcome the read yield degradation, a higher  $\Delta V_{0,1}$  and lower impact of process variation are required in deep submicron technology nodes. However,  $\Delta V_{0,1}$  is limited by  $V_{DD}$  and current, and the variations in the process parameters inevitably increase as the technology node scales down.

Furthermore, the offset voltage of the SA ( $V_{SA_OS}$ ) must be considered for a successful read operation. Thus,  $\Delta V_{0,1}$  must be larger than  $V_{SA_OS}$  to generate the read-access path. The statistical distributions between  $\Delta V_{0,1}$  and  $V_{SA_OS}$  can be modeled using a Gaussian distribution [11]. The read-access pass yield for a single cell with states 0 or 1 ( $RAPY_{CELL0,1}$ ) [12] is expressed as

$$RAPY_{\text{CELL0,1}} = \frac{\mu_{\Delta \text{V0,1}} - \mu_{\text{SA}}_{\text{OS}}}{\sqrt{\sigma_{\Delta \text{V0,1}}^2 - \sigma_{\text{SA}}^2_{\text{OS}}}}$$
(3)

where  $\mu_{\Delta V0,1}$  and  $\mu_{SA_OS}$  are the means of  $\Delta V_{0,1}$  and  $V_{SA_OS}$ , respectively, and  $\sigma_{\Delta V0,1}$  and  $\sigma_{SA_OS}$  are the standard deviations of  $\Delta V_{0,1}$  and  $V_{SA_OS}$ , respectively. *RAPY*<sub>CELL</sub> is defined as the minimum value between *RAPY*<sub>CELL0</sub> and *RAPY*<sub>CELL1</sub>.

$$RAPY_{CELL} = \min(RAPY_{CELL0}, RAPY_{CELL1})$$
(4)

In this study, a novel timing-based split-path sensing circuit (TSSC) that is tolerant to process variations and increases  $\Delta V_{0,1}$  value is proposed and compared with various SCs with respect to  $RAPY_{CELL}$ , delay, and power consumption. It improves  $\mu_{\Delta V0,1}$  using the dynamic reference voltage (DRV) technique that modifies  $V_{ref}$  according to the MTJ state. It also reduces  $\sigma_{\Delta V0,1}$  by compensating the transistor mismatch and effectively increasing the sensing current. Even though the split-path sensing circuit (SPSC) [10] is considered to have optimal performance in terms of read yield, Monte Carlo HSPICE simulation results based on industry-compatible 28-nm model parameters reveal that the proposed TSSC achieves a 42% boost in  $RAPY_{CELL}$  at a  $V_{DD}$  of 1.0 V when compared to the SPSC in terms of iso-area and -power. The remainder of this paper is organized as follows: Section 2 describes the operational principles and characteristics of the conventional SCs and proposed TSSC. Section 3 compares the performance of the proposed SC and conventional SCs. Finally, Section 4 presents the conclusions drawn from our study.

#### 2. Previous SCs and Proposed TSSC

In this section, the characteristics and operation principles of the existing and proposed SCs are described.

## 2.1. Existing SCs

Three conventional SCs [10,11,13] for STT-MRAM are shown in Figure 1 along with their output voltage distributions. To analyze the voltage distributions, industry-compatible 28-nm model parameter libraries were used. Temperature was set to room temperature (25 °C) and, to compare in terms of iso-area and -power, the layout area of SCs were set to be identical by varying the transistor sizes. In addition, the gate voltage of the clamp NMOS ( $V_{\text{CLAMP}}$ ) was used differently for each SC to set the sensing current to 20  $\mu$ A at state 1.







**Figure 1.** Schematics of conventional SCs with voltage distributions and timing diagrams. (a) SDSC [11] at state 0; (b) SDSC at state 1; (c) HSCC [13] at state 0; (d) HSCC at state 1; (e) SPSC [10] at state 0; (f) SPSC at state 1.

Figure 1a,b illustrate the schematics and output voltage distributions of the source degeneration sensing circuit (SDSC) [11] at states 0 and 1, respectively. In the SDSC, the clamp NMOS (NCD, NCR) is used to generate  $\Delta I$ , which is then converted to  $\Delta V$  via a load PMOS (PLD, PLR) using a current mirror; then,  $\Delta V$  is fed to the SA. In addition, SDSC achieves a PMOS (PDD, PDR) degeneration between  $V_{DD}$  and load PMOS for source degeneration purposes. The source degeneration effect reduces the variation in the load

PMOS and increases  $r_{O PLD}$ , leading to an improvement in the  $RAPY_{CELL}$ . The I-V curve shown in Figure 2a represents the relationship between the drain voltage of each MOSFET and current through each MOSFET in the SDSC. The crossing points of the I-V curve for the load PMOS and clamp NMOS denote the operating points. For example, the crossing points for PLR and NCR, PLD0 and NCD0, and PLD1 and NCD1 are the operating points for V<sub>ref</sub>, V<sub>data0</sub>, and V<sub>data1</sub>, respectively. PLD0 (PLD1) passes through the operating point between PLR and NCR, as V<sub>ref</sub> denotes the gate voltage of PLD0 (PLD1). The drain voltage distribution of the PLR shown in Figure 1a has a small standard deviation because of the large slope (i.e., the output resistance is small) of the diode-connected PLR. When the output resistance is small, the voltage variation is small as well, expressed in (2). The PLD has a relatively large standard deviation owing to the small slope of the PLD I-V curve and current mirror variation. Thus, to obtain a proper sensing margin in SDSC, the PLD variation must be reduced. SDSC can obtain an adequate read yield in 65-nm process technology because the degeneration effect reduces the process variation in the SC. However, under 65-nm process technology, the SDSC suffers from significant read yield degradation because the fixed  $V_{ref}$  limits  $\mu_{\Delta V0,1}$  and process variation increment increases  $\sigma_{\Delta V0,1}$ .



**Figure 2.** I-V curves of the conventional SCs with both MOSFET and MTJ process variations for analyzing operating point. (a) SDSC at states 0 and 1; (b) HSCC states 0 and 1; (c) SPSC at states 0 and 1.

Figure 1c,d illustrate the SC with a highly symmetric cross-coupled current mirror (HSCC) [13] at states 0 and 1, respectively. To address the fixed  $V_{ref}$  issue observed in SDSC, the HSCC uses the DRV technique that adjusts  $V_{ref}$  according to the data state to enhance the sensing margin [14]. Figure 2b shows the I-V curve of HSCC, where crossing point of POR and NOR transistors' I-V curves represents  $V_{ref}$ . As demonstrated,  $V_{ref}$  decreases at state 1 ( $V_{ref1}$ ) and increases at state 0 ( $V_{ref0}$ ). The DRV technique can almost double  $\mu_{\Delta V0,1}$  compared to the fixed  $V_{ref}$  approach used in the SDSC. However, in the HSCC, three current mirrors from the PLD to the NCMD are employed to generate  $V_{data0,1}$ . Consequently, these current mirrors induce a substantial current mismatch, which increases  $\sigma_{\Delta V0,1}$ . Neglecting the channel-length modulation, the current through the NCD ( $I_{NCD}$ ) for the current mirror is expressed as

$$I_{\rm NCD} = A(V_{\rm GS} - V_{\rm TH})^2 \tag{5}$$

where *A* is the transconductance parameter,  $V_{GS}$  is the gate-to-source voltage of the NCD, and  $V_{TH}$  is the threshold voltage. The current through the PCMD ( $I_{data\_cm0}$ ), including the mismatch effects, is given by

$$I_{\text{data}\_\text{cm0}} = (A + \Delta A)(V_{\text{GS}} - V_{\text{TH}} - \Delta V_{\text{TH}})^2$$
(6)

where  $\Delta A$  is the transconductance mismatch that results in a gain error and  $\Delta V_{\text{TH}}$  is the threshold voltage mismatch that results in voltage offsets [15]. The current through the NOD ( $I_{\text{data}\_\text{cm1}}$ ), including the mismatch effects, is given by

$$I_{\text{data\_cm1}} = (A + 2\Delta A)(V_{\text{GS}} - V_{\text{TH}} - 2\Delta V_{\text{TH}})^2$$
(7)

The current through the POD is similar to  $I_{data\_cm0}$ . Furthermore, PMOS-based circuits are vulnerable to mismatches because of poor gate oxide capacitance matching and high mobility variations [16]. Thus, when numerous current mirrors are used, the current mismatch increases, as expressed in (7). The standard deviations of  $V_{data0,1}$  ( $V_{ref0,1}$ ) shown in Figure 1c,d are relatively large because of the many steps in the current mirror and small slope of the POD0,1 (POR0,1) I-V curve. Consequently, even though the HSSC doubles  $\mu_{\Delta V0,1}$  using the DRV technique, it cannot guarantee a sufficient read yield in deep submicron technology.

Figure 1e,f show the SPSC [10] at states 0 and 1, respectively. The SPSC uses a split path to employ the DRV technique and achieves PMOS degeneration to reduce the variation. Figure 2c shows that the SPSC successfully modifies  $V_{ref}$  according to the MTJ state; the subsequent  $V_{ref0}$  and  $V_{ref1}$  values that confirm this are presented in Figure 1e,f. Furthermore, to overcome the process variation issue faced by the HSCC, the SPSC utilizes a split-path technique instead of current mirrors. Instead of employing current mirrors to transmit the NCD saturation current to the NOD as HSCC does, the SPSC uses the split path on the NCD source and applies the same gate-to-source voltage to both the NCD and NOD. As a result, even without a current mirror, the saturation currents of the NCD and NOD are equalized. The standard deviations of  $V_{data0,1}$  ( $V_{ref0,1}$ ) presented in Figure 1e,f is smaller than those of the HSCC because the number of current mirrors is reduced. Therefore, among the existing SCs, SPSC is superior in terms of the  $RAPY_{CELL}$ . However, as shown in Figure 1e,f, the standard deviations of  $V_{ref0}$  and  $V_{data1}$  remain large (i.e., 106 mV and 101 mV). This is because the data current ( $I_{data}$ ) is split in half by the split-path scheme ( $I_{data} = I_{data_{cm}} + I_{data_{sp}}$ ), and the lowered current is sensitive to the increased process variations in deep submicron technology.

## 2.2. Proposed TSSC

In SPSC, the lowered saturation current caused by the split path is vulnerable against process variations. The TSSC maintains the advantages of SPSC (i.e., the DRV technique, thereby increasing the sensing margin  $\Delta V_{0,1}$ ) but overcomes the lowered saturation current problem. Figure 3a shows the schematic and timing diagram of the proposed TSSC. The key

circuit design differences between the TSSC and SPSC are the inclusion of two transistors as switches (ST1 and ST2) and exclusion of NOD and NOR by implementing a timing-based split-path scheme. To reduce  $\sigma_{\Delta V0,1}$  by enhancing the current for each path, the MOSFET operating times are controlled by the PRE signal. Figure 3b shows the TSSC operation in phase 1 (P1) at state 0. In P1, the PRE signal is high; thus, PD0, PD3, ST1, and ST2 are turned on at the beginning. The gate voltage of POR0 (POD0) starts to pre-charge as ST1 and ST2 are turned on. PLD0 and PLR0 reach saturation points after a sufficient pre-charge time if  $I_{data0}$  and  $I_{ref0}$  are constant in P1, regardless of the process variations in PLD0 (PLR0). At the end of P1, the gate voltage of POR0 (POD0) is kept at the same value as  $V_{data0}$  ( $V_{ref0}$ ). However, POR0 and POD0 cannot operate because PD1 and PD2 are turned off, respectively. Figure 3c shows the TSSC operation in phase 2 (P2) at state 0. At the beginning of P2, the PRE signal is turned off. Thus, PD0, PD3, ST1, and ST2 are turned off, and PD1 and PD2 are turned on. When PLD0 and PLR0 are cut off from  $V_{DD}$ , POR0 and POD0 can operate because the gate voltages of POR0 and POD0 are pre-charged to  $V_{data0}$  and  $V_{ref0}$ , respectively, in P1. Thus, the saturation current of POR0 is transmitted from NCD0 as the SPSC. However, for TSSC, the saturation current of POR0 is approximately two times larger than that of the SPSC, regardless of process variation. This is because PLD0 is turned off in P2 and there is no current division. In addition, in the TSSC, without current division, the clamp NMOS number is less than that of the SPSC, which is achieved by shifting the split-path position to the drain of the clamp NMOS. Therefore, the TSSC implements the DRV technique with the minimization of  $\sigma_{\Delta V0,1}$ , which is confirmed by the TSSC voltage distribution diagram shown in Figure 3c,e. The process variation in the TSSC can be further decreased compared to the SPSC because in terms of iso-area, the size of MOSFETs in the TSSC can be increased as the number of clamp NMOS is reduced. Figure 3d, e show the TSSC operation at state 1 in P1 and P2, respectively. The TSSC operation at state 1 is nearly identical to that at state 0, with only a change in the MTJ state. Figure 4 shows the TSSC I-V curves, from which  $V_{data0.1}$  and  $V_{ref0.1}$  can be estimated. The crossing point in the I-V curves for NCD0 (NCD1) and POD0 (POD1) is the operating point for V<sub>data0</sub> (V<sub>data1</sub>). Furthermore, the crossing point in the I-V curves of NCR0 (NCR1) and POR0 (POR1) is the operating point for  $V_{ref0}$  ( $V_{ref1}$ ).  $V_{ref}$  is successfully modified according to the MTJ state, and operation current is nearly doubled compared to that of SPSC.



(a)

Figure 3. Cont.



**Figure 3.** (**a**) Schematic and timing diagram of TSSC. Operations with TSSC voltage distributions in (**b**) phase 1 at state 0, (**c**) phase 2 at state 0, (**d**) phase 1 at state 1, and (**e**) phase 2 at state 1.



**Figure 4.** I-V curves of the TSSC with both MOSFET and MTJ process variations for analyzing operating points.

## 3. Simulation Results

### 3.1. Simulation Conditions

All simulation results included in this section were obtained using Monte Carlo HSPICE simulations implemented in industry-compatible 28-nm model parameter libraries. The  $\mu_{SA_OS}$  and  $\sigma_{SA_OS}$  values for calculating  $RAPY_{CELL}$  were set to 0 and 20 mV, respectively [17]. Furthermore, a standard deviation of 4% was considered for the MTJ variation. The MTJ model used in this study had an  $R_1$  ( $R_H$ , anti-parallel) of 6 k $\Omega$  and  $R_0$  ( $R_L$ , parallel) of 3 k $\Omega$ , considering a tunnel magnetoresistance (TMR) ratio of 100%. For a fair comparison between SCs in terms of iso-area and iso-power, all transistor sizes were chosen such that the layout area (=sum of each transistor's area (width × length)) of each SC was 1.76 µm<sup>2</sup>. Moreover, the  $V_{CLAMP}$  for each SC was precisely set such that it generated 20 µA at state 1. In addition, the optimal  $R_{ref}$  for each circuit was used for architectural analysis.

#### 3.2. Results and Comparison

 $RAPY_{CELL0}$  and  $RAPY_{CELL1}$  increased when  $R_{ref}$  increased and decreased, respectively. Accordingly, the crossing points for  $RAPY_{CELL0}$  and  $RAPY_{CELL1}$  were the maximum values of  $RAPY_{CELL}$ . Figure 5 shows  $RAPY_{CELL0,1}$  for SDSC, HSCC, SPSC, and TSSC with respect to  $R_{ref}$  when the temperature was set to room temperature (25 °C). The HSCC achieved the lowest  $RAPY_{CELL}$  value owing to the large output variation due to the current mismatch of the multiple current mirrors. Despite the large  $\mu_{\Delta V0,1}$ , the SDSC and SPSC exhibited the same value for  $RAPY_{CELL}$  in terms of iso-area because the SDSC transistor size was twice as large as that of the SPSC. The  $RAPY_{CELL}$  value of TSSC was the largest compared with that of the conventional SC because the DRV technique was successfully implemented and a timing-based split-path scheme overcame the lowered data current problem faced by SPSC.



**Figure 5.**  $RAPY_{CELL0}$  and  $RAPY_{CELL1}$  of (**a**) SDSC; (**b**) HSCC; (**c**) SPSC; and (**d**) TSSC with respect to  $R_{ref}$ . (**e**) Optimal  $R_{ref}$  and  $RAPY_{CELL}$  of SDSC, HSSC, SPSC, and TSSC.

Figure 6 shows the RAPY<sub>CELL</sub> for SDSC, HSCC, SPSC, and TSSC with respect to  $V_{\text{DD}}$  and  $V_{\text{CLAMP}}$ . To analyze the SCs in terms of iso-power,  $V_{\text{CLAMP}}$  for each SC was precisely set such that it generates 20  $\mu$ A at state 1. The temperature range was set in the range from -45 to 90 °C, and the worst case was chosen for  $RAPY_{CELL}$  calculation. As shown in Figure 6a, the HSCC exhibited minimal variations in  $RAPY_{CELL}$  when  $V_{DD}$ increased because  $\sigma_{\Delta V0,1}$  was excessively large owing to the number of current mirrors. When  $V_{DD}$  was less than 0.9 V,  $RAPY_{CELL}$  for the SPSC was greater than that for the TSSC. In the TSSC,  $RAPY_{CELL}$  was significantly reduced at low  $V_{DD}$  because the decrease rate of  $\mu_{\Delta V0,1}$  was high. When  $V_{DD}$  was greater than 0.9 V, the TSSC achieved a large value for *RAPY*<sub>CELL</sub> because the increase rate of  $\mu_{\Delta V0,1}$  was higher than that of  $\sigma_{\Delta V0,1}$ . Figure 6b shows  $RAPY_{CELL}$  with respect to  $V_{CLAMP}$  when  $V_{DD} = 1.0$  V.  $RAPY_{CELL}$  value's reliance on  $V_{\text{CLAMP}}$  is not linear because when  $V_{\text{CLAMP}}$  is low, the operating current is not enough, and when  $V_{\text{CLAMP}}$  is high  $\mu_{\Delta V0,1}$  decreases because of the decreased  $r_{\text{O_PLD}}$ . Thus,  $V_{\text{CLAMP}}$  has an optimal value that maximizes the read yield. The HSCC has low  $RAPY_{CELL}$  and no optimal value according to  $V_{CLAMP}$  in the range from 0.55 to 0.8 V. The optimal V<sub>CLAMP</sub> values for SDSC and SPSC are 0.65 V and 0.7 V, respectively. However, STT-MRAM requires low sensing current to prevent read disturbances. An unintentional write operation occurs during a read operation when the critical MTJ switching current is lower than the read current [11]. In the TSSC, the optimal  $V_{\text{CLAMP}}$  value, 0.6 V, is lower than that of conventional SCs, and the RAPY<sub>CELL</sub> value is the highest. Therefore, the TSSC is suitable to be implemented in STT-MRAM, which requires low sensing current.



**Figure 6.** *RAPY*<sub>CELL</sub> of SDSC, HSCC, SPSC, and TSSC with respect to (**a**) *V*<sub>DD</sub> and (**b**) *V*<sub>CLAMP</sub>.

Figure 7a illustrates the  $RAPY_{CELL}$  values of the SDSC, HSCC, SPSC, and proposed TSSC with respect to the TMR of MTJ when  $V_{DD}$  is set to 1 V and  $V_{CLAMP}$  is set to 0.6 V. The temperature range was set in the range from -45 to 90 °C and the worst case was chosen for  $RAPY_{CELL}$  calculation. TMR is calculated as  $(R_1 - R_0)/R_0$  and ideally assumes a high values because it influences the sensing speed, read margin, and noise margin of the memory cell. However, the SCs for STT-MRAM must be built to compensate for process-related fluctuations in the TMR value. As shown in Figure 7a, for low sensing current applications, even when the TMR value is decreased to 60%, the proposed TSSC maintains a greater

 $RAPY_{CELL}$  value compared to existing SCs. Figure 7b plots the  $RAPY_{CELL}$  as a function of sensing time when  $V_{DD}$  was set to 1 V and  $V_{CLAMP}$  was set to 0.6 V. Because the TSSC uses a two-phase sensing operation for the timing-split-based DRV technique, it requires sufficient time for charging and discharging. Thus, a sharp increase in  $RAPY_{CELL}$  can be observed at approximately 8 ns, and the  $RAPY_{CELL}$  value is saturated at approximately 14 ns. The existing designs are not sensitive to the sensing time compared to the proposed TSSC, which provides more accurate sensing at the expense of increased sensing times.



Figure 7. RAPY<sub>CELL</sub> of SDSC, HSCC, SPSC, and TSSC with respect to (a) TMR of MTJ and (b) sensing time.

As mentioned earlier, the TSSC exploits the gate capacitance of the PLD, POR, POD, and PLR transistors to accumulate charge during P1, which raises the question of whether the capacitor-less design is better than using a capacitor. Figure 8 shows the TSSC *RAPY*<sub>CELL</sub> value with respect to the additional capacitor capacitance when two capacitors are added at the gates of the PLD and PLR. The *RAPY*<sub>CELL</sub> value increases until the additional capacitor capacitance reaches 2 fF but the increment is insignificant and starts to decrease as the capacitance increases further because of the limited sensing time. Moreover, if a capacitor is added to the design, the sizes of all transistors need to be reduced in terms of iso-area, which results in a decrease in the gate capacitance and performance degradation. Accordingly, Figure 8 indicates that the gate capacitances of PLD, POR and POD, PLR pairs are sufficient for the TSSC to accumulate charge during P1 to mirror the data current during P2.



Figure 8. *RAPY*<sub>CELL</sub> of TSSC with respect to capacitance value when an additional capacitor is assumed.

Table 1 summarizes the simulation results and compares the proposed TSSC with conventional SCs. The simulation to calculate the average power consumption ( $P_{AVG}$ ) used and the  $V_{CLAMP}$  and  $R_{ref}$  values are presented in Figure 5. In both states 0 and 1, the TSSC achieved the lowest  $P_{AVG}$  compared with the conventional SCs. However, the proposed TSSC requires a longer sensing time because of its charging and discharging times. As shown in Table 1, the TSSC can achieve a 42% higher  $RAPY_{CELL}$  value compared to that of SPSC. Because of the read disturbance, sensing circuitries are required to work with low sensing currents. Moreover, the  $RAPY_{CELL}$  value of the proposed TSSC is the highest in low-current sensing tasks. Therefore, the TSSC is suitable for STT-MRAM applications, which require low sensing currents.

|                                                                                                                                                                                                                                            | SDSC [11]      | HSSC [13]            | SPSC [10]      | TSSC<br>(Proposed) |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|----------------------|----------------|--------------------|
| Process technology<br>DRV technique<br>I <sub>CELL</sub> [μΑ] @ state 1                                                                                                                                                                    | Х              | 28 nm<br>O O<br>20.0 |                | О                  |
| Layout Area of SC [μm <sup>2</sup> ] @ iso-area<br>(Layout Area of SC [μm <sup>2</sup> ] @<br>identical Tr. size <sup>(1)</sup> )                                                                                                          | 1.76<br>(1.28) | 1.76<br>(3.04)       | 1.76<br>(2.56) | 1.76<br>(1.76)     |
| $RAPY_{CELL}^{(2)} [\sigma] @ V_{DD} = 1.0 V,$<br>iso-area<br>$(RAPY_{CELL}^{(2)} [\sigma] @ V_{DD} = 1.0 V,$<br>identical Tr. size)                                                                                                       | 3.3<br>(3.12)  | 1.1<br>(1.19)        | 3.3<br>(3.36)  | 4.7<br>(4.7)       |
| $\begin{aligned} RAPY_{\text{CELL}}^{(3)} & [\sigma] @ \text{V}_{\text{DD}} = 1.0 \text{ V},\\ \text{iso-area} \\ (RAPY_{\text{CELL}}^{(3)} & [\sigma] @ \text{V}_{\text{DD}} = 1.0 \text{ V},\\ \text{identical Tr. size}) \end{aligned}$ | 2.80<br>(2.68) | 0.83<br>(0.89)       | 2.78<br>(2.79) | 3.82<br>(3.82)     |
| Average sensing time [ns]<br>@ V <sub>DD</sub> = 1.0 V, iso-area<br>(Average sensing time [ns]<br>@ V <sub>DD</sub> = 1.0 V, identical Tr. size)                                                                                           | 7<br>(7)       | 8<br>(8)             | 8<br>(8)       | 14<br>(14)         |
| $P_{\text{AVG}} [\mu\text{A}] @ \text{state } 0, V_{\text{DD}} = 1.0 \text{ V},$<br>iso-area<br>$(P_{\text{AVG}} [\mu\text{A}] @ \text{state } 0, V_{\text{DD}} = 1.0 \text{ V},$<br>identical Tr. size)                                   | 50.1<br>(48.1) | 153<br>(213)         | 53.0<br>(62.9) | 48.9<br>(48.9)     |
| $P_{\text{AVG}}$ [ $\mu$ A] @ state 1, V <sub>DD</sub> = 1.0 V,<br>iso-area<br>( $P_{\text{AVG}}$ [ $\mu$ A] @ state 1, V <sub>DD</sub> = 1.0 V,<br>identical Tr. size)                                                                    | 46.6<br>(44.7) | 137<br>(188.7)       | 47.0<br>(55.8) | 44.0<br>(44.0)     |

Table 1. Summary of the simulation results in terms of iso-area (identical Tr. size) and iso-power.

(1) For the identical transistor size (width/length), load PMOS of 2  $\mu$ m/0.1  $\mu$ m, clamp NMOS of 4  $\mu$ m/0.1  $\mu$ m, and switch of 0.3  $\mu$ m/0.05  $\mu$ m were used. (2) Temperature was set to room temperature (25 °C) and  $V_{CLAMP}$  was set to the values depicted in Figure 5. (3) The worst *RAPY*<sub>CELL</sub> value was calculated by comparing the results obtained at 45 °C and 90 °C when  $V_{CLAMP}$  was set to 0.6 V.

## 4. Conclusions

In this study, we proposed a TSSC using the DRV technique and novel split path over time, which maintains a large  $\mu_{\Delta V0,1}$  and reduces  $\sigma_{\Delta V0,1}$ . The proposed TSSC can reduce the  $V_{\text{TH}}$  variation effects by increasing the current and reducing the transistor mismatch. The simulation results indicate that conventional SCs exhibit low read yield because of small  $\mu_{\Delta V0,1}$  or large  $\sigma_{\Delta V0,1}$  values. In contrast, the TSSC obtains the greatest read yield in the 28-nm process technology. **Author Contributions:** Conceptualization, T.N.; methodology, T.N.; software, T.N.; validation, B.I., J.K. and J.H.K.; formal analysis, T.N.; investigation, B.I. and J.K.; resources, J.K.; data curation, B.I., J.K. and J.H.K.; writing—original draft preparation, B.I. and J.K.; writing—review and editing, T.N.; visualization, J.K.; supervision, T.N.; project administration, T.N.; funding acquisition, T.N. All authors have read and agreed to the published version of the manuscript.

Funding: This work was supported by Incheon National University Research Grant in 2021.

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: Data is contained within the article.

Acknowledgments: The EDA tool was supported by the IC Design Education Center (IDEC), Korea.

Conflicts of Interest: The authors declare no conflict of interest.

### References

- Huang, W.; Liu, L.; Zhu, Z. A sub-200nW all-in-one bandgap voltage and current reference without amplifiers. *IEEE Trans. Circuits Syst. II Exp. Briefs* 2021, 1, 121–125. [CrossRef]
- Kim, C.; Kwon, K.; Park, C.; Jang, S.; Choi, J. A covalent-bonded cross-coupled current-mode sense amplifier for STT-MRAM with 1T1MTJ common source-line structure array. In Proceedings of the 2015 IEEE International Solid-State Circuits Conference— (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3.
- Hosomi, M.; Yamagishi, H.; Yamamoto, T.; Bessho, K.; Higo, Y.; Yamane, K.; Yamada, H.; Shoji, M.; Hachino, H.; Fukumoto, C.; et al. A novel nonvolatile memory with spin torque transfer magnetization switching: Spin-RAM. In Proceedings of the IEEE International Electron Devices Meeting, 2005. IEDM Technical Digest, Washington, DC, USA, 5 December 2005; pp. 459–462.
- 4. Ikeda, S.; Miura, K.; Yamamoto, H.; Mizunuma, K.; Gan, H.D.; Endo, M.; Kanai, S.; Hayakawa, J.; Matsukura, F.; Ohno, H. A perpendicular-anisotropy CoFeB-MgO magnetic tunnel junction. *Nat. Mater.* **2010**, *9*, 721–724. [CrossRef] [PubMed]
- Takemura, R.; Kawahara, T.; Miura, K.; Yamamoto, H.; Hayakawa, J.; Matsuzaki, N.; Ono, K.; Yamanouchi, M.; Ito, K.; Takahashi, H.; et al. A 32-Mb SPRAM with 2T1R memory cell, localized bi-directional write driver and '1'/'0' dual-array equalized reference scheme. *IEEE J. Solid-State Circuits* 2010, 4, 869–879. [CrossRef]
- Tsuchida, K.; Inaba, T.; Fujita, K.; Ueda, Y.; Shimizu, T.; Asao, Y.; Kajiyama, T.; Iwayama, M.; Sugiura, K.; Ikegawa, S.; et al. A 64Mb MRAM with Clamped-Reference and Adequate-Reference Schemes. In Proceedings of the 2010 IEEE International Solid-State Circuits Conference—(ISSCC), San Francisco, CA, USA, 7–11 February 2010; pp. 258–259.
- Driskill-Smith, A.; Apalkov, D.; Nikitin, V.; Tang, X.; Watts, S.; Lottis, D.; Moon, K.; Khvalkovskiy, A.; Kawakami, R.; Luo, X.; et al. Latest Advances and Roadmap for In-Plane and Perpendicular STT-RAM. In Proceedings of the 2011 3rd IEEE International Memory Workshop (IMW), Monterey, CA, USA, 22–25 May 2011; pp. 1–3.
- 8. Tabrizi, F. Non-Volatile STT-RAM: A True Universal Memory; Grandis Inc.: Milpitas, CA, USA, 2009.
- Na, T.; Kim, J.; Kim, J.P.; Kang, S.H.; Jung, S.-O. Reference-scheme study and novel reference scheme for deep submicrometer STT-RAM. *IEEE Trans. Circuits Syst. I Regul. Pap.* 2014, 61, 3376–3385. [CrossRef]
- 10. Kim, J.; Na, T.; Kim, J.P.; Kang, S.H.; Jung, S.-O. A split-path sensing circuit for spin torque transfer MRAM. *IEEE Trans. Circuits Syst. II Exp. Briefs* **2014**, *3*, 193–197. [CrossRef]
- 11. Kim, J.; Ryu, K.; Kang, S.H.; Jung, S.-O. A novel sensing circuit for deep submicron spin transfer torque MRAM (STT-MRAM). *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* **2012**, *1*, 181–186. [CrossRef]
- Nho, H.; Yoon, S.; Wong, S.S.; Jung, S. Numerical Estimation of Yield in Sub-100-nm SRAM Design Using Monte Carlo Simulation. IEEE Trans. Circuits Syst. II Express Briefs 2008, 9, 907–911.
- Maffitt, T.M.; DeBrosse, J.K.; Gabric, J.A.; Gow, E.T.; Lamorey, M.C.; Parenteau, J.S.; Willmott, D.R.; Wood, M.A.; Gallagher, W.J. Design considerations for MRAM. *IBM J. Res. Develop.* 2012, 1, 181–186. [CrossRef]
- 14. Trinh, Q.; Ruocco, S.; Alioto, M. Dynamic reference voltage sensing scheme for read margin improvement in STT-MRAMs. *IEEE Trans. Circuits Syst. I* 2018, 65, 1269–1278. [CrossRef]
- Datta, T.; Abshire, P. Mismatch compensation of CMOS current mirrors using floating-gate transistors. In Proceedings of the 2009 IEEE International Symposium on Circuits and Systems, Taipei, Taiwan, 24–27 May 2009.
- Lakshmikumar, K.R.; Hadaway, R.A.; Copeland, M.A. Characterisation and modeling of mismatch in MOS transistors for precision analog design. *IEEE J. Solid-State Circuits* 1986, 6, 1057–1066. [CrossRef]
- Na, T.; Woo, S.-H.; Kim, J.; Jeong, H.; Jung, S.-O. Comparative study of various latch-type sense amplifiers. *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* 2014, 2, 425–429. [CrossRef]