Publications
2025
- IEDM 2025First Experimental Demonstration of Disturb-Free 3D Vertical 1T-nC-1T Ferroelectric-based KV Cache with Co-Optimization of Hybrid Analog-Digital CIM and Token-Wise Dynamic Pruning for Efficient Long-Context LLM InferenceWeikai Xu, Danyun Luo, Minyue Deng, Shuzhang Zhong, Shengjie Cao , Meng Li, Qianqian Huang , and Ru HuangIn 2025 IEEE International Electron Devices Meeting (IEDM), 2025
For the first time, a ferroelectric (FE)-based key-value (KV) cache for large language models (LLMs) is proposed and experimentally demonstrated. Through device-architecture-algorithm co-optimization, a novel 3D vertical 1T-nC-1T FeRAM structure, featuring orthogonally aligned word-lines/bit-lines and shared FE capacitor (FeCap) strings, as well as a hybrid analog-digital compute-in-memory (CIM) architecture with token-wise dynamic pruning algorithm, are presented. The designed 1T-nC-based analog CIM with small-signal non-destructive read can evaluate similarity in O(1) time for efficient token selection, and the nC-1T-based digital CIM with robust destructive read supports accurate attention computation with a subset of selected KV cache, enabling efficient and accurate processing of dynamically sparse KV cache workloads. Besides, by optimizing data mapping scheme, parallel computation across nC strings is enabled, leading to enhanced throughput and avoided write disturbance. Based on the above design, a 3D 3×32×32 FeCap array is experimentally fabricated with 10ns switching, 10-year retention, 1016 endurance and good consistency, and the KV cache-based attention is demonstrated with 6×/315× improved performance and energy efficiency over the state-of-the-art designs, along with high accuracy comparable with full attention, showing its great potential for efficient long-context LLM inference.
@inproceedings{xu2025first, title = {First Experimental Demonstration of Disturb-Free 3D Vertical 1T-nC-1T Ferroelectric-based KV Cache with Co-Optimization of Hybrid Analog-Digital CIM and Token-Wise Dynamic Pruning for Efficient Long-Context LLM Inference}, author = {Xu, Weikai and Luo, Danyun and Deng, Minyue and Zhong, Shuzhang and Cao, Shengjie and Li, Meng and Huang, Qianqian and Huang, Ru}, booktitle = {2025 IEEE International Electron Devices Meeting (IEDM)}, pages = {1--4}, year = {2025}, organization = {IEEE}, doi = {10.1109/IEDM50572.2025.11353834} } - SNW 2025A Novel Cross-Point Ferroelectric-Capacitor Array Based Parallel In-Memory Encryption for Energy-Efficient Secure-AI ApplicationsIn 2025 Silicon Nanoelectronics Workshop (SNW), 2025
A novel parallel in-memory encryption design based on cross-point ferroelectric capacitor (FeCAP) arrays is proposed and experimentally demonstrated, realizing significantly improved energy efficiency. The proposed dual-layer FeCAP arrays with coupled signal subtracting method enables parallel computation with larger sensing margin (∼2×) than bitwise encryption for robust data en/decryption. Benefiting from the ultra-low-power bitwise XOR property of device, large energy reduction is achieved compared with conventional encryption scheme. With the great 3D stacking potential of hafnia-based FE, this work provides a promising solution towards 3D FE secure-CIM.
@inproceedings{cao2025novel, title = {A Novel Cross-Point Ferroelectric-Capacitor Array Based Parallel In-Memory Encryption for Energy-Efficient Secure-AI Applications}, author = {Cao, Shengjie and Xu, Weikai and Fu, Zhiyuan and Deng, Minyue and Zhang, Zebin and Huang, Qianqian and Huang, Ru}, booktitle = {2025 Silicon Nanoelectronics Workshop (SNW)}, pages = {34--35}, year = {2025}, organization = {IEEE}, doi = {10.23919/SNW65111.2025.11097253} } - CSTIC 2025A Novel Ambipolar Ferroelectric Tunnel FINFET Based Computing-in-Memory for Quantized Neural Networks with High Area-and Energy-EfficiencyRunze Han, Jin Luo, Shengjie Cao, Qianqian Huang , and Ru HuangIn 2025 Conference of Science and Technology of Integrated Circuits (CSTIC), 2025
In this work, for the first time, a novel ambipolar ferroelectric tunnel FET (FeTFET) based computing-in-memory (CIM) scheme is proposed and experimentally demonstrated for quantized neural networks (QNNs) with high area- and energy-efficiency. By leveraging non-monotonic transfer characteristic of ambipolar tunnel FET (TFET) for signed multiplication and utilizing the nonvolatile ferroelectric gate modulation for weight storage, the signed CIM cell with weight precision expansion can be implemented by one single FeTFET device based on 14-nm technology node platform. Based on the proposed FeTFET-based CIM design, QNNs for typical pattern recognition tasks are demonstrated with high accuracy and energy efficiency, showing its significant potential for edge AI applications.
@inproceedings{han2025novel, title = {A Novel Ambipolar Ferroelectric Tunnel FINFET Based Computing-in-Memory for Quantized Neural Networks with High Area-and Energy-Efficiency}, author = {Han, Runze and Luo, Jin and Cao, Shengjie and Huang, Qianqian and Huang, Ru}, booktitle = {2025 Conference of Science and Technology of Integrated Circuits (CSTIC)}, pages = {1--3}, year = {2025}, organization = {IEEE}, doi = {10.1109/CSTIC64481.2025.11017891} } - EDTM 2025A Novel Superlattice HfO2-ZrO2 Ferroelectric Tunnel FET for Overall Improvement in Memory Window, EOT and Disturb ImmunityShaodi Xu, Zhiyuan Fu, Shengjie Cao, Yue Yu, Hao Zheng, Qianqian Huang , and Ru HuangIn 2025 9th IEEE Electron Devices Technology & Manufacturing Conference (EDTM), 2025
In this paper, a novel superlattice (SL) HfO2-ZrO2 ferroelectric (FE) junction-modulated tunnel FET (TFET) is proposed and experimentally demonstrated with overall improvement in memory window (MW), equivalent oxide thickness (EOT) and disturb immunity. The gate stack is optimized with SL FE layer for larger MW and smaller EOT simultaneously which usually has an optimization conflict in conventional FE layer. In addition, a more abrupt tunnel junction is introduced by a striped gate stack design, which is found to further increase the MW. The fabricated novel SL FE-JTFET demonstrates a 2.6× improvement in MW along with a 19 % reduction in EOT comparing to FE-TFET, which is very beneficial for practical read current margin improvement. Moreover, the SL FE layer can also mitigate the disturb issue in the memory operations, showing the great potential of proposed device for practical low-power and high-robust memory applications.
@inproceedings{xu2025novel, title = {A Novel Superlattice HfO2-ZrO2 Ferroelectric Tunnel FET for Overall Improvement in Memory Window, EOT and Disturb Immunity}, author = {Xu, Shaodi and Fu, Zhiyuan and Cao, Shengjie and Yu, Yue and Zheng, Hao and Huang, Qianqian and Huang, Ru}, booktitle = {2025 9th IEEE Electron Devices Technology \& Manufacturing Conference (EDTM)}, pages = {1--3}, year = {2025}, organization = {IEEE}, doi = {10.1109/EDTM61175.2025.11040842} } - EDTM 2025Physics-Based Circuit-Compatible Model of Polycrystalline Hafnia-Based 3D Ferroelectric Capacitor for High-Density Memory ApplicationsMinyue Deng, Chang Su, Jiayan Zhu, Liang Chen, Shengjie Cao, Qianqian Huang , and Ru HuangIn 2025 9th IEEE Electron Devices Technology & Manufacturing Conference (EDTM), 2025
In this work, based on finite element method, a physics-based equivalent circuit model for polycrystalline hafnia-based 3D ferroelectric capacitor is proposed and developed, which can capture the cylindrical geometric feature and the impacts from complex phase distribution in actual ferroelectric film for the first time. The proposed circuitcompatible model shows high accuracy compared with TCAD simulation and significantly improved computation efficiency compared with previous methods. Moreover, two typical types of phase distribution in ferroelectric capacitors are discussed based on the experimental characterization for model modification. Based on the proposed model, it is suggested to improve the ferroelectricity near inner electrode and reduce the possible interfacial dielectric layer thickness of 3D ferroelectric capacitor for high density memory application.
@inproceedings{deng2025physics, title = {Physics-Based Circuit-Compatible Model of Polycrystalline Hafnia-Based 3D Ferroelectric Capacitor for High-Density Memory Applications}, author = {Deng, Minyue and Su, Chang and Zhu, Jiayan and Chen, Liang and Cao, Shengjie and Huang, Qianqian and Huang, Ru}, booktitle = {2025 9th IEEE Electron Devices Technology \& Manufacturing Conference (EDTM)}, pages = {1--3}, year = {2025}, organization = {IEEE}, doi = {10.1109/EDTM61175.2025.11040739} } - EDTM 2025Hafnia-Based XP-FeRAM: A Novel High-Speed and Low-Power Cross-Point Ferroelectric Memory for Data-Intensive ApplicationsQianqian Huang, Shengjie Cao, Zhiyuan Fu , and Ru HuangIn 2025 9th IEEE Electron Devices Technology & Manufacturing Conference (EDTM), 2025
Device and circuit co-optimization of a novel hafnia-based cross-point FeRAM (XP-FeRAM) are comprehensively investigated for high-density, high-speed and low-power memory applications. Planar and 3D vertical-stacked XP-FeRAM designs are demonstrated with application-specific device optimization, and the outstanding comprehensive performance are achieved with high 2Pr, excellent disturbance immunity under low-voltage operation and fast switching at scaled size, as well as good device reliability. A modified V/ 2 operation scheme with in-situ write-back is further presented, leading to faster operation and lower power consumption than traditional 1T1C FeRAM. Combined with its excellent scalability, XP-FeRAM shows strong potential as a competitive candidate of future non-volatile memory technology for data-intensive demands.
@inproceedings{huang2025hafnia, title = {Hafnia-Based XP-FeRAM: A Novel High-Speed and Low-Power Cross-Point Ferroelectric Memory for Data-Intensive Applications}, author = {Huang, Qianqian and Cao, Shengjie and Fu, Zhiyuan and Huang, Ru}, booktitle = {2025 9th IEEE Electron Devices Technology \& Manufacturing Conference (EDTM)}, pages = {1--3}, year = {2025}, organization = {IEEE}, doi = {10.1109/EDTM61175.2025.11040540} }
2024
- IEDM 2024Comprehensive Performance Re-Assessment of Hafnia-Based Cross-Point FeRAM with Ultra-Fast and Low-Power Operation from Device/Array PerspectiveShengjie Cao*, Zhiyuan Fu*, Minyue Deng, Hao Zheng, Qianqian Huang , and Ru HuangIn 2024 IEEE International Electron Devices Meeting (IEDM), 2024
In this work, hafnia-based selector-less cross-point FeRAM (XP-FeRAM) with ultra-fast and low-power operation is experimentally demonstrated from device-level optimization to array-level evaluation for embedded and standalone memories. For device optimization, the impacts of ferroelectric (FE) layer deposition process sequence considering different applications are investigated for the first time with the awareness of switching speed. Optimized devices show the outstanding comprehensive performance with 1.5ns switching, 2Pr of 47μC/cm2 and excellent disturbance immunity under 1.6V low-voltage operation, alone with the good reliability of extrapolated 1014 endurance cycles and 10 years data retention in scaled devices. Moreover, from array perspective, a modified V/2 operation scheme with in-situ write-back is further proposed and experimentally demonstrated in the fabricated XP-FeRAM array, resulting in the even faster operation and lower power consumption than 1T1C FeRAM. Additionally, based on the established parasitic circuit model, memory performances and scalability design spaces are robustly evaluated, showing the great potential of XP-FeRAMs for high-speed, high-density and low-power memory applications.
@inproceedings{cao2024comprehensive, title = {Comprehensive Performance Re-Assessment of Hafnia-Based Cross-Point FeRAM with Ultra-Fast and Low-Power Operation from Device/Array Perspective}, author = {Cao, Shengjie and Fu, Zhiyuan and Deng, Minyue and Zheng, Hao and Huang, Qianqian and Huang, Ru}, booktitle = {2024 IEEE International Electron Devices Meeting (IEDM)}, pages = {1--4}, year = {2024}, organization = {IEEE}, doi = {10.1109/IEDM50854.2024.10873447} } - IEEE TED 2024Hafnia-Based High-Disturbance-Immune and Selector-Free Cross-Point FeRAMZhiyuan Fu, Shengjie Cao, Hao Zheng, Jin Luo, Qianqian Huang , and Ru HuangIEEE Transactions on Electron Devices, 2024
This study presents an experimental demonstration of 3-D-stackable hafnia-based selector-free cross-point FeRAM, with enhanced disturbance immunity achieved through design technology co-optimization (DTCO). Considering ferroelectric (FE) dynamics, the disturbance behavior of FE devices has been systematically and quantitatively examined using the proposed “pulse-disturb” analysis method. Through the optimization of grain uniformity and interfacial layers, the fabricated Hf0.5Zr0.5O2 (HZO) FE capacitor exhibits large grain size exceeding 30 nm with record best disturbance immunity among FE-HZO. It also achieves a significant improvement of MW in selector-free FeRAM operation and enhanced remnant polarization ( Pr ) of approximately 23 μC/cm2, low operation voltage (2.4 V), high endurance (1013 cycles), long retention capability (ten years), and excellent potential for 3-D stacking. Moreover, to address the multiple pulses disturb issue, a novel “disturb-recovery” pulsing method is proposed, showing multidisturb-free operation for practical cross-point array applications. Based on the above strategies, 1-kbit selector-free cross-point FeRAM array is experimentally demonstrated with successful read/write operation, indicating its great potential for high-density and low-power memory applications.
@article{fu2024hafnia, title = {Hafnia-Based High-Disturbance-Immune and Selector-Free Cross-Point FeRAM}, author = {Fu, Zhiyuan and Cao, Shengjie and Zheng, Hao and Luo, Jin and Huang, Qianqian and Huang, Ru}, journal = {IEEE Transactions on Electron Devices}, year = {2024}, publisher = {IEEE}, doi = {10.1109/TED.2024.3369569} }
2023
- IEDM 2023First demonstration of hafnia-based selector-free FeRAM with high disturb immunity through design technology co-optimizationZhiyuan Fu*, Shengjie Cao*, Hao Zheng, Jin Luo, Qianqian Huang , and Ru HuangIn 2023 International Electron Devices Meeting (IEDM), 2023
In this work, 3D-stackable hafnia-based selector-free FeRAM is experimentally demonstrated for the first time, showing significantly improved disturb immunity through design technology co-optimization. With ferroelectric (FE) dynamics considered, based on the proposed pulse-disturb analysis method for FE capacitors, the disturb behavior of FE-based cross-point arrays has been systematically and quantitatively investigated. By grain uniformity and interfacial layer optimization, the fabricated optimized Hf0.5Zr0.5O2 (HZO) FE capacitor shows the record best disturb immunity among FE-HZO and 71.3% of MW improvement for FeRAM operation due to the large grain size (>30nm), along with the advantages of enhanced remnant polarization (Pr) ( 23 μC/cm2), low operation voltage (2.4V), high endurance (1013 cycles), long retention (10 years) and excellent 3D-stackable potential. Moreover, to address the multiple pulses disturb issue, a new disturb-recovery pulsing method is further proposed, showing multi-disturb-free operation for practical cross-point array applications. Based on the above strategies, the first 1 kbit cross-point array for selector-free FeRAM based on the optimized HZO devices is experimentally demonstrated with successful read/write operation, indicating its great potential for high-density and low-power memory applications.
@inproceedings{fu2023first, title = {First demonstration of hafnia-based selector-free FeRAM with high disturb immunity through design technology co-optimization}, author = {Fu, Zhiyuan and Cao, Shengjie and Zheng, Hao and Luo, Jin and Huang, Qianqian and Huang, Ru}, booktitle = {2023 International Electron Devices Meeting (IEDM)}, pages = {1--4}, year = {2023}, organization = {IEEE}, doi = {10.1109/IEDM45741.2023.10413887} }
2021
- IEEE TVLSI 2021DyTAN: Dynamic Ternary Content Addressable Memory Using Nanoelectromechanical RelaysHongtao Zhong*, Shengjie Cao*, Li Jiang, Xia An, Vijaykrishnan Narayanan, Yongpan Liu, Huazhong Yang, and Xueqing LiIEEE Transactions on Very Large Scale Integration (VLSI) Systems, Nov 2021
Ternary content addressable memory (TCAM) is one type of associative memory and has been widely used in caches, routers, and many other mapping-aware applications. While the conventional SRAM-based TCAM is high speed and bulky, there have been denser but slower and less reliable nonvolatile TCAMs using nonvolatile memory (NVM) devices. Meanwhile, some CMOS TCAMs using dynamic memories have been also proposed. Although dynamic TCAM could be denser than the 16T SRAM TCAM and more reliable than the nonvolatile TCAMs, CMOS dynamic TCAMs still suffer from the row-by-row refresh energy and time overheads. In this article, we propose dynamic TCAM using nanoelectromechanical (NEM) relays (DyTAN), and utilize one-shot refresh (OSR) to solve the memory refresh problem. By exploiting the unique NEM relay characteristics, DyTAN outperforms the existing works in the balance between density, speed, and power efficiency. Compared with the 16T SRAM-based TCAM, the 5T CMOS dynamic TCAM, the 2T2R TCAM, and the 2FeFET TCAM, evaluations show that the proposed DyTAN reduces the write energy by up to 2.3\times , 1.3\times , 131\times , and 13.5\times , and improves the search energy-delay-product (EDP) by up to 12.7\times , 1.7\times , 1.3\times , and 2.8\times , respectively.
@article{9570131, author = {Zhong, Hongtao and Cao, Shengjie and Jiang, Li and An, Xia and Narayanan, Vijaykrishnan and Liu, Yongpan and Yang, Huazhong and Li, Xueqing}, journal = {IEEE Transactions on Very Large Scale Integration (VLSI) Systems}, title = {DyTAN: Dynamic Ternary Content Addressable Memory Using Nanoelectromechanical Relays}, year = {2021}, volume = {29}, number = {11}, pages = {1981-1993}, keywords = {Relays;Nanoelectromechanical systems;Nonvolatile memory;Logic gates;FeFETs;Electrodes;Very large scale integration;Dynamic ternary content addressable memory (TCAM);low power;(snanoelectromechanical (NEM) relay;TCAM}, doi = {10.1109/TVLSI.2021.3115622}, issn = {1557-9999}, month = nov } - DATE 2021Dynamic Ternary Content-Addressable Memory Is Indeed Promising: Design and Benchmarking Using Nanoelectromechanical RelaysHongtao Zhong*, Shengjie Cao*, Huazhong Yang, and Xueqing LiIn 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Feb 2021
Ternary content addressable memory (TCAM) has been a critical component in caches, routers, etc., in which density, speed, power efficiency, and reliability are the major design targets. There have been the conventional low-write-power but bulky SRAM-based TCAM design, and also denser but less reliable or higher-write-power TCAM designs using nonvolatile memory (NVM) devices. Meanwhile, some TCAM designs using dynamic memories have been also proposed. Although dynamic design TCAM is denser than CMOS SRAM TCAM and more reliable than NVM TCAM, the conventional row-by-row refresh operations land up with a bottleneck of interference with normal TCAM activities. Therefore, this paper proposes a custom low-power dynamic TCAM using nanoelectromechanical (NEM) relay devices utilizing one-shot refresh to solve the memory refresh problem. By harnessing the unique NEM relay characteristics with a proposed novel cell structure, the proposed TCAM occupies a small footprint of only 3 transistors (with two NEM relays integrated on the top through the back-end-of-line process), which significantly outperforms the density of 16-transistor SRAM-based TCAM. In addition, evaluations show that the proposed TCAM improves the write energy efficiency by 2.31x, 131x, and 13.5x over SRAM, RRAM, and FeFET TCAMs, respectively; The search energy-delay-product is improved by 12.7x, 1.30x, and 2.83x over SRAM, RRAM, and FeFET TCAMs, respectively.
@inproceedings{9474177, author = {Zhong, Hongtao and Cao, Shengjie and Yang, Huazhong and Li, Xueqing}, booktitle = {2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)}, title = {Dynamic Ternary Content-Addressable Memory Is Indeed Promising: Design and Benchmarking Using Nanoelectromechanical Relays}, year = {2021}, volume = {}, number = {}, pages = {1100-1103}, keywords = {Associative memory;Nanoelectromechanical systems;Nonvolatile memory;Random access memory;Benchmark testing;Reliability engineering;Nanoscale devices;Ternary content addressable memory (TCAM);low-power;NEM relay;beyond-CMOS;dynamic memory}, doi = {10.23919/DATE51398.2021.9474177}, issn = {1558-1101}, month = feb }