Updated on 2026/06/19

写真a

 
AWANO Hiromitsu
 
Organization
Graduate School of Engineering Information and Communication Engineering 2 Professor
Undergraduate School
School of Engineering Electrical Engineering, Electronics, and Information Engineering
Title
Professor
 

Papers 75

  1. Frieren: A Fault-Tolerant Reconfigurable Energy-Efficient Computing Architecture With Enhanced Reliability in Harsh Environments Reviewed

    Cheng Q., Li Q., Dong W., Zhang M., Zhang R., Huang M., Yu H., Shi Y., Awano H., Sato T., Saligane M., Lin L., Hashimoto M.

    IEEE Transactions on Computers   Vol. 75 ( 7 ) page: 2589 - 2603   2026.7

     More details

    Publisher:IEEE Transactions on Computers  

    In harsh environments such as space, strong radiation effects often induce single-event effects that threaten the reliability of computing systems. Meanwhile, edge artificial intelligence (AI) processors deployed in these conditions must not only tolerate faults but also operate under stringent resource constraints, while still ensuring efficient task execution. Achieving high-performance and energy-efficient computation with adaptive reliability in such harsh conditions is therefore of great importance. This work presents Frieren, a fault-tolerant and reconfigurable computing architecture for reliable operation in harsh environments. A 22 nm system-on-chip (SoC) prototype is implemented to validate Frieren and evaluate its resilience to soft errors. Frieren operates in three primary modes: (1) a high-throughput computation engine mode, (2) a multi-core mode featuring adaptive dual-core lockstep (DCLS) for fault tolerance and programmable parallel computing, and (3) a JTAG-assisted scan-chain-based fault injection (FI) mode. The first two modes fully share processing elements and memory resources, ensuring zero data movement during mode transitions, while the third mode supports pre-deployment reliability evaluation by emulating transient faults. Both irradiation and hardware-level FI experiments are conducted to verify reliability, confirming the robustness of Frieren. Radiation tests of the SoC indicate that DCLS can correct up to about 83% of RISC-V errors, while customized parallel computing in multi-core mode achieves a 17.77× latency reduction. Moreover, the SoC delivers up to 17.18 TOPS/W in computation engine mode and 1.92 TOPS/W in multi-core mode, demonstrating an energy-efficient and resilient platform for AI deployment under harsh conditions. In real workloads, the SoC achieves peak energy efficiencies of 14.72 TOPS/W on SuperYOLO and 12.33 TOPS/W on DROID-SLAM.

    DOI: 10.1109/TC.2026.3688989

    Scopus

  2. Analog In-Memory Computing from a Memory-Agnostic Perspective: Theory, Nonidealities, and Hardware-Aware Training Reviewed Open Access

    SAKEMI Yusuke, AWANO Hiromitsu, MORIE Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E109.A ( 5 ) page: 840 - 859   2026.5

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>Analog in-memory computing (AIMC) executes matrix-vector multiplications (MVMs) inside memory to alleviate the von Neumann bottleneck and improve energy efficiency. This tutorial classifies AIMC circuits in a memory-agnostic way, namely, <i>current-domain, charge-domain, charge-redistribution, capacitive-division, resistive-division</i>, and <i>time-domain</i> IMC. We explain each type of AIMC circuit with simple mathematical models. Furthermore, we review key device and circuit nonidealities (e.g., process variation, IR drop, sneak paths, and I/O quantization/nonlinearity) with practical mitigation strategies in circuitry and peripherals. Finally, we organize hardware-aware training into three complementary families — probabilistic/precise modeling, physical modeling, and hardware-in-the-loop techniques — providing a mathematically grounded bridge between circuits and learning for robust, scalable AIMC accelerators.</p>

    DOI: 10.1587/transfun.2025gci0001

    Web of Science

    Scopus

    CiNii Research

  3. Biologically Constrained DNA Encoding With Triplet Networks for Similarity Image Retrieval Reviewed Open Access

    Koike T., Awano H., Sato T.

    IEEE Transactions on Computational Biology and Bioinformatics   Vol. 23 ( 3 ) page: 1240 - 1252   2026.5

     More details

    Publisher:IEEE Transactions on Computational Biology and Bioinformatics  

    As the volume of digital data continues to grow exponentially, DNA has emerged as a promising medium for long-term data storage due to its high density and durability. For enabling data retrieval via DNA's biochemical reactions, the encoding strategy plays a critical role. This paper proposes a training framework for a DNA encoder that improves both accuracy and training efficiency in content-based image retrieval by incorporating deep metric learning. In addition, we introduce loss functions that enforce biological constraints, specifically homopolymer length and GC content, thereby improving the biochemical stability of the generated DNA sequences. To evaluate the effectiveness of the proposed method, we conduct quantitative assessments based on image classification performance. Simulations on the CIFAR-10 and CIFAR-100 datasets demonstrate that our method achieves classification accuracy comparable to CNN-based baselines and a 20-fold speedup over the training time of the existing method. Moreover, the generated DNA sequences enable strict control of homopolymer length and maintain GC content within the optimal 40-60% range, significantly improving biological feasibility compared to baseline methods.

    DOI: 10.1109/TCBBIO.2026.3673740

    Open Access

    Scopus

  4. Improving Robustness of Leakage-Based MOSFET Reservoir Computing Using Adaptive Pulse-Width Control Reviewed

    Seki R., Utsunomiya M., Chen Y.G., Awano H., Sato T.

    IEEE International Conference on Microelectronic Test Structures     2026

     More details

    Publisher:IEEE International Conference on Microelectronic Test Structures  

    This paper proposes a method to enhance the robustness of Leakage-based MOSFET Echo State Network (LMESN) against environmental variations. LMESN is a hardware reservoir computing architecture that exploits MOSFET subthreshold leakage currents. The proposed method consists of two components: adaptive tuning of the minimum input pulse width based on temperature to compensate for leakage-current change, and the use of Lasso regression for output-weight training to suppress errors arising from temperature-coefficient variations. Simulation results on a time-series classification task confirm that the inference accuracy is maintained across temperatures ranging from 5 to 75°C without requiring retraining over this temperature range.

    DOI: 10.1109/ICMTS69943.2026.11471724

    Scopus

  5. Online Training and Inference System on Edge FPGA Using Delayed Feedback Reservoir Reviewed Open Access

    Ikeda, S; Awano, H; Sato, T

    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS   Vol. 44 ( 9 ) page: 3323 - 3335   2025.9

     More details

    Publisher:IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems  

    A delayed feedback reservoir (DFR) is a hardware-friendly reservoir computing system. Implementing DFRs in embedded hardware requires efficient online training. However, two main challenges prevent this: 1) hyperparameter selection, which is typically done by offline grid search, and 2) training of the output linear layer, which is memory-intensive. This article introduces a fast and accurate parameter optimization method for the reservoir layer utilizing backpropagation and gradient descent by adopting a modular DFR model. A truncated backpropagation strategy is proposed to reduce memory consumption associated with the expansion of the recursive structure while maintaining accuracy. The computation time is significantly reduced compared to grid search. In addition, an in-place Ridge regression for the output layer via 1-D Cholesky decomposition is presented, reducing memory usage to be 1/4. These methods enable the realization of an online edge training and inference system of DFR on an FPGA, reducing computation time by about 1/13 and power consumption by about 1/27 compared to software implementation on the same board.

    DOI: 10.1109/TCAD.2025.3541565

    Web of Science

    Scopus

  6. A 22nm Resource-Frugal Hyper-Heterogeneous Multi-Modal System-on-Chip Towards In-Orbit Computing Reviewed

    Cheng Q., Li Q., Dong W., Zhang M., Zhang R., Huang M., Yu H., Shi Y., Awano H., Sato T., Lin L., Hashimoto M.

    Proceedings of the Custom Integrated Circuits Conference     2025

     More details

    Publisher:Proceedings of the Custom Integrated Circuits Conference  

    Integrating artificial intelligence (AI) into in-orbit computing offers significant benefits, but current satellites face challenges in processing large sensor data volumes due to limited communication and computing resources, resulting in high latency [1]. Intelligent Early Discard (IED) [2] addresses this by filtering irrelevant data early, optimizing bandwidth and data usage. However, this demands highperformance onboard computing for efficient data preprocessing and AI acceleration [3], [4]. Additionally, Space radiation, including solar energetic particles and cosmic rays, can cause Single Event Upsets (SEUs) in satellite systems [5], risking mission failure and increasing reliability demands [6], [7]. To tackle these challenges, we propose a resource-frugal hyper-heterogeneous System-on-Chip (SoC) architecture for in-orbit computing. The SoC features two modes: (1) a specialized computation engine for AI acceleration, and (2) a multicore mode with dual-core lock-step (DCLS) and vector computing for efficient, fault-tolerant data processing (Fig. 1). This resource-frugal architecture enables full sharing of Processing Elements (PEs) and memories for dynamic workload allocation, enhancing in-orbit performance by processing IED data directly on the satellite and reducing costly data transmission to Earth.

    DOI: 10.1109/CICC63670.2025.10983627

    Scopus

  7. A Radiation-Hardened Neuromorphic Imager with Self-Healing Spiking Pixels and Unified Spiking Neural Network for Space Robotics Reviewed

    Cheng Q., Li Q., Yang Z., Kong Z., Niu G., Liang Y., Li J., Park J.H., Liao W., Awano H., Sato T., Lin L., Hashimoto M.

    Digest of Technical Papers Symposium on VLSI Technology     2025

     More details

    Publisher:Digest of Technical Papers Symposium on VLSI Technology  

    A radiation-hardened neuromorphic imager prototype is developed for space exploration, featuring a fully spike-based neuromorphic vision system architecture, in-pixel self-healing against radiation-induced damage, and integrated unified spiking neural network (USNN) with adaptive neurons and synapses and contrast enhancement at low-contrast conditions. Self-healing reduces dark current by 6.25× at 14kGy cumulative dose, recovering recognition accuracy by 27.8%. USNN consumes 0.0529 pj/SOP at 5,000 events/s.

    DOI: 10.23919/VLSITechnologyandCir65189.2025.11075180

    Scopus

  8. Beamforming Feedback-Based Respiration and Heart Rate Estimation Toward Firmware-Agnostic WiFi Sensing Reviewed Open Access

    Kanda, T; Kondo, S; Shimomura, H; Sato, T; Awano, H; Yamamoto, K

    IEEE ACCESS   Vol. 13   page: 146008 - 146019   2025

     More details

    Publisher:IEEE Access  

    WiFi-based vital sign monitoring has attracted growing attention for its potential applications in contactless healthcare. However, most existing techniques rely on channel state information (CSI), which typically requires custom firmware and specific chipsets. To address this issue, this study explores firmware-agnostic respiration and heart rate estimation using beamforming feedback (BFF), compressed representation of CSI. This eliminates the need for custom firmware or chipset support, enabling broader applicability using off-the-shelf devices. However, it is not trivial to apply CSI-based estimation techniques to BFF-based estimation because the information content and data structure of BFF differ from those of CSI. The proposed BFF-based estimation algorithm addresses this issue by adapting the CSI-based estimation techniques to work with BFF. The algorithm consists of four key components: subcarrier selection, data calibration, signal extraction, and respiration and heart rate estimation. The performance of the BFF-based estimation algorithm is experimentally validated in several indoor environments using commodity IEEE 802.11ac devices. Results show that respiration rate and heart rate can be estimated with average errors below 1 breaths/min and 10 beats/min, respectively. Furthermore, accuracy comparisons between BFF-based and CSI-based estimations are provided to investigate the impact of lossy compression from CSI to BFF, specifically singular value decomposition (SVD) calculation and quantization. Comparisons reveal that the accuracy degradation of the BFF-based estimation compared to CSI-based estimation is primarily caused by the quantization rather than the SVD calculation.

    DOI: 10.1109/ACCESS.2025.3600278

    Open Access

    Web of Science

    Scopus

  9. Zero-Aware Regularization for Energy-Efficient Inference on Akida Neuromorphic Processor Reviewed

    Habara, T; Sato, T; Awano, H

    2025 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS     2025

     More details

    Publisher:Proceedings IEEE International Symposium on Circuits and Systems  

    Spiking Neural Networks (SNNs) and their hardware accelerators have emerged as promising systems for advanced cognitive processing with low power consumption. Although the development of SNN hardware accelerators is particularly active, research on the intelligent use of these accelerators remains limited. This study focuses on the SNN accelerator Akida, a commercially available neuromorphic processor, and presents a novel training method designed to reduce inference energy by leveraging the unique architecture of the hardware. Specifically, we apply sparse constraints on neuron activations and synaptic connection weights, aiming to minimize the number of firing neurons by considering Akida's batch spike processing feature. Our proposed method was applied to a network consisting of three convolutional layers and two fully connected layers. In the MNIST image classification task, the activations became 76.1% sparser, and the weights became 22.1% sparser, resulting in a 13.8% reduction in energy consumption per image.

    DOI: 10.1109/ISCAS56072.2025.11044086

    Web of Science

    Scopus

  10. Window Function-less DFT with Reduced Noise and Latency for Real-Time Music Analysis Reviewed

    Biesinger C., Awano H., Hashimoto M.

    European Signal Processing Conference     page: 431 - 435   2025

     More details

    Publisher:European Signal Processing Conference  

    Music analysis applications demand algorithms that can provide both high time and frequency resolution while minimizing noise in an already-noisy signal. Real-time analysis additionally demands low latency and low computational requirements. We propose a DFT-based algorithm that accomplishes all these requirements by extending a method that post-processes DFT output without the use of window functions. Our approach yields greatly reduced sidelobes and noise, and improves time resolution without sacrificing frequency resolution. We use exponentially spaced output bins which directly map to notes in music. The resulting improved performance, compared to existing FFT and DFT-based approaches, creates possibilities for improved real-time visualizations, and contributes to improved analysis quality in other applications such as automatic transcription.

    DOI: 10.23919/EUSIPCO63237.2025.11226525

    Scopus

  11. Weighted Range-Constrained Ising-Model Decoder for Quantum Error Correction Reviewed

    Guo, XY; Awano, H; Sato, T

    PROCEEDINGS OF THE 62ND ANNUAL ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2025     2025

     More details

    Publisher:Proceedings Design Automation Conference  

    Ising model-based Quantum Error Correction decoders reduce topological complexity compared to classical decoders. However, the SOTA Ising decoder has a higher time complexity than union-find (UF) and a lower threshold than minimum-weight perfect-matching (MWPM). We propose the Weighted Range-Constrained Ising Model-Based (WRIM) decoder. WRIM uses a polygonal region to enclose flipped syndromes, ensuring the coverage of all potential error chains while optimizing coupling and external field coefficients. WRIM reduces the variable count by 97.8x, achieves microsecondlevel decoding, and has a worst-case time complexity of O(n), outperforming UF. WRIM exhibits threshold behavior up to 10.711.0%, surpassing the MWPM's highest reported threshold.

    DOI: 10.1109/DAC63849.2025.11133309

    Web of Science

    Scopus

  12. SOME: Symmetric One-Hot Matching Elector - A Lightweight Microsecond Decoder for Quantum Error Correction Reviewed

    Guo X., Miao G., Nishizawa S., Awano H., Kimura S., Sato T.

    IEEE ACM International Conference on Computer Aided Design Digest of Technical Papers Iccad     2025

     More details

    Publisher:IEEE ACM International Conference on Computer Aided Design Digest of Technical Papers Iccad  

    Conventional quantum error correction (QEC) de-coders such as Minimum-Weight Perfect Matching (MWPM) and Union-Find (UF) offer high thresholds and fast decoding, respectively, but both suffer from high topological complexity. In contrast, Ising model-based decoders reduce topological complexity but demand considerable decoding time. We propose the Symmetric One-Hot Matching Elector (SOME), a novel decoder that reformulates the QEC decoding task as a Quadratic Unconstrained Binary Optimization (QUBO) problem - termed the One-Hot QUBO (OHQ). Each variable in the QUBO represents whether a given pair of flipped syndromes is matched, while the error probabilities between the pair are encoded as interaction coefficients (weight). Constraints ensure that each flipped syndrome is matched exactly once. Valid solutions of OHQ correspond to self-inverse permutation matrices, characterized by symmetric one-hot encoding. To solve the OHQ efficiently, SOME reformulates the decoding task as the construction of permutation matrices that minimize the total weight. It initializes each candidate matrix from one of the minimum-weight syndrome pairs, then iteratively appends additional pairs in ascending order of weight, and finally selects the permutation matrix with the lowest total energy. SOME achieves up to a 99.9x reduction in variable count and reduces decoding times from milliseconds to microseconds on a single-threaded commodity CPU. OHQ also maintains performance up to a 10.5% physical error rate, surpassing the highest known threshold of MWPM.

    DOI: 10.1109/ICCAD66269.2025.11240965

    Scopus

  13. Random Telegraph Noise Observed on 65-nm Bulk pMOS Transistors at 3.8K Reviewed

    Kawakami, T; Sato, T; Awano, H

    30TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2025     page: 1438 - 1443   2025

     More details

    Publisher:Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC  

    This paper presents a detailed study on Random Telegraph Noise (RTN) behavior under cryogenic conditions. The study leverages a device array, BTIarray, to statistically measure RTN in a temperature range from room temperature down to 3.8 K. The measurement results indicate that while RTN's impact decreases in the low-temperature region at about 100 K, it becomes more pronounced at lower temperatures, especially in transistors with shorter channel lengths. This research advances the understanding of RTN in cryogenic environments, offering essential insights for future integrated circuit (IC) design.

    DOI: 10.1145/3658617.3703140

    Web of Science

    Scopus

  14. Lookup Table-based Multiplication-free All-digital DNN Accelerator Featuring Self-Synchronous Pipeline Accumulation Reviewed

    Tagata, H; Sato, T; Awano, H

    PROCEEDINGS OF THE 62ND ANNUAL ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2025     2025

     More details

    Publisher:Proceedings Design Automation Conference  

    Deep neural networks (DNNs) have been widely applied in our society, yet reducing power consumption due to large-scale matrix computations remains a critical challenge. MADDNESS is a known approach to improving energy efficiency by substituting matrix multiplication with table lookup operations. Previous research has employed large analog computing circuits to convert inputs into LUT addresses, which presents challenges to area efficiency and computational accuracy. This paper proposes a novel MADDNESS-based all-digital accelerator featuring a self-synchronous pipeline accumulator, resulting in a compact, energy-efficient, and PVT-invariant computation. Post-layout simulation using a commercial 22nm process showed that 2.5 × higher energy efficiency (174 TOPS/W) and 5× higher area efficiency (2.01 TOPS/mm2) can be achieved compared to the conventional accelerator.

    DOI: 10.1109/DAC63849.2025.11132097

    Web of Science

    Scopus

  15. GaitCloud: Leveraging Spatial-temporal Information for LiDAR-base Gait Recognition with A True-3D Gait Representation Reviewed

    Zhang, SX; Awano, H; Sato, T

    2025 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV     page: 2849 - 2858   2025

     More details

    Publisher:Proceedings 2025 IEEE Winter Conference on Applications of Computer Vision Wacv 2025  

    Gait recognition using point clouds captured by LiDAR (Light Detection And Ranging) sensors offers better adaptability to variations in walking conditions compared to camera-based methods, due to the precise spatial information captured. However, existing methods typically project the point clouds into a sequence of 2D depth images extended along the time dimension and adopt gait recognition networks optimized for camera-based approaches. This planar projection compromises the integrity of the 3D coordinates (length, width, and depth) and results in severe silhouette deformations with varied observation viewpoints, similar to the camera-based methods. To better utilize the spatial information in gait point clouds, we propose a true 3D gait representation using efficient point cloud voxelization, termed GaitCloud. Additionally, we explore the unique nature of LiDAR-captured point clouds and present two improved modules adapted to our method, called Layer Encoder (LE) and Horizontal Convolutional Pooling (HCP). Evaluation results using the open-access gait dataset SUSTech1K show that our method outperforms the state-of-the-art, achieving recognition accuracies of 93.1 % and 89.2 % in cross-view and variance experiments, respectively. These results demonstrate that 3D gait representation based on point cloud voxelization more effectively utilizes spatial information than depth images, offering new possibilities for high-performance LiDAR-based gait recognition. The source code is available at https://github.com/seagrgz/GaitCloud-master.git.

    DOI: 10.1109/WACV61041.2025.00282

    Web of Science

    Scopus

  16. A Robust and Energy Efficient Hyperdimensional Computing System for Voltage-scaled Circuits Reviewed Open Access

    Liang, DH; Awano, H; Miura, N; Shiomi, J

    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS   Vol. 23 ( 6 )   2024.11

     More details

    Publisher:ACM Transactions on Embedded Computing Systems  

    Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle such issues, we further extend the DependableHD to the second version DependableHDv2, a HyperDimensional Computing (HDC) system that can tolerate bit-level memory failure in the low voltage region with high robustness. DependableHDv2 introduces the concept of margin enhancement for model retraining and utilizes noise injection to improve the robustness, which is capable of application in most state-of-the-art HDC algorithms. We additionally propose the dimension-swapping technique, which aims at handling the stuck-at errors induced by aggressive voltage scaling in the memory cells. Our experiment shows that under 8% memory stuck-at error, DependableHDv2 exhibits a 2.42% accuracy loss on average, which achieves a 14.1× robustness improvement compared to the baseline HDC solution. The hardware evaluation shows that DependableHDv2 supports the systems to reduce the supply voltage from 430 mV to 340 mV for both item Memory and Associative Memory, which provides a 41.8% energy consumption reduction while maintaining competitive accuracy performance.

    DOI: 10.1145/3620671

    Web of Science

    Scopus

  17. BayesianSpikeFusion: accelerating spiking neural network inference via Bayesian fusion of early prediction Reviewed Open Access

    Habara, T; Sato, T; Awano, H

    FRONTIERS IN NEUROSCIENCE   Vol. 18   page: 1420119   2024.8

     More details

    Language:English   Publisher:Frontiers in Neuroscience  

    Spiking neural networks (SNNs) have garnered significant attention due to their notable energy efficiency. However, conventional SNNs rely on spike firing frequency to encode information, necessitating a fixed sampling time and leaving room for further optimization. This study presents a novel approach to reduce sampling time and conserve energy by extracting early prediction results from the intermediate layer of the network and integrating them with the final layer's predictions in a Bayesian fashion. Experimental evaluations conducted on image classification tasks using MNIST, CIFAR-10, and CIFAR-100 datasets demonstrate the efficacy of our proposed method when applied to VGGNets and ResNets models. Results indicate a substantial energy reduction of 38.8% in VGGNets and 48.0% in ResNets, illustrating the potential for achieving significant efficiency gains in spiking neural networks. These findings contribute to the ongoing research in enhancing the performance of SNNs, facilitating their deployment in resource-constrained environments. Our code is available on GitHub: https://github.com/hanebarla/BayesianSpikeFusion.

    DOI: 10.3389/fnins.2024.1420119

    Open Access

    Web of Science

    Scopus

    PubMed

  18. DNA-based Similar Image Retrieval via Triplet Network-driven Encoder Reviewed

    Koike, T; Awano, H; Sato, T

    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE     2024

  19. Triplet Network-Based DNA Encoding for Enhanced Similarity Image Retrieval Reviewed

    Koike, T; Awano, H; Sato, T

    PROCEEDINGS OF THE 61ST ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2024     2024

     More details

    Publisher:Proceedings Design Automation Conference  

    With the exponential growth of digital data, DNA is emerging as an attractive medium for storage and computing. Thus, design methods for encoding, storing, and searching digital data within DNA storage are of utmost importance. This paper introduces image classification as a measurable task for evaluating the performance of DNA encoders in similar image searches. Furthermore, we propose a novel triplet network-based DNA encoder to improve the accuracy and efficiency. The evaluation using the CIFAR-100 dataset demonstrates that the proposed encoder outperforms existing encoders in retrieving similar images, with an accuracy of 0.77, which is equivalent to 94% of the practical upper limit, and 16 times faster training time.

    DOI: 10.1145/3649329.3657320

    Web of Science

    Scopus

  20. StrideHD: A Binary Hyperdimensional Computing System Utilizing Window Striding for Image Classification Reviewed Open Access

    Liang, DH; Shiomi, J; Miura, N; Awano, H

    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS   Vol. 5   page: 211 - 223   2024

     More details

    Publisher:IEEE Open Journal of Circuits and Systems  

    Hyper-Dimensional (HD) computing is a brain-inspired learning approach for efficient and fast learning on today's embedded devices. HDC first encodes all data points to high-dimensional vectors called hypervectors and then efficiently performs the classification task using a well-defined set of operations. Although HDC achieved reasonable performances in several practical tasks, it comes with huge memory requirements since the data point should be stored in a very long vector having thousands of bits. To alleviate this problem, we propose a novel HDC architecture, called StrideHD. By utilizing the window striding in image classification, StrideHD enables HDC system to be trained and tested using binary hypervectors and achieves high accuracy with fast training speed and significantly low hardware resources. StrideHD encodes data points to distributed binary hypervectors and eliminates the expensive Channel item Memory (CiM) and item Memory (iM) in the encoder, which significantly reduces the required hardware cost for inference. Our evaluation also shows that compared with two popular HD algorithms, the singlepass StrideHD model achieves a 27.6 × and 8.2 × reduction in inference memory cost without hurting the classification accuracy, while the iterative mode further provides 8.7× memory efficiency. Under the same inference memory cost, our single-pass mode StrideHD averagely achieves 13.56% accuracy improvement in comparison with the single-pass baseline HD, which is a similar performance even in comparison with the costly iterative baseline HD models. As an extension, the iterative retraining mode of StrideHD averagely provides 11.33% accuracy improvement to its single-pass mode, which can be accomplished in fewer iterations in comparison with the baseline HD algorithms. The hardware implementation also demonstrates that StrideHD achieves over 9.9× and 28.8× reduction compared with baseline in area and power, respectively.

    DOI: 10.1109/OJCAS.2024.3401028

    Open Access

    Web of Science

    Scopus

  21. S<SUP>3</SUP>M: Static Semi-Segmented Multipliers for Energy-efficient DNN Inference Accelerators Reviewed Open Access

    Zhang, MT; Cheng, Q; Awano, H; Lin, LY; Hashimoto, M

    2024 IEEE 42ND INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD     page: 16 - 23   2024

     More details

    Publisher:Proceedings IEEE International Conference on Computer Design VLSI in Computers and Processors  

    Approximate multipliers offer an efficient approach to reduce power consumption in compute-intensive applications, such as Deep Neural Networks (DNNs). However, current 8-bit approximate multipliers struggle to maintain high accuracy across various DNN applications. In this paper, we highlight challenges in 8-bit multiplier designs with body approximation strategies and evaluate the effectiveness of input approximation methods. Recognizing that exact multipliers with quantization bit-widths below 8 bits have demonstrated superior performance, we aim to explore whether alternative input approximation methods can provide an even better tradeoff between accuracy and energy consumption. To this end, by exploiting the fact that weight operand values are smaller than activations and prepared offline in DNNs, we simplify a static segmented multiplier (SSM) into a static semi-segmented multiplier (S3M), achieving a 31.58% reduction in power-delay product (PDP) compared to the original SSM, with similar classification accuracy. Additionally, we propose Coded S3M with optimized memory usage and im-plement various multipliers on a systolic array-based accelerator. Experimental results show that the proposed S3M and Coded S3M outperform existing 8-bit approximate multipliers in DNN applications, effectively bridging the PDP and inference accuracy tradeoff observed across exact commercial IP multipliers of varied bit-widths without requiring time-consuming retraining. Consequently, the proposed multiplier designs provide enhanced computational solutions for energy-efficient DNN inference ac-celerators.

    DOI: 10.1109/ICCD63220.2024.00014

    Web of Science

    Scopus

  22. S3M: Static Semi-Segmented Multipliers for Energy-efficient DNN Inference Accelerators Reviewed

    Zhang Mingtao, Cheng Quan, Awano Hiromitsu, Lin Longyang, Hashimoto Masanori

    IEEE International Conference on Computer Design: VLSI in Computers and Processors, (ICCD)     page: 16 - 23   2024

     More details

    Language:English  

    Approximate multipliers offer an efficient approach to reduce power consumption in compute-intensive applications, such as Deep Neural Networks (DNNs). However, current 8-bit approximate multipliers struggle to maintain high accuracy across various DNN applications. In this paper, we highlight challenges in 8-bit multiplier designs with body approximation strategies and evaluate the effectiveness of input approximation methods. Recognizing that exact multipliers with quantization bit-widths below 8 bits have demonstrated superior performance, we aim to explore whether alternative input approximation methods can provide an even better trade-off between accuracy and energy consumption. To this end, by exploiting the fact that weight operand values are smaller than activations and prepared offline in DNNs, we simplify a static segmented multiplier (SSM) into a static semi-segmented multiplier (S³M), achieving a 31.58% reduction in power-delay product (PDP) compared to the original SSM, with similar classification accuracy. Additionally, we propose Coded S³M with optimized memory usage and implement various multipliers on a systolic array-based accelerator. Experimental results show that the proposed S³M and Coded S³M outperform existing 8-bit approximate multipliers in DNN applications, effectively bridging the PDP and inference accuracy trade-off observed across exact commercial IP multipliers of varied bit-widths without requiring time-consuming retraining. Consequently, the proposed multiplier designs provide enhanced computational solutions for energy-efficient DNN inference accelerators.

    CiNii Research

  23. Fast Parameter Optimization of Delayed Feedback Reservoir with Backpropagation and Gradient Descent Reviewed

    Ikeda, S; Awano, H; Sato, T

    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE     2024

  24. Double MAC on a Cell: A 22-nm 8T-SRAM-Based Analog In-Memory Accelerator for Binary/Ternary Neural Networks Featuring Split Wordline Reviewed Open Access

    Tagata, H; Sato, T; Awano, H

    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS   Vol. 5   page: 328 - 340   2024

     More details

    Publisher:IEEE Open Journal of Circuits and Systems  

    This paper proposes a novel 8T-SRAM based computing-in-memory (CIM) accelerator for the Binary/Ternary neural networks. The proposed split dual-port 8T-SRAM cell has two input ports, simultaneously performing two binary multiply-and-accumulate (MAC) operations on left and right bitlines. This approach enables a twofold increase in throughput without significantly increasing area or power consumption, since the area overhead for doubling throughput is only two additional WL wires compared to the conventional 8T-SRAM. In addition, the proposed circuit supports binary and ternary activation input, allowing flexible adjustment of high energy efficiency and high inference accuracy depending on the application. The proposed SRAM macro consists of a 128 × 128 SRAM array that outputs the MAC operation results of 96 binary/ternary inputs and 96 × 128 binary weights as 1-5 bit digital values. The proposed circuit performance was evaluated by post-layout simulation with the 22-nm process layout of the overall CIM macro. The proposed circuit is capable of high-speed operation at 1 GHz. It achieves a maximum area efficiency of 3320 TOPS/mm2, which is 3.4 × higher compared to existing research with a reasonable energy efficiency of 1471 TOPS/W. The simulated inference accuracies of the proposed circuit are 96.45%/97.67% for MNIST dataset with binary/ternary MLP model, and 86.32%/88.56% for CIFAR-10 dataset with binary/ternary VGG-like CNN model.

    DOI: 10.1109/OJCAS.2024.3482469

    Open Access

    Web of Science

    Scopus

  25. Exploring Surface Code Decoding via Cryo-CMOS for Fault-Tolerant Quantum Computers Reviewed

    Wang, R.T.; Sato, T.; Awano, H.

    2024 IEEE International Conference on Quantum Computing and Engineering (QCE)     2024

     More details

    Authorship:Last author   Language:English  

  26. Square-wave defined pulse generator for high fidelity gate operation of superconducting qubits Reviewed

    Matsuo, R.; Ogawa, K.; Shiomi, H.; Negoro, M.; Ohira, R.; Miyoshi, T.; Shintani, M.; Awano, H.; Sato, T.; Shiomi, J.

    2024 IEEE International Conference on Quantum Computing and Engineering (QCE)     2024

     More details

    Language:English  

  27. Modular DFR: Digital Delayed Feedback Reservoir Model for Enhancing Design Flexibility Reviewed

    Ikeda, S; Awano, H; Sato, T

    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS   Vol. 22 ( 5 )   2023.10

     More details

    Publisher:ACM Transactions on Embedded Computing Systems  

    A delayed feedback reservoir (DFR) is a type of reservoir computing system well-suited for hardware implementations owing to its simple structure. Most existing DFR implementations use analog circuits that require both digital-to-analog and analog-to-digital converters for interfacing. However, digital DFRs emulate analog nonlinear components in the digital domain, resulting in a lack of design flexibility and higher power consumption. In this paper, we propose a novel modular DFR model that is suitable for fully digital implementations. The proposed model reduces the number of hyperparameters and allows flexibility in the selection of the nonlinear function, which improves the accuracy while reducing the power consumption. We further present two DFR realizations with different nonlinear functions, achieving 10× power reduction and 5.3× throughput improvement while maintaining equal or better accuracy.

    DOI: 10.1145/3609105

    Web of Science

    Scopus

  28. Uncertainty-Aware Haptic Shared Control With Humanoid Robots for Flexible Object Manipulation Reviewed

    Hara, T; Sato, T; Ogata, T; Awano, H

    IEEE ROBOTICS AND AUTOMATION LETTERS   Vol. 8 ( 10 ) page: 6435 - 6442   2023.10

     More details

    Publisher:IEEE Robotics and Automation Letters  

    We propose a haptic shared control system that predicts human manipulation intentions using a neural network and adaptively presents haptic guidance to achieve smooth robot control remotely. Although the haptic shared control has garnered increasing attention as a method to improve operability in remote operations, incorrect guidance can worsen operability. In this study, we dynamically switch the strength of haptic guidance presentation depending on the uncertainty of the inference results of the neural network. Thus, we weaken the haptic guidance presentation strength for predictions in which the neural network lacks confidence and strengthen it for those with high confidence, thereby achieving guidance presentation that does not impede human manipulation. As a result of experiments using the Nextage OPEN upper-body humanoid robot, in a task involving folding a flexible object, we succeeded in reducing task execution time by 17.1% compared to that with an existing method that determines the strength of haptic guidance presentation without considering the confidence of the neural network.

    DOI: 10.1109/LRA.2023.3306668

    Web of Science

    Scopus

  29. BayesianPUFNet: Training Sample Efficient Modeling Attack for Physically Unclonable Functions Reviewed Open Access

    AWANO Hiromitsu, IKEDA Makoto

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E106.A ( 5 ) page: 840 - 850   2023.5

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>This paper proposes a deep neural network named <i>BayesianPUFNet</i> that can achieve high prediction accuracy even with few challenge-response pairs (CRPs) available for training. Generally, modeling attacks are a vulnerability that could compromise the authenticity of physically unclonable functions (PUFs); thus, various machine learning methods including deep neural networks have been proposed to assess the vulnerability of PUFs. However, conventional modeling attacks have not considered the cost of CRP collection and analyzed attacks based on the assumption that sufficient CRPs were available for training; therefore, previous studies may have underestimated the vulnerability of PUFs. Herein, we show that the application of Bayesian deep neural networks that incorporate Bayesian statistics can provide accurate response prediction even in situations where sufficient CRPs are not available for learning. Numerical experiments show that the proposed model uses only half the CRP to achieve the same response prediction as that of the conventional methods. Our code is openly available on https://github.com/bayesian-puf-net/bayesian-puf-net.git.</p>

    DOI: 10.1587/transfun.2022eap1061

    Web of Science

    Scopus

    CiNii Research

  30. B2N2: Resource efficient Bayesian neural network accelerator using Bernoulli sampler on FPGA Reviewed Open Access

    Awano H., Hashimoto M.

    Integration   Vol. 89   page: 1 - 8   2023.3

     More details

    Publisher:Integration  

    A resource efficient hardware accelerator for Bayesian neural network (BNN) named B<sup>2</sup>N<sup>2</sup>, Bernoulli random number based Bayesian neural network accelerator, is proposed. As neural networks expand their application into risk sensitive domains where mispredictions may cause serious social and economic losses, evaluating the NN's confidence on its prediction has emerged as a critical concern. Among many uncertainty evaluation methods, BNN provides a theoretically grounded way to evaluate the uncertainty of NN's output by treating network parameters as random variables. By exploiting the central limit theorem, we propose to replace costly Gaussian random number generators (RNG) with Bernoulli RNG which can be efficiently implemented on hardware since the possible outcome from Bernoulli distribution is binary. We demonstrate that B<sup>2</sup>N<sup>2</sup> implemented on Xilinx ZCU104 FPGA board consumes only 465 DSPs and 81661 LUTs which corresponds to 50.9% and 14.3% reductions compared to Gaussian-BNN (Hirayama et al., 2020) implemented on the same FPGA board for fair comparison. We further compare B<sup>2</sup>N<sup>2</sup> with VIBNN (Cai et al., 2018), which shows that B<sup>2</sup>N<sup>2</sup> successfully reduced DSPs and LUTs usages by 50.9% and 57.9%, respectively. Owing to the reduced hardware resources, B<sup>2</sup>N<sup>2</sup> improved energy efficiency by 7.50% and 57.5% compared to Gaussian-BNN (Hirayama et al., 2020) and VIBNN (Cai et al., 2018), respectively.

    DOI: 10.1016/j.vlsi.2022.11.005

    Open Access

    Scopus

  31. <i>DependableHD</i>: A Hyperdimensional Learning Framework for Edge-oriented Voltage-scaled Circuits Reviewed

    Liang, D; Awano, H; Miura, N; Shiomi, J

    2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC     page: 416 - 422   2023

     More details

    Publisher:Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC  

    Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing the stable operation in modern VLSI. To tackle such issues, we propose DependableHD, a learning framework based on HyperDimensional Computing (HDC), which supports the systems to tolerate bit-level memory failure in the low voltage region with high robustness. For the first time, DependableHD introduces the concept of margin enhancement for model retraining and utilizes noise injection to improve the robustness, which is capable of application in most state-of-the-art HDC algorithms. Our experiment shows that under 10% memory error, DependableHD exhibits a 1.22% accuracy loss on average, which achieves an 11.2× improvement compared to the baseline HDC solution. The hardware evaluation shows that DependableHD supports the systems to reduce the supply voltage from 400mV to 300mV, which provides a 50.41% energy consumption reduction while maintaining competitive accuracy performance.

    DOI: 10.1145/3566097.3567886

    Web of Science

    Scopus

  32. Introducing Transfer Learning Framework on Device Modeling by Machine Learning Reviewed

    Niiyama K., Awano H., Sato T.

    IEEE International Conference on Microelectronic Test Structures   Vol. 2023-March   2023

     More details

    Publisher:IEEE International Conference on Microelectronic Test Structures  

    In this study, we propose a novel transistor modeling method using machine learning techniques, with a focus on extrapolation performance. Our method leverages knowledge from a base model that is related to the target model, instead of relying solely on device-specific information. The results show that our approach outperforms other transistor modeling methods based on machine learning, particularly in modeling similar but different transistors that belong to the same device family. Our method was able to reduce the root mean squared error (RMSE) by up to 80.0% compared to other methods.

    DOI: 10.1109/ICMTS55420.2023.10094067

    Scopus

  33. Pay Attention via Quantization: Enhancing Explainability of Neural Networks via Quantized Activation Reviewed Open Access

    Tashiro, Y; Awano, H

    IEEE ACCESS   Vol. 11   page: 34431 - 34439   2023

     More details

    Publisher:IEEE Access  

    Modern deep learning algorithms comprise highly complex artificial neural networks, making it extremely difficult for humans to track their inference processes. As the social implementation of deep learning progresses, the human and economic losses caused by inference errors are becoming increasingly problematic, making it necessary to develop methods to explain the basis for the decisions of deep learning algorithms. Although an attention mechanism-based method to visualize the regions that contribute to steering angle prediction in an automated driving task has been proposed, its explanatory capability is low. In this paper, we focus on the fact that the importance of each bit in the activation value of a network is biased (i.e., the sign and exponent bits are weighted more heavily than the mantissa bits), which has been overlooked in previous studies. Specifically, this paper quantizes network activations, encouraging important information to be aggregated to the sign bit. Further, we introduce an attention mechanism restricted to the sign bit to improve the explanatory power. Our numerical experiment using the Udacity dataset revealed that the proposed method achieves a 1.14× higher area under curve (AUC) in terms of the deletion metric.

    DOI: 10.1109/ACCESS.2023.3264855

    Open Access

    Web of Science

    Scopus

  34. Hardware-Friendly Delayed-Feedback Reservoir for Multivariate Time-Series Classification Reviewed Open Access

    Ikeda, S; Awano, H; Sato, T

    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS   Vol. 41 ( 11 ) page: 3650 - 3660   2022.11

     More details

    Publisher:IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems  

    Reservoir computing (RC) is attracting attention as a machine-learning technique for edge computing. In time-series classification tasks, the number of features obtained using a reservoir depends on the length of the input series. Therefore, the features must be converted to a constant-length intermediate representation (IR), such that they can be processed by an output layer. Existing conversion methods involve computationally expensive matrix inversion that significantly increases the circuit size and requires processing power when implemented in hardware. In this article, we propose a simple but effective IR, namely, dot-product-based reservoir representation (DPRR), for RC based on the dot product of data features. Additionally, we propose a hardware-friendly delayed-feedback reservoir (DFR) consisting of a nonlinear element and delayed feedback loop with DPRR. The proposed DFR successfully classified multivariate time series data that has been considered particularly difficult to implement efficiently in hardware. In contrast to conventional DFR models that require analog circuits, the proposed model can be implemented in a fully digital manner suitable for high-level syntheses. A comparison with existing machine-learning methods via field-programmable gate array implementation using 12 multivariate time-series classification tasks confirmed the superior accuracy and small circuit size of the proposed method.

    DOI: 10.1109/TCAD.2022.3197488

    Web of Science

    Scopus

  35. A Hardware Efficient Reservoir Computing System Using Cellular Automata and Ensemble Bloom Filter Reviewed Open Access

    LIANG Dehua, SHIOMI Jun, MIURA Noriyuki, HASHIMOTO Masanori, AWANO Hiromitsu

    IEICE Transactions on Information and Systems   Vol. E105.D ( 7 ) page: 1273 - 1282   2022.7

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>Reservoir computing (RC) is an attractive alternative to machine learning models owing to its computationally inexpensive training process and simplicity. In this work, we propose <i>EnsembleBloomCA</i>, which utilizes cellular automata (CA) and an ensemble Bloom filter to organize an RC system. In contrast to most existing RC systems, <i>EnsembleBloomCA</i> eliminates all floating-point calculation and integer multiplication. <i>EnsembleBloomCA</i> adopts CA as the reservoir in the RC system because it can be implemented using only binary operations and is thus energy efficient. The rich pattern dynamics created by CA can map the original input into a high-dimensional space and provide more features for the classifier. Utilizing an ensemble Bloom filter as the classifier, the features provided by the reservoir can be effectively memorized. Our experiment revealed that applying the ensemble mechanism to the Bloom filter resulted in a significant reduction in memory cost during the inference phase. In comparison with <i>Bloom WiSARD</i>, one of the state-of-the-art reference work, the <i>EnsembleBloomCA</i> model achieves a 43× reduction in memory cost while maintaining the same accuracy. Our hardware implementation also demonstrated that <i>EnsembleBloomCA</i> achieved over 23× and 8.5× reductions in area and power, respectively.</p>

    DOI: 10.1587/transinf.2021edp7203

    Open Access

    Web of Science

    Scopus

    CiNii Research

  36. Temporal Ensemble SSDLite: Exploiting Temporal Correlation in Video for Accurate Object Detection Reviewed

    NAKAMURA Lukas, AWANO Hiromitsu

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E105.A ( 7 ) page: 1082 - 1090   2022.7

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>We propose “Temporal Ensemble SSDLite,” a new method for video object detection that boosts accuracy while maintaining detection speed and energy consumption. Object detection for video is becoming increasingly important as a core part of applications in robotics, autonomous driving and many other promising fields. Many of these applications require high accuracy and speed to be viable, but are used in compute and energy restricted environments. Therefore, new methods that increase the overall performance of video object detection i.e., accuracy and speed have to be developed. To increase accuracy we use ensemble, the machine learning method of combining predictions of multiple different models. The drawback of ensemble is the increased computational cost which is proportional to the number of models used. We overcome this deficit by deploying our ensemble temporally, meaning we inference with only a single model at each frame, cycling through our ensemble of models at each frame. Then, we combine the predictions for the last <i>N</i> frames where <i>N</i> is the number of models in our ensemble through non-max-suppression. This is possible because close frames in a video are extremely similar due to temporal correlation. As a result, we increase accuracy through the ensemble while only inferencing a single model at each frame and therefore keeping the detection speed. To evaluate the proposal, we measure the accuracy, detection speed and energy consumption on the Google Edge TPU, a machine learning inference accelerator, with the Imagenet VID dataset. Our results demonstrate an accuracy boost of up to 4.9% while maintaining real-time detection speed and an energy consumption of 181mJ per image.</p>

    DOI: 10.1587/transfun.2021eap1068

    Web of Science

    Scopus

    CiNii Research

  37. DistriHD: A Memory Efficient Distributed Binary Hyperdimensional Computing Architecture for Image Classification Reviewed

    Liang D., Shiomi J., Miura N., Awano H.

    Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC   Vol. 2022-January   page: 43 - 49   2022

     More details

    Publisher:Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC  

    Hyper-Dimensional (HD) computing is a brain-inspired learning approach for efficient and fast learning on today's embedded devices. HD computing first encodes all data points to high-dimensional vectors called hypervectors and then efficiently performs the classification task using a well-defined set of operations. Although HD computing achieved reasonable performances in several practical tasks, it comes with huge memory requirements since the data point should be stored in a very long vector having thousands of bits. To alleviate this problem, we propose a novel HD computing architecture, called DistriHD which enables HD computing to be trained and tested using binary hypervectors and achieves high accuracy in single-pass training mode with significantly low hardware resources. DistriHD encodes data points to distributed binary hypervectors and eliminates the expensive item memory in the encoder, which significantly reduces the required hardware cost for inference. Our evaluation also shows that our model can achieve a 27.6× reduction in memory cost without hurting the classification accuracy. The hardware implementation also demonstrates that DistriHD achieves over 9.9× and 28.8× reduction in area and power, respectively.

    DOI: 10.1109/ASP-DAC52403.2022.9712589

    Scopus

  38. Respiratory Rate Estimation Based on WiFi Frame Capture Reviewed Open Access

    Kanda T., Sato T., Awano H., Kondo S., Yamamoto K.

    Proceedings IEEE Consumer Communications and Networking Conference Ccnc     page: 881 - 884   2022

     More details

    Publisher:Proceedings IEEE Consumer Communications and Networking Conference Ccnc  

    This paper presents a method that estimates the respiratory rate based on the frame capturing of wireless local area networks. The method uses beamforming feedback matrices (BFMs) contained in the captured frames, which is a rotation matrix of channel state information (CSI). BFMs are transmitted unencrypted and easily obtained using frame capturing, requiring no specific firmware or WiFi chipsets, unlike the methods that use CSI. Such properties of BFMs allow us to apply frame capturing to various sensing tasks, e.g., vital sensing. In the proposed method, principal component analysis is applied to BFMs to isolate the effect of the chest movement of the subject, and then, discrete Fourier transform is performed to extract respiratory rates in a frequency domain. Experimental evaluation results confirm that the frame-capture-based respiratory rate estimation can achieve estimation error lower than 3.5 breaths/minute.

    DOI: 10.1109/CCNC49033.2022.9700721

    Scopus

  39. Pay Attention via Binarization: Enhancing Explainability of Neural Networks via Binarization of Activation Reviewed

    Tashiro, Y; Awano, H

    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22)   Vol. 2022-May   page: 3160 - 3164   2022

     More details

    Publisher:Proceedings IEEE International Symposium on Circuits and Systems  

    Modern deep learning algorithms consist of highly complex artificial neural networks, making it extremely difficult for humans to track the inference process. While the social implementation of deep learning is progressing, the human and economic losses caused by inference errors are becoming more and more problematic, and there is a need for methods to explain the basis for the decisions of deep learning algorithms. Although, in an automated driving task, a method to visualize the regions that contribute to steering angle prediction using an attention mechanism has been proposed, its explanatory capability is still low. In this paper, we focus on the difference in the importance of each bit in the activation (i.e., the LSBs have the lowest weight while the MSBs have the highest weight), and propose a method to add attention only to the sign bits to further enhance the explanation. Our numerical experiment using the Udacity dataset revealed that the proposed method achieves 33% higher area under curve (AUC) in terms of the deletion metric.

    DOI: 10.1109/ISCAS48785.2022.9937289

    Web of Science

    Scopus

  40. Psyche Navigation System 構想

    熊谷 誠慈, 三浦 典之, 粟野 皓光, 上田 祥行

    人工知能   Vol. 36 ( 6 ) page: 684 - 694   2021.11

     More details

    Language:Japanese   Publisher:一般社団法人 人工知能学会  

    DOI: 10.11517/jjsai.36.6_684

    CiNii Research

  41. Visualization of a chorus structure in multiple frog species by a sound discrimination device Reviewed

    Awano, H; Shirasaka, M; Mizumoto, T; Okuno, HG; Aihara, I

    JOURNAL OF COMPARATIVE PHYSIOLOGY A-NEUROETHOLOGY SENSORY NEURAL AND BEHAVIORAL PHYSIOLOGY   Vol. 207 ( 1 ) page: 87 - 98   2021.1

     More details

    Language:English   Publisher:Journal of Comparative Physiology A Neuroethology Sensory Neural and Behavioral Physiology  

    We developed a sound discrimination device to identify and localize the species of nocturnal animals in their natural habitat. The sound discrimination device is equipped with a microphone, a light-emitting diode, and a band-pass filter. By tuning the center frequency of the filter to include a dominant frequency of the calls of a focal species, we enable the device to be illuminated only when detecting the calls of the focal species. In experiments in a laboratory room, we tuned the sound discrimination devices to detect the calls of Hyla japonica or Rhacophorus schlegelii and broadcast the frog calls from loudspeakers. By analyzing the illumination pattern of the devices, we successfully identified and localized the two kinds of sound sources. Next, we placed the sound discrimination devices in a field site where actual male frogs (H. japonica and R. schlegelii) produced sounds. The analysis of the illumination pattern demonstrates the efficacy of the developed devices in a natural environment and also enables us to extract pairs of male frogs that significantly overlapped or alternated their calls.

    DOI: 10.1007/s00359-021-01463-9

    Web of Science

    Scopus

    PubMed

  42. Binary Neural Network in Robotic Manipulation: Flexible Object Manipulation for Humanoid Robot Using Partially Binarized Auto-Encoder on FPGA Reviewed

    Ohara, S; Ogata, T; Awano, H

    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)     page: 6010 - 6015   2021

     More details

    Publisher:IEEE International Conference on Intelligent Robots and Systems  

    A neural network based flexible object manipulation system for a humanoid robot on FPGA is proposed. Although the manipulations of flexible objects using robots attract ever increasing attention since these tasks are the basic and essential activities in our daily life, it has been put into practice only recently with the help of deep neural networks. However such systems have relied on GPU accelerators, which cannot be implemented into the space limited robotic body. Although field programmable gate arrays (FPGAs) are known to be energy efficient and suitable for embedded systems, the model size should be drastically reduced since FPGAs have limited on-chip memory. To this end, we propose partially binarized deep convolutional auto-encoder technique, where only an encoder part is binarized to compress model size without degrading the inference accuracy. The model implemented on Xilinx ZCU102 achieves 41.1 frames per second with a power consumption of 3.1 W, which corresponds to 10× and 3.7× improvements from the systems implemented on Core i7 6700K and RTX 2080 Ti, respectively.

    DOI: 10.1109/IROS51168.2021.9636825

    Web of Science

    Scopus

  43. BloomCA: A Memory Efficient Reservoir Computing Hardware Implementation Using Cellular Automata and Ensemble Bloom Filter Reviewed

    Liang, DH; Hashimoto, M; Awano, H

    PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021)     page: 587 - 590   2021

  44. Ising-PUF: A machine learning attack resistant PUF featuring lattice like arrangement of Arbiter-PUFs Reviewed

    Awano H., Sato T.

    Proceedings of the 2018 Design Automation and Test in Europe Conference and Exhibition Date 2018   Vol. 2018-January   page: 1447 - 1452   2018.4

     More details

    Publisher:Proceedings of the 2018 Design Automation and Test in Europe Conference and Exhibition Date 2018  

    A concept of Ising-PUF, a novel PUF structure that utilizes chaotic behavior of mutually interacting small PUFs, is proposed. Ising-PUF consists of a lattice like arrangement of small PUFs, each of which contains a spin register that stores the response of the small PUF, which also serves as a challenge of its neighbors. The spin patterns that develop along time determine the 1-bit response of the Ising-PUF. Utilizing state-memorizing nature of the spin registers, Ising-PUF attains a challenge hysteresis, i.e., allowing sequence of challenge inputs that continuously stimulate its chaotic behavior, which provides the drastically large challenge-to-response space. Experimental results demonstrate nearly ideal metrics; inter-chip Hamming distance (HD) of 50.1% and inter-environment HD of 2.26%. Further, Ising-PUF is remarkably tolerant to machine learning attacks, demonstrating that, even with a deep neural network using a 50k training cRPs, the prediction accuracy remains 50%, which is comparable to a random guess.

    DOI: 10.23919/DATE.2018.8342239

    Scopus

  45. Visualizing Phonotactic Behavior of Female Frogs in Darkness Reviewed Open Access

    Aihara, I; Bishop, PJ; Ohmer, MEB; Awano, H; Mizumoto, T; Okuno, HG; Narins, PM; Hero, JM

    SCIENTIFIC REPORTS   Vol. 7 ( 1 ) page: 10539   2017.9

     More details

    Language:English   Publisher:Scientific Reports  

    Many animals use sounds produced by conspecifics for mate identification. Female insects and anuran amphibians, for instance, use acoustic cues to localize, orient toward and approach conspecific males prior to mating. Here we present a novel technique that utilizes multiple, distributed sound-indication devices and a miniature LED backpack to visualize and record the nocturnal phonotactic approach of females of the Australian orange-eyed tree frog (Litoria chloris) both in a laboratory arena and in the animal's natural habitat. Continuous high-definition digital recording of the LED coordinates provides automatic tracking of the female's position, and the illumination patterns of the sound-indication devices allow us to discriminate multiple sound sources including loudspeakers broadcasting calls as well as calls emitted by individual male frogs. This innovative methodology is widely applicable for the study of phonotaxis and spatial structures of acoustically communicating nocturnal animals.

    DOI: 10.1038/s41598-017-11150-y

    Open Access

    Web of Science

    Scopus

    PubMed

  46. RTN in Scaled Transistors for On-Chip Random Seed Generation Reviewed

    Mohanty, A; Sutaria, KB; Awano, H; Sato, T; Cao, Y

    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS   Vol. 25 ( 8 ) page: 2248 - 2257   2017.8

     More details

    Publisher:IEEE Transactions on Very Large Scale Integration VLSI Systems  

    Random numbers play a vital role in cryptography, where they are used to generate keys, nonce, one-time pads, and initialization vectors for symmetric encryption. The quality of random number generator (RNG) has significant implications on vulnerability and performance of these algorithms. A pseudo-RNG uses a deterministic algorithm to produce numbers with a distribution very similar to uniform. True RNGs (TRNGs), on the other hand, use some natural phenomenon/process to generate random bits. They are nondeterministic, because the next number to be generated cannot be determined in advance. In this paper, a novel on-chip noise source, random telegraph noise (RTN), is exploited for simple and reliable TRNG. RTN, a microscopic process of stochastic trapping/detrapping of charges, is usually considered as a noise and mitigated in design. Through physical modeling and silicon measurement, we demonstrate that RTN is appropriate for TRNG, especially in highly scaled MOSFETs. Due to the slow speed of RTN, we purpose the system for on-chip seed generation for random number. Our contributions are: 1) physical model calibration of RTN with comprehensive 65-and 180-nm transistor measurements; 2) the scaling trend of RTN, validated with silicon data down to 28 nm; 3) design principles to achieve 50% signal probability by using intrinsic RTN physical properties, without traditional postprocessing algorithms, the generated sequence passes the National Institute of Standards and Technology (NIST) tests; and 4) solutions to manage realistic issues in practice, including multilevel RTN signal, robustness to voltage and temperature fluctuations and the operation speed.

    DOI: 10.1109/TVLSI.2017.2687762

    Web of Science

    Scopus

  47. Scalable Device Array for Statistical Characterization of BTI-Related Parameters Reviewed

    Awano, H; Morita, S; Sato, T

    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS   Vol. 25 ( 4 ) page: 1455 - 1466   2017.4

     More details

    Publisher:IEEE Transactions on Very Large Scale Integration VLSI Systems  

    A device array circuit, scalable in terms of the number of transistors used, is proposed. The proposed array facilitates accurate and simultaneous bias voltage application to a large number of devices, making it suitable for the measurement-based statistical characterization of device degradation, known as bias temperature instability. Using the proposed array, the degradation measurement of thousands of transistors is made possible in a practical amount of time. The experimental results show that the defect-centric model can approximate the statistical variation in magnitudes of threshold voltage shifts ( Delta V{mathrm {TH}} ) and that the variance of Delta V{mathrm {TH}} bears an inverse relationship to the channel areas of transistors. The degradation variability under ac stress conditions is also presented for the first time.

    DOI: 10.1109/TVLSI.2016.2638021

    Web of Science

    Scopus

  48. Efficient circuit failure probability calculation along product lifetime considering device aging Reviewed

    Awano H., Hiromoto M., Sato T.

    Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC     page: 93 - 98   2017.2

     More details

    Publisher:Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC  

    A device-aging simulation that efficiently estimates temporal degradation of failure probability of a circuit is proposed. As the size of transistors shrinks, consideration of device aging in addition to manufacturing variability has become an urgent issue for maintaining reliability of LSIs. Contrary to existing techniques that separately handle manufacturing variability and the device aging, we propose a simultaneous evaluation approach using an augmented reliability and subset simulation. By eliminating the repetitive failure-probability calculations at each device-age, the proposed method reduces the number of required circuit simulations to about 1/6 of that of the conventional method without compromising accuracy.

    DOI: 10.1109/ASPDAC.2017.7858302

    Scopus

  49. Swarm of sound-to-light conversion devices to monitor acoustic communication among small nocturnal animals Reviewed Open Access

    Mizumoto T., Aihara I., Otsuka T., Awano H., Okuno H.G.

    Journal of Robotics and Mechatronics   Vol. 29 ( 1 ) page: 255 - 267   2017.2

     More details

    Publisher:Journal of Robotics and Mechatronics  

    While many robots have been developed to monitor environments, most studies are dedicated to navigation and locomotion and use off-the-shelf sensors. We focus on a novel acoustic device and its processing software, which is designed for a swarm of environmental monitoring robots equipped with the device. This paper demonstrates that a swarm of monitoring devices is useful for biological field studies, i.e., understanding the spatio-temporal structure of acoustic communication among animals in their natural habitat. The following processes are required in monitoring acoustic communication to analyze the natural behavior in the field: (1) working in their habitat, (2) automatically detecting multiple and simultaneous calls, (3) minimizing the effect on the animals and their habitat, and (4) working with various distributions of animals. We present a sound-imaging system using sound-to-light conversion devices called “Fireflies” and their data analysis method that satisfies the requirements. We can easily collect data by placing a swarm (dozens) of Fireflies and record their light intensities using an offthe- shelf video camera. Because each Firefly converts sound in its vicinity into light, we can easily obtain when, how long, and where animals call using temporal analysis of the Firefly light intensities. The device is evaluated in terms of three aspects: volume to light-intensitycharacteristics, battery life through indoor experiments, and water resistance via field experiments. We also present the visualization of a chorus of Japanese tree frogs (Hyla japonica) recorded in their habitat, that is, paddy fields.

    DOI: 10.20965/jrm.2017.p0255

    Open Access

    Scopus

  50. Efficient Aging-Aware Failure Probability Estimation Using Augmented Reliability and Subset Simulation Reviewed Open Access

    AWANO Hiromitsu, SATO Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E100.A ( 12 ) page: 2807 - 2815   2017

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>A circuit-aging simulation that efficiently calculates temporal change of rare circuit-failure probability is proposed. While conventional methods required a long computational time due to the necessity of conducting separate calculations of failure probability at each device age, the proposed Monte Carlo based method requires to run only a single set of simulation. By applying the augmented reliability and subset simulation framework, the change of failure probability along the lifetime of the device can be evaluated through the analysis of the Monte Carlo samples. Combined with the two-step sample generation technique, the proposed method reduces the computational time to about 1/6 of that of the conventional method while maintaining a sufficient estimation accuracy.</p>

    DOI: 10.1587/transfun.e100.a.2807

    Web of Science

    Scopus

    CiNii Research

  51. Identification and Application of Invariant Critical Paths under NBTI Degradation Reviewed Open Access

    BIAN Song, MORITA Shumpei, SHINTANI Michihiro, AWANO Hiromitsu, HIROMOTO Masayuki, SATO Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E100.A ( 12 ) page: 2797 - 2806   2017

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    <p>As technology further scales semiconductor devices, aging-induced device degradation has become one of the major threats to device reliability. In addition, aging mechanisms like the negative bias temperature instability (NBTI) are known to be sensitive to workload (i.e., signal probability) that is hard to be assumed at design phase. In this work, we analyze the workload dependence of NBTI degradation using a processor, and propose a novel technique to estimate the worst-case paths. In our approach, we exploit the fact that the deterministic nature of circuit structure limits the amount of NBTI degradation on different paths, and propose a two-stage path extraction algorithm to identify the invariant critical paths (ICPs) in the processor. Utilizing these paths, we also propose an optimization technique for the replacement of internal node control logic that mitigates the NBTI degradation in the design. Through numerical experiment on two processor designs, we achieved nearly 300x reduction in the sheer number of paths on both designs. Utilizing the extracted ICPs, we achieved 96x-197x speedup without loss in mitigation gain.</p>

    DOI: 10.1587/transfun.e100.a.2797

    Web of Science

    Scopus

    CiNii Research

  52. Physically unclonable function using RTN-induced delay fluctuation in ring oscillators Reviewed

    Yoshinaga M., Awano H., Hiromoto M., Sato T.

    Proceedings IEEE International Symposium on Circuits and Systems   Vol. 2016-July   page: 2619 - 2622   2016.7

     More details

    Publisher:Proceedings IEEE International Symposium on Circuits and Systems  

    This paper proposes RTN-PUF, a novel PUF that utilizes random telegraph noise (RTN) of transistors as the physical uniqueness of individual devices. Our proposed RTN-PUF generates a response from a pair of ring oscillators (ROs) by comparing the numbers of frequency changes, which depend on the time constants of RTN. Due to the log-uniform distribution of the time constants, our RTN-PUF provides more stable responses than the existing manufacturing-variation-based PUFs. The numerical experiments show that the RTN-PUF reduces false negative errors by about 60 times compared to the conventional RO-based PUF. This facilitates to implement PUF into security purposes.

    DOI: 10.1109/ISCAS.2016.7539130

    Scopus

  53. Call Alternation between Specific Pairs of Male Frogs Revealed by a Sound-Imaging Method in Their Natural Habitat Reviewed

    Aihara, I; Mizumoto, T; Awano, H; Okuno, HG

    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5   Vol. 08-12-September-2016   page: 2597 - 2601   2016

     More details

    Publisher:Proceedings of the Annual Conference of the International Speech Communication Association Interspeech  

    Male frogs vocalize calls to attract conspecific females as well as to announce their own territories to other male frogs. In the choruses, acoustic interaction allows the male frogs to alternate their calls with each other. Such call alternation is reported in various species of frogs including Japanese tree frogs (Hyla japonica). During call alternation, both male and female frogs are likely to discriminate calls of the male frogs because of small amount of call overlaps. Here, we show that call alternation is observed in natural choruses of male Japanese tree frogs especially between neighboring pairs. First, we demonstrate that caller positions and call timings can be estimated by a sound-imaging method. Second, the occurrence of call alternation is detected on the basis of statistical tests on phase differences of calls between respective pairs. Although our previous study revealed a global synchronization pattern in natural choruses of the male frogs, local chorus structures were not examined well. Through the observation of call alternation between specific pairs, this study suggests the existence of selective attention in the frog choruses.

    DOI: 10.21437/Interspeech.2016-336

    Web of Science

    Scopus

  54. Workload-Aware Worst Path Analysis of Processor-Scale NBTI Degradation Reviewed

    Bian, S; Shintani, M; Morita, S; Awano, H; Hiromoto, M; Sato, T

    2016 INTERNATIONAL GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI)   Vol. 18-20-May-2016   page: 203 - 208   2016

     More details

    Publisher:Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi  

    As technology further scales semiconductor devices, aging-induced device degradation has become one of the major threats to device reliability. In addition, aging mechanisms like the negative bias temperature instability (NBTI) is known to be sensitive to workload (i.e., signal probability) that is hard to be assumed at design phase. In this work, we analyze the workload dependence of NBTI degradation using a processor, and propose a novel technique to estimate the worst-case paths. In our approach, with careful examination, we exploit the fact that the deterministic nature of circuit structure limits the amount of NBTI degradation on different paths, and proposes a two-stage path extraction algorithm to identify the invariable critical paths in the processor. Through numerical experiment on a MIPS32 processor, we performed a detailed signal probability analysis, and successfully extracted 85 invariable critical paths out of the 24,978 path candidates, achieving nearly 300x reduction in the sheer number of paths.

    DOI: 10.1145/2902961.2903013

    Web of Science

    Scopus

  55. Efficient Aging-Aware SRAM Failure Probability Calculation via Particle Filter-Based Importance Sampling Reviewed Open Access

    AWANO Hiromitsu, HIROMOTO Masayuki, SATO Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E99.A ( 7 ) page: 1390 - 1399   2016

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    An efficient Monte Carlo (MC) method for the calculation of failure probability degradation of an SRAM cell due to negative bias temperature instability (NBTI) is proposed. In the proposed method, a particle filter is utilized to incrementally track temporal performance changes in an SRAM cell. The number of simulations required to obtain stable particle distribution is greatly reduced, by reusing the final distribution of the particles in the last time step as the initial distribution. Combining with the use of a binary classifier, with which an MC sample is quickly judged whether it causes a malfunction of the cell or not, the total number of simulations to capture the temporal change of failure probability is significantly reduced. The proposed method achieves 13.4× speed-up over the state-of-the-art method.

    DOI: 10.1587/transfun.e99.a.1390

    Web of Science

    Scopus

    CiNii Research

  56. Efficient Transistor-level Timing Yield Estimation via Line Sampling Reviewed

    Awano, H.; Sato, T.

    2016 ACM/EDAC/IEEE Design Automation Conference (DAC)     2016

     More details

    Authorship:Lead author   Language:English  

  57. ECRIPSE: An efficient method for calculating RTN-induced failure probability of an SRAM cell Reviewed

    Awano H., Hiromoto M., Sato T.

    Proceedings Design Automation and Test in Europe Date   Vol. 2015-April   page: 549 - 554   2015.4

     More details

    Publisher:Proceedings Design Automation and Test in Europe Date  

    Failure rate degradation of an SRAM cell due to random telegraph noise (RTN) is calculated for the first time. ECRIPSE, an efficient method for calculating the RTN-induced failure probability of an SRAM cell, has been developed to exhaustively cover a large number of possible bias-voltage combinations on which RTN statistics strongly depend. In order to shorten computational time, the Monte Carlo calculation of a single gate-bias condition is accelerated by incorporating two techniques: 1) construction of an optimal importance sampling using particles that move about the 'important' regions in a variability space, and 2) a classifier that quickly judges whether the random samples are in failure regions or not. We show that the proposed method achieves at least 15.6× speed-up over the state-of-the-art method. We then integrate an RTN model to modulate failure probability. In our experiment, RTN worsens failure probability by six times than that calculated without the effect of RTN.

    DOI: 10.7873/date.2015.0731

    Scopus

  58. A-3-5 On Stochastic modeling of NBTI induced threshold voltage variation

    Sato Masahiro, Izuka Syoichi, Awano Hiromitsu, Hashimoto Masanori, Onoye Takao

    Proceedings of the IEICE General Conference   Vol. 2015   page: 84   2015.2

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Research

  59. Recognition of in-field frog chorusing using Bayesian nonparametric microphone array processing Reviewed

    Bandog Y., Otsuka T., Aihara I., Awano H., Itoyama K., Yoshii K., Okuno H.G.

    Aaai Workshop Technical Report   Vol. WS-15-06   page: 2 - 6   2015

     More details

    Publisher:Aaai Workshop Technical Report  

    In this paper, we exploit Bayesian nonparametric microphone array processing (BNP-MAP) for analyzing the spatio- Temporal patterns of the frog chorus. Such analysis in real environments is made more difficult due to unpredictable sound sources including calls of various species of animals. An application of conventional signal processing algorithms has been difficult because these algorithms usually require the number of sound sources in advance. BNP-MAP is developed to cope with auditory uncertainties such as reverberation or unknown number of sounds by using a unified model based on Bayesian nonparametrics. We exploit BNP-MAP for analyzing the sound data of 20 minutes captured by a 7-channel microphone array in a paddy rice field in Oki Island, Japan, and revealed that two individuals of Schlegel's green tree frog {Rhacophorus schlegelii) called alternately with anti-phase. This result is compared with the video data captured by a video camera with 18 units of sound-imaging devices called Firefly deployed along the bank of the rice field. The auditory result provides more detailed patterns of the frog chorus in higher temporal resolutions. This higher resolution enables to analyze fine temporal structures of the frog calls. For example, BNP-MAP reveals the trill-like calling pattern of R. schlegelii.

    Scopus

  60. Variability in device degradations: Statistical observation of NBTI for 3996 transistors Reviewed

    Awano H., Hiromoto M., Sato T.

    European Solid State Device Research Conference     page: 218 - 221   2014.11

     More details

    Publisher:European Solid State Device Research Conference  

    Degradations of thousands of transistors have been observed in a practical time. A novel device array circuit suitable for measurement-based statistical characterization has been devised to facilitate parallel stress bias application to capture negative bias temperature instability (NBTI). The experimental results show that log-normal distributions approximate the distribution of power-law exponents very well and that the variation in magnitude of threshold voltage shifts bears an inverse relation to the channel areas of transistors. The variability in degradations under an AC-stress condition is also presented for the first time.

    DOI: 10.1109/ESSDERC.2014.6948799

    Scopus

  61. A-7-1 A Study of Chip Identification Using Random Telegraph Noise

    Yoshinaga Motoki, Awano Hiromitsu, Hiromoto Masayuki, Sato Takashi

    Proceedings of the Society Conference of IEICE   Vol. 2014   page: 95   2014.9

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Research

  62. BTIarray: A Time-Overlapping Transistor Array for Efficient Statistical Characterization of Bias Temperature Instability Reviewed

    Awano, H; Hiromoto, M; Sato, T

    IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY   Vol. 14 ( 3 ) page: 833 - 843   2014.9

     More details

    Publisher:IEEE Transactions on Device and Materials Reliability  

    A transistor array has been developed that is capable of efficiently collecting parametric data for a statistical model of bias-temperature instability (BTI) degradation. This BTIarray uses a time-overlapping technique, in which all transistors in the array undergo BTI stress or recovery bias in parallel, which greatly reduces the measurement time for a large number of transistors. An implementation using 65-nm technology validated the time-overlapping concept. The use of this array reduces the time to measure the statistical threshold voltage shifts of 128 transistors from a month to within a day while retaining precision as high as 50 μV (rms). Experiments showed that the statistical distribution of the time exponent for the degradation model of the pMOS transistor was log-normal.

    DOI: 10.1109/TDMR.2014.2327164

    Web of Science

    Scopus

  63. Spatio-Temporal Dynamics in Collective Frog Choruses Examined by Mathematical Modeling and Field Observations Reviewed Open Access

    Aihara, I; Mizumoto, T; Otsuka, T; Awano, H; Nagira, K; Okuno, HG; Aihara, K

    SCIENTIFIC REPORTS   Vol. 4   page: 3891   2014.1

     More details

    Language:English   Publisher:Scientific Reports  

    This paper reports theoretical and experimental studies on spatio-temporal dynamics in the choruses of male Japanese tree frogs. First, we theoretically model their calling times and positions as a system of coupled mobile oscillators. Numerical simulation of the model as well as calculation of the order parameters show that the spatio-temporal dynamics exhibits bistability between two-cluster antisynchronization and wavy antisynchronization, by assuming that the frogs are attracted to the edge of a simple circular breeding site. Second, we change the shape of the breeding site from the circle to rectangles including a straight line, and evaluate the stability of two-cluster and wavy antisynchronization. Numerical simulation shows that two-cluster antisynchronization is more frequently observed than wavy antisynchronization. Finally, we recorded frog choruses at an actual paddy field using our sound-imaging method. Analysis of the video demonstrated a consistent result with the aforementioned simulation: namely, two-cluster antisynchronization was more frequently realized.

    DOI: 10.1038/srep03891

    Open Access

    Web of Science

    Scopus

    PubMed

  64. A scalable device array for statistical device-aging characterization Reviewed

    Sato T., Awano H., Hiromoto M.

    Proceedings 2014 IEEE 12th International Conference on Solid State and Integrated Circuit Technology Icsict 2014     2014.1

     More details

    Publisher:Proceedings 2014 IEEE 12th International Conference on Solid State and Integrated Circuit Technology Icsict 2014  

    A device array circuit that is suitable for efficiently characterizing device parameter degradation due to bias temperature instability (BTI) is reviewed. The device array facilitates parallel application of stress and recovery bias voltages to multiple devices, reducing total measurement time significantly. The device count in the array is easily scalable to meet necessary statistical confidence level. Measurement examples of the two implementations containing 128 and 3,996 devices are also presented.

    DOI: 10.1109/ICSICT.2014.7021224

    Scopus

  65. Automation of Model Parameter Estimation for Random Telegraph Noise Reviewed Open Access

    SHIMIZU Hirofumi, AWANO Hiromitsu, HIROMOTO Masayuki, SATO Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E97.A ( 12 ) page: 2383 - 2392   2014

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    The modeling of random telegraph noise (RTN) of MOS transistors is becoming increasingly important. In this paper, a novel method is proposed for realizing automated estimation of two important RTN-model parameters: the number of interface-states and corresponding threshold voltage shift. The proposed method utilizes a Gaussian mixture model (GMM) to represent the voltage distributions, and estimates their parameters using the expectation-maximization (EM) algorithm. Using information criteria, the optimal estimation is automatically obtained while avoiding overfitting. In addition, we use a shared variance for all the Gaussian components in the GMM to deal with the noise in RTN signals. The proposed method improved estimation accuracy when the large measurement noise is observed.

    DOI: 10.1587/transfun.e97.a.2383

    Web of Science

    Scopus

    CiNii Research

  66. Compact Modeling of Statistical BTI under Trapping/Detrapping Reviewed

    Velamala, JB; Sutaria, KB; Shimizu, H; Awano, H; Sato, T; Wirth, G; Cao, Y

    IEEE TRANSACTIONS ON ELECTRON DEVICES   Vol. 60 ( 11 ) page: 3645 - 3654   2013.11

     More details

    Publisher:IEEE Transactions on Electron Devices  

    The aging process due to negative bias temperature instability (NBTI) is a key limiting factor of circuit lifetimes in CMOS design. Recent NBTI data exhibits an excessive amount of randomness and fast recovery, which are difficult to be handled by conventional power-law model (tn). Such discrepancies further pose the challenge on long-term reliability prediction under statistical variations and dynamic voltage scaling (DVS) in real circuit operation. To overcome these barriers, this paper: 1) practically explains the aging statistics due to randomness in number of traps with the log(t) model, accurately predicting the mean and variance shift; 2) proposes cycle-to-cycle model (from the first principles of trapping) to handle aging under multiple supply voltages, predicting the nonmonotonic behavior under DVS; 3) presents a long-term model to estimate a tight upper bound of dynamic aging over multiple cycles; and 4) comprehensively validates the new set of aging models with 65-nm statistical silicon data. Compared with previous models, the new set of aging models capture the aging variability and the essential role of the recovery phase under DVS, reducing unnecessary guard banding during the design stage. © 1963-2012 IEEE.

    DOI: 10.1109/TED.2013.2281986

    Web of Science

    Scopus

  67. Logarithmic modeling of BTI under dynamic circuit operation: Static, dynamic and long-term prediction Reviewed

    Velamala J.B., Sutaria K.B., Shimuzu H., Awano H., Sato T., Wirth G., Cao Y.

    IEEE International Reliability Physics Symposium Proceedings     2013.8

     More details

    Publisher:IEEE International Reliability Physics Symposium Proceedings  

    Bias temperature instability (BTI) is the dominant source of aging in nanoscale transistors. Recent works show the role of charge trapping/de-trapping (T-D) in BTI through discrete V<inf>th</inf> shifts, with the degradation exhibiting an excessive amount of randomness. Furthermore, modern circuits employ dynamic voltage scaling (DVS) where V<inf>dd</inf> is tuned, complicating the aging effect. It becomes challenging to predict long-term aging in an actual circuit under statistical variation and DVS. To accurately predict the degradation in these circumstances, this work (1) examines the principles of T-D, thereby proposing static and cycle-to-cycle (dynamic) models under voltage tuning in DVS; (2) presents a long-term model, estimating a tight upper bound of dynamic aging; (3) comprehensively validates the new set of models with 65nm silicon data. The proposed aging models accurately capture the recovery behavior in dynamic operations, reducing the unnecessary margin and enhancing the simulation efficiency for aging estimation during the design stage. © 2013 IEEE.

    DOI: 10.1109/IRPS.2013.6532063

    Scopus

  68. Multi-trap RTN parameter extraction based on Bayesian inference Reviewed

    Awano H., Tsutsui H., Ochi H., Sato T.

    Proceedings International Symposium on Quality Electronic Design Isqed     page: 597 - 602   2013.7

     More details

    Publisher:Proceedings International Symposium on Quality Electronic Design Isqed  

    This paper presents a new analysis method for estimating the statistical parameters of random telegraph noise (RTN). RTN is characterized by the time constants of carrier capture and emission, and associated changes of threshold voltage. Because trap activities are projected on to the threshold voltage, the separation of time constants and amplitude for each trap is an ill-posed problem. The proposed method solves this problem by statistical method that can reflect the physical generation process of RTN. By using Gibbs sampling algorithm developed in statistical machine learning community, we decompose the measured threshold voltage sequence to time constants and amplitude of each trap. We also demonstrate that the proposed method estimates time constants about 2.1 times more accurately than the existing work that uses hidden Markov model, which contributes to enhance the accuracy of reliability-aware circuit simulation. © 2013 IEEE.

    DOI: 10.1109/ISQED.2013.6523672

    Scopus

  69. Statistical aging under dynamic voltage scaling: A logarithmic model approach Reviewed

    Velamala J.B., Sutaria K., Shimizu H., Awano H., Sato T., Cao Y.

    Proceedings of the Custom Integrated Circuits Conference     2012.11

     More details

    Publisher:Proceedings of the Custom Integrated Circuits Conference  

    Aging mechanisms, such as Negative Bias Temperature Instability (NBTI), limit the lifetime of CMOS design. Recent NBTI data exhibits an excessive amount of randomness and fast recovery, which are difficult to be handled by conventional power-law model (t<sup>n</sup>). Such discrepancies further pose the challenge on long-term reliability prediction in real circuit operation. To overcome these barriers, this work (1) proposes a logarithmic model (log(t)) that is derived from the trapping/de-trapping assumptions; (2) practically explains the aging statistics and the non-monotonic behavior under dynamic voltage scaling (DVS); and (3) comprehensively validates the new model with 65nm silicon data. Compared to previous models, the new result captures the essential role of the recovery phase under DVS, reducing unnecessary guard-banding in reliability protection. © 2012 IEEE.

    DOI: 10.1109/CICC.2012.6330572

    Scopus

  70. Statistical observations of NBTI-induced threshold voltage shifts on small channel-area devices Reviewed

    Sato T., Awano H., Shimizu H., Tsutsui H., Ochi H.

    Proceedings International Symposium on Quality Electronic Design Isqed     page: 306 - 311   2012.7

     More details

    Publisher:Proceedings International Symposium on Quality Electronic Design Isqed  

    Performance variability of miniaturized devices has become a major obstacle for designing electronic systems. Temporal degradation of threshold voltages and its variation are going to be an additional concerns to ensure their reliability. In this paper, based on measurement results on large number of devices, we present statistical properties of device degradation and recovery. The measurement data is obtained by using a device-array circuit suitable for efficiently collect statistical data on degradations and recoveries of very small channel-area devices. Stair-like change of threshold voltages found in our measurement suggests that charge trapping and emission may play a key role in the device degradation process. © 2012 IEEE.

    DOI: 10.1109/ISQED.2012.6187510

    Scopus

  71. Bayesian Estimation of Multi-Trap RTN Parameters Using Markov Chain Monte Carlo Method Reviewed Open Access

    AWANO Hiromitsu, TSUTSUI Hiroshi, OCHI Hiroyuki, SATO Takashi

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   Vol. E95.A ( 12 ) page: 2272 - 2283   2012

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Random telegraph noise (RTN) is a phenomenon that is considered to limit the reliability and performance of circuits using advanced devices. The time constants of carrier capture and emission and the associated change in the threshold voltage are important parameters commonly included in various models, but their extraction from time-domain observations has been a difficult task. In this study, we propose a statistical method for simultaneously estimating interrelated parameters: the time constants and magnitude of the threshold voltage shift. Our method is based on a graphical network representation, and the parameters are estimated using the Markov chain Monte Carlo method. Experimental application of the proposed method to synthetic and measured time-domain RTN signals was successful. The proposed method can handle interrelated parameters of multiple traps and thereby contributes to the construction of more accurate RTN models.

    DOI: 10.1587/transfun.e95.a.2272

    Web of Science

    Scopus

    CiNii Research

  72. Use of a sparse structure to improve learning performance of recurrent neural networks Reviewed

    Awano H., Nishide S., Arie H., Tani J., Takahashi T., Okuno H.G., Ogata T.

    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics   Vol. 7064 LNCS ( PART 3 ) page: 323 - 331   2011.11

     More details

    Publisher:Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics  

    The objective of our study is to find out how a sparse structure affects the performance of a recurrent neural network (RNN). Only a few existing studies have dealt with the sparse structure of RNN with learning like Back Propagation Through Time (BPTT). In this paper, we propose a RNN with sparse connection and BPTT called Multiple time scale RNN (MTRNN). Then, we investigated how sparse connection affects generalization performance and noise robustness. In the experiments using data composed of alphabetic sequences, the MTRNN showed the best generalization performance when the connection rate was 40%. We also measured sparseness of neural activity and found out that sparseness of neural activity corresponds to generalization performance. These results means that sparse connection improved learning performance and sparseness of neural activity would be used as metrics of generalization performance. © 2011 Springer-Verlag.

    DOI: 10.1007/978-3-642-24965-5_36

    Scopus

  73. A study on parameter estimation for modeling of random-telegraph noise

    AWANO Hiromitsu, SHIMIZU Hirofumi, TSUTSUI Hiroshi, OCHI Hiroyuki, SATO Takashi

      Vol. 111 ( 324 ) page: 85 - 90   2011.11

     More details

    Language:Japanese  

    CiNii Research

  74. Human-robot cooperation in arrangement of objects using confidence measure of neuro-dynamical system Reviewed

    Awano H., Ogata T., Nishide S., Takahashi T., Komatani K., Okuno H.

    Conference Proceedings IEEE International Conference on Systems Man and Cybernetics     page: 2533 - 2538   2010.12

     More details

    Publisher:Conference Proceedings IEEE International Conference on Systems Man and Cybernetics  

    The objective of our study was to develop dynamic collaboration between a human and a robot. Most conventional studies have created pre-designed rule-based collaboration systems to determine the timing and behavior of robots to participate in tasks. Our aim is to introduce the confidence of the task as a criterion for robots to determine their timing and behavior. In this paper, we report the effectiveness of applying reproduction accuracy as a measure for quantitatively evaluating confidence in an object arrangement task. Our method is comprised of three phases. First, we obtain human-robot interaction data through the Wizard of OZ method. Second, the obtained data are trained using a neuro-dynamical system, namely, the Multiple Time-scales Recurrent Neural Network (MTRNN). Finally, the prediction error in MTRNN is applied as a confidence measure to determine the robot's behavior. The robot participated in the task when its confidence was high, while it just observed when its confidence was low. Training data were acquired using an actual robot platform, Hiro. The method was evaluated using a robot simulator. The results revealed that motion trajectories could be precisely reproduced with a high degree of confidence, demonstrating the effectiveness of the method. ©2010 IEEE.

    DOI: 10.1109/ICSMC.2010.5641924

    Scopus

  75. Human and Robot Cooperation for Arrangement of Objects by Prediction using Recurrent Neural Network

    AWANO Hiromitsu, OGATA Tetsuya, KOMATANI Kazunori, TAKAHASHI Toru, OKUNO Hiroshi G.

      Vol. 72 ( 0 ) page: 395 - 396   2010.3

     More details

    Language:Japanese  

    CiNii Research

▼display all

Books 1

  1. On-chip characterization of statistical device degradation

    Sato T., Awano H.

    Circuit Design for Reliability  2015.1  ( ISBN:9781461440772, 9781461440789

     More details

    Bias temperature instability (BTI) is one of the most critical degradation mechanisms that occur in modern semiconductor devices. The degradation due to BTI is transient, and known to be greatly influenced by bias voltages and temperature, making it very difficult to detect possible BTI-related failures during manufacturing test. Characterization and modeling of BTI is hence extremely important to protect a chip from BTI-related failures. In this chapter, an array structure that accelerates the statistical characterization of BTI is described. By overlapping the stress-application period for each device, measurements on hundreds or thousands ofdevices can be conducted concurrently. Test chip measurement results that provides a statistical insight on the parameters of BTI-related degradation process are also presented.

    DOI: 10.1007/978-1-4614-4078-9_5

    Scopus