Papers - TODA Tomoki
-
Audio difference learning for audio captioning Reviewed
T. Komatsu, Y. Fujita, K. Takeda, T. Toda
Proc. IEEE ICASSP page: 1456 - 1460 2024.4
-
ConvNeXt-TTS and ConvNeXt-VC: ConvNeXt-based fast end-to-end sequence-to-sequence text-to-speech and voice conversion Reviewed
T. Okamoto, Y. Ohtani, T. Toda, H. Kawai
Proc. IEEE ICASSP page: 12456 - 12460 2024.4
-
MF-AED-AEC: speech emotion recognition by leveraging multimodal fusion, ASR error detection, and ASR error correction Reviewed International coauthorship
J. He, X. Shi, X. Li, T. Toda
Proc. IEEE ICASSP page: 11066 - 11070 2024.4
-
Electrolaryngeal speech intelligibility enhancement through robust linguistic encoders Reviewed
L.P. Violeta, W.-C. Huang, D. Ma, R. Yamamoto, K. Kobayashi, T. Toda
Proc. IEEE ICASSP page: 10961 - 10965 2024.4
-
FIRNET: fundamental frequency controllable fast neural vocoder with trainable finite impulse response filter Reviewed
Y. Ohtani, T. Okamoto, T. Toda, H. Kawai
Proc. IEEE ICASSP page: 10871 - 10875 2024.4
-
Dual-channel target speaker extraction based on conditional variational autoencoder and directional information Reviewed
R. Wang, L. Li, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 32 page: 12 pages 2024.3
-
Fast neural speech waveform generative models with fully-connected layer-based upsampling Reviewed
H. Yamashita, T. Okamoto, R. Takashima, Y. Ohtani, T. Takiguchi, T. Toda, H. Kawai
IEEE Access Vol. 12 page: 31409 - 31421 2024.2
-
喉頭摘出者における音声収録アプリを用いた術前音声の保存 ―Save the Voice プロジェクト― Reviewed
西尾 直樹, 戸田 智基, 小林 和弘, 三谷 壮平, 飴矢 美里, 向山 宣昭, 木村 宏之, 徳倉 達也, 坪井 崇, 藤本 保志, 曾根 三千彦
喉頭 Vol. 35 ( 2 ) page: 142 - 147 2023.12
-
The Singing Voice Conversion Challenge 2023 Reviewed International coauthorship
W.-C. Huang, L.P. Violeta, S. Liu, J. Shi, T. Toda
Proc. IEEE ASRU page: 8 pages 2023.12
-
ED-CEC: improving rare word recognition using ASR post-processing based on error detection and context-aware error correction Reviewed
J. He, Z. Yang, T. Toda
Proc. IEEE ASRU page: 6 pages 2023.12
-
Improving severity preservation of healthy-to-pathological voice conversion with global style tokens Reviewed International coauthorship
B. Halpern, W.-C. Huang, L.P. Violeta, R. van Son, T. Toda
Proc. IEEE ASRU page: 7 pages 2023.12
-
A comparative study of voice conversion models with large-scale speech and singing data: the T13 systems for the Singing Voice Conversion Challenge 2023 Reviewed
R. Yamamoto, R. Yoneyama, L.P. Violeta, W.-C. Huang, T. Toda
Proc. IEEE ASRU page: 6 pages 2023.12
-
The VoiceMOS Challenge 2023: zero-shot subjective speech quality prediction for multiple domains Reviewed International coauthorship
E. Cooper, W.-C. Huang, Y. Tsao, H.-M. Wang, T. Toda, J. Yamagishi
Proc. IEEE ASRU page: 7 pages 2023.12
-
WaveNeXt: ConvNeXt-based fast neural vocoder without iSTFT layer Reviewed
T. Okamoto, H. Yamashita, Y. Ohtani, T. Toda, H. Kawai
Proc. IEEE ASRU page: 8 pages 2023.12
-
Sequence-to-sequence network training methods for automatic guitar transcription with tokenized outputs Reviewed
S. Kim, K. Takeda, T. Toda
Proc. ISMIR page: 524 - 531 2023.11
-
Evaluating methods for ground-truth-free foreign accent conversion Reviewed
W.-C. Huang, T. Toda
Proc. APSIPA ASC page: 1136 - 1141 2023.11
-
An analysis of personalized speech recognition system development for the deaf and hard-of-hearing Reviewed
L.P. Violeta, T. Toda
Proc. APSIPA ASC page: 1851 - 1856 2023.11
-
Semi-supervised multimodal emotion recognition with consensus decision-making and label correction Reviewed International coauthorship
J. Tian, D. Hu, X. Shi, J. He, X. Li, Y. Gao, T. Toda, X. Xu, X. Hu
Proc. MRAC page: 67 - 73 2023.10
-
Differentiable representation of warping based on Lie group theory Reviewed
A. Miyashita, T. Toda
Proc. IEEE WASPAA page: 5 pages 2023.10
-
Directional target speaker extraction under noisy underdetermined conditions through conditional variational autoencoder with global style tokens Reviewed
R. Wang, T. Toda
Proc. IEEE WASPAA page: 5 pages 2023.10
-
Sound field interpolation with unsupervised calibration for freely spaced circular microphone array in rotation-robust beamforming Reviewed
S. Luan, Y. Wakabayashi, T. Toda
Proc.EUSIPCO page: 21 - 25 2023.9
-
Noisy-to-noisy voice conversion under variations of noisy condition Reviewed
C. Xie, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 31 page: 3871 - 3882 2023.9
-
High-fidelity and pitch-controllable neural vocoder based on unified source-filter networks Reviewed
R. Yoneyama, Y.-C. Wu, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 31 page: 3717 - 3729 2023.9
-
Preference-based training framework for automatic speech quality assessment using deep neural network Reviewed
C.-H. Hu, Y. Yasuda, T. Toda
Proc. INTERSPEECH page: 546 - 550 2023.8
-
Analysis of mean opinion scores in subjective evaluation of synthetic speech based on tail probabilities Reviewed
Y. Yasuda, T. Toda
Proc. INTERSPEECH page: 5491 - 5495 2023.8
-
Reverberation-controllable voice conversion using reverberation time estimator Reviewed
Y. Choi, C. Xie, T. Toda
Proc. INTERSPEECH page: 2103 - 2107 2023.8
-
E2E-S2S-VC: end-to-end sequence-to-sequence voice conversion Reviewed
T. Okamoto, H. Yamashita, T. Toda, H. Kawai
Proc. INTERSPEECH page: 2043 - 2047 2023.8
-
Emotion awareness in multi-utterance turn for improving emotion prediction in multi-speaker conversation Reviewed International coauthorship
X. Shi, X. Li, T. Toda
Proc. INTERSPEECH page: 765 - 769 2023.8
-
Representation of vocal tract length transformation based on group theory Reviewed
A. Miyashita, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Analysis of Noisy-target Training for DNN-based speech enhancement Reviewed
T. Fujimura, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Intermediate fine-tuning using imperfect synthetic speech for improving electrolaryngeal speech recognition Reviewed
L.P. Violeta, D. Ma, W.-C. Huang, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Source-Filter HiFiGAN: fast and pitch controllable high-fidelity neural vocoder Reviewed International coauthorship
R. Yoneyama, Y.-C. Wu, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
NNSVS: a neural network based singing voice synthesis toolkit Reviewed
R. Yamamoto, R. Yoneyama, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Low-latency electrolaryngeal speech enhancement based on FastSpeech2-based voice conversion and self-supervised speech representation Reviewed
K. Kobayashi, T. Hayashi, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Reviewed
Y. Yasuda, T. Toda
Proc. IEEE ICASSP page: 5 pages 2023.6
-
Harmonic-Net: fundamental frequency and speech rate controllable fast neural vocoder Reviewed
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, H. Kawai
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 31 page: 1902 - 1915 2023.5
-
Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion Reviewed
D. Ma, L.P. Violeta, K. Kobayashi, T. Toda
Proc. IEEE SLT page: 949 - 954 2023.1
-
Music similarity calculation of individual instrumental sounds using metric learning Reviewed
Y. Hashizume, L. Li, T. Toda
Proc. APSIPA ASC page: 33 - 38 2022.11
-
Sequence-wise optimization for quasi-harmonic speech waveform modeling Reviewed
S. Chen, T. Toda
Proc. APSIPA ASC page: 1658 - 1663 2022.11
-
Direction-aware target speaker extraction with a dual-channel system based on conditional variational autoencoders under underdetermined conditions Reviewed
R. Wang, L. Li, T. Toda
Proc. APSIPA ASC page: 347 - 353 2022.11
-
Interpretable control for emotional text-to-speech system toward development of sympathetic educational-support robots Reviewed
J. Feng, T. Yoshikawa, T. Toda
Proc. APSIPA ASC page: 342 - 346 2022.11
-
Investigation of Japanese Png BERT language model in text-to-speech synthesis for pitch accent language Reviewed
Y. Yasuda, T. Toda
IEEE Journal of Selected Topics in Signal Processing Vol. 16 ( 6 ) page: 1319 - 1328 2022.10
-
A comparative study of self-supervised speech representation based voice conversion Reviewed International coauthorship
W.-C. Huang, S.-W. Yang, T. Hayashi, T. Toda
IEEE Journal of Selected Topics in Signal Processing Vol. 16 ( 6 ) page: 1308 - 1318 2022.10
-
Noisy-to-noisy voice conversion with pre-training strategy Invited Reviewed
C. Xie, T. Toda
Proc. ICA page: 5 pages 2022.9
-
A cyclical approach to synthetic and natural speech mismatch refinement of neural post-filter for low-cost text-to-speech system Reviewed
Y.-C. Wu, P.L. Tobing, K. Yasuhara, N. Matsunaga, Y. Ohtani, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 11 ( e30 ) page: 1 - 32 2022.9
-
Investigating self-supervised pretraining frameworks for pathological speech recognition Reviewed
L.P. Violeta, W.-C. Huang, T. Toda
Proc. INTERSPEECH page: 41 - 45 2022.9
-
Unified source-filter GAN with harmonic-plus-noise source excitation generation Reviewed
R. Yoneyama, Y.-C. Wu, T. Toda
Proc. INTERSPEECH page: 848 - 852 2022.9
-
The VoiceMOS Challenge 2022 Reviewed International coauthorship
W.-C. Huang, E. Cooper, Y. Tsao, H.-M. Wang, T. Toda, J. Yamagishi
Proc. INTERSPEECH page: 4536 - 4540 2022.9
-
Spoken-text-style transfer with conditional variational autoencoder and content word storage Reviewed
D. Yoshioka, Y. Yaduda, N. Matsunaga, Y. Ohtani, T. Toda
Proc. INTERSPEECH page: 4576 - 4580 2022.9
-
An evaluation of three-stage voice conversion framework for noisy and reverberant conditions Reviewed
Y. Choi, C. Xie, T. Toda
Proc. INTERSPEECH page: 4910 - 4914 2022.9
-
Improvement of anomalous sound detection method considering the distribution of embedding Invited Reviewed
I. Kuroyanagi, T. Hayashi, K. Takeda, T. Toda
Proc. ICA page: 5 pages 2022.9
-
Modified sound field interpolation method for rotation-robust beamforming with unequally spaced circular microphone array Reviewed
S. Luan, Y. Wakabayashi, T. Toda
Proc. EUSIPCO page: 344 - 348 2022.8
-
Note-level automatic guitar transcription using attention mechanism Reviewed
S. Kim, T. Hayashi, T. Toda
Proc. EUSIPCO page: 229 - 233 2022.8
-
Improvement of serial approach to anomalous sound detection by incorporating two binary cross-entropies for outlier exposure Reviewed
I. Kuroyanagi, T. Hayashi, K. Takeda, T. Toda
Proc. EUSIPCO page: 294 - 298 2022.8
-
Generalization ability of MOS prediction networks Reviewed
E. Cooper, W.-C. Huang, T. Toda, J. Yamagishi
Proc. IEEE ICASSP page: 8442 - 8446 2022.5
-
LDNet: unified listener dependent modeling in MOS prediction for synthetic speech Reviewed
W.-C. Huang, E. Cooper, J. Yamagishi, T. Toda
Proc. IEEE ICASSP page: 896 - 900 2022.5
-
S3PRL-VC: open-source voice conversion framework with self-supervised speech representations Reviewed International coauthorship
W.-C. Huang, S.-W. Yang, T. Hayashi, H.-Y. Lee, S. Watanabe, T. Toda
Proc. IEEE ICASSP page: 6552 - 6556 2022.5
-
Towards identity preserving normal to dysarthric voice conversion Reviewed International coauthorship
W.-C. Huang, B.M Halpern, L.P. Violeta, O. Scharenborg, T. Toda
Proc. IEEE ICASSP page: 6672 - 6676 2022.5
-
Direct noisy speech modeling for noisy-to-noisy voice conversion Reviewed
C. Xie, Y-.C. Wu, P.L. Tobing, W-.C. Huang, T. Toda
Proc. IEEE ICASSP page: 6787 - 6791 2022.5
-
An investigation of streaming non-autoregressive sequence-to-sequence voice conversion Reviewed
T. Hayashi, K. Kobayashi, T. Toda
Proc. IEEE ICASSP page: 6802 - 6806 2022.5
-
Comparison of real-time multi-speaker neural vocoders on CPUs Reviewed
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, H. Kawai
Acoustical Science and Technology, Acoustical Letter Vol. 43 ( 2 ) page: 121 - 124 2022.3
-
Neural speech-rate conversion with multispeaker WaveNet vocoder Reviewed
T. Okamoto, K. Matsubara, T. Toda, Y. Shiga, H. Kawai
Speech Communication Vol. 138 page: 1 - 12 2022.3
-
S3PRL-VC: open-source voice conversion framework with self-supervised speech representations Reviewed International coauthorship
W.-C. Huang, S.-W. Yang, T. Hayashi, H.-Y. Lee, S. Watanabe, T. Toda
Proc. AAAI-22 Workshop, W35: Self-Supervised Learning for Audio and Speech Processing page: 5 pages 2022.2
-
Time alignment using lip images for frame-based electrolaryngeal voice conversion Reviewed International coauthorship
Y.-S. Liou, W.-C. Huang, M.-C. Yen, S.-W. Tsai, Y.-H. Peng, T. Toda, Y. Tsao, H.-M. Wang
Proc. APSIPA ASC page: 1234 - 1238 2021.12
-
Multi-stream HiFi-GAN with data-driven waveform decomposition Reviewed
T. Okamoto, T. Toda, H. Kawai
Proc. IEEE ASRU page: 610 - 617 2021.12
-
On prosody modeling for ASR+TTS based voice conversion Reviewed International coauthorship
W.-C. Huang, T. Hayashi, X. Li, S. Watanabe, T. Toda
Proc. IEEE ASRU page: 642 - 649 2021.12
-
Mandarin electrolaryngeal speech voice conversion with sequence-to-sequence modeling Reviewed International coauthorship
M.-C. Yen, W.-C. Huang, K. Kobayashi, Y.-H. Peng, S.-W. Tasi, Y. Tsao, T. Toda, J.-S. R. Jang, H.-M. Wang
Proc. IEEE ASRU page: 650 - 657 2021.12
-
HASA-Net: a non-intrusive hearing-aid speech assessment network Reviewed International coauthorship
H.-T. Chiang, Y.-C. Wu, C. Yu, T. Toda, H.-M. Wang, Y.-C. Hu, Y. Tsao
Proc. IEEE ASRU page: 907 - 913 2021.12
-
Mandarin electro-laryngeal speech enhancement based on statistical voice conversion and manual tone control Reviewed International coauthorship
Z. Qian, H. Niu, L. Wang, K. Kobayashi, S. Zhang, T. Toda
Proc. APSIPA ASC page: 546 - 552 2021.12
-
Noisy-to-noisy voice conversion framework with denoising model Reviewed
C. Xie, Y.-C. Wu, P.L. Tobing, W.-C. Huang, T. Toda
Proc. APSIPA ASC page: 814 - 820 2021.12
-
Investigation of text-to-speech-based synthetic parallel data for sequence-to-sequence non-parallel voice conversion Reviewed
D. Ma, W.-C. Huang, T. Toda
Proc. APSIPA ASC page: 870 - 877 2021.12
-
An ensemble approach to anomalous sound detection based on conformer-based autoencoder and binary classifier incorporated with metric learning Reviewed
I. Kuroyanagi, T. Hayashi, Y. Adachi, T. Yoshimura, K. Takeda, T. Toda
Proc. DCASE 2021 Workshop page: 110 - 114 2021.11
-
Singing fundamental frequency contour generation using generalized command response model and score-conditional variational autoencoder Reviewed
S. Seki, H. Taga, T. Toda
Proc. IEEE MLSP page: 1 - 6 2021.10
-
Singing fundamental frequency contour generation using generalized command response model and score-conditional variational autoencoder Reviewed
S. Seki, H. Taga, T. Toda
Proc. IEEE MLSP page: 6 pages 2021.10
-
Anomalous sound detection using a binary classification model and class centroids Reviewed
I. Kuroyanagi, T. Hayashi, K. Takeda, T. Toda
Proc. EUSIPCO page: 1995 - 1999 2021.8
-
学習支援サービスの運用とオンデマンド型を中心としたオンライン授業への展開――名古屋大学における事例――
戸田 智基, 大平 茂輝, 後藤 明史, 出口 大輔, 森 健策
電子情報通信学会誌 Vol. 104 ( 8 ) page: 862 - 866 2021.8
-
Relational data selection for data augmentation of speaker-dependent multi-band MelGAN vocoder Reviewed International coauthorship
Y.-C. Wu, C.-H. Hu, H.-S. Lee, Y.-H. Peng, W.-C. Huang, Y. Tsao, H.-M. Wang, T. Toda
Proc. INTERSPEECH page: 3630 - 3634 2021.8
-
High-fidelity and low-latency universal neural vocoder based on multiband WaveRNN with data-driven linear prediction for discrete waveform modeling Reviewed
P.L. Tobing, T. Toda
Proc. INTERSPEECH page: 2217 - 2221 2021.8
-
Unified source-filter GAN: unified source-filter network based on factorization of quasi-periodic parallel WaveGAN Reviewed
R. Yoneyama, Y.-C. Wu, T. Toda
Proc. INTERSPEECH page: 2187 - 2191 2021.8
-
A preliminary study of a two-stage paradigm for preserving speaker identity in dysarthric voice conversion Reviewed International coauthorship
W.-C. Huang, K. Kobayashi, Y.-H. Peng, C.-F. Liu, Y. Tsao, H.-M. Wang, T. Toda
Proc. INTERSPEECH page: 1329 - 1333 2021.8
-
Low-latency real-time non-parallel voice conversion based on cyclic variational autoencoder and multiband WaveRNN with data-driven linear prediction Reviewed
P.L. Tobing, T. Toda
Proc. 11th ISCA Speech Synthesis Workshop (SSW11) page: 142 - 147 2021.8
-
Full-band LPCNet: a real-time neural vocoder for 48 kHz audio with a CPU Reviewed
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga, H. Kawai
IEEE Access Vol. 9 page: 94923 - 94933 2021.7
-
Crank: an open-source software for nonparallel voice conversion based on vector-quantized variational autoencoder Reviewed
K. Kobayashi, W.-C. Huang, Y.-C. Wu, P.L. Tobing, T. Hayashi, T. Toda
Proc. IEEE ICASSP page: 5934 - 5938 2021.6
-
Any-to-one sequence-to-sequence voice conversion using self-supervised discrete speech representations Reviewed
W.-C. Huang, Y.-C. Wu, T. Hayashi, T. Toda
Proc. IEEE ICASSP page: 5944 - 5948 2021.6
-
Speech recognition by simply fine-tuning BERT Reviewed International coauthorship
W.-C. Huang, C.-H. Wu, S.-B. Luo, K.-Y. Chen, H.-M. Wang, T. Toda
Proc. IEEE ICASSP page: 7343 - 7347 2021.6
-
Non-autoregressive sequence-to-sequence voice conversion Reviewed
T. Hayashi, W.-C. Huang, K. Kobayashi, T. Toda
Proc. IEEE ICASSP page: 7068 - 7072 2021.6
-
High-intelligibility speech synthesis for dysarthric speakers with LPCNet-based TTS and CycleVAE-based VC Reviewed
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 7058 - 7062 2021.6
-
Speech emotion recognition based on listener adaptive models Reviewed
A. Ando, R. Masumura, H. Sato, T. Moriya, T. Ashihara, Y. Ijima, T. Toda
Proc. IEEE ICASSP page: 6274 - 6278 2021.6
-
Noise level limited sub-modeling for diffusion probabilistic vocoders Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 6029 - 6033 2021.6
-
Speech emotion recognition based on listener-dependent emotion perception models Reviewed
A. Ando, T. Mori, S. Kobashikawa, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 10 ( e6 ) page: 1 - 11 2021.4
-
Quasi-periodic WaveNet: an autoregressive raw waveform generative model with pitch-dependent dilated convolution neural network Reviewed
Y.-C. Wu, T. Hayashi, P.L. Tobing, K. Kobayashi, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 29 page: 1134 - 1148 2021.3
-
Pretraining techniques for sequence-to-sequence voice conversion Reviewed
W.-C. Huang, T. Hayashi, Y.-C. Wu, H. Kameoka, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 29 page: 745 - 755 2021.2
-
Quasi-periodic parallel WaveGAN: a non-autoregressive raw waveform generative model with pitch-dependent dilated convolution neural network Reviewed
Y.-C. Wu, T. Hayashi, T. Okamoto, H. Kawai, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 29 page: 792 - 806 2021.2
-
Investigation of training data size for real-time neural vocoders on CPUs Reviewed
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga, H. Kawai
Acoustical Science and Technology, Acoustical Letter Vol. 42 ( 1 ) page: 65 - 68 2021.1
-
Many-to-many voice transformer network Reviewed
H. Kameoka, W.-C. Huang, K. Tanaka, T. Kaneko, N. Hojo, T. Toda
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 29 page: 656 - 670 2021.1
-
Cross-lingual voice conversion using cyclic variational auto-encoder and a WaveNet vocoder Reviewed
H. Nakatani, P.L. Tobing, K. Takeda, T. Toda
Proc. APSIPA ASC page: 520 - 526 2020.12
-
Phoneme embeddings on predicting fundamental frequency pattern for electrolaryngeal speech Reviewed
M. Eshghi, K. Kobayashi, K. Tanaka, H. Kameoka, T. Toda
Proc. APSIPA ASC page: 572 - 577 2020.12
-
ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech Reviewed International coauthorship
X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.-H. Peng, H.-T. Hwang, Y. Tsao, H.-M. Wang, S. Le Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.-J. Liu, Y.-C. Wu, W.-C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J.-F. Bonastre, A. Govender, S. Ronanki, J.-X. Zhang, Z.-H. Ling
Computer Speech and Language Vol. 64 ( Article 101114 ) page: 1 - 27 2020.11
-
Conformer-based sound event detection with semi-supervised learning and data augmentation Reviewed International coauthorship
K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, K. Takeda
Proc. DCASE 2020 Workshop page: 100 - 104 2020.11
-
An evaluation of voice conversion with neural network spectral mapping models and WaveNet vocoder Reviewed
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 9 ( e26, ) page: 1 - 14 2020.11
-
Quasi-periodic parallel WaveGAN vocoder: a non-autoregressive pitch-dependent dilated convolution model for parametric speech generation Reviewed
Y.-C. Wu, T. Hayashi, T. Okamoto, H. Kawai, T. Toda
Proc. INTERSPEECH page: 3535 - 3539 2020.10
-
The NU voice conversion system for the Voice Conversion Challenge 2020: on the effectiveness of sequence-to-sequence models and autoregressive neural vocoders Reviewed
W.-C. Huang, P.L. Tobing, Y.-C. Wu, K. Kobayashi, T. Toda
Proc. Joint workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 page: 165 - 169 2020.10
-
The sequence-to-sequence baseline for the Voice Conversion Challenge 2020: cascading ASR and TTS Reviewed International coauthorship
W.-C. Huang, T. Hayashi, S. Watanabe, T. Toda
Proc. Joint workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 page: 160 - 164 2020.10
-
Baseline system of Voice Conversion Challenge 2020 with cyclic variational autoencoder and parallel WaveGAN Reviewed
P.L. Tobing, Y.-C. Wu, T. Toda
Proc. Joint workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 page: 155 - 159 2020.10
-
Predictions of subjective ratings and spoofing assessments of Voice Conversion Challenge 2020 submissions Reviewed International coauthorship
R.K. Das, T. Kinnunen, W.-C. Huang, Z. Ling, J. Yamagishi, Z. Yi, X. Tian, T. Toda
Proc. Joint workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 page: 99 - 120 2020.10
-
Voice Conversion Challenge 2020 -- intra-lingual semi-parallel and cross-lingual voice conversion -- Reviewed International coauthorship
Z. Yi, W.-C. Huang, X. Tian, J. Yamagishi, R.K. Das, T. Kinnunen, Z. Ling, T. Toda
Proc. Joint workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 page: 80 - 98 2020.10
-
Cyclic spectral modeling for unsupervised unit discovery into voice conversion with excitation and waveform modeling Reviewed
P.L. Tobing, T. Hayashi, Y.-C. Wu, K. Kobayashi, T. Toda
Proc. INTERSPEECH page: 4861 - 4865 2020.10
-
Voice transformer network: sequence-to-sequence voice conversion using transformer with text-to-speech pretraining Reviewed
W.-C. Huang, T. Hayashi, Y.-C. Wu, H. Kameoka, T. Toda
Proc. INTERSPEECH page: 4676 - 4680 2020.10
-
Intelligibility enhancement based on speech waveform modification using hearing impairment simulator Reviewed
S. Hikosaka, S. Seki, T. Hayashi, K. Kobayashi, K. Takeda, H. Banno, T. Toda
Proc. INTERSPEECH page: 4059 - 4063 2020.10
-
Semi-supervised self-produced speech enhancement and suppression based on joint source modeling of air- and body-conducted signals using variational autoencoder Reviewed
S. Seki, M. Takada, T. Toda
Proc. INTERSPEECH page: 4039 - 4043 2020.10
-
A cyclical post-filtering approach to mismatch refinement of neural vocoder for text-to-speech systems Reviewed
Y.-C. Wu, P.L. Tobing, K. Yasuhara, N. Matsunaga, Y. Ohtani, T. Toda
Proc. INTERSPEECH page: 3540 - 3544 2020.10
-
Implementation of low-latency electrolaryngeal speech enhancement based on multi-task CLDNN Reviewed
K. Kobayashi, T. Toda
Proc. EUSIPCO page: 396 - 400 2020.8
-
Semi-supervised enhancement and suppression of self-produced speech using correspondence between air- and body-conducted signals Reviewed
M. Takada, S. Seki, P.L. Tobing, T. Toda
Proc. EUSIPCO page: 456 - 460 2020.8
-
Weakly-supervised sound event detection with self-attention Reviewed International coauthorship
K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, K. Takeda
Proc. IEEE ICASSP page: 66 - 70 2020.5
-
ESPNET-TTS: Uunified, reproducible, and integratable open source end-to-end text-to-speech toolkit Reviewed International coauthorship
T. Hayashi, R. Yamamoto, K. Inoue, T. Yoshimura, S. Watanabe, T. Toda, K. Takeda, Y. Zhang, X. Tan
Proc. IEEE ICASSP page: 7654 - 7658 2020.5
-
Efficient shallow WaveNet vocoder using multiple samples output based on Laplacian distribution and linear prediction Reviewed
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda
Proc. IEEE ICASSP page: 7204 - 7208 2020.5
-
Transformer-based text-to-speech with weighted forced attention Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 6729 - 6733 2020.5
-
Non-parallel voice conversion system with WaveNet vocoder and collapsed speech suppression Reviewed
Y.-C. Wu, P.L. Tobing, T. Hayashi, K. Kobayashi, T. Toda
IEEE Access Vol. 8 ( 1 ) page: 62094 - 62106 2020.4
-
LMS経由で手書きレポートを返却するWebサービス「かみレポ」の開発・評価 Reviewed
大平 茂輝, 清谷 峻也, 伊藤 瑠哉, 岡本 康佑, 谷川 右京, 出口 大輔, 戸田 智基
情報処理学会論文誌:教育とコンピュータ Vol. 6 ( 1 ) page: 52 - 68 2020.2
-
Customer satisfaction estimation in contact center calls based on a hierarchical multi-task model Reviewed
A. Ando, R. Masumura, H. Kamiyama, S. Kobashikawa, Y. Aono, T. Toda
IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28 ( 1 ) page: 715 - 728 2020.1
-
Investigation of shallow WaveNet vocoder with Laplacian distribution output Reviewed
P.L. Tobing, T. Hayashi, T. Toda
Proc. IEEE ASRU page: 176 - 183 2019.12
-
Tacotron-based acoustic model using phoneme alignment for practical neural text-to-speech synthesis Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ASRU page: 214 - 221 2019.12
-
Underdetermined source separation based on generalized multichannel variational autoencoder Reviewed
S. Seki, H. Kameoka, L. Li, T. Toda, K. Takeda
IEEE Access Vol. 7 ( 1 ) page: 168104 - 168115 2019.12
-
Voice conversion with CycleRNN-based spectral mapping and finely-tuned WaveNet vocoder Reviewed
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda
IEEE Access Vol. 7 ( 1 ) page: 171114 - 171125 2019.12
-
機械学習と音声生成:音声波形モデリングの進展
戸田 智基
計測と制御 Vol. 58 ( 12 ) page: 951 - 954 2019.12
-
Improving singing aid system for laryngectomees with statistical voice conversion and VAE-SPACE Reviewed
L. Li, T. Toda, K. Morikawa, K. Kobayashi, S. Makino
Proc. ISMIR page: 784 - 790 2019.11
-
Development of a real-time bionic voice generation system based on statistical excitation prediction Reviewed International coauthorship
F. Ahmadi, K. Kobayashi, T. Toda
Proc. ACM ASSETS page: 655 - 657 2019.10
-
統計的手法による音響イベント検出
林 知樹, 戸田 智基
日本音響学会誌 Vol. 75 ( 9 ) page: 532 - 537 2019.9
-
An investigation of features for fundamental frequency pattern prediction in electrolaryngeal speech enhancement Reviewed
M. Eshghi, K. Tanaka, K. Kobayashi, H. Kameoka, T. Toda
Proc. 10th ISCA Speech Synthesis Workshop (SSW10) page: 251 - 256 2019.9
-
Statistical voice conversion with quasi-periodic WaveNet vocoder Reviewed
Y.-C. Wu, T. Hayashi, P.L. Tobing, K. Kobayashi, T. Toda
Proc. 10th ISCA Speech Synthesis Workshop (SSW10) page: 63 - 68 2019.9
-
Generalization of spectrum differential based direct waveform modification for voice conversion Reviewed International coauthorship
W.-C. Huang, Y.-C. Wu, K. Kobayashi, Y.-H. Peng, H.-T. Hwang, P.L. Tobing, Y. Tsao, H.-M. Wang, T. Toda
Proc. 10th ISCA Speech Synthesis Workshop (SSW10) page: 57 - 62 2019.9
-
Pre-trained text embeddings for enhanced text-to-speech synthesis Reviewed International coauthorship
T. Hayashi, S. Watanabe, T. Toda, K. Takeda, S. Toshniwal, K. Livescu
Proc. INTERSPEECH page: 4430 - 4434 2019.9
-
Real-time neural text-to-speech with sequence-to-sequence acoustic model and WaveGlow or single Gaussian WaveRNN vocoders Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. INTERSPEECH page: 1308 - 1312 2019.9
-
Investigation of F0 conditioning and fully convolutional networks in variational autoencoder based voice conversion Reviewed International coauthorship
W.-C. Huang, Y.-C. Wu, C.-C. Lo, P.L. Tobing, T. Hayashi, K. Kobayashi, T. Toda, Y. Tsao, H.-M. Wang
Proc. INTERSPEECH page: 709 - 713 2019.9
-
Robustness of statistical voice conversion based on direct waveform modification against background sounds Reviewed
Y. Kurita, K. Kobayashi, K. Takeda, T. Toda
Proc. INTERSPEECH page: 684 - 688 2019.9
-
Non-parallel voice conversion with cyclic variational autoencoder Reviewed
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda
Proc. INTERSPEECH page: 674 - 678 2019.9
-
Quasi-periodic WaveNet vocoder: a pitch dependent dilated convolution model for parametric speech generation Reviewed
Y.-C. Wu, T. Hayashi, P.L. Tobing, K. Kobayashi, T. Toda
Proc. INTERSPEECH page: 196 - 200 2019.9
-
Refined WaveNet vocoder for variational autoencoder based voice conversion Reviewed International coauthorship
W.-C. Huang, Y.-C. Wu, H.-T. Hwang, P.L. Tobing, T. Hayashi, K. Kobayashi, T. Toda, Y. Tsao, H.-M. Wang
Proc. EUSIPCO page: 5 pages 2019.9
-
Generalized multichannel variational autoencoder for underdetermined source separation Reviewed
S. Seki, H. Kameoka, L. Li, T. Toda, K. Takeda
Proc. EUSIPCO page: 5 pages 2019.9
-
Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 7020 - 7024 2019.5
-
Voice conversion with cyclic recurrent neural network and fine-tuned WaveNet vocoder Reviewed
P.L. Tobing, Y. Wu, T. Hayashi, K. Kobayashi, T. Toda
Proc. IEEE ICASSP page: 6815 - 6819 2019.5
-
Scene-dependent anomalous acoustic-event detection based on conditional WaveNet and i-Vector Reviewed
T. Komatsu, T. Hayashi, R. Kondo, T. Toda, K. Takeda
Proc. IEEE ICASSP page: 870 - 874 2019.5
-
Environmental sound processing and its applications Invited Reviewed
K. Miyazaki, T. Toda, T. Hayashi, K. Takeda
IEEJ Transactions on Electronics, Information and Systems Vol. 14 ( 3 ) page: 340 - 351 2019.3
-
Speech-to-singing voice conversion: the challenges and strategies for improving vocal conversion processes Reviewed International coauthorship
K. Vijayan, H. Li, T. Toda
IEEE Signal Processing Magazine Vol. 36 ( 1 ) page: 95 - 102 2019.1
-
An end-to-end model for cross-lingual transformation of paralinguistic information Reviewed
T. Kano, S. Takamichi, S. Sakti, G. Neubig, T. Toda, S. Nakamura
Machine Translation Vol. 32 ( 4 ) page: 353 - 368 2018.12
-
Back-translation-style data augmentation for end-to-end ASR Reviewed International coauthorship
T. Hayashi, S. Watanabe, Y. Zhang, T. Toda, T. Hori, R. Astudillo, K. Takeda
Proc. IEEE SLT page: 426 - 433 2018.12
-
Improving FFTNet vocoder with noise shaping and subband approaches Reviewed
T. Okamoto, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE SLT page: 304 - 311 2018.12
-
An evaluation of deep spectral mappings and WaveNet vocoder for voice conversion Reviewed
P.L. Tobing, T. Hayashi, Y. Wu, K. Kobayashi, T. Toda
Proc. IEEE SLT page: 297 - 303 2018.12
-
Daily activity recognition based on recurrent neural network using multi-modal signals Reviewed
A. Tamamori, T. Hayashi, T. Toda, K. Takeda
APSIPA Transactions on Signal and Information Processing Vol. 7 ( e21 ) page: 1 - 11 2018.12
-
Self-produced speech enhancement and suppression method using air- and body-conductive microphones Reviewed
M. Takada, S. Seki, T. Toda
Proc. APSIPA ASC page: 1240 - 1245 2018.11
-
Connectionist temporal classification-based sound event encoder for converting sound events into onomatopoeia representations Reviewed
K. Miyazaki, T. Hayashi, T. Toda, K. Takeda
Proc. EUSIPCO page: 857 - 861 2018.9
-
音声翻訳システムにおける音声変換の利用
高道 慎之介, 戸田 智基
日本音響学会誌 Vol. 74 ( 9 ) page: 535 - 538 2018.9
-
Designing a pneumatic bionic voice prosthesis - statistical approach for source excitation generation Reviewed International coauthorship
F. Ahmadi, T. Toda
Proc. INTERSPEECH page: 3142 - 3146 2018.9
-
Audio-visual voice conversion using deep canonical correlation analysis for deep bottleneck features Reviewed
S. Tamura, K. Horio, H. Endo, S. Hayamizu, T. Toda
Proc. INTERSPEECH page: 2469 - 2473 2018.9
-
Frequency domain variants of velvet noise and their application to speech processing and synthesis Reviewed
H. Kawahara, K. Sakakibara, M. Morise, H. Banno, T. Toda, T. Irino
Proc. INTERSPEECH page: 2027 - 2031 2018.9
-
Collapsed segment detection and reduction for WaveNet vocoder Reviewed
Y. Wu, K. Kobayashi, T. Hayashi, P.L. Tobing, T. Toda
Proc. INTERSPEECH page: 1998 - 1992 2018.9
-
Multi-Head Decoder for end-to-end speech recognition Reviewed International coauthorship
T. Hayashi, S. Watanabe, T. Toda, K. Takeda
Proc. INTERSPEECH page: 801 - 805 2018.9
-
Anomalous sound event detection based on WaveNet Reviewed
T. Hayashi, T. Komatsu, R. Kondo, T. Toda, K. Takeda
Proc. EUSIPCO page: 2508 - 2512 2018.9
-
Electrolarygeal speech enhancement with statistical voice conversion based on CLDNN Reviewed
K. Kobayashi, T. Toda
Proc. EUSIPCO page: 2129 - 2133 2018.9
-
Stereophonic music separation based on non-negative tensor factorization with cepstral distance regularization Reviewed
S. Seki, T. Toda, K. Takeda
IEICE Transactions on Fundamentals Vol. E101-A ( 7 ) page: 1057 - 1064 2018.7
-
A spoofing benchmark for the 2018 voice conversion challenge: leveraging from spoofing countermeasures for speech artifact assessment Reviewed International coauthorship
T. Kinnunen, J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, Z. Ling
Proc. Odyssey 2018 page: 187 - 194 2018.6
-
NU voice conversion system for the voice conversion challenge 2018 Reviewed
P.L. Tobing, Y. Wu, T. Hayashi, K. Kobayashi, T. Toda
Proc. Odyssey 2018 page: 219 - 226 2018.6
-
The NU non-parallel voice conversion system for the voice conversion challenge 2018 Reviewed
Y. Wu, P.L. Tobing, T. Hayashi, K. Kobayashi, T. Toda
Proc. Odyssey 2018 page: 211 - 218 2018.6
-
sprocket: open-source voice conversion software Reviewed
K. Kobayashi, T. Toda
Proc. Odyssey 2018 page: 203 - 210 2018.6
-
The voice conversion challenge 2018: promoting development of parallel and nonparallel methods Reviewed International coauthorship
J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, Z. Ling
Proc. Odyssey 2018 page: 195 - 202 2018.6
-
Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential Reviewed
K. Kobayashi, T. Toda, S. Nakamura
Speech Communication Vol. 99 page: 211 - 220 2018.5
-
An investigation of subband WaveNet vocoder covering entire audible frequency range with limited acoustic features Reviewed
T. Okamoto, K. Tachibana, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 5654 - 5658 2018.4
-
Development of "KamiRepo" system with automatic student identification to handle handwritten assignments on LMS Reviewed
S. Seiya, R. Ito, K. Okamoto, U. Tanikawa, S. Ohira, D. Deguchi, T. Toda
Proc. IEEE EDUCON page: 841 - 848 2018.4
-
An investigation of noise shaping with perceptual weighting for WaveNet-based speech generation Reviewed
K. Tachibana, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ICASSP page: 5664 - 5668 2018.4
-
Deep neural network-based power spectrum reconstruction to improve quality of vocoded speech with limited acoustic parameters Reviewed
T. Okamoto, K. Tachibana, T. Toda, Y. Shiga, H. Kawai
Acoustical Science and Technology, Acoustical Letter Vol. 39 ( 2 ) page: 163 - 166 2018.3
-
統計的声質変換ソフトウェア入門 Invited Reviewed
戸田 智基, 小林 和弘
システム/制御/情報 Vol. 62 ( 2 ) page: 69 - 75 2018.2
-
Daily activity recognition with large-scaled real-life recording datasets based on deep neural network using multi-modal signals Reviewed
T. Hayashi, M. Nishida, N. Kitaoka, T. Toda, K. Takeda
IEICE Transactions on Fundamentals Vol. E101-A ( 1 ) page: 199 - 210 2018.1
-
Electrolaryngeal speech modification towards singing aid system for laryngectomees Reviewed
K. Morikawa, T. Toda
Proc. APSIPA ASC page: 1 - 4 2017.12
-
Articulatory controllable speech modification based on statistical inversion and production mappings Reviewed
P.L. Tobing, K. Kobayashi, T. Toda
IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 25 ( 12 ) page: 2337 - 2350 2017.12
-
An investigation of multi-speaker training for WaveNet vocoder Reviewed
T. Hayashi, A. Tamamori, K. Kobayashi, K. Takeda, T. Toda
Proc. IEEE ASRU page: 712 - 718 2017.12
-
Subband WaveNet with overlapped single-sideband filterbanks Reviewed
T. Okamoto, K. Tachibana, T. Toda, Y. Shiga, H. Kawai
Proc. IEEE ASRU page: 698 - 704 2017.12
-
Accurate estimation of fo and aperiodicity based on periodicity detector residuals and deviations of phase derivatives Reviewed
H. Kawahara, K. Sakakibara, M. Morise, H. Banno, T. Toda
Proc. APSIPA ASC page: 1 - 9 2017.12
-
An investigation of how to design control parameters for statistical voice timbre control Reviewed
K. Kubo, K. Kobayashi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. APSIPA ASC page: 1 - 4 2017.12
-
Investigation of effectiveness on recurrent neural network for daily activity recognition using multi-modal signals Invited Reviewed
A. Tamamori, T. Hayashi, T. Toda, K. Takeda
Proc. APSIPA ASC page: 1 - 7 2017.12
-
Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling Reviewed
P.L. Tobing, H. Kameoka, T. Toda
Proc. APSIPA ASC page: 1 - 4 2017.12
-
Duration-controlled LSTM for polyphonic sound event detection Reviewed International coauthorship
T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. Le Roux, K. Takeda
IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 25 ( 11 ) page: 2059 - 2070 2017.11
-
Missing component restoration for masked speech signals based on time-domain spectrogram factorization Reviewed
S. Seki, H. Kameoka, T. Toda, K. Takeda.
Proc. IEEE MLSP page: 6 pages 2017.9
-
A vibration control method of an electrolarynx based on statistical F0 pattern prediction Reviewed
K. Tanaka, T. Toda, S. Nakamura
IEICE Transactions on Information and Systems Vol. E100-D ( 9 ) page: 2165 - 2173 2017.9
-
A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and FO estimation Reviewed
H. Kawahara, K. Sakakibara, M. Morise, H. Banno, T. Toda
Proc. INTERSPEECH page: 424 - 428 2017.8
-
Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization Reviewed
S. Seki, T. Toda, K. Takeda
Proc. EUSIPCO page: 1011 - 1015 2017.8
-
Speech enhancement using non-negative spectrogram models with mel-generalized cepstral regularization Reviewed
L. Li, H. Kameoka, T. Toda, S. Makino
Proc. INTERSPEECH page: 1998 - 2002 2017.8
-
A new cosine series antialiasing function and its application to aliasing-free glottal source models for speech and singing synthesis Reviewed
H. Kawahara, K. Sakakibara, H. Banno, M. Morise, T. Toda, T. Irino
Proc. INTERSPEECH page: 1358 - 1362 2017.8
-
Statistical voice conversion with WaveNet-based waveform generation Reviewed
K. Kobayashi, T. Hayashi, A. Tamamori, T. Toda
Proc. INTERSPEECH page: 1138 - 1142 2017.8
-
Speaker-dependent WaveNet vocoder Reviewed
A. Tamamori, T. Hayashi, K. Kobayashi, K. Takeda, T. Toda
Proc. INTERSPEECH page: 1118 - 1122 2017.8
-
Physically constrained statistical F0 prediction for electrolaryngeal speech enhancement Reviewed
K. Tanaka, H. Kameoka, T. Toda, S. Nakamura
Proc. INTERSPEECH page: 1069 - 1073 2017.8
-
A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals Reviewed
Y. Tajiri, H. Kameoka, T. Toda
Proc. IEEE ICASSP page: 4960 - 4964 2017.3
-
Preserving word-level emphasis in speech-to-speech translation Reviewed
Q. Truong Do, T. Toda, G. Neubig, S. Sakti, S. Nakamura
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 25 ( 3 ) page: 544 - 556 2017.3
-
BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic sound event detection Reviewed International coauthorship
T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. Le Roux, K. Takeda
Proc. IEEE ICASSP page: 766 - 770 2017.3
-
中間言語情報を記憶するピボット翻訳手法 Reviewed
三浦 明波, Graham Neubig, Sakriani Sakti, 戸田 智基, 中村 哲
自然言語処理 Vol. 23 ( 5 ) page: 499 - 528 2016.12
-
Non-native text-to-speech preserving speaker individuality based on partial correction of prosodic and phonetic characteristics Reviewed
Y. Oshima, S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
IEICE Transactions on Information and Systems Vol. E99-D ( 12 ) page: 3132 - 3139 2016.12
-
F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential Reviewed
K. Kobayashi, T. Toda, S. Nakamura
Proc. IEEE SLT page: 693 - 700 2016.12
-
Learning cooperative persuasive dialogue policies using framing Reviewed
T. Hiraoka, G. Neubig, S. Sakti, T. Toda, S. Nakamura
Speech Communication Vol. 84 page: 83 - 96 2016.11
-
Improvements of voice timbre control based on perceived age in singing voice conversion Reviewed
K. Kobayashi, T. Toda, T. Nakano, M. Goto, S. Nakamura
IEICE Transactions on Information and Systems Vol. E99-D ( 11 ) page: 2767 - 2777 2016.11
-
A statistical sample-based approach to GMM-based voice conversion using tied-covariance acoustic models Reviewed
S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
IEICE Transactions on Information and Systems Vol. E99-D ( 10 ) page: 2490 - 2498 2016.10
-
Investigation on recurrent neural network architectures for daily activity recognition Reviewed
A. Tamamori, T. Hayashi, T. Toda, K. Takeda
Proc. UV2016 page: 1 - 4 2016.10
-
Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoring Reviewed
Y. Tajiri, T. Toda
Proc. 9th ISCA Speech Synthesis Workshop (SSW9) page: 54 - 60 2016.9
-
Acoustic-to-articulatory inversion mapping based on latent trajectory Gaussian mixture model Reviewed
P.L. Tobing, T. Toda, H. Kameoka, S. Nakamura
Proc. INTERSPEECH page: 953 - 957 2016.9
-
The Voice Conversion Challenge 2016 Reviewed International coauthorship
T. Toda, L.-H. Chen, D. Saito, F. Villavicencio, M. Wester, Z. Wu, J. Yamagishi
Proc. INTERSPEECH page: 1632 - 1636 2016.9
-
The NU-NAIST voice conversion system for the Voice Conversion Challenge 2016 Reviewed
K. Kobayashi, S. Takamichi, S. Nakamura, T. Toda
Proc. INTERSPEECH page: 1667 - 1671 2016.9
-
Model integration for HMM- and DNN-based speech synthesis using Product-of-Experts framework Reviewed
K. Tachibana, T. Toda, Y. Shiga, H. Kawai
Proc. INTERSPEECH page: 2288 - 2292 2016.9
-
A hybrid system for continuous word-level emphasis modeling based on HMM state clustering and adaptive training Reviewed
Q. Truong Do, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. INTERSPEECH page: 3196 - 3200 2016.9
-
Removing noise from event-related potentials using a probabilistic generative model with grouped covariance matrices Reviewed
H. Maki, T. Toda, S. Sakti, G. Neubig, S. Nakamura
Proc. IEEE EMBC page: 1 - 4 2016.8
-
Teaching social communication skills through human-agent interaction Reviewed
H. Tanaka, S. Sakti, G. Neubig, T. Toda, H. Negoro, H. Iwasaka, S. Nakamura
ACM Transactions on Interactive Intelligent Systems Vol. 6 ( 2 ) page: 1 - 23 2016.8
-
Bidirectional LSTM-HMM hybrid system for polyphonic sound event detection Reviewed International coauthorship
T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. Le Roux, K. Takeda
Proc. DCASE2016 workshop page: 1 - 5 2016.8
-
Real-time vibration control of an electrolarynx based on statistical F0 contour prediction Reviewed
K. Tanaka, T. Toda, G. Neubig, S. Nakamura
Proc. EUSIPCO page: 1333 - 1337 2016.8
-
Enhancing event-related potentials based on maximum a posteriori estimation with a spatial correlation prior Reviewed
H. Maki, T. Toda, S. Sakti, G. Neubig, S. Nakamura
IEICE Transactions on Information and Systems Vol. E99-D ( 6 ) page: 1410 - 1419 2016.6
-
はじめての音声変換
戸田 智基
日本音響学会誌 Vol. 72 ( 6 ) page: 324 - 331 2016.6
-
Anti-spoofing for text-independent speaker verification: an initial database, comparison of countermeasures, and human performance Reviewed International coauthorship
Z. Wu, P. De Leon, C. Demiroglu, A. Khodabakhsh, S. King, Z.-H. Ling, D. Saito, B. Stewart, T. Toda, M. Wester, J. Yamagishi
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 24 ( 4 ) page: 768 - 783 2016.4
-
Post-filters to modify the modulation spectrum for statistical parametric speech synthesis Reviewed International coauthorship
S. Takamichi, T. Toda, A.W. Black, G. Neubig, S. Sakti, S. Nakamura
IEEE/ACM Transactions on Audio, Speech and Language Processing Vol. 24 ( 4 ) page: 755 - 767 2016.4
-
Implementation of F0 transformation for statistical singing voice conversion based on direct waveform modification Reviewed
K. Kobayashi, T. Toda, S. Nakamura
Proc. IEEE ICASSP page: 5670 - 5674 2016.3
-
An estimation method of voice timbre evaluation values using feature extraction with Gaussian mixture model based on reference singer Reviewed
S. Yamane, K. Kobayashi, T. Toda, T. Nakano, M. Goto, S. Nakamura
Proc. IEEE ICASSP page: 5265 - 5269 2016.3
-
Statistical F0 prediction for electrolaryngeal speech enhancement considering generative process of F0 contours within product of experts framework Reviewed
K. Tanaka, H. Kameoka, T. Toda, S. Nakamura
Proc. IEEE ICASSP page: 5665 - 5669 2016.3
-
Noise suppression method for body-conducted soft speech enhancement based on external noise monitoring Reviewed
Y. Tajiri, T. Toda, S. Nakamura
Proc. IEEE ICASSP page: 5935 - 5939 2016.3
-
Example based dialogue system based on satisfaction prediction Reviewed
Vol. 31 ( 1 ) page: 1 - 12 2016.1
-
Active learning for example-based dialog systems Reviewed
T. Hiraoka, G. Neubig, K. Yoshino, T. Toda, S. Nakamura
Proc. IWSDS page: 1 - 11 2016.1
-
A dialog system to detect deception Reviewed
Y. Tsunomori, G. Neubig, T. Hiraoka, M. Mizukami, S. Sakti, T. Toda, S. Nakamura
Proc. IWSDS page: 1 - 6 2016.1
-
機械翻訳システムの誤り分析のための誤り箇所選択手法 Reviewed
赤部 晃一, Graham Neubig, Sakriani Sakti, 戸田 智基, 中村 哲
自然言語処理 Vol. 23 ( 1 ) page: 88 - 117 2016.1
-
Improving translation of emphasis with pause prediction in speech-to-speech translation systems Reviewed
Q. Truong Do, S. Sakti, G. Neubig, T. Toda, S. Nakamura
Proc. IWSLT page: 204 - 208 2015.12
-
Semantic parsing of ambiguous input through paraphrasing and verification Reviewed
P. Arthur, G. Neubig, S. Sakti, T. Toda, S. Nakamura
Transactions of the Association for Computational Linguistics Vol. 3 page: 571 - 584 2015.12
-
Adaptive selection from multiple response candidates in example-based dialogue Reviewed
M. Mizukami, H. Kizuki, T. Nomura, G. Neubig, K. Yoshino, S. Sakti, T. Toda, S. Nakamura
Proc. IEEE ASRU page: 784 - 790 2015.12
-
A study of social-affective communication: automatic prediction of emotion triggers and responses in television talk shows Reviewed
N. Lubis, S. Sakti, G. Neubig, K. Yoshino, T. Toda, S. Nakamura
Proc. IEEE ASRU page: 777 - 783 2015.12
-
The NAIST ASR system for the 2015 Multi-Genre Broadcast Challenge: on combination of deep learning systems using a rank-score function Reviewed
Q. Truong Do, M. Heck, S. Sakti, G. Neubig, T. Toda, S. Nakamura
Proc. IEEE ASRU page: 654 - 659 2015.12
-
Incremental sentence compression using LSTM recurrent networks Reviewed International coauthorship
S. Sakti, F. Ilham, G. Neubig, T. Toda, Purwarianti, S. Nakamura
Proc. IEEE ASRU page: 252 - 258 2015.12
-
Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation Reviewed
H. Kawahara, K. Sakakibara, H. Banno, M. Morise, T. Toda, T. Irino
Proc. APSIPA ASC page: 520 - 529 2015.12
-
Learning to generate pseudo-code from source code using statistical machine translation Reviewed
Y. Oda, H. Fudaba, G. Neubig, H. Hata, S. Sakti, T. Toda, S. Nakamura
Proc. ASE page: 1 - 11 2015.11
-
Pseudogen: a tool to automatically generate pseudo-code from source code Reviewed
H. Fudaba, Y. Oda, K. Akabe, G. Neubig, H. Hata, S. Sakti, T. Toda, S. Nakamura
Proc. ASE page: 1 - 6 2015.11
-
An enhanced electrolarynx with automatic fundamental frequency control based on statistical prediction Reviewed
K. Tanaka, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. ASSETS page: 435 - 436 2015.10
-
Construction and analysis of social-affective interaction corpus in English and Indonesian Reviewed
N. Lubis, S. Sakti, G. Neubig, T. Toda, S. Nakamura
Proc. O-COCOSDA page: 202 - 206 2015.10
-
An investigation of machine translation evaluation metrics in cross-lingual question answering Reviewed
K. Sugiyama, M. Mizukami, G. Neubig, K. Yoshino, S. Sakti, T. Toda, S. Nakamur
Proc. 10th Workshop on Statistical Machine Translation page: 442 - 449 2015.9
-
Preserving word-level emphasis in speech-to-speech translation using linear regression HSMMs Reviewed
D.Q. Truong, S. Takamichi, S. Sakti, G. Neubig, T. Toda, S. Nakamura
Proc. INTERSPEECH page: 3665 - 3669 2015.9
-
Articulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential Reviewed
P.L. Tobing, K. Kobayashi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. INTERSPEECH page: 3350 - 3354 2015.9
-
Non-audible murmur enhancement based on statistical conversion using air- and body-conductive microphones in noisy environments Reviewed
Y. Tajiri, K. Tanaka, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. INTERSPEECH page: 2769 - 2773 2015.9
-
Statistical singing voice conversion based on direct waveform modification with global variance Reviewed
K. Kobayashi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. INTERSPEECH page: 2754 - 2758 2015.9
-
A latent variable model for joint pause prediction and dependency parsing Reviewed
T.T. Nguyen, G. Neubig, H. Shindo, S. Sakti, T. Toda, S. Nakamura
Proc. INTERSPEECH page: 2719 - 2723 2015.9
-
Speed or accuracy? a study in evaluation of simultaneous speech translation Reviewed
T. Mieno, G. Neubig, S. Sakti, T. Toda, S. Nakamura
Proc. INTERSPEECH page: 2267 - 2271 2015.9
-
Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis Reviewed International coauthorship
S. Takamichi, T. Toda, A.W. Black, S. Nakamura
Proc. INTERSPEECH page: 1206 - 1210 2015.9
-
Non-native speech synthesis preserving speaker individuality based on partial correction of prosodic and phonetic characteristics Reviewed
Y. Oshima, S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. INTERSPEECH page: 299 - 303 2015.9
-
The NAIST text-to-speech system for the Blizzard Challenge 2015 Reviewed
S. Takamichi, K. Kobayashi, K. Tanaka, T. Toda, S. Nakamura
Proc. Blizzard Challenge Workshop page: 1 - 4 2015.9
-
Prosody-controllable HMM-based speech synthesis using speech input Reviewed
Y. Nishigaki, S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
Proc. MLSLP page: 1 - 5 2015.9