Papers - TODA Tomoki
-
Speaker-aware multi-task learning for speech emotion recognition Reviewed International coauthorship Open Access
X. Shi, X. Li, T. Toda
Proc. INTERSPEECH page: 4333 - 4337 2025.8
-
Advancing emotion recognition via ensemble learning: integrating speech, context, and text representations Reviewed International coauthorship Open Access
X. Shi, J. Mi, X. Li, T. Toda
Proc. INTERSPEECH page: 4693 - 4697 2025.8
-
Comparative analysis of fast and high-fidelity neural vocoders for low-latency streaming synthesis in resource-constrained environments Reviewed Open Access
R. Yoneyama, M. Kawamura, R. Terashima, R. Yamamoto, T. Toda
Proc. INTERSPEECH page: 4888 - 4892 2025.8
-
Who, When, and What: leveraging the "Three Ws" concept for emotion recognition in conversation Reviewed International coauthorship Open Access
X. Shi, X, Li, T. Toda
Proc. INTERSPEECH page: 1763 - 1767 2025.8
-
GST-BERT-TTS: prosody prediction without accentual labels for multi-speaker TTS using BERT with global style tokens Reviewed Open Access
T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai
Proc. INTERSPEECH page: 444 - 448 2025.8
-
Improving electrolaryngeal speech enhancement via a representation learning method based on integrated text and speech representations Reviewed International coauthorship
D. Ma, J. Mi, F. Li, L.P. Violeta, K. Kobayashi, T. Toda
Proc. IEEE EMBC page: 6 pages 2025.7
-
Phoneme-level duration controllable neural text-to-speech with phoneme embedding skip connection and modified Gaussian duration modeling Reviewed Open Access
T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai
IEEE Access Vol. 13 page: 118369 - 118380 2025.7
-
Learning separated representations for instrument-based music similarity Reviewed Open Access
Y. Hashizume, L. Li, A. Miyashita, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 14 ( 1, e16 ) page: 1 - 32 2025.7
-
Pretraining and fine-tuning techniques for electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion Reviewed Open Access
D. Ma, L.P. Violeta, K. Kobayashi, T. Toda
IEEE Transactions on Audio, Speech and Language Processing Vol. 33 page: 3189 - 3201 2025.7
-
Noise and reverberation-controllable voice conversion Reviewed Open Access
Y. Choi, C. Xie, T. Toda
IEEE Transactions on Audio, Speech and Language Processing Vol. 33 page: 2430 - 2443 2025.6
-
PMF-CEC: phoneme-augmented multimodal fusion for context-aware ASR error correction with error-specific selective decoding Reviewed Open Access
J. He, T. Toda
IEEE Transactions on Audio, Speech and Language Processing Vol. 33 page: 2402 - 2417 2025.6
-
Improving anomalous sound detection through pseudo-anomalous set selection and pseudo-label utilization under unlabeled conditions Reviewed Open Access
I. Kuroyanagi, T. Fujimura, K. Takeda, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 14 ( 1, e13 ) page: 1 - 28 2025.6
-
Analysis and extension of noisy-target training for unsupervised target signal enhancement Reviewed Open Access
T. Fujimura, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 14 ( 1, e12 ) page: 1 - 27 2025.6
-
An investigation of noisy-to-noisy voice conversion performance in various noisy conditions Reviewed Open Access
C. Xie, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 14 ( 1, e10 ) page: 1 - 30 2025.6
-
Resolving domain mismatches in electrolaryngeal speech enhancement with linguistic intermediates Reviewed
L.P. Violeta, W.-C. Huang, D. Ma, R. Yamamoto, K. Kobayashi, T. Toda
IEEE Journal of Selected Topics in Signal Processing Vol. 19 ( 5 ) page: 827 - 839 2025.6
-
Sequence-to-sequence voice conversion-based techniques for electrolaryngeal speech enhancement in noisy and reverberant conditions Reviewed International coauthorship Open Access
D. Ma, Y. Choi, T. Fujimura, F. Li, C. Xie, K. Kobayashi, T. Toda
APSIPA Transactions on Signal and Information Processing Vol. 14 ( 1, e8 ) page: 1 - 40 2025.5
-
Fast neural vocoder with fundamental frequency control using finite impulse response filters Reviewed Open Access
Y. Ohtani, T. Okamoto, T. Toda, H. Kawai
IEEE Transactions on Audio, Speech and Language Processing Vol. 33 page: 1893 - 1906 2025.4
-
Predicting fundamental frequency patterns in electrolaryngeal speech using automated phoneme extraction Reviewed Open Access
M. Eshghi, T. Toda
IEEE Access Vol. 13 page: 73831 - 73847 2025.4
-
Generalized sound field interpolation for freely spaced microphone arrays in rotation-robust beamforming Reviewed Open Access
S. Luan, Y. Wakabayashi, T. Toda
Applied Acoustics Vol. 236 ( Article 110706 ) page: 1 - 15 2025.4
-
Mora-level prosody prediction for text-to-speech using Japanese BERT without accentual labels Reviewed
T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai
Proc. IEEE ICASSP page: 1 - 5 2025.4