Biography

I received a B.Eng. degree and an M.Sc. degree from Harbin Engineering University in 2012 and 2015, respectively. I received a Ph.D. degree in Electronic and Information Engineering at The Hong Kong Polytechnic University in 2022. I am now a postdoctoral fellow at The Hong Kong Polytechnic University. My research interests include speaker recognition and speech deepfake detection.

Education

  • Doctor of Philosophy in Electronic and Information Engineering (Speaker Recognition)
    The Hong Kong Polytechnic University, Hong Kong SAR, Sep. 2018–Apr. 2022
    Thesis title: Deep Speaker Embedding for Robust Speaker Verification
  • Master of Engineering in Underwater Acoustic Engineering (Acoustic Signal Processing)
    Harbin Engineering University, China, Aug. 2012–Mar. 2015
  • Bachelor of Engineering in Electronic and Information Engineering
    Harbin Engineering University, China, Aug. 2008–Jun. 2012

Working Experience

  • Postdoc Fellow, The Hong Kong Polytechnic University, Dec. 2022–Present
  • Research Associate, The Hong Kong Polytechnic University, Nov. 2021–Feb. 2022
  • Research Assistant, The Hong Kong Polytechnic University, Oct. 2017–Aug. 2018

Publication

Journal

  • Youzhi Tu, Man-Wai Mak, Kong-Aik Lee, and Weiwei Lin, “ConFusionformer: Locality-enhanced Conformer Through Multi-resolution Attention Fusion for Speaker Verification,” Neurocomputing, vol. 644, 2025.
  • Zezhong Jin, Youzhi Tu, ChongXin Gan, Man-Wai Mak, and Kong-Aik Lee, “Adversarially Adaptive Temperatures for Decoupled Knowledge Distillation With Applications to Speaker Verification,” Neurocomputing, vol. 624, 2025.
  • Youzhi Tu, Man-Wai Mak, and Jen-Tzung Chien, “Contrastive Self-Supervised Speaker Embedding With Sequential Disentanglement,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 2704–2715, 2024.
  • Youzhi Tu, Weiwei Lin, and Man-Wai Mak, “A Survey on Text-Dependent and Text-Independent Speaker Verification,” IEEE Access, vol. 10, pp. 99038–99049, 2022.
  • Youzhi Tu and Man-Wai Mak, “Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 944–957, 2022.
  • Youzhi Tu, Man-Wai Mak, and Jen-Tzung Chien, “Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2013–2024, 2020.

Conference

  • ChongXin Gan, Youzhi Tu, Zezhong Jin, Man-Wai Mak, and Kong Aik Lee, “Grouped Knowledge Distillation with Adaptive Logit Softening for Speaker Recognition,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025.
  • Zezhong Jin, Youzhi Tu, Zhe Li, Zilong Huang, ChongXin Gan, and Man-Wai Mak, “Denoising Student Features with Diffusion Models for Knowledge Distillation in Speaker Verification,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025.
  • Zezhong Jin, Youzhi Tu, and Man-Wai Mak, “W-GVKT: Within-Global-View Knowledge Transfer for Speaker Verification,” in Proc. Annual Conference of the International Speech Communication Association (INTERSPEECH), 2024, pp. 3779–3783.
  • Zezhong Jin, Youzhi Tu, and Man-Wai Mak, “Self-Supervised Learning with Multi-Head Multi-Mode Knowledge Distillation for Speaker Verification,” in Proc. Annual Conference of the International Speech Communication Association (INTERSPEECH), 2024, pp. 4723–4727.
  • Youzhi Tu, Man-Wai Mak, and Jen-Tzung Chien, “Contrastive Speaker Embedding with Sequential Disentanglement,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024, pp. 10891–10895.
  • Lishi Zuo, Man-Wai Mak, and Youzhi Tu, “Promoting Independence of Depression and Speaker Features for Speaker Disentanglement in Speech-Based Depression Detection,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024, pp. 10191–10195.
  • Weiwei Lin, ChenHang He, Man-Wai Mak, Youzhi Tu, “Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations,” in Proc. International Conference on Machine Learning (ICML), 2023, pp. 21065–21077.
  • Youzhi Tu and Man-Wai Mak, “Mutual Information Enhanced Training for Speaker Embedding,” in Proc. Annual Conference of the International Speech Communication Association (INTERSPEECH), 2021, pp. 91–95.
  • Youzhi Tu and Man-Wai Mak, “Short-time Spectral Aggregation for Speaker Embedding,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6708–6712.
  • Youzhi Tu, Man-Wai Mak, and Jen-Tzung Chien, “Information Maximized Variational Domain Adversarial Learning for Speaker Verification,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020, pp. 6449–6453.
  • Youzhi Tu, Man-Wai Mak, and Jen-Tzung Chien, “Variational Domain Adversarial Learning for Speaker Verification,” in Proc. Annual Conference of the International Speech Communication Association (INTERSPEECH), 2019, pp. 4315–4319.