I am currently a second-year PhD student at National Taiwan University, supervised by Prof. Hung-yi Lee, in the Speech Processing and Machine Learning Lab. My research interests include speech foundation models, spoken language models, model compression, neuron analysis, and model merging. I am eager to explore new research areas and am currently looking for research internships for the year 2026. If there are any possibilities for research collaboration, please feel free to contact me.

🔥 News

  • 2024.09:  🎉🎉 Two papers accepted at SLT 2024 main conference track. See you in Macao 🇲🇴!
  • 2024.06:  🎉🎉 Two papers accepted at Interspeech 2024. See you in Greece 🇬🇷!
  • 2023.09:  🎉🎉 One paper accepeted at ASRU 2023.

📝 Publications

Under Review
sym

Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers

Tzu-Quan Lin, Tsung-Huan Yang, Chun-Yao Charly registrationang, Kuang-Ming Chen, Tzu-hsun Feng, Hung-yi Lee, Hao Tang

Project

  • This work propose evaluating model compression methods using three different metrics: MACs, number of parameters, and real-time factor. We find that different compression methods excel in different metrics.
Under Review
sym

Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization

Tzu-Quan Lin, Wei-Ping Huang, Hao Tang, Hung-yi Lee

  • Speech-FT is a two-stage fine-tuning framework designed for speech representation learning. It improves performance on specific tasks while maintaining cross-task generalization ability.
  • Speech-FT improves HuBERT’s performance on SUPERB by reducing phone error rate from 5.17% to 3.94%, lowering word error rate from 6.38% to 5.75%, and boosting speaker ID accuracy from 81.86% to 84.11%.
Under Review
sym

An Exploration of Mamba for Speech Self-Supervised Models

Tzu-Quan Lin, Heng-Cheng Kuo, Tzu-Chieh Wei, Hsi-Chun Cheng, Chun-Wei Chen, Hsien-Fu Hsiao, Yu Tsao, Hung-yi Lee

  • This work explores Mamba-based HuBERT as a speech SSL model, showing its advantages in long-context and streaming ASR, improved speech unit quality, and competitive performance on probing tasks compared to Transformer-based models.
Under Review
sym

Identifying Speaker Information in Feed-Forward Layers of Self-Supervised Speech Transformers

Tzu-Quan Lin, Hsi-Chun Cheng, Hung-yi Lee, Hao Tang

  • This work identifies speaker-relevant neurons in self-supervised speech Transformers and shows that preserving them during pruning helps maintain performance on speaker-related tasks.
SLT 2024
sym

Property Neurons in Self-Supervised Speech Transformers

Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee, Hao Tang

  • In this work, we identify a set of property neurons in the feedforward layers of Transformers to study how speech-related properties, such as phones, gender, and pitch, are stored.
Interspeech 2024
sym

DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

Tzu-Quan Lin, Hung-yi Lee, Hao Tang

  • This work introduces a novel early exit method for speech self-supervised models that enhances the speed of HuBERT with minimal performance loss.
ASRU 2023
sym

MelHuBERT: A Simplified Hubert on Mel Spectrograms

Tzu-Quan Lin, Hung-yi Lee, Hao Tang

Project

  • MelHuBERT simplifies the model architecture and loss function of HuBERT, achieving comparable performance while saving 33.5% of MACs per one second of speech.

📖 Educations

  • 2024.07 - now, PhD in Electrical, Electronics, Communications Engineering (EE), Data Science and Smart Networking, National Taiwan University
  • 2022.07 - 2024.06, Master in CSIE, Networking and Multimedia, National Taiwan University
  • 2018.09 - 2022.06, Bachelor in Department of Computer Science and Information Engineering (CSIE), National Taiwan University

🏆 Honors and Awards

  • Interspeech 2024 Travel Grant

💻 Internships

  • 2021.07 - 2021-09, aetherAI, Taipei, Taiwan.