About
Welcome to my homepage! I’m Huy Nguyen, a third-year Ph.D candidate at the Department of Statistics and Data Sciences, The University of Texas at Austin where I am fortunate to be advised by Professor Nhat Ho and Professor Alessandro Rinaldo. Before that, I graduated from Ho Chi Minh City University of Science with a Bachelor’s degree in Mathematics (Summa Cum Laude). In Summer 2024, I worked as a research intern at Microsoft AI.
I co-organize the Statistical Machine Learning seminar at UT Austin (StatML@UT).
Email: huynm@utexas.edu
Research Interests
My current research focuses on theoretical foundations for the Mixture-of-Experts models. In particular, I investigate the effects of various gating functions (namely the softmax gate, the Top-K sparse softmax gate, the dense-to-sparse gate, the sigmoid gate, etc) on the convergence of expert estimation under the Mixture-of-Experts models. Based on insights from these results, I aim to design novel gating functions and characterize expert networks which help improve the efficiency and scalability of the Mixture-of-Experts applications, including Large Language Models, Multi-modal Learning and Parameter-efficient Fine-Tuning. Additionally, I am also interested in Optimal Transport theory.
(*) denotes equal contribution, (**) denotes equal advising.
Selected Publications on the Theory of Mixture of Experts
[T.1] On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating. Under review
Huy Nguyen, Thong T. Doan, Quang Pham, Nghi Q. D. Bui, Nhat Ho**, Alessandro Rinaldo* *
[T.2] Convergence Rates for Softmax Gating Mixture of Experts. Under review (Part of it has been in the Proceedings of the ICML, 2024)
Huy Nguyen, Nhat Ho**, Alessandro Rinaldo* *
[T.3] Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts. Advances in NeurIPS, 2024
Huy Nguyen, Nhat Ho**, Alessandro Rinaldo* *
[T.4] Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective . Under review
Huy Nguyen*, Fanqi Yan*, Pedram Akbarian, Nhat Ho**, Alessandro Rinaldo* *
[T.5] Demystifying Softmax Gating Function in Gaussian Mixture of Experts. Advances in NeurIPS, 2023 (Spotlight)
Huy Nguyen, TrungTin Nguyen, Nhat Ho
[T.6] Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?. Proceedings of the ICML, 2024
Huy Nguyen, Pedram Akbarian, Nhat Ho
[T.7] Statistical Advantages of Perturbing Cosine Router in Mixture of Experts. Proceedings of the ICLR, 2025
Huy Nguyen, Pedram Akbarian*, Trang Pham*, Trang Nguyen*, Shujian Zhang, Nhat Ho
[T.8] Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts. Proceedings of the ICLR, 2024
Huy Nguyen, Pedram Akbarian, Fanqi Yan, Nhat Ho
[T.9] On Expert Estimation in Hierarchical Mixture of Experts: Beyond Softmax Gating Functions. Under review
Huy Nguyen*, Xing Han*, Carl William Harris, Suchia Saria**, Nhat Ho* *
[T.10] Quadratic Gating Functions in Mixture of Experts: A Statistical Insight. Under review
Pedram Akbarian*, Huy Nguyen*, Xing Han*, Nhat Ho
[T.11] A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts. Proceedings of the ICML, 2024
Huy Nguyen, Pedram Akbarian, TrungTin Nguyen, Nhat Ho
[T.12] Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts. In AISTATS, 2024
Huy Nguyen*, TrungTin Nguyen*, Khai Nguyen, Nhat Ho
[T.13] On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts. Under review
Fanqi Yan*, Huy Nguyen*, Dung Le*, Pedram Akbarian, Nhat Ho**, Alessandro Rinaldo* *
[T.14] Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts. In AISTATS, 2025
Fanqi Yan*, Huy Nguyen*, Dung Le*, Pedram Akbarian, Nhat Ho
[T.15] On Parameter Estimation in Deviated Gaussian Mixture of Experts. In AISTATS, 2024
Huy Nguyen, Khai Nguyen, Nhat Ho
Selected Publications on the Applications of Mixture of Experts
[A.1] FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion. Advances in NeurIPS, 2024
Xing Han, Huy Nguyen*, Carl Harris*, Nhat Ho, Suchi Saria
[A.2] Mixture of Experts Meets Prompt-Based Continual Learning. Advances in NeurIPS, 2024
Minh Le, An Nguyen*, Huy Nguyen*, Trang Nguyen*, Trang Pham*, Linh Van Ngo, Nhat Ho
[A.3] Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts. Proceedings of the ICLR, 2025
Minh Le*, Chau Nguyen*, Huy Nguyen*, Quyen Tran, Trung Le, Nhat Ho
[A.4] Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning. Under review
Minh Le*, Anh Nguyen*, Huy Nguyen, Chau Nguyen, Nhat Ho
[A.5] RepLoRA: Reparameterizing Low-rank Adaptation via the Perspective of Mixture of Experts. Proceedings of the ICML, 2025
Tuan Truong*, Chau Nguyen*, Huy Nguyen*, Minh Le, Trung Le, Nhat Ho
[A.6] On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation. Proceedings of the ICML, 2025
Nghiem T. Diep*, Huy Nguyen*, Chau Nguyen*, Minh Le, Duy M. H. Nguyen, Daniel Sonntag, Mathias Niepert, Nhat Ho
[A.7] CompeteSMoE – Statistically Guaranteed Mixture of Experts Training via Competition. Under review
Nam V. Nguyen, Huy Nguyen, Quang Pham, Van Nguyen, Savitha Ramasamy, Nhat Ho
Selected Publications on Optimal Transport
[O.1] Entropic Gromov-Wasserstein between Gaussian Distributions. Proceedings of the ICML, 2022
Huy Nguyen*, Khang Le*, Dung Le*, Dat Do, Tung Pham, Nhat Ho
[O.2] On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity. In AISTATS, 2022
Huy Nguyen*, Khang Le*, Khai Nguyen, Tung Pham, Nhat Ho
[O.3] On Robust Optimal Transport: Computational Complexity and Barycenter Computation. Advances in NeurIPS, 2021
Huy Nguyen*, Khang Le*, Quang Minh Nguyen, Tung Pham, Hung Bui, Nhat Ho
[O.4] Fast Approximation of the Generalized Sliced-Wasserstein Distance. IEEE ICASSP, 2024
Huy Nguyen*, Dung Le*, Khai Nguyen*, Trang Nguyen*, Nhat Ho
Recent News
- [May 2025] Two papers on applications of MoE in parameter-efficient fine-tuning are accepted to ICML 2025 ([1], [2]).
- [Jan 2025] Three papers on Mixture of Experts are accepted to ICLR 2025 ([1], [2]) and AISTATS 2025 ([3]).
- [Dec 2024] I was recognized as a top reviewer at NeurIPS 2024. I was also promoted to PhD candidate at UT Austin.
- [Oct 2024] Four new papers on Mixture of Experts are out, [1], [2], [3] and [4].
- [Sep 2024] Three papers on Mixture of Experts, [1], [2] and [3], are accepted to NeurIPS 2024. See you in Vancouver, Canada this December!
- [May 2024] I start my research internship at Microsoft AI where I will work on the applications of Mixture of Experts in Large Language Models.
- [May 2024] Three new papers on Mixture of Experts [1], [2] and [3] are out!
- [May 2024] Three papers on Mixture of Experts, [1], [2] and [3], are accepted to ICML 2024.
- [Apr 2024] I was offered the AISTATS 2024 registration grant. See you in Valencia, Spain this May!
- [Mar 2024] I received the ICLR 2024 Travel Award. See you in Vienna, Austria this May!
- [Feb 2024] Two new papers on the applications of Mixture of Experts in Medical Images [1] and Large Language Models [2] are out!
- [Feb 2024] Two new papers on the theory of Mixture of Experts, [1] and [2], are out!
- [Jan 2024] Two papers on Mixture of Experts, [1] and [2], are accepted to AISTATS 2024.
- [Jan 2024] Our paper “Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts” is accepted to ICLR 2024.
- [Dec 2023] Our paper “Fast Approximation of the Generalized Sliced-Wasserstein Distance” is accepted to ICASSP 2024.
- [Oct 2023] I received the NeurIPS 2023 Scholar Award. See you in New Orleans this December!
- [Oct 2023] Our new paper “A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts” is out.
- [Sep 2023] Our new paper “Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts” is out.
- [Sep 2023] We have two papers accepted to NeurIPS 2023, [1] as spotlight and [2] as poster.
- [Jul 2023] We will present the paper “Fast Approximation of the Generalized Sliced-Wasserstein Distance” at the Frontier4LCD workshop, ICML 2023.
- [May 2023] Three new papers on the Mixture of Experts theory are out! See more at [1], [2] and [3].
- [Feb 2023] Our new paper on Mixture Models theory “Minimax Optimal Rate for Parameter Estimation in Multivariate Deviated Models” is out.
Professional Services
- Conference Reviewer: ICML (2022-2025), NeurIPS (2022-2025), AISTATS (2022-2025), ICLR (2024-2025), and AAAI (2025).
- Journal Reviewer: Journal of Machine Learning Research, Electronic Journal of Statistics, Transactions on Machine Learning Research.