Publications
You can also find my articles on my Google Scholar profile
Research Outlines
My main research interests lie in i) optimization, generalization, and personalization of deep learning models, especially under distributed/federated/collaborative setups, which are generally empowered by deep learning phenomena and mechanistic interpretability; ii) trustworthy model manipulation for foundation models: understanding and improving foundation models (e.g., large language model, vision transformer, and diffusion transformer) from the model parameter perspective, i.e., model fusion, editing, pruning, stitching, growth, unlearning, and generation.
- Optimization, Generalization, and Personalization of Deep Learning under Collaboration
- Federated deep Learning.
- Generalization and optimization: improving generalization and gaining insights in training dynamics [9, 15].
- Personalization: general personalization [8], personalization under clustered heterogeneity [4], personalization under feature skew [11].
- Robustness: joint problem of noisy labels and non-IID data [5].
- Federated large language model [12, 15].
- Edge-cloud collaborative & domain-transferred machine learning.
- Edge-cloud collaboration in recommender systems [6].
- Transfer learning of regression models [2].
- Domain adaptation [7].
- Train-once-for-all personalization [17].
- Socio-technical issues brought by big data and machine learning.
- Data collaboration and open science [3].
- Big data driven medical reform studies [1].
- Federated deep Learning.
- Model Manipulation for Foundation Models
- Model fusion and linear mode connectivity [10].
- Model editing of large language models [16].
- Model generation: text-to-model generative diffusion transformer [17].
- Model tailor: mitigating catastrophic forgetting in multi-modal large language models [14].
- Neural-collapse-inspired representation learning: i) federated learning [8]; ii) prompt tuning for CLIP [13].
Publications
* indicates equal contributions, # indicates corresponding author.
2024
- [17] Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li*, Lingzhi Gao*, Chao Wu#
preprint. [arxiv] - [16] WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
Peng Wang*, Zexi Li*, Ningyu Zhang#, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen#
NeurIPS 2024. [arxiv] - [15] Improving Group Connectivity for Generalization of Federated Deep Learning
Zexi Li*, Jie Lin*, Zhiqi Li*, Didi Zhu, Tao Lin#, Chao Wu#
preprint. [arxiv] - [14] Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Didi Zhu, Zhongyi Sun, Zexi Li, Tao Shen, Ke Yan, Shouhong Ding, Kun Kuang#, Chao Wu#
International Conference on Machine Learning (ICML) 2024. (CCF A, Top Conference in Machine Learning) [arxiv] - [13] Neural Collapse Anchored Prompt Tuning for Generalizable Vision-Language Models
Didi Zhu, Zexi Li, Min Zhang, Junkun Yuan, yunfeng shao, Yinchuan Li, Jiashuo Liu, Kun Kuang, Chao Wu#
ACM KDD 2024. (CCF A, Top Conference in Data Mining) [arxiv] - [12] OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen
ACM KDD 2024. (CCF A, Top Conference in Data Mining) [arxiv]
2023
- [11] FediOS: Decoupling Orthogonal Subspaces for Personalization in Feature-skew Federated Learning
Lingzhi Gao*, Zexi Li*, Yang Lu, and Chao Wu#
preprint. [arxiv] - [10] Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion
Zexi Li, Zhiqi Li, Jie Lin, Tao Shen, Tao Lin#, and Chao Wu#
preprint. [arxiv] - [9] Revisiting Weighted Aggregation in Federated Learning with Neural Networks
Zexi Li, Tao Lin#, Xinyi Shang, and Chao Wu#
International Conference on Machine Learning (ICML) 2023. (CCF A, Top Conference in Machine Learning) [paper][github][arxiv] - [8] No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier
Zexi Li, Xinyi Shang, Rui He, Tao Lin#, and Chao Wu#
International Conference on Computer Vision (ICCV) 2023. (CCF A, Top Conference in Computer Vision) [paper][arxiv] - [7] Universal Domain Adaptation via Compressive Attention Matching
Didi Zhu*, Yinchuan Li*, Junkun Yuan, Zexi Li, Kun Kuang#, Chao Wu#
International Conference on Computer Vision (ICCV) 2023. (CCF A, Top Conference in Computer Vision) [paper] - [6] Edge-cloud Collaborative Learning with Federated and Centralized Features
Zexi Li*, Qunwei Li*, Yi Zhou, Wenliang Zhong#, Guannan Zhang, and Chao Wu#
International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2023. (CCF A, Top Conference in Data Mining and Information Retrieval) [paper][arxiv] - [5] Learning Cautiously in Federated Learning with Noisy and Heterogeneous Clients
Chenrui Wu*, Zexi Li*, Fangxin Wang#, and Chao Wu#
(Oral) IEEE International Conference on Multimedia and Expo (ICME) 2023. (CCF B, Top Conference in Multimedia) [paper][arxiv] –
2022
- [4] Towards Effective Clustered Federated Learning: A Peer-to-peer Framework with Adaptive Neighbor Matching
Zexi Li, Jiaxun Lu, Shuang Luo, Didi Zhu, Yunfeng Shao, Yinchuan Li, Zhimeng Zhang, Yongheng Wang#, and Chao Wu# - [3] Can We Share Models If Sharing Data Is Not an Option?
Zexi Li, Feng Mao#, and Chao Wu#
Patterns, Cell Press. (JCR Q1, IF: 6.5) [paper]
2021
- [2] Boosting the generalization ability of Vis-NIR-spectroscopy-based regression models through dimension reduction and transfer learning
Xiaoli Li*, Zexi Li*, Xufeng Yang, and Yong He#
Computers and Electronics in Agriculture. (JCR Q1, CAS Top, IF: 8.3) [paper] - [1] An early assessment of the County Medical Community reform in China: a case study of Zhejiang province
Chao Wu, Yixin Tu, Zexi Li, Jianxing Yu#
Journal of Chinese Governance. (SSCI, JCR Q1, IF: 3.0) [paper]