About Me
I’m a third-year PhD student at the State Key Laboratory of Pattern Recognition, the University of Chinese Academy of Sciences, advised by Prof. Tieniu Tan. I have also spent time at Microsoft, advised by Prof. Jingdong Wang, alibaba DAMO Academy, work with Prof. Rong Jin. I strongly believe in the power of interdisciplinary collaboration and the potential it holds for driving impactful research outcomes. If you are interested in partnering on research projects, offering internship opportunities or exchange programs, I would be thrilled to connect with you.
I am actively seeking research positions in both industry and academia. My primary research focuses on the training and evaluation of multimodal large-scale models, with particular interest in developing efficient alignment strategies and comprehensive evaluation frameworks for vision-language systems.
Research Highlights
Multimodal Model Training: My work focuses on training advanced multimodal models, with particular emphasis on high-resolution models like the SliME and omni-MLLMs like VITA series (Vita, Vita 1.5, Long Vita) projects.
Model Evaluation: I lead and participate several research projects aimed at improving the evaluation processes for large-scale models, including the MME-Realworld (ICLR 25), ErrorRadar (ICLR 25 Workshop) and MME-Survey projects. My work focuses on evaluating the performance of models in real-world environments and the design of evaluation frameworks like VLMEvalKit [2k star+].
Post-Training & Alignment for Multimodal Models: My research includes the development of post-training techniques for multimodal models, such as MM-RLHF and DAMO, and surveys of MLLM Alignment to ensure the alignment and ethical usage of large-scale models. These projects examine how to optimize model behavior after initial training to adapt to specific tasks and ensure responsible AI deployment.
Full publication list can be found in google scholar
🔥 News
We have presented a comprehensive survey on the evaluation of large multi-modality models, jointly with Opencompass Team and LMMs-Lab 🔥🔥🔥
We have presented Aligning Multimodal LLM with Human Preference: A Survey on the RLHF of large multi-modality models reading note 🔥🔥🔥
- 2025.02 🎉🎉We present MM-RLHF, a comprehensive dataset of 120K fully human-annotated preference data, along with a robust reward model and training algorithm, designed to enhance MLLM alignment and significantly improve performance across 27 benchmark tasks.
- 2025.01 🎉🎉 MME-RealWorld [Project Page] has been accepted by ICLR.
- 2024.06 🎉🎉 Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models [Code] has been released.
- 2024.03 🎉🎉 two of our papers considering In-context-learning and symbolic reasoning have been accepted by NAACL 2024!
- 2023.12 🎉🎉 I will be contributing as a reviewer for IEEE TIP, International Journal of Computer Vision (IJCV), CVPR 2024, and NAACL 2024.
- 2023.10 🎉🎉 OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling has been accepted by NeurIPS 2023. [Code][Reading Notes]
- 2023.05 🎉🎉 Domain-Specific Risk Minimization for Out-of-Distribution Generalization has been accepted by SIGKDD 2023. [Code][Reading Notes]
- 2023.05 🎉🎉 AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation has been accepted by ICML 2023. [Code] [Reading Notes]
- 2023.01 🎉🎉 Free Lunch for Domain Adversarial Training: Environment Label Smoothing has been accepted by ICLR 2023. [Code] [Reading Notes]
- 2023.01 🎉🎉 Learning Domain Invariant Representations for Generalizable Person Re-Identification has been accepted by IEEE Transactions on Image Processing (T-IP).
- 2022.11 🎉🎉 Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation has been accepted by NeurIPS ML Safety workshop. [Code]
- 2022.04 🎉🎉 Towards Principled Disentanglement for Domain Generalization has been selected for an CVPR Oral presentation. [Reading Notes] [Code]
Publications
A Study on the Calibration of In-context Learning | NAACL, 2024 |
Hanlin Zhang, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade
OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling | NeurIPS, 2023 | [Code] |
Yi-Fan Zhang, Qingsong Wen, Xue Wang, Weiqi Chen, Liang Sun, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
Domain-Specific Risk Minimization for Out-of-Distribution Generalization | SIGKDD, 2023 | [Code] |
Yi-Fan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, Dacheng Tao, Xing Xie
AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation | ICML, 2023 | [Code] | [Reading Notes] |
Yi-Fan Zhang, Xue Wang, Kexin Jin, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
Free Lunch for Domain Adversarial Training: Environment Label Smoothing | ICLR, 2023 | [Code] | [Reading Notes] |
Yi-Fan Zhang, xue wang, Jian Liang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
Towards Principled Disentanglement for Domain Generalization | CVPR 2022 Oral | Code
Hanlin Zhang*, Yi-Fan Zhang*, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, Eric P. Xing
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation | NeurIPS ML Safety Workshop, 2022 | Code
Yi-Fan Zhang*, Hanlin Zhang*, Zachary Lipton, Li Erran Li, Eric Xing
Learning Domain Invariant Representations for Generalizable Person Re-Identification | IEEE T-IP, 2023 |
Yi-Fan Zhang; Zhang Zhang; Da Li; Zhen Jia; Liang Wang; Tieniu Tann
Focal and efficient IOU loss for accurate bounding box regression | Neurocomputing, 2022 |
Yi-Fan Zhang, Weiqiang Ren, Zhang Zhang, Zhen Jia, Liang Wang, Tieniu Tan
(* denotes equal contribution.)
Service
💻 Conference: PC Member/Reviewer for:
- ICML (2022,2023,2024,2025), NeurIPS (2022,2023,2024), ICLR (2023,2024,2025), AISTATS (2025)
- ECCV (2022,2024), CVPR (2022,2023,2024), ICCV (2023,2025),
- AAAI (2023,2024), EMNLP (2023,2024), NAACL (2024), ACL (2025)
💻 Journal: PC Member/Reviewer
- IEEE Transactions on Image Processing (TIP)
- International Journal of Computer Vision (IJCV)
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Transactions on Machine Learning Research (TMLR)
- IEEE Transactions on Information Forensics & Security (T-IFS)
💻 Workshops: PC Member for MILETS@PAKDD’23, DMLR@ICML’23
🎖 Selected Awards
- AAAI Innovative Applications Award, 2025.
- Top cited paper in Neurocomputing, 2023
- University of Chinese Academy of Sciences 三好学生标兵,国家奖学金, 2023
- Top Ten Best Student Models of South China University of Technology (Summa Cum Laude), 2020
- Jingtang He Technology Innovation Scholarship (Top 1‰, 5 out of 10000+ in university), 2020
- Contemporary Undergraduate Mathematical Contest in Modeling(CUMCM), National first prize (Top 1% globally), 2019.
📖 Work experience
- Dec 2024 - Now: Research Assistant on Kwaishou.
- Mar 2024 - Dec 2024: Research Assistant on Squirrel AI.
- May 2022 - Dec 2023: Research Assistant on Alibaba DAMO.
- March 2021 - July: Research Assistant on Microsoft Research Asia.