About Me

I’m a third-year PhD student at the State Key Laboratory of Pattern Recognition, the University of Chinese Academy of Sciences, advised by Prof. Tieniu Tan. I have also spent time at Microsoft, advised by Prof. Jingdong Wang, alibaba DAMO Academy, work with Prof. Rong Jin. I strongly believe in the power of interdisciplinary collaboration and the potential it holds for driving impactful research outcomes. If you are interested in partnering on research projects, offering internship opportunities or exchange programs, I would be thrilled to connect with you.

I am actively seeking research positions in both industry and academia. My primary research focuses on the training and evaluation of multimodal large-scale models, with particular interest in developing efficient alignment strategies and comprehensive evaluation frameworks for vision-language systems.

Research Highlights

  1. Multimodal Model Training: My work focuses on training advanced multimodal models, with particular emphasis on high-resolution models like the SliME and omni-MLLMs like VITA series (Vita, Vita 1.5, Long Vita) projects.

  2. Model Evaluation: I lead and participate several research projects aimed at improving the evaluation processes for large-scale models, including the MME-Realworld (ICLR 25), ErrorRadar (ICLR 25 Workshop) and MME-Survey projects. My work focuses on evaluating the performance of models in real-world environments and the design of evaluation frameworks like VLMEvalKit [2k star+].

  3. Post-Training & Alignment for Multimodal Models: My research includes the development of post-training techniques for multimodal models, such as MM-RLHF and DAMO, and surveys of MLLM Alignment to ensure the alignment and ethical usage of large-scale models. These projects examine how to optimize model behavior after initial training to adapt to specific tasks and ensure responsible AI deployment.

Full publication list can be found in google scholar

🔥 News

We have presented a comprehensive survey on the evaluation of large multi-modality models, jointly with Opencompass Team and LMMs-Lab 🔥🔥🔥

We have presented Aligning Multimodal LLM with Human Preference: A Survey on the RLHF of large multi-modality models reading note 🔥🔥🔥

Publications

A Study on the Calibration of In-context Learning | NAACL, 2024 |

Hanlin Zhang, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade

OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling | NeurIPS, 2023 | [Code] |

Yi-Fan Zhang, Qingsong Wen, Xue Wang, Weiqi Chen, Liang Sun, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

Domain-Specific Risk Minimization for Out-of-Distribution Generalization | SIGKDD, 2023 | [Code] |

Yi-Fan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, Dacheng Tao, Xing Xie

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation | ICML, 2023 | [Code] | [Reading Notes] |

Yi-Fan Zhang, Xue Wang, Kexin Jin, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

Free Lunch for Domain Adversarial Training: Environment Label Smoothing | ICLR, 2023 | [Code] | [Reading Notes] |

Yi-Fan Zhang, xue wang, Jian Liang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

Towards Principled Disentanglement for Domain Generalization | CVPR 2022 Oral | Code

Hanlin Zhang*, Yi-Fan Zhang*, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, Eric P. Xing

Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation | NeurIPS ML Safety Workshop, 2022 | Code

Yi-Fan Zhang*, Hanlin Zhang*, Zachary Lipton, Li Erran Li, Eric Xing

Learning Domain Invariant Representations for Generalizable Person Re-Identification | IEEE T-IP, 2023 |

Yi-Fan Zhang; Zhang Zhang; Da Li; Zhen Jia; Liang Wang; Tieniu Tann

Focal and efficient IOU loss for accurate bounding box regression | Neurocomputing, 2022 |

Yi-Fan Zhang, Weiqiang Ren, Zhang Zhang, Zhen Jia, Liang Wang, Tieniu Tan

(* denotes equal contribution.)

Service

💻 Conference: PC Member/Reviewer for:

  • ICML (2022,2023,2024,2025), NeurIPS (2022,2023,2024), ICLR (2023,2024,2025), AISTATS (2025)
  • ECCV (2022,2024), CVPR (2022,2023,2024), ICCV (2023,2025),
  • AAAI (2023,2024), EMNLP (2023,2024), NAACL (2024), ACL (2025)

💻 Journal: PC Member/Reviewer

  • IEEE Transactions on Image Processing (TIP)
  • International Journal of Computer Vision (IJCV)
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
  • IEEE Transactions on Machine Learning Research (TMLR)
  • IEEE Transactions on Information Forensics & Security (T-IFS)

💻 Workshops: PC Member for MILETS@PAKDD’23, DMLR@ICML’23

🎖 Selected Awards

  • AAAI Innovative Applications Award, 2025.
  • Top cited paper in Neurocomputing, 2023
  • University of Chinese Academy of Sciences 三好学生标兵,国家奖学金, 2023
  • Top Ten Best Student Models of South China University of Technology (Summa Cum Laude), 2020
  • Jingtang He Technology Innovation Scholarship (Top 1‰, 5 out of 10000+ in university), 2020
  • Contemporary Undergraduate Mathematical Contest in Modeling(CUMCM), National first prize (Top 1% globally), 2019.

📖 Work experience

  • Dec 2024 - Now: Research Assistant on Kwaishou.
  • Mar 2024 - Dec 2024: Research Assistant on Squirrel AI.
  • May 2022 - Dec 2023: Research Assistant on Alibaba DAMO.
  • March 2021 - July: Research Assistant on Microsoft Research Asia.

Flag Counter