Kun Zhang (张坤)

Google Scholar  /  Github  /  Blog  /  ResearchGate

 /  MIRACLE Center

 

Postdoctoral Researcher, University of Science and Technology of China (USTC)
School Email: kkzhang (at) ustc.edu.cn
Motto: Curiosity, Thirst for knowledge, Understanding, Creativity

About Me

Currently, I am a Postdoctoral Researcher at Suzhou Institute for Advanced Research, University of Science and Technology of China (USTC), advised by Prof. S. Kevin Zhou (IEEE Fellow) and Prof. Houqiang Li (IEEE Fellow). Before that, I obtained my Ph.D. degree from the Department of Electronic Engineering and Information Science, University of Science and Technology of China (USTC) in 2024, advised by Prof. Yongdong Zhang (IEEE Fellow) and Prof. Zhendong Mao. From 2018 to 2020, I studied in the Department of Automation at USTC, advised by Prof. Shuang Cong. I obtained my B. Eng. degree in the School of Internet of Things Engineering from Jiangnan University in 2018.

My research interests broadly lie in the areas of Multimodal Artificial Intelligence and Deep Learning (e.g., vision-language alignment, cross-modal retrieval, report generation, retrieval augmented generation, hallucination evaluation, etc.). I am recently interested in multimodal large language models in medical scenarios, including explainable disease diagnosis, LLM-based clinical decision making, medical text processing, pathology, MRI image processing, etc.

We are currently completing a review: Composed Multi-modal Retrieval: A Survey of Approaches and Applications. For more details, please see CMR page. This repo is used for recording and tracking recent Composed Multi-modal Retrieval (CMR) works, including Composed Image Retrieval (CIR), Composed Video Retrieval (CVR), Composed Person Retrieval (CPR), etc. The survey can be found here.

Explainable artificial intelligence is an important research direction, and the Concept Bottleneck Model (CBM) is a current promising research paradigm. CBMs typically involve a layer preceding the final fully connected classifier, where each neuron corresponds to a concept that can be interpreted by humans. CBMs also show advantages in improving accuracy through human intervention during testing. We are maintaining a GitHub repository CBM page, aiming to keep pace with its rapidly evolving.

Selected Publications [ Full List ]

2025

DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning
Kun Zhang, Jingyu Li, Zhe Li, S Kevin Zhou.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[ PDF ]
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
Kun Zhang#, Jingyu Li#, Zhe Li#, Jingjing Zhang#, Fan Li, Yandong Liu, Rui Yan, Zihang Jiang, Nan Chen, Lei Zhang, Yongdong Zhang, Zhendong Mao, S Kevin Zhou.
Preprint, 2025
[ PDF ] [ Code ]
MVP-CBM: Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification
Chunjiang Wang, Kun Zhang#, Yandong Liu, Zhiyang He, Xiaodong Tao, S Kevin Zhou#.
International Joint Conference on Artificial Intelligence (IJCAI), 2025
[ PDF ]
MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM
Wenliang Li, Rui Yan, Xu Zhang, Li Chen, Hongji Zhu, Jing Zhao, Junjun Li, Mengru Li, Wei Cao, Zihang Jiang, Wei Wei#, Kun Zhang#, S Kevin Zhou#.
Preprint, 2025
[ PDF ]
A General Knowledge Injection Framework for ICD Coding
Xu Zhang, Kun Zhang#, Wenxin ma, Rongsheng Wang, Chenxu Wu, Yingtai Li, S Kevin Zhou#.
Association for Computational Linguistics (ACL Findings), 2025
[ PDF ]
KANTrust: A Multi-Omics Framework for Uncertainty-Aware Disease Subtyping
Chunjiang Wang, Rui Yan#, Kun Zhang#, Zihang Jiang, Zhiyang He, Xiaodong Tao, S Kevin Zhou#.
IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2025
Rethinking Pseudo Word Learning in Zero-Shot Composed Image Retrieval: From an Object-Aware Perspective
Zhe Li, Lei Zhang, Kun Zhang, Weidong Chen, Yongdong Zhang and Zhendong Mao.
International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
[ PDF ]
Hierarchy-Aware Pseudo Word Learning with Text Adaptation for Zero-Shot Composed Image Retrieval
Zhe Li, Lei Zhang, Zheren Fu, Kun Zhang, Zhendong Mao.
International Conference on Computer Vision (ICCV), 2025
FundusAdapter: few-shot adaptation of fundus image foundation model for fundus image diagnosis
Yifan Chang, Zihang Jiang, Kun Zhang#,S Kevin Zhou#.
Medical Imaging with Deep Learning (MIDL Short), 2025
[ PDF ]
DiffRGenNet: Difference-aware Medical Report Generation
Minghao Bian, Kun Zhang, Dexin Zhao, S Kevin Zhou.
Medical Imaging with Deep Learning (MIDL), 2025
[ PDF ]
SDD-4DGS: Static-Dynamic Aware Decoupling in Gaussian Splatting for 4D Scene Reconstruction
Dai Sun, Huhao Guan, Kun Zhang, Xike Xie, S Kevin Zhou.
Preprint (Arxiv), 2025
[ PDF ]

2024

Enhanced Semantic Similarity Learning Framework for Image-Text Matching
Kun Zhang, Bo Hu, Huatian Zhang, Zhe Li, Zhendong Mao.
IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT) , 2024
[ PDF ] [ Code ]
Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching
Huatian Zhang, Lei Zhang, Kun Zhang, Zhendong Mao.
AAAI Conference on Artificial Intelligence (AAAI) , 2024
[ PDF ]
Cascade Semantic Prompt Alignment Network for Image Captioning
Jingyu Li, Lei Zhang, Kun Zhang, Bo Hu, Hongtao Xie, Zhendong Mao.
IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT) , 2024
[ PDF ] [ Code ]
Improving Image-Text Matching with Bidirectional Consistency of Cross-Modal Alignment
Zhe Li, Lei Zhang, Kun Zhang, Yongdong Zhang, Zhendong Mao.
IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT) , 2024
[ PDF ]
Fast, Accurate, and Lightweight Memory-Enhanced Embedding Learning Framework for Image-Text Retrieval
Zhe Li, Lei Zhang, Kun Zhang, Yongdong Zhang, Zhendong Mao.
IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT) , 2024
[ PDF ]
Visual-Linguistic Dependency Encoding for Image-Text Retrieval
Wenxin Guo, Lei Zhang, Kun Zhang, Yi Liu and Zhendong Mao.
Joint International Conference on Computational Linguistics, Language Resources and Evaluation Technology (COLING) , 2024
[ PDF ]

2023

Unlocking the Power of Cross-Dimensional Semantic Dependency for Image-Text Matching
Kun Zhang, Lei Zhang, Bo Hu, Mengxiao Zhu, Zhendong Mao.
ACM International Conference on Multimedia (ACM MM), 2023
[ PDF ] [ Code ] [ Blog ]
Unified Adaptive Relevance Distinguishable Attention Network for Image-Text Matching
Kun Zhang, Zhendong Mao, Anan Liu, Yongdong Zhang.
IEEE Transactions on Multimedia (IEEE-TMM), 2023
, (ESI Highly Cited Paper) [ PDF ] [ Blog ] [ Code ]

2022

Negative-Aware Attention Framework for Image-Text Matching
Kun Zhang, Zhendong Mao, Quan Wang, Yongdong Zhang.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[ PDF ] [ Code ] [ Blog ]
Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching
Huatian Zhang, Zhendong Mao, Kun Zhang, Yongdong Zhang.
AAAI Conference on Artificial Intelligence (AAAI) , 2022
[ PDF ] [ Blog ] [ Code ]

Before 2022

An Efficient Online Estimation Algorithm with Measurement Noise for Time-varying Quantum States
Kun Zhang, Shuang Cong, Kezhi Li.
Signal Processing (SIGPRO) , 2021
[ PDF ]
An Online Optimization Algorithm for Real-time Quantum State Tomography
Kun Zhang, Shuang Cong, Kezhi Li, Tao Wang.
Quantum Information Processing (QIP) , 2020
[ PDF ]
An Efficient Online Estimation Algorithm for Evolving Quantum States
Kun Zhang, Shuang Cong, Yaru Tang, Nikolaos M. Freris.
IEEE European Signal Processing Conference (EUSIPCO) , 2020
[ PDF ]
Efficient and fast optimization algorithms for quantum state filtering and estimation
Kun Zhang, Shuang Cong, Jiao Ding, Jiaojiao Zhang, Kezhi Li.
International Conference on Intelligent Control and Information Processing (IEEE ICICIP) , 2019
[ PDF ]

Awards

  • Jiangsu Funding Program for Excellent Postdoctoral Talent (A) (江苏省卓越博士后A类), 2025
  • President Award of the Chinese Academy of Sciences (中国科学院院长奖), 2024
  • National Scholarship for Doctoral Students (博士生国家奖学金), 2023
  • USTC-SZSE Doctoral Scholarship (中国科大-深交所博士奖学金), 2022
  • National Scholarship for Undergraduate Students (本科生国家奖学金), 2015
  • First-class Academic Scholarship of USTC, 2018/19/21/23
  • 1st Place of Wuxi Internet of Things Maker Competition, 2018
  • 3rd Place of China Artificial Intelligence Society Bo Er Cup Competition, 2018

Project

  • 国家自然科学基金青年项目C类,2026-2028, (主持) (National Natural Science Foundation of China), 2025
  • 江苏省青年基金项目,2026-2028, (主持) (Jiangsu Provincial Department of Science and Technology), 2025
  • 中国博士后科学基金面上项目, 2024-2026, (主持) (China Postdoctoral Science Foundation), 2024
  • 国家重点研发计划:基于大模型的基层辅助诊断机器人软件系统开发, 2025-2028,(参与) (Ministry of Science and Technology of China), 2025
  • 国家重点研发计划:主流价值观理论知识体系与计算模型, 2021-2024,(参与) (Ministry of Science and Technology of China), 2021
  • 国家重点研发计划:面向全员媒体的内容跨媒体解析与动态组合生产, 2020-2023,(参与) (Ministry of Science and Technology of China), 2020

Academic Activities

Reviewer

  • Served as reviewer for many conferences or journals, including CVPR, ICCV, ECCV, ACL, NeurIPS, AAAI, ACM MM, IJCAI, IEEE-TMM, IEEE-TCSVT, KBS, Pattern Recognition, etc.
  • Snapping life's beautiful moments!

    Last Updated in July, 2024

    Published with GitHub Pages