I am currently a Professor at Harbin Institute of Technology (Shenzhen). My research focuses on the intersection of Data and AI, with emphases on data science, vector search and indexing, massive data algorithms, and LLM-based data applications.
Prior to joining HIT(SZ), I was a Principal Researcher at Huawei Hong Kong Research Center, conducting research and implementation in the Data+AI domain. I received my Ph.D. from The Chinese University of Hong Kong. Previously, I obtained my Bachelor's degree from the ACM Honored Class at Shanghai Jiao Tong University.
I am always open for possible collaborations. Please do not hesitate to contact me if you are interested!
课题组长期招收博士生、硕士生、博士后、研究助理(RA)和实习生,欢迎对Data+AI、向量检索、Agent Memory、LLM 应用、大数据算法等方向感兴趣的同学和研究者联系我!
My research focuses on algorithms and systems for Big Data Management and Data Science. Current interests include Data+AI analytics algorithms and systems, covering topics such as vector search & indexing, retrieval-augmented generation (RAG), LLM-based applications, graph analytics, and scalable data processing systems.
- Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework.
Yanchen Wu, Tenghui Lin, Yingli Zhou, Fangyuan Zhang, Qintian Guo, Xun Zhou, Sibo Wang, Xilin Liu, Yuchi Ma, Yixiang Fang.
Proceedings of the VLDB Endowment (VLDB), Under review, 2027. - EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora.
Fangyuan Zhang*, Zhengjun Huang*, Yingli Zhou*, Qintian Guo, et al., Xiaofang Zhou.
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Under review, 2026.
- LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning.
Zhengjun Huang, Zhoujin Tian, Qintian Guo, Fangyuan Zhang, et al., Xiaofang Zhou.
Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Findings, 2026. - Breaking the Static Graph: Context-Aware Traversal for Robust Retrieval-Augmented Generation.
Lau Kwun Hang, Fangyuan Zhang, Yingli Zhou, Ruiyuan Zhang, Qintian Guo, Xiaofang Zhou.
Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Findings, 2026. - Concurrent Copy-on-Write to Tree Structures.
Guanhao Hou, Dechuang Chen, Qintian Guo, Fangyuan Zhang, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 2026. - LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search.
Yujia Chen, Yingli Zhou, Fangyuan Zhang, Cuiyun Gao.
Proceedings of the IEEE/ACM International Conference on Software Engineering (ICSE), 2026. Distinguished Solution Paper - BOTBIN: Accelerated Indexing for Structural Graph Clustering on Dynamic Graphs.
Fangyuan Zhang, Qintian Guo, Junhao Gan, Sibo Wang.
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2026. - Streaming View: An Efficient Data Processing Engine for Modern Real-time Data Warehouse of Alibaba Cloud.
Fangyuan Zhang*, Mengqi Wu*, Chunlei Xu, Yunong Bao, Jiyu Qiao, Yingli Zhou, Hua Fan, Caihua Yin, Wenchao Zhou, Feifei Li.
Proceedings of the VLDB Endowment (VLDB), 18(12): 5153-5165, 2025. - AnalyticDB-PG: A Cloud-native Real-time Intelligent Data Warehouse.
Fangyuan Zhang*, CaiHua Yin*, Hua Fan, Fenghua Fang, Yineng Chen, Xuqi Wang, Mengqi Wu, Bing Chen, Tianbo Jin, Sibo Wang, Wenchao Zhou, Feifei Li.
Proceedings of the VLDB Endowment (VLDB), 18(12): 5139-5152, 2025. - Scalable Graph-based Retrieval-Augmented Generation via Locality-Sensitive Hashing.
Fangyuan Zhang, Zhengjun Huang, Yingli Zhou, Qintian Guo, Wensheng Luo, Xiaofang Zhou.
Proceedings of the VLDB Endowment (VLDB), Workshop, 2025. - Efficient Dynamic Indexing for Range Filtered Approximate Nearest Neighbor Search.
Fangyuan Zhang, Mengxu Jiang, Guanhao Hou, Jieming Shi, Hua Fan, Wenchao Zhou, Feifei Li, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 152:1-152:26, 2025. - DIGRA: A Dynamic Graph Indexing for Approximate Nearest Neighbor Search with Range Filter.
Mengxu Jiang, Zhi Yang, Fangyuan Zhang, Guanhao Hou, Jieming Shi, Wenchao Zhou, Feifei Li, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 148:1-148:26, 2025. - Efficient Concurrent Algorithms for Updates to Persistent Binary Search Trees.
Guanhao Hou, Jinchao Huang, Fangyuan Zhang, Sibo Wang.
Proceedings of the VLDB Endowment (VLDB), 18(5): 1481-1494, 2025. - Efficient Approximation Algorithms for Minimum Cost Seed Selection with Probabilistic Coverage Guarantee.
Chen Feng, Xingguang Chen, Qintian Guo, Fangyuan Zhang, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 2(4), 197:1-197:26, 2025. - FICOM: An Effective and Scalable Active Learning Framework for GNNs on Semi-supervised Node Classification.
Xingyi Zhang, Jinchao Huang, Fangyuan Zhang, Sibo Wang.
International Journal on Very Large Data Bases (VLDBJ), 33(5): 1723-1742, 2024. - Scalable Approximate Butterfly and Bi-triangle Counting for Large Bipartite Networks.
Fangyuan Zhang, Dechuang Chen, Sibo Wang, Yin Yang, Junhao Gan.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 259:1-259:26, 2024. - Efficient Approximation Framework for Attribute Recommendation.
Xingguang Chen, Fangyuan Zhang, Jinchao Huang, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 239:1-239:26, 2024. - Efficient Algorithm for Budgeted Adaptive Influence Maximization: An Incremental RR-set Update Approach.
Qintian Guo, Chen Feng, Fangyuan Zhang, Sibo Wang.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 207:1-207:26, 2024. - Efficient Dynamic Weighted Set Sampling and Its Extension.
Fangyuan Zhang, Mengxu Jiang, Sibo Wang.
Proceedings of the VLDB Endowment (VLDB), 17(1): 15-27, 2023. - Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme.
Guanhao Hou, Qintian Guo, Fangyuan Zhang, Sibo Wang, Zhewei Wei.
Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 1(1): 25:1-25:26, 2023. - Efficient Approximate Algorithms for Empirical Variance with Hashed Block Sampling.
Xingguang Chen, Fangyuan Zhang, Sibo Wang.
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 157-167, 2022. - Effective Indexing for Dynamic Structural Graph Clustering.
Fangyuan Zhang, Sibo Wang.
Proceedings of the VLDB Endowment (VLDB), 15(11): 2908-2920, 2022.
- Professor, Harbin Institute of Technology (Shenzhen)
Shenzhen, China, Mar. 2026 – Present.
Leading research on Data+AI, vector search and indexing, massive data algorithms, and LLM-based applications. - Principal Researcher, Huawei Hong Kong Research Center (HKRC)
Hong Kong SAR, Mar. 2025 – Feb. 2026.
Leading research and implementation in the Data+AI domain for GaussDB products. Managing domestic and international research collaborations. - Research Intern, Alibaba Cloud
Hangzhou, China, Aug. 2024 – Mar. 2025.
Developed hybrid query processing algorithms for vector databases. Contributed to the core engine of AnalyticDB for PostgreSQL. - Research Intern, Microsoft Research Asia (MSRA)
Beijing, China, Jul. 2020 – Dec. 2020.
Conducted research on privacy-preserving machine learning models. Collaborated with the Microsoft Forms team. - Security Research Intern, Tencent
Shanghai, China, Jul. 2019 – Sep. 2019.
Competed in Defcon CTF with the A*0*e team. Engineered a RISC-V disassembler plugin for IDA Pro.
- Session Chair: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD).
- Invited Reviewer: SIGKDD, WWW, ICDE, WSDM, DASFAA, PAKDD, TKDE.
- ICSE 2026 Distinguished Solution Paper
- KDD 2025 Outstanding Reviewer
- KDD 2024 Commendable Reviewer
- VLDB Travel Award
- CUHK Postgraduate Scholarship
- Zhiyuan Honorary Scholarship, SJTU, 2017–2021
- Shanghai Jiao Tong University Scholarship (Top 15%), 2017–2021
- Silver Medal, National Olympiad in Informatics (NOI), 33rd
- First Prize, National Olympiad in Informatics in Provinces (NOIP), 21st
- TA of FTEC 4005 (Financial Informatics), 2021 Fall @ CUHK
- TA of FTEC 4003 (Data Mining for FinTech) @ CUHK
- TA of CSCI 2100 (Data Structures), 2022 Spring @ CUHK
- TA of CS221 (Data Structures), 2018 Fall @ SJTU
- Ph.D. in Systems Engineering and Engineering Management
The Chinese University of Hong Kong (CUHK), Hong Kong SAR, Aug. 2021 – Mar. 2025.
Member of the Database Research Group. - B.E. in Computer Science (ACM Class)
Shanghai Jiao Tong University (SJTU), Shanghai, China, Sep. 2017 – Jun. 2021.
Member of the prestigious ACM Honored Class (Top 5% of CS students).