👨🎓 About Me
I am Xiaoteng Ma, a postdoctoral researcher in the Department of Automation at Tsinghua University. Currently, I am working closely with the MIG lab led by Prof. Chongjie Zhang (currently with McKelvey School of Engineering, Washington University in St. Louis). Prior to this role, I obtained my Ph.D. degree in June, 2023. During my doctoral studies, I was a member of CFINS and was supervised by Prof. Qianchuan Zhao and Prof. Li Xia (currently with the Business School at Sun Yat-Sen University). I completed my Bachelor of Engineering degree from the Department of Automation at Xi’an Jiaotong University in 2017.
My primary research interest lies in data-driven decision making, particularly in the areas of reinforcement learning, multi-agent systems, and large language model-based agents. In addition, I am passionate about applying data-driven decision-making algorithms to solve real-world problems in robotics, finance, and power systems. Please feel free to contact me for any further information or collaboration opportunities.
📝 Publications
Conference Paper
-
Single-Trajectory Distributionally Robust Reinforcement Learning. Zhipeng Liang*, Xiaoteng Ma*, Jose Blanchet, Jiheng Zhang, Zhengyuan Zhou. International Conference on Machine Learning (ICML), 2024.
-
Efficient Multi-agent Reinforcement Learning by Planning. Qihan Liu*, Jianing Ye*, Xiaoteng Ma*, Jun Yang, Bin Liang, Chongjie Zhang. International Conference on Learning Representations (ICLR), 2024.
-
SEABO: A Simple Search-Based Method for Offline Imitation Learning. Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu. International Conference on Learning Representations (ICLR), 2024.
-
Learning Diverse Risk Preferences In Population-based Self-play. Yuhua Jiang*, Qihan Liu*, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, Qianchuan Zhao. AAAI Conference on Artificial Intelligence, (AAAI), 2024. (Oral)
-
Cross-Domain Policy Adaptation via Value-Guided Data Filtering. Kang Xu, Chenjia Bai, Xiaoteng Ma, Dong Wang, Bin Zhao, Zhen Wang, Xuelong Li, Wei Li. Advances in Neural Information Processing Systems (NeurIPS), 2023.
- Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning. Junjie Zhang*, Jiafei Lyu*, Xiaoteng Ma, Jiangpeng Yan, Jun Yang, Le Wan, Xiu Li. European Conference on Artificial Intelligence (ECAI), 2023.
-
What Is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang. International Conference on Machine Learning (ICML), 2023.
-
Mildly Conservative Q-Learning for Offline Reinforcement Learning. Jiafei Lyu*, Xiaoteng Ma*, Xiu Li, Zongqing Lu. Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight)
-
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing. Rui Yang*, Chenjia Bai*, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han. Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight)
-
Exploiting Reward Shifting in Value-Based Deep RL. Hao Sun, Lei Han, Rui Yang, Xiaoteng Ma, Jian Guo, Bolei Zhou. Advances in Neural Information Processing Systems (NeurIPS), 2022.
-
Offline Reinforcement Learning with Value-based Episodic Memory. Xiaoteng Ma*, Yiqin Yang*, Hao Hu*, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang. International Conference on Learning Representations (ICLR), 2022.
-
Efficient Continuous Control with Double Actors and Regularized Critics. Jiafei Lyu*, Xiaoteng Ma*, Jiangpeng Yan, Xiu Li. AAAI Conference on Artificial Intelligence, (AAAI), 2022. (Oral)
-
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning. Yiqin Yang*, Xiaoteng Ma*, Chenghao Li, Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang and Qianchuan Zhao. Advances in Neural Information Processing Systems (NeurIPS), 2021. (Spotlight)
-
Average-Reward Reinforcement Learning with Trust Region Methods. Xiaoteng Ma, Xiaohang Tang, Jun Yang, Li Xia, Qianchuan Zhao. International Joint Conference on Artificial Intelligence (IJCAI), 2021.
-
Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning. Xiaoteng Ma*, Yiqin Yang*, Chenghao Li*, Yiwen Lu, Qianchuan Zhao and Jun Yang. International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2021.
-
Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration. Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li and Xiu Li. IEEE Data Driven Control and Learning Systems Conference (DDCLS), 2020.
-
Fairness Control of Traffic Light via Deep Reinforcement Learning. Chenghao Li, Xiaoteng Ma, Li Xia, Qianchuan Zhao and Jun Yang. IEEE International Conference on Automation Science and Engineering (CASE), 2020.
-
Bi-level Proximal Policy optimization for Stochastic Coordination of EV Charging Load with Uncertain Wind Power. Teng Long, Xiaoteng Ma, Qing-Shan Jia. IEEE Conference on Control Technology and Applications (CCTA), 2019.
- Attendance and security system based on building video surveillance. Kailai Sun, Qianchuan Zhao, Jianhong Zou, Xiaoteng Ma. International Conference on Smart City and Intelligent Building (ICSCIB), 2018.
Journal Paper
-
CVaR-Constrained Policy Optimization for Safe Reinforcement Learning. Qiyuan Zhang, Shu Leng, Xiaoteng Ma, Qihan Liu, Xueqian Wang, Bin Liang, Yu Liu, Jun Yang. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024.
-
A unified algorithm framework for mean-variance optimization in discounted Markov decision processes. Shuai Ma, Xiaoteng Ma, Li Xia. European Journal of Operational Research (EJOR), 2023.
-
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning. Xiaoteng Ma, Shuai Ma, Li Xia, Qianchuan Zhao. Journal of Artificial Intelligence Research (JAIR), 2022.
-
MPSN: Motion-aware Pseudo-Siamese Network for Indoor Video Head Detection in Buildings. Kailai Sun*, Xiaoteng Ma*, Peng Liu*, Qianchuan Zhao. Building and Environment (BAE), 2022.
-
An optimistic value iteration for mean–variance optimization in discounted Markov decision processes. Shuai Ma, Xiaoteng Ma, Li Xia.Results in Control and Optimization (RICO), 2021.
-
Learning to Discover Task-Relevant Features for Interpretable Reinforcement Learning. Qiyuan Zhang, Xiaoteng Ma, Yiqin Yang, Chenghao Li, Jun Yang, Yu Liu and Bin Liang. IEEE Robotics and Automation Letters (RA-L), 2021.
-
Reinforcement learning for fluctuation reduction of wind power with energy storage. Zhen Yang, Xiaoteng Ma, Li Xia, Qianchuan Zhao and Xiaohong Guan. Results in Control and Optimization (RICO), 2021.
Preprint
-
DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning. Xiaoteng Ma, Li Xia, Zhengyuan Zhou, Jun Yang and Qianchuan Zhao.
-
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation. Xiaoteng Ma*, Zhipeng Liang*, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou.
👨🏽🤝👨🏼 Collaborators
- Qianchuan Zhao - Professor, Department of Automation, Tsinghua University.
- Li Xia - Professor, Business School, Sun Yat-Sen University
- Zhengyuan Zhou - Assistant Professor, Stern School of Business, New York University
- Gao Huang - Assistant Professor, Department of Automation, Tsinghua University
- Chongjie Zhang - Assistant Professor, Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University
- Zhipeng Liang - Ph.D. Student, Department of Industrial Engineering and Decision Analytics (IEDA), Hong Kong University of Science and Technology.
- Rui Yang - Ph.D. Student, Department of Computer Science and Engineering (CSE), Hong Kong University of Science and Technology.
- Jiafei Lyu - Ph.D. Student, Tsinghua Shenzhen International Graduate School, Tsinghua University.
🥇 Honors and Awards
- 2022.10 Tsinghua Comprehensive Scholarship
- 2021.10 Tsinghua Comprehensive Scholarship
- 2015.10 Xi’an Jiaotong University Outstanding Student (Undergraduate) (Top 10)
- 2015.10 National Scholarship (Undergraduate) (Top 1%)
- 2014.10 National Scholarship (Undergraduate) (Top 1%)
📖 Educations
- 2017.09 - 2023.06, Ph.D., Department of Automation, Tsinghua University.
- 2013.09 - 2017.06, Bachelor, Department of Automation, Xi’an Jiaotong Univeristy.
💻 Internships
- 2018.09 - 2018.11, SenseTime, Beijing.
- 2017.07 - 2017.08, Institute of Automation, CAS, Beijing.
📞 Contact
Xiaoteng Ma
Department of Automation
Tsinghua University
FIT Building 1-109
Beijing, China, 100084
E-mail: pony[DOT]xtma[AT]gmail[DOT]com