Paper publications led, supervised or (co-)authored

Scalable RLXF/RL/Agentic Alignment

  • Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong , Sheng Wen, Yang Xiang, AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways, ACM Computing Surveys, paper, 202501.
  • Yusen Wu, Li Jiang, Junwu Xiong , Jingqing Ruan, Yichuan Ding, Qingpei Guo, zujie wen, Jun Zhou, Xiaotie Deng, Hummer: Towards Limited Competitive Preference Dataset, paper, Conference on Language Modeling(COLM), 2024.
  • T Cui, Y Wang, C Fu, Y Xiao, S Li, X Deng, Y Liu, Q Zhang, Z Qiu, P Li, Z Tan, Junwu Xiong and others, Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems, paper, 202401. Submitted to ACM Computing Surveys, 202401.

Multi-modal Reinforcement Learning

  • Junwu Xiong , Xiaoyun Feng, YunZhou Shi, James Zhang, Zhongzhou Zhao, and Wei Zhou. Digital human inter‑active recommendation decision‑making based on reinforcement learning. NeurIPS 2022 Workshop on Human in the Loop Learning , paper, 2022.

Multi-agent Reinforcement Learning

  • Chao Qu, Hui Li, Chang Liu, Junwu Xiong , James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, and Le Song. Variational Policy Propagation for Multi‑agent Reinforcement Learning, paper, 2020.
  • Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, and Junwu Xiong . Value propagation for decentralized networked deep multi‑agent reinforcement learning. Advances in Neural Information Processing Systems , 32, paper, 2019.

Game Theory and Reinforcement Learning

  • Romain Lopez, Chenchen Li, Xiang Yan, Junwu Xiong , Michael Jordan, Yuan Qi, and Le Song. Cost‑effective incentive allocation via structured counterfactual inference. In Proceedings of the AAAI Conference on Artificial Intelligence , volume 34, pages 4997–5004, paper, 2020.
  • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, and Junwu Xiong . Latent Dirichlet Allocation for Internet Price War. In Proceedings of the AAAI Conference on Artificial Intelligence , volume 33, pages 639–646, paper, 2019.

Deep Reinforcement Learning

  • Tan, Xiaoyu and Qu, Chao and Xiong, Junwu and Zhang, James and Qiu, Xihe and Jin, Yaochu,Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding, IEEE Transactions on Emerging Topics in Computational Intelligence , paper, 202403.
  • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, and Junwu Xiong . Reinforcement learning for uplift modeling, paper, 2018.

Deep Learning

  • Huiru Xiao, Caigao Jiang, Yangqiu Song, James Zhang, and Junwu Xiong . Unit ball model for embedding hier‑ archical structures in the complex hyperbolic space, paper, 2021.
  • Tong Yin, Xiaotie Deng, Yuan Qi, Wei Chu, Jing Pan, Xiang Yan, and Junwu Xiong . Personalized behavior predic‑ tion with encoder‑to‑decoder structure. In 2018 IEEE International Conference on Networking , Architecture and Storage (NAS), pages 1–10. IEEE, paper, 2018.

Scalable Wireless Sensor Networks

  • Huan Li, Yanlei Liu, Weifeng Chen, Weijia Jia, Bing Li, and Junwu Xiong . COCA: Constructing optimal clustering architecture to maximize sensor network lifetime. Computer Communications , 36(3):256–268, paper, 2013.
  • Huan Li, Jierui Cao, and Junwu Xiong . Constructing optimal clustering architecture for maximizing lifetime in large scale wireless sensor networks. In 2009 15th International Conference on Parallel and Distributed Systems, pages 182–189. IEEE, paper, 2009.