Runzhe Wan

Runzhe Wan

Senior Applied Scientist

Core AI, Amazon Inc.

Biography

I am currently an Senior Applied Scientist at Core AI, Amazon. I obtained my Ph.D. in Statistics at North Carolina State University in Feb. 2022, advised by Dr. Rui Song. Previously, I received my B.S. in Mathematics from Fudan University, China in May 2017.

My current research interests center around optimal decision-making under uncertainty. Such a decision may involve counterfactual effects, have long-term impacts, need to be personalized, and can be evaluated/learned either during online interactions or from offline data. I am passionate to develop powerful and robust frameworks to evaluate and optimize our decisions and policies, with reliable statistical guarantees and efficient numerical algorithms. Accordingly, I have broad interests in Reinforcement Learning (inc. Bandits), GenAI/LLM, and Causal Inference (inc. Causal ML).

At Amazon, I worked in Core AI, Amazon Stores’ central science org that reports directly to the CEO. I work across organizational boundaries to drive high-stakes initiatives to power Amazon Stores’ most critical decision systems. My work has delivered over $1B in annual business impact, but also shaped the strategic adoption of causal ML and RL methodologies across Amazon’s core platforms, including inventory management, markdown algorithms, product selection, delivery speed optimization, experimentation platforms, etc.

Education
  • Ph.D. in Statistics

    North Carolina State University, 2022

  • B.S. in Mathematics

    Fudan University, 2017

Experience
  • Core AI, Amazon Inc.

    Senior Applied Scientist

    April.2024 -- Present

  • Core AI, Amazon Inc.

    Applied Scientist

    April.2022 -- March.2024

  • Core AI, Amazon Inc.

    Applied/Research Scientist Intern

    May.2020 -- Feb.2022 (part-time within semesters)

  • Bell Labs

    Research Intern

    Jun.2019 - Aug.2019

Interests
  • Reinforcement Learning (inc. Bandits)
  • GenAI/LLM
  • Causal Inference (esp. Causal ML)

Research

Publications

* : Equal Contribution

  1. Zero-Inflated Bandits
    Wei, H.*, Wan, R.*, Shi, L. and Song, R. (2025). ICML 2025
    RL
  2. Know When to Fold: Futility-Aware Early Termination in Online Experiments
    Liu, Y.*, Wan, R.*, Huang, Y., McQueen, J., Hains, D., Gu, J., and Song, R. (2025). WWW 2025
    Statistics/ML
  3. Contextual Deep Reinforcement Learning with Adaptive Value-Based Clustering
    Gao, Y., Wan, R., Giannakakis, I., Gu, J., and Song, R. (2025). ICMLT 2025
    RL
  4. Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards
    Zhu, J. , Wan, R., Qi, Z., Luo, S. and Shi, C. (2024). AISTATS 2024.
    RL Causal
  5. Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
    Liu, Y.*, Wan, R.*, McQueen, J., Hains D., Gu, J., and Song, R. (2024) AAAI 2024. (Oral Presentation)
    Statistics/ML
  6. Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring
    Wan, R.*, Liu, Y.*, McQueen, J., Hains D. and Song, R. (2023). KDD 2023
    RL Statistics
  7. Multiplier Bootstrap-based Exploration
    Wan, R.*, Wei, H.*, Kveton, B. and Song, R. (2023). ICML 2023
    RL
  8. A Review of Reinforcement Learning in Financial Applications
    Bai, Y.*, Gao, Y.*, Wan, R.*, Zhang, S.*, and Song, R. (2024). Annual Review of Statistics and Its Application
    RL
  9. Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
    Wan, R*., Ge, L.* and Song, R. (2023). AISTATS 2023
    RL
  10. Batch Policy Learning in Average Reward Markov Decision Processes
    Liao, P.*, Qi, Z.*, Wan, R., Klasnja, P. and Murphy S. (2022). Annals of Statistics (AoS)
    RL Causal
  11. A Multi-Agent Reinforcement Learning Framework for Treatment Effects Evaluation in Two-Sided Markets
    Shi, C., Wan, R., Song, G., Luo, S., Song, R. and Zhu, H. (2022). Annals of Applied Statistics (AOAS)
    RL Causal
  12. Mining the Factor Zoo: Estimation of Latent Factor Models with Sufficient Proxies
    Wan, R., Li Y., Lu, W. and Song, R. (2022). Journal of Econometrics (JOE)
    (Won the Best Student Paper Award, Business&Econ Section, American Statistical Association. Declined following the one-award-per-year policy)
    Statistics

Under Review / Revision

  1. STEEL: Singularity-aware Reinforcement Learning
    Chen, X., Qi, Z. and Wan, R.
    RL
  2. A Review of Causal Decision Making
    Ge, L.*, Cai, H.*, Wan, R.*, Xu, Y.* and Song, R.
    Causal
  3. Heterogeneous Synthetic Learner for Panel Data
    Shen, Y., Wan, R., Cai, H. and Song, R.
    Causal

Internal Publications (Amazon)

  1. Know When to Fold: Futility-aware Early Termination in Online Experiments.
    Wan, R.*, Liu, Y.*, Huang Y., McQueen, J., Hains D., Gu J. and Song, R. (2023) CSS 2024 (Best Paper Award)
    Statistics/ML
  2. Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring.
    Wan, R.*, Liu, Y.*, McQueen, J., Hains D. and Song, R. (2023) CSS 2023 (Best Paper Award) and AMLC 2023.
    RL Statistics/ML
  3. Data-driven substitution aware promise tuning model and promise extension experiment
    Giannakakis, I., Svoboda , R., Wan, R.*, Gu, J., Yao, J. (2023) AMLC 2023
    Statistics/ML
  4. Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
    Liu, Y.*, Wan, R.*, McQueen, J., Hains D., Gu, J., and Song, R. (2023) Econ Summit 2023
    Statistics/ML
  5. Continuous Monitoring of A/B Tests: A Meta Analysis
    Liu, Y.*, Wan, R.*, McQueen, J., Song, R, Hains D. and Richardson, T. (2023) AMLC 2023
    Statistics/ML
  6. Deep Inventory Control Policy for Perishables
    Wan, R., Giannakakis, I., Sisikoglu E., Jiang T., Goyal, V., Song, R. and Gu, J. (2022). CSS 2022 (Long talk) and AMLC 2022.
    RL
  7. Reinforcement Learning for Replaceability Index Estimation and Assortment Optimization.
    Wan, R., Giannakakis, I., Gu, J. and Song, R. (2021). CSS 2021 (Spotlight talk, rate 4.8%) and AMLC 2021.
    RL

Other Contributions

Awards / Honors

Professional Services

Reviewer:

  • Conferences: NeurIPS, ICML, ICLR, UAI, AISTATS, KDD, IJCAI, AAAI, SDM
  • Journals: Annals of Statistics, Journal of the American Statistical Association (T&M), Journal of the American Statistical Association (book review), Journal of the Royal Statistical Society (Series B), EJS, STPA, Transactions on Machine Learning Research (TMLR)

PC member: RLITS Workshop at IJCAI 2021, Online Marketplaces Workshop at KDD 2022/2023