Runzhe Wan

Runzhe Wan

Applied Scientist

Core AI, Amazon Inc.

Biography

I am currently an Senior Applied Scientist at Core AI, Amazon. I obtained my Ph.D. degree in Statistics at North Carolina State University, advised by Dr. Rui Song. Previously, I received my B.S. in Mathematics from Fudan University, China in May 2017.

My current research interests center around optimal decision-making under uncertainty. Such a decision may have long-term impacts, need to be personalized, and can be evaluated/learned either during online interactions or from offline data. I am passionate to develop powerful and robust frameworks to evaluate and optimize our decisions and policies, with reliable statistical guarantees and efficient numerical algorithms. Accordingly, I have broad interests in Causal Inference (inc. Causal ML), Optimal (Personalized) Decision Rule, and Online/Offline Bandits & Reinforcement Learning.

At Amazon, I am collaborating with multiple product teams on inventing and productionizing advanced causal and RL/bandits methodology to revolutionize critical systems that have direct impacts on Amazon business. My current projects are related to the inventory management system, delivery speed system, markdown algorithms, product selection, and experimentation platforms, etc.

Education
  • Ph.D. in Statistics

    North Carolina State University, 2022

  • B.S. in Mathematics

    Fudan University, 2017

Experience
  • Core AI, Amazon Inc.

    Senior Applied Scientist

    April.2024 -- Present

  • Core AI, Amazon Inc.

    Applied Scientist

    April.2022 -- March.2024

  • Core AI, Amazon Inc.

    Applied/Research Scientist Intern

    May.2020 -- Feb.2022 (part-time within semesters)

  • Bell Labs

    Research Intern

    Jun.2019 - Aug.2019

Interests
  • Causal Inference
  • Bandits
  • Reinforcement Learning

Research

Publications

* : co-first author

  1. A Review of Reinforcement Learning in Financial Applications
    Bai, Y.*, Gao, Y. *, Wan R. *, Zhang S. * and Song R. (2024) Annual Review of Statistics and Its Application
  2. Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards
    Zhu, J. , Wan, R., Qi, Z., Luo, S. and Shi, C. (2024). AISTATS 2024.
  3. Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
    Liu, Y.*, Wan, R.*, McQueen, J., Hains D., Gu, J., and Song, R. (2024) AAAI 2024. (Oral Presentation)
  4. Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring
    Wan, R.*, Liu, Y.*, McQueen, J., Hains D. and Song, R. (2023). KDD 2023
  5. Multiplier Bootstrap-based Exploration
    Wan, R.*, Wei, H.*, Kveton, B. and Song, R. (2023). ICML 2023
  6. Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
    Wan, R*., Ge, L.* and Song, R. (2023). AISTATS 2023
  7. Batch Policy Learning in Average Reward Markov Decision Processes
    Liao, P.*, Qi, Z.*, Wan, R., Klasnja, P. and Murphy S. (2022). Annals of Statistics (AoS)
  8. A Multi-Agent Reinforcement Learning Framework for Treatment Effects Evaluation in Two-Sided Markets
    Shi, C., Wan, R., Song, G., Luo, S., Song, R. and Zhu, H. (2022). Annals of Applied Statistics (AOAS)
  9. Mining the Factor Zoo: Estimation of Latent Factor Models with Sufficient Proxies
    Wan, R., Li Y., Lu, W. and Song, R. (2022). Journal of Econometrics (JOE)
    (Won the Best Student Paper Award, B&E Section, American Statistical Association. Declined following the one-award-per-year policy)
  10. Safe Exploration for Efficient Policy Evaluation and Comparison
    Wan, R., Kveton, B., and Song, R. (2022). ICML 2022
  11. Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
    Wan, R., Ge, L. and Song, R. (2021). NeurIPS 2021
  12. Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control
    Wan, R., Zhang, X. and Song, R. (2021). KDD 2021
    (Norman Breslow Young Investigator Award, American Statistical Association )
  13. Deeply-Debiased Off-Policy Interval Estimation
    Shi, C.*, Wan, R*., Chernozhukov, V. and Song, R. (2021). ICML 2021. (Long oral presentation, rate 3%)
  14. Pattern Transfer Learning for Reinforcement Learning in Order Dispatching
    Wan, R*., Zhang, S.*, Shi, C., Luo, S. and Song, R. (2021). IJCAI 2021, RLITS workshop. (Best paper award)
  15. Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
    Shi, C., Wan, R., Song, R., Lu, W. and Leng, L. (2020). ICML 2020

Under Review / Revision

  1. Zero-Inflated Bandits
    Wei, H.*, Wan, R.*, Shi, L. and Song, R.
  2. STEEL: Singularity-aware Reinforcement Learning
    Chen, X., Qi, Z. and Wan, R..
  3. Heterogeneous Synthetic Learner for Panel Data
    Shen, Y., Wan, R., Cai, H. and Song, R.

Internal Publications (Amazon)

  1. Know When to Fold: Futility-aware Early Termination in Online Experiments.
    Wan, R.*, Liu, Y.*, Huang Y., McQueen, J., Hains D., Gu J. and Song, R. (2023) CSS 2024 (Best Paper Award)
  2. Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring.
    Wan, R.*, Liu, Y.*, McQueen, J., Hains D. and Song, R. (2023) CSS 2023 (Best Paper Award) and AMLC 2023.
  3. Data-driven substitution aware promise tuning model and promise extension experiment
    Giannakakis, I., Svoboda , R., Wan, R.*, Gu, J., Yao, J. (2023) AMLC 2023
  4. Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
    Liu, Y.*, Wan, R.*, McQueen, J., Hains D., Gu, J., and Song, R. (2023) Econ Summit 2023
  5. Continuous Monitoring of A/B Tests: A Meta Analysis
    Liu, Y.*, Wan, R.*, McQueen, J., Song, R, Hains D. and Richardson, T. (2023) AMLC 2023
  6. Deep Inventory Control Policy for Perishables
    Wan, R., Giannakakis, I., Sisikoglu E., Jiang T., Goyal, V., Song, R. and Gu, J. (2022). CSS 2022 (Long talk) and AMLC 2022.
  7. Reinforcement Learning for Replaceability Index Estimation and Assortment Optimization.
    Wan, R., Giannakakis, I., Gu, J. and Song, R. (2021). CSS 2021 (Spotlight talk, rate 4.8%) and AMLC 2021.

Other Contributions

Awards / Honors

Professional Services

Reviewer:

  • Conferences: NeurIPS, ICML, ICLR, UAI, AISTATS, KDD, IJCAI, AAAI, SDM
  • Journals: Annals of Statistics, Journal of the American Statistical Association (T&M), Journal of the American Statistical Association (book review), Journal of the Royal Statistical Society (Series B), EJS, STPA, Transactions on Machine Learning Research (TMLR)

PC member: RLITS Workshop at IJCAI 2021, Online Marketplaces Workshop at KDD 2022/2023