Senior Researcher

Microsoft Research

Stanford Data Science Fellow

IBM PhD Fellow

MIT Rising Stars in EECS

I am a senior researcher at Microsoft Research in the AI Interaction and Learning group. I was a Postdoctoral fellow in Computer Science at Stanford University and Stanford Data Science, working with Professor Stefano Ermon and Professor Barbara Engelhardt. I obtained my PhD in Computer Science from Washington State University in 2023, where I was advised by Professor Jana Doppa. I am working at the intersection of generative models, decision making, and scientific discovery. My recent work focuses on developing scalable algorithms for LLM alignment, RLHF, preference learning, and diffusion-based optimization with scientific applications ranging from antibiotic discovery to high-throughput screening. This includes methods improving sample efficiency in LLMs, a risk-assessment strategy to balance improvement and uncertainty in RLHF/DPO exploration, and preference-guided diffusion models. My recent work also introduced preference datasets to analyze model behavior and reduce hallucinations. Broadly, my expertise spans Bayesian optimization, uncertainty quantification, and efficient reasoning algorithms for sequential decision-making.

Selected Projects

Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF

- Collecting preference data for LLM alignment is challenging and costly, especially in scientific or safety-critical domains where expert input is required. We propose a method to pre-select high-impact preference pairs using Sharpe Ratio–based gradient analysis, reducing annotation costs while improving win rates.
- [Paper (COLM 2025)][Code][SDS lightning Talk]

Sample-Efficient Preference Alignment in LLMs via Active Exploration

- We tackle the high cost of human feedback in LLM alignment by framing preference learning as an active contextual dueling bandit problem. Our exploration-based algorithm efficiently selects where to query for preferences, with provable regret bounds and strong empirical gains across several LLMs and real-world datasets, including two new benchmarks we introduce: Jeopardy! and Haikus.
- [Paper (COLM 2025)][Code]

Preference-Guided Diffusion for Multi-Objective Offline Optimization

- We introduce a preference-guided diffusion model for offline multi-objective optimization that generates diverse, Pareto-optimal solutions beyond the training data. By guiding generation with a dominance-based classifier and explicitly promoting Pareto diversity, our method offers a novel generative, surrogate-free solution to inverse design problems.
- [Paper (Neurips 2025)][Code]

Antibiotic Discovery with Novel Mechanisms of Action Using Deep and Generative Models

- We develop deep learning pipelines that combine graph neural networks and diffusion-based generative models to discover antibiotic candidates with novel mechanisms of action. Our models prioritize potency, low toxicity, and structural diversity, enabling both the screening and generation of promising molecules, validated through real-world wet lab experiments.
- [Paper (coming soon)]

Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes

- We propose the first active learning approach tailored to derivative-based global sensitivity analysis (DGSMs), using Gaussian processes to guide costly black-box evaluations. By targeting DGSM uncertainty and information gain, our method substantially boosts sample efficiency in scientific and engineering tasks with limited evaluation budgets.
- [Paper (Neurips 2024)][Code][GSA tool code in Ax]

Output Space Entropy Search Framework for Multi-Objective Bayesian Optimization

- We introduce an output space entropy (OSE) search framework for multi-objective Bayesian optimization, selecting experiments that maximize information gain per resource spent. Our approach generalizes across single- and multi-fidelity, constrained, and continuous-fidelity settings—delivering more accurate Pareto fronts with fewer expensive evaluations.
- Related Papers: [MESMO, Code] [MF-OSEMO, Code][JAIR, Code]

[full list of publications]