Peihan Liu

CS PhD student, Columbia · peihanliu@cs.columbia.edu · Scholar

I'm a second-year CS PhD student in the Columbia theory group, advised by Rachel Cummings and Roxana Geambasu. I work on trustworthy machine learning — particularly the theoretical and empirical foundations of privacy, fairness, and modern machine learning.

Previously, I received my M.Eng. in CSE from Harvard, where I worked on algorithmic fairness with Cynthia Dwork and Juan Perdomo. Before that, I earned B.S. degrees in Mathematics and Statistics, with high honors and high distinction, from the University of Michigan, where I worked with Martin Strauss, Ranjan Pal, Shizhang Li, and Nuh Aydin on algorithmic fairness, simplicial algebra, and algebraic coding theory.

Beyond research, I walk my dog and (used to) play poker.

Experience

Fall 2025
Student Researcher, Google
hosted by Alex Bie and Lily Tsai
Summer 2025
Student Researcher, Google
hosted by Alex Bie and Lily Tsai
Summer 2023
Student Intern, OpenDP
supervised by Salil Vadhan and Wanrong Zhang

Publications

See Google Scholar for the up-to-date list.

ContinuousBench: Can Differentially Private Synthetic Text Improve Capabilities?
Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva, Gautam Kamath, Rachel Cummings, Roxana Geambasu, Yu Gan, Lillian Tsai, Alex Bie
preprint, 2026. [arXiv] [code] [dataset]
Behavior Cloning is Not All You Need: The Optimality of On-Policy Distillation for Noisy Expert Feedback
Ved Sriraman, Peihan Liu, Daniel Hsu, Adam Block
preprint, 2026. [arXiv] [code]
Adaptive Target-Charging with Privacy Filters and Individual Accounting
Peihan Liu, Alison Caulfield, Mark Chen, Rachel Cummings, Roxana Geambasu, Mathias Lécuyer, Pierre Tholoniat
preprint, 2026. [arXiv] [code]
Privately Fine-Tuned LLMs Preserve Temporal Dynamics in Tabular Data
Lucas Rosenblatt, Peihan Liu, Ryan McKenna, Natalia Ponomareva
ICML, 2026. [arXiv]
Convex Relaxation for Solving Large-Margin Classifiers in Hyperbolic Space
Sheng Yang, Peihan Liu, Cengiz Pehlevan
TMLR, 2024. [arXiv]
Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training
Alyssa Huang, Peihan Liu, Ryumei Nakada, Linjun Zhang, Wanrong Zhang
preprint, 2024. [arXiv]

Blog

2026-05-14
Release of ContinuousBench
Does DP synthetic text actually transfer knowledge? A new benchmark says no — even at ε = 100. Read →