The lab’s primary research is to use our expertise in statistics (such as semiparametrics, unconventional likelihood methods, feature selection and post-selection inference) and in statistical learning (such as semi-supervised learning, transfer learning, robust methods and non-convex optimization), to analyze data with massive structures in various biomedical studies. The lab is interested in traditional randomized controlled trials, as well as its integration with other modern databases.
The lab is also highly motivated and always enthusiastic about interdisciplinary research in clinical trials and observational studies as well as in a wide range of disciplines. Currently, the lab is interested in patient-reported outcomes, electronic health records, health services research, pain research, cancer, health disparities, orthopaedics and sport medicine, and dental medicine. The lab is consistently looking forward to establishing new exciting collaborations of cutting-edge research.
Dr. Zhao serve(d) as the Principal Investigator of the following grants:
NSF/NIGMS/1953526 and 2122074
Title: A Robust and Efficient Statistical Framework for Handling Missing-Not-At-Random Data in Patient Reported Outcomes and Beyond
Duration: 8/1/2020-7/31/2023
Cost: $599,662
Dr. Zhao serve(d) as an editorial board member of the following journals:
Associate Editor for Stat, 2019-present
Associate Editor for Journal of Nonparametric Statistics, 2017-present
Associate Editor for Statistica Sinica (special issue), 2015-2018
Selected Research Projects
Nonignorable Missingness Can Be Ignored
Using the Neyman orthogonality in semiparametrics, we show that, in a regression model with nonignorable missing data, the estimation procedure actually does not need the model or the estimation of the missingness mechanism; instead, the procedure only needs a working model, which could be arbitrarily misspecified. Hence, this nonignorable missingness mechanism can be "ignored".
Smoothed Surrogate Loss
How to approximate the zero-one loss in general? In this paper, we show that, in many problems in precision medicine, the common convex surrogate loss would not work; instead, one needs to propose a nonconvex surrogate loss. In particular, we study the smoothed surrogate loss under high dimensional regime. The theoretical results are quite unique and exciting!
Weak Signal Detection And Post-Detection Inference
This paper studies the problem of how to take into account the weak signals, how to detect important signals and also conduct statistical inference, with an informative subsample. The primary application of this work is an HIV-1 drug resistance study.

High-d Variable Selection And Inference With Missing Data
This work considers the general problem of how to do the variable selection and also conduct the post-selection inference with high dimensionality and under an arbitrary missingness mechanism.
Knee Surgery: Have We Been Doing It Wrong?
Cleaning up loose cartilage is not always beneficial, according to our study that could impact athletes and seniors, reduce health care costs. This work was published at Journal of Bone & Joint Surgery.