AVERAGING ESTIMATORS OF HETEROGENEOUS TREATMENT EFFECTS UNDER ADDITIVE MODELS

Na Li; Yu Fei; Yuhong Yang; Xinyu Zhang

doi:10.1017/S0266466625100200

AVERAGING ESTIMATORS OF HETEROGENEOUS TREATMENT EFFECTS UNDER ADDITIVE MODELS

Published online by Cambridge University Press: 22 October 2025

Na Li ,

Yu Fei ,

Yuhong Yang

and

Xinyu Zhang

Show author details

Na Li: Affiliation:
Kunming University of Science and Technology
Yu Fei: Affiliation:
Yunnan University of Finance and Economics
Yuhong Yang*: Affiliation:
Tsinghua University and Beijing Institute of Mathematical Sciences and Applications
Xinyu Zhang: Affiliation:
Chinese Academy of Sciences and University of Science and Technology of China
*: Address correspondence to Yuhong Yang, Yau Mathematical Sciences Center, Tsinghua University, Beijing, China and Beijing Institute of Mathematical Sciences and Applications, Beijing, China, e-mail: yyangsc@tsinghua.edu.cn.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

We consider spline-based additive models for estimation of conditional treatment effects. To handle the uncertainty due to variable selection, we propose a method of model averaging with weights obtained by minimizing a J-fold cross-validation criterion, in which a nearest neighbor matching is used to approximate the unobserved potential outcomes. We show that the proposed method is asymptotically optimal in the sense of achieving the lowest possible squared loss in some settings and assigning all weight to the correctly specified models if such models exist in the candidate set. Moreover, consistency properties of the optimal weights and model averaging estimators are established. A simulation study and an empirical example demonstrate the superiority of the proposed estimator over other methods.

Information

Type: ARTICLES
Information: Econometric Theory , First View , pp. 1 - 34

DOI: https://doi.org/10.1017/S0266466625100200 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The authors are very grateful to the editor, associate editor, and two anonymous referees for their constructive comments and suggestions. X.Z.’s work is supported by the National Key Research and Development Program of China (Grant No. 2023YFA1008704) and the National Natural Science Foundation of China (Grant Nos. 72525001 and 72495124). Y.F.’s work is supported by the National Natural Science Foundation of China (Grant No. 12561051) and Yunnan Province XingDian Talent Support Program (Grant No. YNWR-YLXZ-2018-020).

References

REFERENCES

Abadie, A., & Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica , 74, 235–267.10.1111/j.1468-0262.2006.00655.xCrossRef Google Scholar

Abadie, A., & Imbens, G. W. (2011). Bias-corrected matching estimators for average treatment effects. Journal of Business & Economic Statistics , 29, 1–11.10.1198/jbes.2009.07333CrossRef Google Scholar

Athey, S., & Imbens, G. W. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences , 113, 7353–7360.10.1073/pnas.1510489113CrossRef Google Scholar PubMed

Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear smoothers and additive models. Annals of Statistics , 17, 453–510.Google Scholar

Cai, T., Tian, L., Wong, P. H., & Wei, L. (2011). Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics , 12, 270–278.10.1093/biostatistics/kxq060CrossRef Google Scholar PubMed

Claeskens, G., Croux, C., & Van Kerckhoven, . (2006). Variable selection for logistic regression using a prediction-focused information criterion. Biometrics , 62, 972–979.10.1111/j.1541-0420.2006.00567.xCrossRef Google Scholar PubMed

Coibion, O., Georgarakos, D., Gorodnichenko, Y., Kenny, G., & Weber, M. (2024). The effect of macroeconomic uncertainty on household spending. American Economic Review , 114, 645–677.10.1257/aer.20221167CrossRef Google Scholar

Cui, G., Li, N., Wan, A., & Zhang, X. (2025). Model averaging for estimating treatment effects with binary responses. Applied Stochastic Models in Business and Industry , 41, 2898–2916.10.1002/asmb.2898CrossRef Google Scholar

Fan, J., Feng, Y., & Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association , 106, 544–557.10.1198/jasa.2011.tm09779CrossRef Google Scholar PubMed

Fang, F., Li, J., & Xia, X. (2022). Semiparametric model averaging prediction for dichotomous response. Journal of Econometrics , 229, 219–245.10.1016/j.jeconom.2020.09.008CrossRef Google Scholar

Hansen, B. E. (2007). Least squares model averaging. Econometrica , 75, 1175–1189.10.1111/j.1468-0262.2007.00785.xCrossRef Google Scholar

Hansen, B. E., & Racine, J. S. (2012). Jackknife model averaging. Journal of Econometrics , 167, 38–46.10.1016/j.jeconom.2011.06.019CrossRef Google Scholar

Hoeting, J., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial. Statistical Science , 14, 382–417.Google Scholar

Hu, X., & Zhang, X. (2023). Optimal parameter-transfer learning by semiparametric model averaging. Journal of Machine Learning Research , 24, 1–53.Google Scholar

Huang, J., Horowitz, J. L., & Wei, F. (2010). Variable selection in nonparametric additive models. Annals of Statistics , 38, 2282–2313.10.1214/09-AOS781CrossRef Google Scholar PubMed

Imai, K., & Ratkovic, M. (2013). Estimating treatment effect heterogeneity in randomized program evaluation. Annals of Applied Statistics , 7, 443–470.10.1214/12-AOAS593CrossRef Google Scholar

Koch, B., Vock, D. M., & Wolfson, J. (2018). Covariate selection with group lasso and doubly robust estimation of causal effects. Biometrics , 74, 8–17.10.1111/biom.12736CrossRef Google Scholar PubMed

Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences , 116, 4156–4165.10.1073/pnas.1804597116CrossRef Google Scholar PubMed

Lee, K. Y., Li, B., & Chiaromonte, F. (2013). A general theory for nonlinear sufficient dimension reduction: Formulation and estimation. Annals of Statistics , 41, 221–249.10.1214/12-AOS1071CrossRef Google Scholar

Lewbel, A., Choi, J. Y., & Zhou, Z. (2023). Over-identified doubly robust identification and estimation. Journal of Econometrics , 235, 25–42.10.1016/j.jeconom.2022.01.009CrossRef Google Scholar

Li, B., Artemiou, A., & Li, L. (2011). Principal support vector machines for linear and nonlinear sufficient dimension reduction. Annals of Statistics , 39, 3182–3210.10.1214/11-AOS932CrossRef Google Scholar

Li, B., & Song, J. (2017). Nonlinear sufficient dimension reduction for functional data. Annals of Statistics , 45, 1059–1095.10.1214/16-AOS1475CrossRef Google Scholar

Li, J., Lv, J., Wan, A. T. K., & Liao, J. (2020). Adaboost semiparametric model averaging prediction for multiple categories. Journal of the American Statistical Association , 117, 495–509.10.1080/01621459.2020.1790375CrossRef Google Scholar

Li, J., Yu, T., Lv, J., & Lee, M. (2021). Semiparametric model averaging prediction for lifetime data via hazards regression. Journal of the Royal Statistical Society: Series C (Applied Statistics) , 70, 1187–1209.Google Scholar

Liao, J., Wan, A. T., He, S., & Zou, G. (2023). Frequentist model averaging for the nonparametric additive model. Statistica Sinica , 33, 401–430.Google Scholar

Liao, J., Zong, X., Zhang, X., & Zou, G. (2019). Model averaging based on leave-subject-out cross-validation for vector autoregressions. Journal of Econometrics , 209, 35–60.10.1016/j.jeconom.2018.10.007CrossRef Google Scholar

Liu, C.-A. (2015). Distribution theory of the least squares averaging estimator. Journal of Econometrics , 186, 142–159.10.1016/j.jeconom.2014.07.002CrossRef Google Scholar

Liu, C.-A., & Kuo, B.-S. (2016). Model averaging in predictive regressions. The Econometrics Journal , 19, 203–231.10.1111/ectj.12063CrossRef Google Scholar

Luo, W., & Li, B. (2016). Combining eigenvalues and variation of eigenvectors for order determination. Biometrika , 103, 875–887.10.1093/biomet/asw051CrossRef Google Scholar

Luo, W., & Zhu, Y. (2020). Matching using sufficient dimension reduction for causal inference. Journal of Business & Economic Statistics , 38, 888–900.10.1080/07350015.2019.1609974CrossRef Google Scholar

Lv, J., & Liu, J. S. (2014). Model selection principles in misspecified models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 76, 141–167.10.1111/rssb.12023CrossRef Google Scholar

Min, L., Sadiq, S., Feaster, D. J., & Ishwaran, H. (2018). Estimating individual treatment effect in observational data using random forest methods. Journal of Computational and Graphical Statistics , 27, 209–219.Google Scholar

Nie, X., & Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika , 108, 299–319.10.1093/biomet/asaa076CrossRef Google Scholar

Opsomer, J. D., & Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression. Annals of Statistics , 25, 186–211.10.1214/aos/1034276626CrossRef Google Scholar

Qiu, Y., Xie, T., Yu, J., & Zhang, X. (2020). Mallows-type averaging machine learning techniques. Working Paper.Google Scholar

Racine, J. S., Li, Q., Yu, D., & Zheng, L. (2023). Optimal model averaging of mixed-data kernel-weighted spline regressions. Journal of Business & Economic Statistics , 41, 1251–1261.10.1080/07350015.2022.2118126CrossRef Google Scholar

Ravikumar, P., Lafferty, J., Liu, H., & Wasserman, L. (2009). Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 71, 1009–1030.10.1111/j.1467-9868.2009.00718.xCrossRef Google Scholar

Rolling, C. A., & Yang, Y. (2014). Model selection for estimating treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 76, 749–769.10.1111/rssb.12043CrossRef Google Scholar

Rolling, C. A., Yang, Y., & Velez, D. (2019). Combining estimates of conditional treatment effects. Econometric Theory , 35, 1089–1110.10.1017/S0266466618000397CrossRef Google Scholar

Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology , 66, 688–701.10.1037/h0037350CrossRef Google Scholar

Ruppert, D., & Wand, M. P. (1994). Multivariate locally weighted least squares regression. Annals of Statistics , 22, 1346–1370.10.1214/aos/1176325632CrossRef Google Scholar

Schumaker, L. (1981). Spline functions: Basic theory . Wiley.Google Scholar

Splawa-Neyman, J., Dabrowska, D. M., & Speed, T. (1990). On the application of probability theory to agricultural experiments. Statistical Science , 5, 465–472.10.1214/ss/1177012031CrossRef Google Scholar

Stone, C. J. (1985). Additive regression and other nonparametric models. Annals of Statistics , 13, 689–705.10.1214/aos/1176349548CrossRef Google Scholar

Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association , 113, 1228–1242.10.1080/01621459.2017.1319839CrossRef Google Scholar

Xue, L. (2009). Consistent variable selection in additive models. Statistica Sinica , 19, 1281–1296.Google Scholar

Yang, Y. (2000). Combining different procedures for adaptive regression. Journal of Multivariate Analysis , 74, 135–161.10.1006/jmva.1999.1884CrossRef Google Scholar

Yuan, C., Wu, Y., & Fang, F. (2022). Model averaging for generalized linear models in fragmentary data prediction. Statistical Theory and Related Fields , 6, 344–352.10.1080/24754269.2022.2105486CrossRef Google Scholar

Zhang, X., Yu, D., Zou, G., & Liang, H. (2016). Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. Journal of the American Statistical Association , 111, 1775–1790.10.1080/01621459.2015.1115762CrossRef Google Scholar

Zhang, X., Zou, G., & Liang, H. (2014). Model averaging and weight choice in linear mixed-effects models. Biometrika , 101, 205–218.10.1093/biomet/ast052CrossRef Google Scholar

Zhao, Z., Zhang, X., Zou, G., Wan, A. T., & Tsoc, G. K. (2024). Model averaging for estimating treatment effects. Annals of the Institute of Statistical Mathematics , 76, 73–92.10.1007/s10463-023-00876-4CrossRef Google Scholar

Zou, J., Wang, W., Zhang, X., & Zou, G. (2022). Optimal model averaging for divergent-dimensional poisson regressions. Econometric Reviews , 41, 775–805.10.1080/07474938.2022.2047508CrossRef Google Scholar

Li et al. supplementary material

File 337.7 KB

Article contents

AVERAGING ESTIMATORS OF HETEROGENEOUS TREATMENT EFFECTS UNDER ADDITIVE MODELS

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

REFERENCES

Li et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests