Hostname: page-component-7dd5485656-dk7s8 Total loading time: 0 Render date: 2025-10-22T20:41:53.575Z Has data issue: false hasContentIssue false

AVERAGING ESTIMATORS OF HETEROGENEOUS TREATMENT EFFECTS UNDER ADDITIVE MODELS

Published online by Cambridge University Press:  22 October 2025

Na Li
Affiliation:
Kunming University of Science and Technology
Yu Fei
Affiliation:
Yunnan University of Finance and Economics
Yuhong Yang*
Affiliation:
Tsinghua University and Beijing Institute of Mathematical Sciences and Applications
Xinyu Zhang
Affiliation:
Chinese Academy of Sciences and University of Science and Technology of China
*
Address correspondence to Yuhong Yang, Yau Mathematical Sciences Center, Tsinghua University, Beijing, China and Beijing Institute of Mathematical Sciences and Applications, Beijing, China, e-mail: yyangsc@tsinghua.edu.cn.

Abstract

We consider spline-based additive models for estimation of conditional treatment effects. To handle the uncertainty due to variable selection, we propose a method of model averaging with weights obtained by minimizing a J-fold cross-validation criterion, in which a nearest neighbor matching is used to approximate the unobserved potential outcomes. We show that the proposed method is asymptotically optimal in the sense of achieving the lowest possible squared loss in some settings and assigning all weight to the correctly specified models if such models exist in the candidate set. Moreover, consistency properties of the optimal weights and model averaging estimators are established. A simulation study and an empirical example demonstrate the superiority of the proposed estimator over other methods.

Information

Type
ARTICLES
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The authors are very grateful to the editor, associate editor, and two anonymous referees for their constructive comments and suggestions. X.Z.’s work is supported by the National Key Research and Development Program of China (Grant No. 2023YFA1008704) and the National Natural Science Foundation of China (Grant Nos. 72525001 and 72495124). Y.F.’s work is supported by the National Natural Science Foundation of China (Grant No. 12561051) and Yunnan Province XingDian Talent Support Program (Grant No. YNWR-YLXZ-2018-020).

References

REFERENCES

Abadie, A., & Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica , 74, 235267.10.1111/j.1468-0262.2006.00655.xCrossRefGoogle Scholar
Abadie, A., & Imbens, G. W. (2011). Bias-corrected matching estimators for average treatment effects. Journal of Business & Economic Statistics , 29, 111.10.1198/jbes.2009.07333CrossRefGoogle Scholar
Athey, S., & Imbens, G. W. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences , 113, 73537360.10.1073/pnas.1510489113CrossRefGoogle ScholarPubMed
Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear smoothers and additive models. Annals of Statistics , 17, 453510.Google Scholar
Cai, T., Tian, L., Wong, P. H., & Wei, L. (2011). Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics , 12, 270278.10.1093/biostatistics/kxq060CrossRefGoogle ScholarPubMed
Claeskens, G., Croux, C., & Van Kerckhoven, . (2006). Variable selection for logistic regression using a prediction-focused information criterion. Biometrics , 62, 972979.10.1111/j.1541-0420.2006.00567.xCrossRefGoogle ScholarPubMed
Coibion, O., Georgarakos, D., Gorodnichenko, Y., Kenny, G., & Weber, M. (2024). The effect of macroeconomic uncertainty on household spending. American Economic Review , 114, 645677.10.1257/aer.20221167CrossRefGoogle Scholar
Cui, G., Li, N., Wan, A., & Zhang, X. (2025). Model averaging for estimating treatment effects with binary responses. Applied Stochastic Models in Business and Industry , 41, 28982916.10.1002/asmb.2898CrossRefGoogle Scholar
Fan, J., Feng, Y., & Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association , 106, 544557.10.1198/jasa.2011.tm09779CrossRefGoogle ScholarPubMed
Fang, F., Li, J., & Xia, X. (2022). Semiparametric model averaging prediction for dichotomous response. Journal of Econometrics , 229, 219245.10.1016/j.jeconom.2020.09.008CrossRefGoogle Scholar
Hansen, B. E. (2007). Least squares model averaging. Econometrica , 75, 11751189.10.1111/j.1468-0262.2007.00785.xCrossRefGoogle Scholar
Hansen, B. E., & Racine, J. S. (2012). Jackknife model averaging. Journal of Econometrics , 167, 3846.10.1016/j.jeconom.2011.06.019CrossRefGoogle Scholar
Hoeting, J., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial. Statistical Science , 14, 382417.Google Scholar
Hu, X., & Zhang, X. (2023). Optimal parameter-transfer learning by semiparametric model averaging. Journal of Machine Learning Research , 24, 153.Google Scholar
Huang, J., Horowitz, J. L., & Wei, F. (2010). Variable selection in nonparametric additive models. Annals of Statistics , 38, 22822313.10.1214/09-AOS781CrossRefGoogle ScholarPubMed
Imai, K., & Ratkovic, M. (2013). Estimating treatment effect heterogeneity in randomized program evaluation. Annals of Applied Statistics , 7, 443470.10.1214/12-AOAS593CrossRefGoogle Scholar
Koch, B., Vock, D. M., & Wolfson, J. (2018). Covariate selection with group lasso and doubly robust estimation of causal effects. Biometrics , 74, 817.10.1111/biom.12736CrossRefGoogle ScholarPubMed
Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences , 116, 41564165.10.1073/pnas.1804597116CrossRefGoogle ScholarPubMed
Lee, K. Y., Li, B., & Chiaromonte, F. (2013). A general theory for nonlinear sufficient dimension reduction: Formulation and estimation. Annals of Statistics , 41, 221249.10.1214/12-AOS1071CrossRefGoogle Scholar
Lewbel, A., Choi, J. Y., & Zhou, Z. (2023). Over-identified doubly robust identification and estimation. Journal of Econometrics , 235, 2542.10.1016/j.jeconom.2022.01.009CrossRefGoogle Scholar
Li, B., Artemiou, A., & Li, L. (2011). Principal support vector machines for linear and nonlinear sufficient dimension reduction. Annals of Statistics , 39, 31823210.10.1214/11-AOS932CrossRefGoogle Scholar
Li, B., & Song, J. (2017). Nonlinear sufficient dimension reduction for functional data. Annals of Statistics , 45, 10591095.10.1214/16-AOS1475CrossRefGoogle Scholar
Li, J., Lv, J., Wan, A. T. K., & Liao, J. (2020). Adaboost semiparametric model averaging prediction for multiple categories. Journal of the American Statistical Association , 117, 495509.10.1080/01621459.2020.1790375CrossRefGoogle Scholar
Li, J., Yu, T., Lv, J., & Lee, M. (2021). Semiparametric model averaging prediction for lifetime data via hazards regression. Journal of the Royal Statistical Society: Series C (Applied Statistics) , 70, 11871209.Google Scholar
Liao, J., Wan, A. T., He, S., & Zou, G. (2023). Frequentist model averaging for the nonparametric additive model. Statistica Sinica , 33, 401430.Google Scholar
Liao, J., Zong, X., Zhang, X., & Zou, G. (2019). Model averaging based on leave-subject-out cross-validation for vector autoregressions. Journal of Econometrics , 209, 3560.10.1016/j.jeconom.2018.10.007CrossRefGoogle Scholar
Liu, C.-A. (2015). Distribution theory of the least squares averaging estimator. Journal of Econometrics , 186, 142159.10.1016/j.jeconom.2014.07.002CrossRefGoogle Scholar
Liu, C.-A., & Kuo, B.-S. (2016). Model averaging in predictive regressions. The Econometrics Journal , 19, 203231.10.1111/ectj.12063CrossRefGoogle Scholar
Luo, W., & Li, B. (2016). Combining eigenvalues and variation of eigenvectors for order determination. Biometrika , 103, 875887.10.1093/biomet/asw051CrossRefGoogle Scholar
Luo, W., & Zhu, Y. (2020). Matching using sufficient dimension reduction for causal inference. Journal of Business & Economic Statistics , 38, 888900.10.1080/07350015.2019.1609974CrossRefGoogle Scholar
Lv, J., & Liu, J. S. (2014). Model selection principles in misspecified models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 76, 141167.10.1111/rssb.12023CrossRefGoogle Scholar
Min, L., Sadiq, S., Feaster, D. J., & Ishwaran, H. (2018). Estimating individual treatment effect in observational data using random forest methods. Journal of Computational and Graphical Statistics , 27, 209219.Google Scholar
Nie, X., & Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika , 108, 299319.10.1093/biomet/asaa076CrossRefGoogle Scholar
Opsomer, J. D., & Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression. Annals of Statistics , 25, 186211.10.1214/aos/1034276626CrossRefGoogle Scholar
Qiu, Y., Xie, T., Yu, J., & Zhang, X. (2020). Mallows-type averaging machine learning techniques. Working Paper.Google Scholar
Racine, J. S., Li, Q., Yu, D., & Zheng, L. (2023). Optimal model averaging of mixed-data kernel-weighted spline regressions. Journal of Business & Economic Statistics , 41, 12511261.10.1080/07350015.2022.2118126CrossRefGoogle Scholar
Ravikumar, P., Lafferty, J., Liu, H., & Wasserman, L. (2009). Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 71, 10091030.10.1111/j.1467-9868.2009.00718.xCrossRefGoogle Scholar
Rolling, C. A., & Yang, Y. (2014). Model selection for estimating treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 76, 749769.10.1111/rssb.12043CrossRefGoogle Scholar
Rolling, C. A., Yang, Y., & Velez, D. (2019). Combining estimates of conditional treatment effects. Econometric Theory , 35, 10891110.10.1017/S0266466618000397CrossRefGoogle Scholar
Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology , 66, 688701.10.1037/h0037350CrossRefGoogle Scholar
Ruppert, D., & Wand, M. P. (1994). Multivariate locally weighted least squares regression. Annals of Statistics , 22, 13461370.10.1214/aos/1176325632CrossRefGoogle Scholar
Schumaker, L. (1981). Spline functions: Basic theory . Wiley.Google Scholar
Splawa-Neyman, J., Dabrowska, D. M., & Speed, T. (1990). On the application of probability theory to agricultural experiments. Statistical Science , 5, 465472.10.1214/ss/1177012031CrossRefGoogle Scholar
Stone, C. J. (1985). Additive regression and other nonparametric models. Annals of Statistics , 13, 689705.10.1214/aos/1176349548CrossRefGoogle Scholar
Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association , 113, 12281242.10.1080/01621459.2017.1319839CrossRefGoogle Scholar
Xue, L. (2009). Consistent variable selection in additive models. Statistica Sinica , 19, 12811296.Google Scholar
Yang, Y. (2000). Combining different procedures for adaptive regression. Journal of Multivariate Analysis , 74, 135161.10.1006/jmva.1999.1884CrossRefGoogle Scholar
Yuan, C., Wu, Y., & Fang, F. (2022). Model averaging for generalized linear models in fragmentary data prediction. Statistical Theory and Related Fields , 6, 344352.10.1080/24754269.2022.2105486CrossRefGoogle Scholar
Zhang, X., Yu, D., Zou, G., & Liang, H. (2016). Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. Journal of the American Statistical Association , 111, 17751790.10.1080/01621459.2015.1115762CrossRefGoogle Scholar
Zhang, X., Zou, G., & Liang, H. (2014). Model averaging and weight choice in linear mixed-effects models. Biometrika , 101, 205218.10.1093/biomet/ast052CrossRefGoogle Scholar
Zhao, Z., Zhang, X., Zou, G., Wan, A. T., & Tsoc, G. K. (2024). Model averaging for estimating treatment effects. Annals of the Institute of Statistical Mathematics , 76, 7392.10.1007/s10463-023-00876-4CrossRefGoogle Scholar
Zou, J., Wang, W., Zhang, X., & Zou, G. (2022). Optimal model averaging for divergent-dimensional poisson regressions. Econometric Reviews , 41, 775805.10.1080/07474938.2022.2047508CrossRefGoogle Scholar
Supplementary material: File

Li et al. supplementary material

Li et al. supplementary material
Download Li et al. supplementary material(File)
File 337.7 KB