Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-01-13T01:10:04.610Z Has data issue: false hasContentIssue false

ALGORITHMIC SUBSAMPLING UNDER MULTIWAY CLUSTERING

Published online by Cambridge University Press:  11 July 2023

Harold D. Chiang*
Affiliation:
University of Wisconsin–Madison
Jiatong Li
Affiliation:
Vanderbilt University
Yuya Sasaki
Affiliation:
Vanderbilt University
*
Address correspondence to Harold D. Chiang, Department of Economics, University of Wisconsin–Madison, Madison, WI 53706, USA; e-mail: hdchiang@wisc.edu.
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper proposes a novel method of algorithmic subsampling (data sketching) for multiway cluster-dependent data. We establish a new uniform weak law of large numbers and a new central limit theorem for multiway algorithmic subsample means. We show that algorithmic subsampling allows for robustness against potential degeneracy, and even non-Gaussian degeneracy, of the asymptotic distribution under multiway clustering at the cost of efficiency and power loss due to algorithmic subsampling. Simulation studies support this novel result, and demonstrate that inference with algorithmic subsampling entails more accuracy than that without algorithmic subsampling. We derive the consistency and the asymptotic normality for multiway algorithmic subsampling generalized method of moments estimator and for multiway algorithmic subsampling M-estimator. We illustrate with an application to scanner data for the analysis of differentiated products markets.

Type
ARTICLES
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Footnotes

We benefited from very useful comments by Peter C. B. Phillips (the Editor), Matias Cattaneo (the Co-Editor), Sokbae (Simon) Lee, three anonymous referees, and participants in the 2021 North American Summer Meeting, International Association for Applied Econometrics, 2021 Asian Meeting, 2021 China Meeting of the Econometric Society, 26th International Panel Data Conference, 2021 Australasia Meeting of the Econometric Society, 2021 European Summer Meeting, and New York Camp Econometrics XVI. The usual disclaimer applies. We thank James M. Kilts Center, University of Chicago Booth School of Business for allowing us to use scanner data from the Dominicks Finer Foods (DFF) retail chain. H. Chiang is supported by the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin–Madison with funding from the Wisconsin Alumni Research Foundation.

References

REFERENCES

Cameron, A.C., Gelbach, J.B., & Miller, D.L. (2011) Robust inference with multiway clustering. Journal of Business & Economic Statistics 29, 238249.CrossRefGoogle Scholar
Cameron, A.C. & Miller, D.L. (2014) Robust Inference for Dyadic Data . University of California-Davis.Google Scholar
Chen, J. & Rao, J. (2007) Asymptotic normality under two-phase sampling designs. Statistica Sinica 17, 10471064.Google Scholar
Chiang, H.D., Kato, K., Ma, Y., & Sasaki, Y. (2022) Multiway cluster robust double/debiased machine learning. Journal of Business & Economic Statistics 40, 10461056.CrossRefGoogle Scholar
Chiang, H.D., Kato, K., & Sasaki, Y. (2021) Inference for high-dimensional exchangeable arrays. Journal of the American Statistical Association, forthcoming.CrossRefGoogle Scholar
Davezies, L., D’Haultfoeuille, X., & Guyonvarch, Y. (2018) Asymptotic results under multiway clustering. Preprint, arXiv:1807.07925.Google Scholar
Davezies, L., D’Haultfoeuille, X., & Guyonvarch, Y. (2021) Empirical process results for exchangeable arrays. Annals of Statistics 49, 845862.CrossRefGoogle Scholar
Djogbenou, A.A., MacKinnon, J.G., & Nielsen, M.Ø. (2019) Asymptotic theory and wild bootstrap inference with clustered errors. Journal of Econometrics 212, 393412.CrossRefGoogle Scholar
Janson, S. (1984) The asymptotic distributions of incomplete U-statistics. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 66, 495505.CrossRefGoogle Scholar
Kosorok, M.R. (2008) Introduction to Empirical Processes and Semiparametric Inference . Springer.CrossRefGoogle Scholar
Lee, S. & Ng, S. (2020a) An econometric perspective on algorithmic subsampling. Annual Review of Economics 12, 4580.CrossRefGoogle Scholar
Lee, S. & Ng, S. (2020b) Sketching for two-stage least squares estimation. Preprint, arXiv:2007.07781.Google Scholar
MacKinnon, J.G., Nielsen, M.Ø., & Webb, M. (2023) Testing for the appropriate level of clustering in linear regression models. Journal of Econometrics, forthcoming.CrossRefGoogle Scholar
MacKinnon, J.G., Nielsen, M.Ø., & Webb, M.D. (2021) Wild bootstrap and asymptotic inference with multiway clustering. Journal of Business & Economic Statistics 39, 505519.CrossRefGoogle Scholar
MacKinnon, J.G. & Webb, M.D. (2017) Wild bootstrap inference for wildly different cluster sizes. Journal of Applied Econometrics 32, 233254.CrossRefGoogle Scholar
MacKinnon, J.G. & Webb, M.D. (2018) The wild bootstrap for few (treated) clusters. Econometrics Journal 21, 114135.CrossRefGoogle Scholar
Menzel, K. (2021) Bootstrap with cluster-dependence in two or more dimensions. Econometrica 89(5), 21432188.CrossRefGoogle Scholar
Nevo, A. (2000) A practitioner’s guide to estimation of random-coefficients logit models of demand. Journal of Economics & Management Strategy 9, 513548.Google Scholar
Newey, W.K. & McFadden, D. (1994) Handbook of Econometrics , Volume IV, Engle, R.F. and McFadden, D.L. (eds.), pp. 21112245. Elsevier Science.CrossRefGoogle Scholar
Thompson, S.B. (2011) Simple formulas for standard errors that cluster by both firm and time. Journal of Financial Economics 99, 110.CrossRefGoogle Scholar
Van der Vaart, A.W. (2000) Asymptotic Statistics , vol. 3. Cambridge University Press.Google Scholar
van der Vaart, A.W. & Wellner, J.A. (1996) Weak Convergence and Empirical Processes . Springer.CrossRefGoogle Scholar