Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-13T00:46:09.431Z Has data issue: false hasContentIssue false

FORMALIZED DATA SNOOPING BASED ON GENERALIZED ERROR RATES

Published online by Cambridge University Press:  30 November 2007

Joseph P. Romano
Affiliation:
Stanford University
Azeem M. Shaikh
Affiliation:
University of Chicago
Michael Wolf
Affiliation:
University of Zurich

Abstract

It is common in econometric applications that several hypothesis tests are carried out simultaneously. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. The classical approach is to control the familywise error rate (FWE), which is the probability of one or more false rejections. But when the number of hypotheses under consideration is large, control of the FWE can become too demanding. As a result, the number of false hypotheses rejected may be small or even zero. This suggests replacing control of the FWE by a more liberal measure. To this end, we review a number of recent proposals from the statistical literature. We briefly discuss how these procedures apply to the general problem of model selection. A simulation study and two empirical applications illustrate the methods.We thank three anonymous referees for helpful comments that have led to an improved presentation of the paper. The research of the third author has been partially supported by the Spanish Ministry of Science and Technology and FEDER, Grant BMF2003-03324.

Type
Research Article
Copyright
© 2008 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Abramovich, F. & Y. Benjamini (1996) Adaptive thresholding of wavelet coefficients. Computational Statistics & Data Analysis 22, 351361.Google Scholar
Abramovich, F., Y. Benjamini, D.L. Donoho, & I.M. Johnstone (2005) Adapting to Unknown Sparsity by Controlling the False Discovery Rate. Annals of Statistics, forthcoming. Working paper available at http://arxiv.org/PS-cache/math/pdf/0505/0505374.pdf.Google Scholar
Andrews, D.W.K. & J.C. Monahan (1992) An improved heteroskedasticity and autocorrelation consistent covariance matrix estimator. Econometrica 60, 953966.Google Scholar
Bauer, P., B.M. Pötscher, & P. Hackl (1988) Model selection by multiple test procedures. Statistics 19, 3944.Google Scholar
Benjamini, Y. & Y. Hochberg (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57, 289300.Google Scholar
Benjamini, Y. & Y. Hochberg (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. Journal of Educational and Behavioral Statistics 25, 6083.Google Scholar
Benjamini, Y. & D. Yekutieli (2001) The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 11651188.Google Scholar
Buena, F., M.H. Wegkamp, & A. Auguste (2006) Consistent variable selection in high dimensional regression via multiple testing. Journal of Statistical Planning and Inference, forthcoming.Google Scholar
Campos, J., N.R. Ericsson, & D.F. Hendry (2005) General-to-Specific Modelling. Edward Elgar.
Davison, A.C. & D.V. Hinkley (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Draper, N.R. & H. Smith (1998) Applied Regression Analysis, 3rd ed. Wiley.
Dudoit, S., J.P. Shaffer, & J.C. Boldrick (2003) Multiple hypothesis testing in microarray experiments. Statistical Science 18, 71103.Google Scholar
Dudoit, S., M.J. van der Laan, & K.S. Pollard (2004a) Multiple testing, part I: Single-step procedures for control of general type I error rates. Statistical Applications in Genetics and Molecular Biology 3, Article 13. Available at http://www.bepress.com/sagmb/vol3/iss1/art13.
Dudoit, S., M.J. van der Laan, & K.S. Pollard (2004b) Multiple testing, part III: Procedures for control of the generalized family-wise error rate and proportion of false positives. Working paper 171, U.C. Berkeley Division of Biostatistics. Available at http://www.bepress.com/ucbbiostat/paper171/.
Efron, B. (1979) Bootstrap methods: Another look at the jackknife. Annals of Statistics 7, 126.Google Scholar
Genovese, C.R. & L. Wasserman (2004) A stochastic process approach to false discovery control. Annals of Statistics 32, 10351061.Google Scholar
Götze, F. & H.R. Künsch (1996) Second order correctness of the blockwise bootstrap for stationary observations. Annals of Statistics 24, 19141933.Google Scholar
Hansen, P.R. (2005) A test for superior predictive ability. Journal of Business & Economics Statistics 23, 365380.Google Scholar
Hansen, P.R., A. Lunde, & J.M. Nason (2003) Choosing the best volatility models: The model confidence set approach. Oxford Bulletin of Economics and Statistics 65, 839861.Google Scholar
Hansen, P.R., A. Lunde, & J.M. Nason (2005) Model Confidence Sets for Forecasting Models. Working paper 2005-7, Federal Reserve Bank of Atlanta. Available at http://ssrn.com/abstract=522382.
Hastie, T.J., R. Tibshirani, & J.H. Friedman (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
Hochberg, Y. & A. Tamhane (1987) Multiple Comparison Procedures. Wiley.
Holm, S. (1979) A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 6570.Google Scholar
Hommel, G. & T. Hoffman (1988) Controlled uncertainty. In P. Bauer, G. Hommel, & E. Sonnemann (eds.), Multiple Hypothesis Testing, pp. 154161. Springer.
Jensen, D.D. & P.R. Cohen (2000) Multiple comparisons in induction algorithms. Machine Learning 38, 309338.Google Scholar
Kabaila, P. & H. Leeb (2006) On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association 101, 619629.Google Scholar
Kat, H.M. (2003) 10 Things Investors Should Know about Hedge Funds. AIRC Working paper 0015, Cass Business School, City University. Available at http://www.cass.city.ac.uk/airc/papers.html.
Korn, E.L., J.F. Troendle, L.M. McShane, & R. Simon (2004) Controlling the number of false discoveries: Application to high-dimensional genomic data. Journal of Statistical Planning and Inference 124, 379398.Google Scholar
Kosowski, R., N.Y. Naik, & M. Teo (2005) Is Stellar Hedge Fund Performance for Real? Working paper HF-018, Centre for Hedge Fund Research and Education, London Business School.
Krolzig, H.-M. & D.F. Hendry (2001) Computer automation of general-to-specific selection procedures. Journal of Economic Dynamics & Control 25, 831866.Google Scholar
Lahiri, S.N. (1992) Edgeworth correction by “moving block” bootstrap for stationary and nonstationary data. In R. LePage & L. Billard (eds.), Exploring the Limits of Bootstrap, pp. 183214. Wiley.
Lehmann, E.L. & J.P. Romano (2005) Generalizations of the familywise error rate. Annals of Statistics 33, 11381154.Google Scholar
Lo, A.W. (2002) The statistics of Sharpe ratios. Financial Analysts Journal 58, 3652.Google Scholar
Pollard, K.S. & M.J. van der Laan (2003a) Multiple testing for gene expression data: An investigation of null distributions with consequences for the permutation test. In F. Valafar & H. Valafar (eds.), Proceedings of the 2003 International MultiConference in Computer Science and Engineering, METMBS'03 Conference, pp. 39. CSREA.
Pollard, K.S. & M.J. van der Laan (2003b) Resampling-Based Multiple Testing: Asymptotic Control of Type i Error and Applications to Gene Expression Data. Working paper 121, U.C. Berkeley Division of Biostatistics. Available at http://www.bepress.com/ucbbiostat/paper121/.
Pötscher, B.M. (1983) Order estimation in ARMA models by Lagrange multiplier tests. Annals of Statistics 11, 872885.Google Scholar
Romano, J.P. & A.M. Shaikh (2006a) On stepdown control of the false discovery proportion. In J. Rojo (ed.), IMS Lecture Notes—Monograph Series, 2nd Lehmann Symposium—Optimality, pp. 3350. Institute of Mathematical Science.
Romano, J.P. & A.M. Shaikh (2006b) Stepup procedures for control of generalizations of the familywise error rate. Annals of Statistics 34, 18501873.Google Scholar
Romano, J.P. & M. Wolf (2005a) Exact and approximate stepdown methods for multiple hypothesis testing. Journal of the American Statistical Association 100, 94108.Google Scholar
Romano, J.P. & M. Wolf (2005b) Stepwise multiple testing as formalized data snooping. Econometrica 73, 12371282.Google Scholar
Romano, J.P. & M. Wolf (2006) Improved nonparametric confidence intervals in time series regressions. Journal of Nonparametric Statistics 18, 199214.Google Scholar
Romano, J.P. & M. Wolf (2007) Control of generalized error rates in multiple testing. Annals of Statistics 35, 13781408.Google Scholar
Shen, X., H. Huang, & J. Ye (2004) Inference after model selection. Journal of the American Statistical Association 99, 751762.Google Scholar
Shimodaira, H. (1998) An application of multiple comparison techniques to model selection. Annals of the Institute of Statistical Mathematics 50, 113.Google Scholar
Storey, J.D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B 64, 479498.Google Scholar
Storey, J.D. (2003) The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics 31, 20132035.Google Scholar
Storey, J.D., J.E. Taylor, & D. Siegmund (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. Journal of the Royal Statistical Society, Series B 66, 187205.Google Scholar
Sullivan, R., A. Timmermann, & H.L. White (1999) Data snooping, technical trading rule performance, and the bootstrap. Journal of Finance 54, 16471692.Google Scholar
Sullivan, R., H.L. White, & B. Golomb (2001) Dangers of data mining: The case of calendar effects in stock returns. Journal of Econometrics 105, 249286.Google Scholar
Timmermann, A. (2006) Forecast combinations. In G. Elliott, C.W.J. Granger, & A. Timmermann (eds.), Handbook of Economic Forecasting, vol. 1, pp. 135196. North-Holland.
van der Laan, M.J., M.D. Birkner, & A.E. Hubbard (2005) Empirical Bayes and resampling based multiple testing procedure controlling tail probability of the proportion of false positives. Statistical Applications in Genetics and Molecular Biology 4, Article 29. Available at http://www.bepress.com/sagmb/vol4/iss1/art29/.
van der Laan, M.J., S. Dudoit, & K.S. Pollard (2004) Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Statistical Applications in Genetics and Molecular Biology 3, Article 15. Available at http://www.bepress.com/sagmb/vol3/iss1/art15/.
van der Laan, M.J. & A.E. Hubbard (2005) Quantile-Function Based Null Distributions in Resampling Based Multiple Testing. Working paper 198, U.C. Berkeley Division of Biostatistics. Available at http://www.bepress.com/ucbbiostat/paper198/.
Westfall, P.H. & S.S. Young (1993) Resampling-Based Multiple Testing: Examples and Methods for P-Value Adjustment. Wiley.
White, H.L. (2000) A reality check for data snooping. Econometrica 68, 10971126.Google Scholar
White, H.L. (2001) Asymptotic Theory for Econometricians, rev. ed. Academic Press.