Shedding Light on the Role of Sample Sizes and Splitting Proportions in Out-of-Sample Tests: A Monte Carlo Cross-Validation Approach

Christian Janze


We examine whether the popular 2/3 rule-of-thumb splitting criterion used in out-of-sample evaluation of predictive econometric and machine learning models makes sense. We conduct simulations regarding the predictive performance of the logistic regression and decision tree algorithm when considering varying splitting points as well as sample sizes. Our non-exhaustive repeated random sub-sampling simulation approach known as Monte Carlo cross-validation indicates that while the 2/3 rule-of-thumb works, there is a spectrum of different splitting proportions that yield equally compelling results. Furthermore, our results indicate that the size of the complete sample has little impact on the applicability of the 2/3 rule-of-thumb. However, our analysis reveals that when considering relatively small and relatively large training samples in relation to the sample size, the variation of the predictive accuracy can lead to misleading results. Our results are especially important for IS researchers considering the usage of out-of-sample methods for evaluating their predictive models.

Texto Completo:

PDF (English)


Allen, D. M. (1974). The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction. Technometrics, 16(1), 125–127.

Campbell, J. Y., Lo, A. W., & MacKinlay, A. C. (2012). The Econometrics of Financial Markets (2nd ed.). Princeton University Press.

Cios, K. J., Pedrycz, W., Swiniarski, R. W., & Kurgan, L. (2007). Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media.

Dobbin, K. K., & Simon, R. M. (2011). Optimally splitting cases for training and testing high dimensional classifiers. BMC Medical Genomics, 4(1), 1.

Dubitzky, W., Granzow, M., & Berrar, D. P. (2007). Fundamentals of Data Mining in Genomics and Proteomics. Springer Science & Business Media.

Geisser, S. (1975). The Predictive Sample Reuse Method with Applications. Journal of the American Statistical Association, 70(350), 320–328.

Inoue, A., & Kilian, L. (2005). In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use? Econometric Reviews, 23(4), 371–402.

Kaynak, O., Alpaydin, E., Oja, E., & Xu, L. (2003). Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003: Joint International Conference ICANN/ICONIP 2003, Istanbul, Turkey, June 26–29, 2003, Proceedings. Springer.

Lantz, B. (2015). Machine Learning with R - Second Edition: Brett Lantz: Fremdsprachige Bücher. Packt Publishing. Retrieved from

Picard, R. R., & Cook, R. D. (1984). Cross-Validation of Regression Models. Journal of the American Statistical Association, 79(387), 575–583.

Schneider, J. (1997, February 7). Cross Validation. Retrieved March 30, 2016, from

Shao, J. (1993). Linear Model Selection by Cross-Validation. Journal of the American Statistical Association, 88(422), 486.

Shmueli, G., & Koppius, O. R. (2011). Predictive analytics in information systems research. Mis Quarterly, 553–572.

Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series B (Methodological), 111–147.

Zhang, Y., & Yang, Y. (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, 187(1), 95–112.



  • Não há apontamentos.