Update June 2013: A systematic analysis of the topic has been published:
Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609-612. doi:10.1016/j.jrp.2013.05.009
Check also the supplementary website, where you can find the PDF of the paper.
As an update of this post: here’s an improved version of “The evolution of correlations”.
From the original post:
“This is the evolution of a bivariate correlation between two questionnaire scales, “hope of power” and “fear of losing control”. Both scales were administered in an open online study. The video shows how the correlation evolves from r = .69*** (n=20) to r = .26*** (n=271). It does not stabilize until n = 150.
Data has not been rearranged – it is the random order how participants dropped into the study. This had been a rather extreme case of an unstable correlation – other scales in this study were stable right from the beginning. Maybe this video could help as an anecdotal caveat for a careful interpretation of correlations with small n’s (and with ‘small’ I mean n < 100) …”
The right panel now displays the correlation in each step. The horizontal green line is the final correlation that is approached, the curved dotted line shows the marginal correlation that would be significant at that sample size. As the empirical curve always is above this dotted line, it is significantly different from zero in each step.
Here the code that created the movie. It’s not fully self-contained – the function plotReg plots the dual-panel display, dat0, A, and B are parameters passed to this function. You can insert any other function here. The function loops through the rows of a data frame and saves a plot at every step into a subfolder. Finally, the function needs the command line version of ffmpeg, which connects the pictures to a movie.
Here’s a video how the modFit function from the FME package optimizes parameters for an oscillation. A Nelder-Mead-optimizer (R function optim) finds the best fitting parameters for an undampened oscillator. Minimum was found after 72 iterations, true parameter eta was -.05:
More on estimating parameters of differential equations is coming later on this blog!
Things I’ve learned:
Rand Wilcox constantly updates the functions accompanying his books on robust statistics. Recently, they have been updated to version 20. The functions are available in the WRS package for R – for installation simply type
install.packages("WRS", repos="http://R-Forge.R-project.org", type="source")
Update: For installation of the WRS package, please use only the fail-safe installation procedure described here!
In version 0.20, a number of functions dealing with ANCOVA have been added and some others improved. Unfortunately, only very few help files exist for the functions. I would recommend to check out the source code, as most functions have a comment section roughly explaining the parameters. Alternatively, consult Wilcox’ books for descriptions of the functions.
Wilcox, R. R. (2012). Introduction to Robust Estimation and Hypothesis Testing, 3rd Ed. Academic Press.
Wilcox, R. (2012). Modern Statistics for the Social and Behavioral Sciences: A Practical Introduction. CRC Press.
Wilcox, R. R. (2010). Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy. Springer, 2nd Ed.
Wilcox, R. R. (2009). Basics Statistics: Understanding Conventional Methods and Modern Insights. New York: Oxford.