# Comparing all quantiles of two distributions simultaneously

Summary: A new function in the WRS package compares many quantiles of two distributions simultaneously while controlling the overall alpha error.

When comparing data from two groups, approximately 99.6% of all psychological research compares the *central tendency* (that is a subjective estimate).

In some cases, however, it would be sensible to compare different parts of the distributions. For example, in reaction time (RT) experiments two groups may only differ in the fast RTs, but not in the long. Measures of central tendency might obscure or miss this pattern, as following example demonstrates.

Imagine RT distributions for two experimental conditions (“black” and “red”). Participants in the red condition have some very fast RTs:

A naïve (but common) approach would be to compare both distributions with a *t* test:

######################

data: RT1 and RT2

t = -0.3778, df = 168.715, p-value = 0.706

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-22.74478 15.43712

sample estimates:

mean of x mean of y

341.8484 345.5022

Results show that both groups do not differ in their central tendency.

**Now let’s do the same with a new method!**

The function

from the WRS package compares user-defined quantiles of both distributions using a Harrell–Davis estimator in conjunction with a percentile bootstrap. The method seems to improve over other methods: “Currently, when there are tied values, no other method has been found that performs reasonably well. Even with no tied values, method HD can provide a substantial gain in power when q ≤ .25 or q ≥ .75 compared to other techniques that have been proposed”. The method is described in the paper “Comparing two independent groups via the upper and lower quantiles” by Wilcox, Erceg-Hurn, Clark and Carlson (2013).

You can use the function as soon as you install the latest version of the WRS package following this installation instruction.

Let’s compare all percentiles from the 10th to the 90th:

The graphical output shows how groups differ in the requested quantiles, and the confidence intervals for each quantile:

The text output (see below) also shows that groups differ significantly in the 10th, the 50th, and the 60th percentile. The column labeled ‘’.value’’shows the p value for a single quantile bootstrapping test. As we do multiple tests (one for each quantile), the overall Type 1 error (defaulting to .05) is controlled by the Hochberg method. Therefore, for each *p* value a critical *p* value is calculated that must be undercut (see column ‘_crit’. The column ‘signify’ marks all tests which fulfill this condition:

1 0.1 100 100 285.8276 218.4852 67.342399 41.04707 84.67980495 0.005555556 0.001 *

2 0.2 100 100 297.5061 264.7904 32.715724 -16.52601 68.80486452 0.025000000 0.217

3 0.3 100 100 310.8760 320.0196 -9.143593 -33.63576 32.95577465 0.050000000 0.589

4 0.4 100 100 322.5014 344.0439 -21.542475 -40.43463 0.03938696 0.010000000 0.054

5 0.5 100 100 331.4413 360.3548 -28.913498 -44.78068 -9.11259108 0.007142857 0.006 *

6 0.6 100 100 344.8502 374.7056 -29.855369 -46.88886 -9.69559705 0.006250000 0.005 *

7 0.7 100 100 363.6210 388.0228 -24.401872 -47.41493 -4.13498039 0.008333333 0.016

8 0.8 100 100 385.8985 406.3956 -20.497097 -47.09522 2.23935390 0.012500000 0.080

9 0.9 100 100 419.4520 444.7892 -25.337206 -55.84177 11.49107833 0.016666667 0.174

To summarize, we see that we have significant differences between both groups: the red group has significantly more faster RTs, but in their central tendency longer RTs.

Recommendations for comparing groups:

- Always plot the densities of both distributions.
- Make a visual scan: Where do the groups differ? Is the central tendency a reasonable summary of the distributions and of the difference between both distributions?
- If you are interested in the central tendency, think about the test for trimmed means, as in most cases this describes the central tendency better than the arithmetic mean.
- If you are interested in comparing quantiles in the
*tails*of the distribution, use the qcomhd function.

### References

Wilcox, R. R., Erceg-Hurn, D. M, Clark, F., & Carlson, M. (in press). Comparing two independent groups via the lower and upper quantiles. *Journal of Statistical Computation and Simulation.* doi:10.1080/00949655.2012.754026.

# The Evolution of Correlations

This is the evolution of a bivariate correlation between two questionnaire scales, “hope of power” and “fear of losing control”. Both scales were administered in an open online study. The video shows how the correlation evolves from *r* = .69*** (n=20) to *r* = .26*** (*n*=271). It does not stabilize until *n* = 150.

Data has not been rearranged – it is the random order how participants dropped into the study. This had been a rather extreme case of an unstable correlation – other scales in this study were stable right from the beginning. Maybe this video could help as an anecdotal caveat for a careful interpretation of correlations with small n’s (and with ‘small’ I mean *n* < 100) …

The evolution of correlations from Felix Schönbrodt on Vimeo.

# Range restrictions for the correlations of 3 variables

A little follow up to my testosterone comment (written in German):

When three variables are correlated to each other, and two of the three correlations are known, the range for the third correlation is restricted according to this formula (Olkin, 1981):

Now comes the new part: here’s the graphical representation of that range restriction:

As one can see, one, or both of the two given correlations have to be fairly high to imply a positive third correlation.

Olkin, I. (1981). Range restrictions for product-moment correlation matrices. Psychometrika, 46, 469-472. doi:10.1007/BF02293804# Weighted t-Test in R

Although there is a weighted.mean function in R, so far I couldn’t find a implementation of weighted.var and weighted.t.test – here they are (the weighted variance is from Gavin Simpson, found on the R malining list):

var.wt <- function(x, w, na.rm = FALSE) {

if (na.rm) {

w <- w[i <- !is.na(x)]

x <- x[i]

}

sum.w <- sum(w)

return((sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)))

}

weighted.t.test <- function(x, w, mu, conf.level = 0.95, alternative="two.sided", na.rm=TRUE) {

if(!missing(conf.level) &

(length(conf.level) != 1 || !is.finite(conf.level) ||

conf.level < 0 || conf.level > 1))

stop("'conf.level' must be a single number between 0 and 1")

if (na.rm) {

w <- w[i <- !is.na(x)]

x <- x[i]

}

# to achieve consistent behavior in loops, return NA-structure in case of complete missings

if (sum(is.na(x)) == length(x)) return(list(estimate=NA, se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))

# if only one value is present: this is the best estimate, no significance test provided

if (sum(!is.na(x)) == 1) {

warning("Warning weighted.t.test: only one value provided; this value is returned without test of significance!", call.=FALSE)

return(list(estimate=x[which(!is.na(x))], se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))

}

x.w <- weighted.mean(x,w, na.rm=na.rm)

var.w <- var.wt(x,w, na.rm=na.rm)

df <- length(x)-1

t.value <- sqrt(length(x))*((x.w-mu)/sqrt(var.w))

se <- sqrt(var.w)/sqrt(length(x))

if (alternative == "less") {

pval <- pt(t.value, df)

cint <- c(-Inf, x.w + se*qt(conf.level, df) )

}

else if (alternative == "greater") {

pval <- pt(t.value, df, lower.tail = FALSE)

cint <- c(x.w - se * qt(conf.level, df), Inf)

}

else {

pval <- 2 * pt(-abs(t.value), df)

alpha <- 1 - conf.level

cint <- x.w + se*qt(1 - alpha/2, df)*c(-1,1)

}

names(t.value) <- "t"

return(list(estimate=x.w, se=se, conf.int=cint, statistic=t.value, df=df, p.value=pval))

}