June 11, 2013
[Update June 12: Data.tables functions have been improved (thanks to a comment by Matthew Dowle); for a similar approach see also Tal Galili’s post]
The guys from RStudio now provide CRAN download logs (see also this blog post). Great work!
I always asked myself, how many people actually download my packages. Now I finally can get an answer (… with some anxiety to get frustrated 😉
Here are the complete, self-contained R scripts to analyze these log data:
Step 1: Download all log files in a subfolder (this steps takes a couple of minutes)
## Step 1: Download all log files
# Here's an easy way to get all the URLs in R
start <- as.Date('2012-10-01')
today <- as.Date('2013-06-10')
all_days <- seq(start
, today, by = 'day')
year <- as.POSIXlt(
$year + 1900
, year, '/'
, all_days, '.csv.gz')
# only download the files you don't have:
missing_days <- setdiff(as.character(
i in 1:length(
Step 2: Combine all daily files into one big data table (this steps also takes a couple of minutes…)
## Step 2: Load single data files into one big data.table
file_list <- list.files("CRANlogs"
logs <- list()
for (file in
logs[[file]] <- read.table(file
, header =
TRUE, sep = ","
, quote = """,
dec = "
.", fill = TRUE, comment.char = "", as.is=TRUE)
# rbind together all files
dat <- rbindlist(logs)
# add some keys and define variable types
setkey(dat, package, date, week, country)
# for later analyses: load the saved data.table
Step 3: Analyze it!
## Step 3: Analyze it!
# Overall downloads of packages
# plot 1: Compare downloads of selected packages on a weekly basis
Here are my two packages,
. Actually, ~30 downloads per week (from this single mirror) is much more than I’ve expected!
To put things in perspective: package
included in the plot:
Some psychological sidenotes on social comparisons:
- Downward comparisons enhance well-being, extreme upward comparisons are detrimental. Hence, do never include
into your graphic!
- Upward comparisons instigate your achievement motive, and give you drive to get better. Hence, select some packages, which are slightly above your own.
- Of course, things are a bit more complicated than that …
All source code on this post is licensed under the FreeBSD license.