Online psychology experiments: calibration

There are no shortage of online experiments on the web. Psychological Research on the Web lists hundreds. As the Top Ten Online Psychology Experiments points out, it’s a little hard to assess the validity of these results because of variations in speeed of the hardware. They note that we also don’t know who is taking these tests, or whether they have understood the instructions properly.

The open source stimulus presentation packages for desktops I’ve programmed with (PsyScript, VisionEgg) advertise impressive temporal accuracy for output. (See, for instance the Appendix in Bates & D’Oliviero (2003)). An important question is: how would you verify such assertions for yourself? How would you make sure your experiment software is calibrated properly? When I raised the issue of calibration with my engineer friend Bob Erickson, he suggested that an oscilloscope connected to a light-sensitive diode held up to the screen, similar to what Bates & D’Oliviero describe, would be the best way to check to see whether screen displays last as long as you expect them to.

In the online space, we can’t expect subjects to employ oscilloscopes. When I discussed the issue of non-standard hardware with Jim McGinley, he showed me the video calibration tests used by Rock Band, in which a metronome-like bar swings back and forth, and the user is invited to strum in time with the metronome. The metronome may not be the right approach, but something like this, where the user attempts to hit the space bar at the same time as a phenomenon on the screen, seems like it would be on the right track.

Jim tells me that Flash is supposed to run at 60 frames a second. This means a temporal resolution of 16.66 ms, at least for output, which is plenty good enough for a lot of psychology experiments. This says nothing about input. The real question, however, is how much temporal variability would be introduced by other applications running on the same machine. For some experiments, having a particular stimulus display for a precise number of milliseconds is crucial.

Any thoughts on output calibration, especially for online experiments, would be welcome.


Bates, TC & D’Oliveiro, L. (2003). ‘PsyScript: A Macintosh Application for Scripting Experiments.’ Behaviour Research Methods 35: 565-576.

Straw, Andrew D. (2008) ‘Vision Egg: An Open-Source Library for Realtime Visual Stimulus Generation.’ Frontiers in Neuroinformatics. doi: 10.3389/neuro.11.004.2008 link

Tools for Psychology and Neuroscience

Open source tools make new options available for designing experiments, doing analysis, and writing papers. Already, we can see hardware becoming available for low-cost experimentation. There is an OpenEEG project. There are open source eye tracking tools for webcams. Stimulus packages like VisionEgg can be used to collect reaction times or to send precise timing signals to fMRI scanners. Neurolens is a free functional neuroimage analysis tool.

Cheaper hardware and software make it easier for students to practice techniques in undergraduate labs, and easier for graduate students to try new ideas that might otherwise be cost-prohibitive.

Results can be collected and annotated using personal wiki lab notebook programs like Garrett Lisi’s Although some people, like Lisi, share their notebooks on the web (a practice known as open notebook science), it is not necessary to share wiki notebooks with anyone to receive substantial benefit from them. Wiki notebooks are an aid to the working researcher because they can be used to record methods, references and stimuli in much more detail than the published paper can afford. Lab notebooks, significantly, can include pointers to all of the raw data, together with each transformation along the chain of data provenance. This inspires trust in the analysis, and makes replication easier. Lab notebooks can also be a place to make a record of the commands that were used to generate tables and graphs in languages like R.

R is an open source statistics package. It scriptable, and can be used in place of SPSS (Revelle (2008), Baron & Li (2007)). It is multi-platform, can be freely shared with collaborators, and can import and export data in a CSV form that is readable by other statistics packages, spreadsheets, and graphing packages.

R code can be embedded directly into a LaTeX or OpenOffice document using a utility called Sweave. Sweave can be used with LaTeX to automatically format documents in APA style (Zahn, 2008). With Sweave, when you see a graph or table in a paper, it’s always up to date, generated on the fly from the original R code when the PDF is generated. Including the LaTeX along with the PDF becomes a form of reproducible research, rooted in Donald Knuth’s idea of literate programming. When you want to know in detail how the analysis was done, you need look no further than the source text of the paper itself.


Baron, J. & Li, Y. (9 Nov 2007). ‘Notes on the use of R for psychology experiments and questionnaires.’

Revelle, W. (25 May 2008). ‘Using R for Psychological Research. A simple guide to an elegant package.’

Zahn, Ista. (2008). ‘Learning to Sweave in APA Style.’ The PracTeX Journal.