Tools for Psychology and Neuroscience

Open source tools make new options available for designing experiments, doing analysis, and writing papers. Already, we can see hardware becoming available for low-cost experimentation. There is an OpenEEG project. There are open source eye tracking tools for webcams. Stimulus packages like VisionEgg can be used to collect reaction times or to send precise timing signals to fMRI scanners. Neurolens is a free functional neuroimage analysis tool.

Cheaper hardware and software make it easier for students to practice techniques in undergraduate labs, and easier for graduate students to try new ideas that might otherwise be cost-prohibitive.

Results can be collected and annotated using personal wiki lab notebook programs like Garrett Lisi’s deferentialgeometry.org. Although some people, like Lisi, share their notebooks on the web (a practice known as open notebook science), it is not necessary to share wiki notebooks with anyone to receive substantial benefit from them. Wiki notebooks are an aid to the working researcher because they can be used to record methods, references and stimuli in much more detail than the published paper can afford. Lab notebooks, significantly, can include pointers to all of the raw data, together with each transformation along the chain of data provenance. This inspires trust in the analysis, and makes replication easier. Lab notebooks can also be a place to make a record of the commands that were used to generate tables and graphs in languages like R.

R is an open source statistics package. It scriptable, and can be used in place of SPSS (Revelle (2008), Baron & Li (2007)). It is multi-platform, can be freely shared with collaborators, and can import and export data in a CSV form that is readable by other statistics packages, spreadsheets, and graphing packages.

R code can be embedded directly into a LaTeX or OpenOffice document using a utility called Sweave. Sweave can be used with LaTeX to automatically format documents in APA style (Zahn, 2008). With Sweave, when you see a graph or table in a paper, it’s always up to date, generated on the fly from the original R code when the PDF is generated. Including the LaTeX along with the PDF becomes a form of reproducible research, rooted in Donald Knuth’s idea of literate programming. When you want to know in detail how the analysis was done, you need look no further than the source text of the paper itself.

_____

Baron, J. & Li, Y. (9 Nov 2007). ‘Notes on the use of R for psychology experiments and questionnaires.’
http://www.psych.upenn.edu/~baron/rpsych/rpsych.html

Revelle, W. (25 May 2008). ‘Using R for Psychological Research. A simple guide to an elegant package.’ http://www.personality-project.org/R/

Zahn, Ista. (2008). ‘Learning to Sweave in APA Style.’ The PracTeX Journal. http://www.tug.org/pracjourn/2008-1/zahn/

Advertisements

18 responses to “Tools for Psychology and Neuroscience

  1. Nice overview – I look forward to following your thoughts here. Do you routinely use R in your research?

  2. Hi Jean-Claude, thanks. I’m hoping to move to the use of R in my research. I’ve been spending quite a bit of time just playing with my data in R. I’m experiencing a bit of a learning curve, but I like R’s expressive power and the simplicity of its syntax very much.

    There are three things I’m still trying to understand about R before fully making the transition:

    (1) how to add 95% confidence intervals to a line-graph (and ideally to an interaction.plot).
    (I’ve checked the R graphics library and as far as I can tell there are no good examples)

    (2) how to superimpose bar graphs of error data on an existing line plot and

    (3) how to do a three-way ANOVA with the car() library using type III Sum of Squares, the way one would in SPSS (R uses Type I, according to Ista Zahn).

    I’m particularly interested in combining R with Sweave, so that all of the numbers in my LaTeX documents are live.

    I’ll probably do a whole post (or several) about these and related R issues, but if you have any off-the-top thoughts, I’d welcome them.

    • Mike Lawrence

      Mark,

      re: (1)
      Check out the ggplot2 graphics package (http://had.co.nz/ggplot2/). If you are quick, you should be able to grab a pdf of the book (http://had.co.nz/ggplot2/book/) that will be published imminently.

      re: (2)
      It’s rarely advisable to superimpose 2 DVs with different metric (eg. RT & accuracy) in the same plot. Further, bar graphs are a waste of ink where a dot or dot-and-line approach provides equivalent information. The way I display 2 DVs of different metrics is either as 2 different graphs, or as 2 different rows using facet_grid(scales=’free_y’) in ggplot2.

      re: (3)
      There are good reasons why R computes the sum squares it does. Much has been written about this on the r-help list, searchable on nabble (http://www.nabble.com/R-help-f13820.html). John Fox has explained the important distinctions many times (ex. http://tr.im/uQml).

      • Mike —

        re: (1)
        Ah, perfect – thanks! Looks like page 82 (fig 5.15) is the charm. (The code for this is in: http://had.co.nz/ggplot2/book/toolbox.r)

        re: (2)
        Mike, thank you. This helped me to realize that I was getting stuck on this approach because I was trying to reproduce a graph from a paper I was replicating.

        “… or as 2 different rows using facet_grid(scales=’free_y’) in ggplot2.”

        I’m taking it this would be two graphs tiled one on top of each-other, with the x axis fixed (7.2.3, p. 121)?

        re:(3)
        I just read John Fox’s discussion — thank you for this. I’ll also try the nabble search you suggest.

        Do you have any thoughts on where I could look for R syntax for 3 or 4 way ANOVAs? I’ve looked in a number of R books and manuals. Apparently I’m not the only person with this difficulty: http://tr.im/uT3d

        This thread:
        http://tr.im/uUvu

        suggested to another poster to that N-way examples exist, but my R search foo is failing me.

        (I’m running a 5x2x2 within-subjects design, where I occasionally want to make comparisons between two experiments.)

      • (Hopefully this works; I’m trying to reply to Mark’s reply to my reply… possibly too deeply threaded for the wordpress interface and I don’t see a reply by Mark’s last reply)

        re: (2)
        Yes, this will plot the DVs as different panels, tiled on top of one another. I suggest setting the y-axis label as “” (null) and then ensuring the factor labels are appropriate for the variable specifying the faceting. Let’s say you have two data frames, a, and b. In a is RT data, in b is error rate data (ER, which is preferred to accuracy because ER lets you see speed-accuracy trade-offs without extra mental computation; if ER trends in an opposite direction than RT, there’s a possible SAT problem). Say your x-axis variable is stimulus onset asynchrony, SOA, and your y-axis variable is in a column named y in each data frame. Then you can concatenate the data frames and add a variable distinguising the DVs:

        ab = rbind(cbind(a,DV = ‘Mean RT (ms)’),cbind(b,DV=’Mean ER (%)’))

        Next, a plot
        ggplot( data = ab, mapping = aes(x = SOA, y = y))+
        geom_point()+
        geom_line()+
        facet_grid( DV ~ . , scales = ‘free_y’)+
        scale_y_continuous( name = ”)

        re: (4)
        If you want to get going with simple ANOVAs quickly, check out the “ez” package (written by me):
        install.packages(“ez”)
        library(“ez”)
        ?ez

        ez makes specifying ANOVA very simple, see the examples in ezANOVA. Otherwise the syntax using aov() is:
        aov(DV ~ IV1*IV2*IV3)
        or, if there’s any within-Ss variables, you add an error term that includes a pointer to the Ss column and pointers to all within-Ss effects. Here’s one with 2 between-Ss variables, and 2 within-Ss variables:
        aov(DV ~ IV1*IV2*IV3*IV3 + Error(SID/(IV3*IV4))

        Note that with many-way repeated measures the MANOVA approach to reapted measures employed by ez (via car) breaks down if you don’t have enough Ss in the design. ez will produce results in this case, but provides no sphericity tests/corrections*, and with so many repeated measures sphericity is surely violated. Best to confirm such results via Mixed Effects Modelling, which doesn’t necessitate sphericity (indeed, where MEM is more statistically powerful and can handle missing data, it is poised to eventually replace ANOVA… once we psychologists take our heads of the sand).

        *-> I’m frankly unsure how SPSS manages to compute these when it crunches the same data, as presumably it is doing the same thing as car does; a question for John Fox possibly?

  3. That’s great – look forward to your thoughts about R. I was asking because Andy Lang and I were wondering if it might be worth learning for our solubility statistics.
    BTW – I saw that you are at Carleton – I did my PhD at U of Ottawa.

  4. Grin – small world!

    I’m really attracted by the degree of automation that R would seem to provide. I’d like to get things to the point where I can can automatically generate a brace of statistics and graphics each time I run a subject, so I can monitor how an experiment is going in a meaningful way. I’d love to have my experiment email me a summary graph, for instance, each time an participant has completed an experiment.

  5. I think we’re headed in a similar direction. We would like to set up web services to do our statistics.

  6. I’d be interested to hear about any pointers to the use of R with web services you run across.

  7. Sure – if we do anything with R I’ll post it on the UsefulChem blog

  8. Great collection indeed (and hi Mark and Jean-Claude)!

    In one of my studies I used Zelig (http://gking.harvard.edu/zelig/), which is based on R, because it has great tools for handling rare events in logistic regression.

    • Hi Victor! Just based on the description on their page,

      “Zelig comes with detailed, self-contained documentation that minimizes startup costs for Zelig and R, automates graphics and summaries for all models, and, with only three simple commands required, generally makes the power of R accessible for all users.” – http://gking.harvard.edu/zelig

      I find myself thinking: ‘Zelig is to R as Lyx is to LaTeX.’ In your experience, would that be a reasonable analogy to draw?

      • Hi Mark – not quite, I think!

        If I recall correctly, Lyx tries to eliminate the need for manually entering LaTeX code and instead offers a point-and-click interface, right?

        Zelig, on the other hand, still requires you to enter code manually – but I believe the commands are a bit simpler than in standard R, and I think some procedures that require multiple steps in R are rolled into a single command in Zelig.

        Anyhow, it’s been years since I tried either Lyx or Zelig, so things might have changed 🙂

  9. Jean-Claude – great – I’ve added UsefulChem to my feed and I’ll be following it with interest.

    Cameron Neylon showed an nifty slide of Second Life visualizations at the Science 2.0 meeting in Toronto yesterday, which I think he mentioned was by Andy Lang. Same person?

  10. Yes same Andy – Cameron probably showed our solubility data in Second Life.

  11. Pingback: BotchagalupeMarks for July 30th - 09:50 | IT Management and Cloud Blog

  12. It’s great to see more interest in this area -especially in Canada! I have been using Python, R, and LaTeX throughout my master’s degree in Behavioral Neuroscience (beginning my Ph.D now). I’ve always felt like an outcast in this respect as our field is predominantly SPSS/MS Office oriented. The closest thing to a community we have is the one surrounding Psychopy/VisionEgg, but that is Python specific and primarily psychophysics. I’ll be adding your blog to my feed reader and keeping an ear out for hints of a growing FLOSS community in the cognitive sciences.

  13. Its is excellent to find this. I read an article on Phys.org (http://phys.org/news/2012-09-open-source-revolution-science.html) and it got me excited about the direction some scientists are taking. So a quick search and I ended up here.

    I personally work in a behavioral neuroscience laboratory and most of the equipment we use is stupid expensive. I would love to find ways to help lower costs in order to make the field more accessible. The more people we have contributing to the field the faster progress will be made, and at the moment equipment costs are a huge barrier to entry for many investigators and universities (particularly smaller institutions).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s