"Troubling Oddities" In A Social Psychology Data Set

A potential case of data manipulation has been uncovered in a psychology paper. The suspect article is ‘Why money meanings matter in decisions to donate time and money’ (2013) from University of Arizona psychologists Promothesh Chatterjee, Randall L. Rose, and Jayati Sinha.

This study fell into the genre of ‘social priming‘, specifically ‘money priming’. The authors reported that making people think about cash reduces their willingness to help others, while thinking of credit cards has the opposite effect.

Now, a critical group of researchers led by Hal Pashler allege “troubling oddities” in the data. Pashler et al.’s paper is followed by three responses, one from each of the original authors (Chatterjee, Rose, Sinha), and finally by a summing-up from the critics. Pashler et al. recently published a failure to replicate several money priming effects.

Pashler et al. focus on Chatterjee et al.’s Study #3, the last of the three experiments reported in the paper; they report having some concerns about the other two studies as well, but they don’t go into much detail.

The “odd” data in Study #3 comes from a word completion task. In this paradigm, participants are shown ‘word stems’ and asked to complete them with the first word they think of. e.g. the stem might be BR___ and I being a neuroscientist might write BRAIN; you might be feeling hungry so you might write BRUNCH.

Pashler et al. say that 20 participants (out of 94 who completed Study #3) gave a striking similar pattern of word-stem responses. Specifically, these 20 participants tended to give the same answers to nine ‘filler’ items, which were chosen to not be affected by the money vs. credit card priming. Here are the raw responses:

word_comp

The sets of words are not identical, but most of them differ in only one or two words from the “consensus” answers within the block. Pashler et al. say that this is extremely unlikely to have happened by chance, and they raise the possibility that these 20 participants were “reduplicated” – essentially, copy-pasted. The few differences may then have been added manually, to make the data look less suspicious.

Couldn’t it be that people just tend to complete these stems the same way? Pashler et al. say that this is prima facie unlikely – is “SPOOK” really the obvious completion for “SPO_”? More importantly, they also show that the degree of overlap is far higher than would be expected by chance: the other 74 participants, who Pashler et al. seem to accept as real, gave much more divergent answers.

What makes Pashler et al.’s point more remarkable is that these 20 participants were not selected (or cherry-picked) on the basis of their similar responses to the filler items. They were selected for an entirely different reason – because they form two subgroups who showed a dramatic response to the priming manipulation.

Priming response in the word completion task was defined by the number of ‘cost’ and ‘benefit’ word stems completed in a particular way. There were 8 of each of these kinds of stems. If we plot the number of ‘cost’ and ‘benefit’ completions per participant, we get a scatter plot. Two outlying subgroups are apparent: these groups show a strong priming effect in the expected direction (e.g. they are evidence that money priming works.)

Untitled

These are the same 20 participants in whom the responses to the filler items are extremely similar. Hmm.

So this is the main “troubling oddity”. Pashler et al. also report other strange features, such as a number of subjects who made the same, invalid responses e.g. six participants wrote “SURGERY” for the stem “SUPP__”. This, they say, could be evidence that someone manually changed copy-pasted responses, forgetting what the stem was.

In my view, Pashler et al. are right: these data are extremely odd. True, there is no proof of misconduct here, or even of honest error. These data could be real. It seems extremely improbable, however.

That said, it’s hard to say exactly how unlikely these results are. The authors, in their various rebuttals, raise the possibility that people who are highly susceptible to priming (i.e. the 20 “odd” participants) are psychologically similar to one another, and therefore tend to give similar word completions, even to filler words. Pashler et al. dispute this defense, saying that the ‘priming susceptibility’ effect would have to be enormous in order to account for the data, but it’s impossible to rebut completely.

Overall, I think we’re faced with a similar situation as with Jens Förster. Förster is a German psychologist who in 2014 was shown to have published papers containing extremely improbable data. Many of these papers have since been retracted, but Förster denies any wrongdoing, and he has defended himself by saying that some unknown mechanism could have generated the odd statistical patterns.

In this case, none of the authors have confessed to wrongdoing. They have, however, reportedly agreed to retract Study #3, and two of them have now disclaimed any involvement in handling the data for that study. According to Pashler et al. in their summing up:

Shortly after our paper was accepted for publication, we learned that all of the original authors had apparently decided amongst themselves that Study 3 should be “retracted.” As far as we know, they have not explained precisely what that means or exactly why they wish this partial retraction to take place, beyond referring to alleged “coding errors”…

From the authors’ commentaries on our paper, it seemed to us that two of the three authors (Rose and Sinha) wish it to be known that they had no personal involvement in the data analysis. Sinha stated that the first author (Chatterjee) was exclusively responsible for “data merging,” data coding, and data analysis. Rose goes further to say that he had no involvement in either data collection or data analysis.

ResearchBlogging.orgPashler, H., Rohrer, D., Abramson, I., Wolfson, T., Harris, C. (2016). A Social Priming Data Set With Troubling Oddities Basic and Applied Social Psychology, 38 (1), 3-18 DOI: 10.1080/01973533.2015.1124767

  • This looks like a job for cluster analysis or multidimensional scaling… if you define a measure of dissimilarity between completion data for each pair of subjects, then clusters of similar subjects would stand out immediately in the MDS solution.
    Yes, I do a lot of MDS.

    Pashler c also point out that the purported effect sizes are preposterous.

    And purportedly, subjects in the “credIt-primed” condition gave away 3/4 of their (token) reward for the task, to a fake charity offered by the experimenters (compared to the unprimed group, who only have back half their reward, and the cash-primed group who gave back 1/4). I’m sorry, if my employer invited me to give back 73% of my earnings to the employer’s own slush-fund, I would not be so generous.

Open bundled references in tabs:

Leave a Reply