Finding needle in haystack
Finding needle in haystack (Photo credit: Bindaas Madhavi)

A former colleague of mine at an institution I no longer work at has admitted to being a science fraudster.*

I participated in their experiments, I read their papers, I respected their work. I felt a very personal outrage when I heard what they had done with their data. But the revelation went some way to answering questions I ask myself when reading about those who engage in scientific misconduct. What are they like? How would I spot a science fraudster?

Here are the qualities of the fraudster that stick with me.

  • relatively well-dressed.
  • OK (not great, not awful) at presenting their data.
  • doing well (but not spectacularly so) at an early stage of their career.
  • socially awkward but with a somewhat overwhelming projection of self-confidence.

And that’s the problem. I satisfy three of the four criteria above. So do most of my colleagues. If you were to start suspecting every socially awkward academic of fabricating or manipulating their data, that wouldn’t leave you with many people to trust. Conversations with those who worked much more closely with the fraudster reveal more telling signs that something wasn’t right with their approach, but again, the vast majority of the people with similar character flaws don’t fudge their data. It’s only once you formally track every single operation that has been carried out on their original data that you can know for sure whether or not someone has perpetrated scientific misconduct. And that’s exactly how this individual’s misconduct was discovered – an eagle-eyed researcher working with the fraudster noticed some discrepancies in the data after one stage of the workflow. Is it all in the data?

Let’s move beyond the few bad apples argument. A more open scientific process (e.g. the inclusion of original data with the journal submission) would have flagged some of the misconduct being perpetrated here, but only after someone had gone to the (considerable) trouble of replicating the analyses in question.  Most worryingly, it would also have missed the misconduct that took place at an earlier stage of the workflow. It’s easy to modify original data files, especially if you have coded the script that writes them in the first place. It’s also easy to change ‘Date modified’ and ‘Date created’ timestamps within the data files.

Failed replication would have helped, but the file drawer problem, combined with the pressure on scientists to publish or perish typically stops this sort of endeavor (though there are notable exceptions such as the “Replications of Important Results in Cognition”special issue of Frontiers in Cognition ). I also worry that the publication process, in its current form, does nothing more constructive than start an unhelpful rumour-mill that never moves beyond gossip and hearsay. The pressure to publish or perish is also cited as motivation for scientists to cook their data. In this fraudster’s case, they weren’t at a stage of their career typically thought of as being under this sort of pressure (though that’s probably a weak argument when applied to anyone without a permanent position). All of which sends us back to trying to spot the fraudster and not the dodgy data. It’s a circular path that’s no more helpful than uncharitable whispers in conference centre corridors.

So how do we identify scientific misconduct? Certainly not with a personality assessment, and only partially with an open science revolution. If someone wants to diddle their data, they will. Like any form of misconduct, if they do it enough, they will probably get caught. Sadly, that’s probably the most reliable way of spotting it. Wait until they become comfortable enough that they get sloppy. It’s just a crying shame it wastes so much of everyone’s time, energy and trust in the meantime.


*I won’t mention their name in this post for two reasons: 1) to minimise collateral damage that this is having on the fraudster’s former collaborators,  former institution and their former (I hope) field; and 2) because this must be a horrible time for them, and whatever their reason for the fraud, it’s not going to help them rehabilitate themselves in ANY career if a Google search on their name returns a tonne of condemnation.