There are a lot of research papers out there, more than ever. Along with the good news (increasing knowledge), comes some bad news: increasing duplication and plagiarism, more often than not going undetected. The developers of eTBLAST, which is a great tool we’ve had a tip on before, have created another tool using an eTBLAST search of Medline and other databases to find highly similar citations: Deja Vu.
These similar citations could be legitimate; a review of a previous article, an author using similar wording of an abstract from a previous paper for new research (the eTBLAST search can only search titles and abstracts), sanctioned duplications, etc, etc. as the author of the post “Deja Boo” points out. There are some real instances of duplications (authors attempting to pad their CVs) and plagiarism (stealing words and research). An earlier example (before Deja Vu) found at Panda’s Thumb is of a creationist attempting to pad a CV and look more legitimate. Errami and Gardner (two of the developers of the tool) published a paper in Nature earlier this year with many such instances of (and another in Science, reported on here with some interesting discussion) duplication and plagiarism.
Still, the database needs to be viewed with caution. Of the 74,792 ‘highly similar and duplicate citations’ found, 92% have not be verified. Of the 8% left that have been verified (this has to be done by manual curation), 65% have been found to be probably legitimate (as stated above) and 35% to be duplicates. But even the duplicates aren’t necessarily nefarious. Since full texts are not available, it is often the case that the duplication might be perfectly understandable (reusing an abstract with some minor changes for new research, etc). Still, it is a tool that, with some work, can help tremendously in that search for true duplicates and plagiarism, and perhaps even just the threat of it might lower the instances? :D
So, with that in mind, this week’s tip of the week is a quick view of “Deja Vu.”