Importing published articles when the preprint is in the library

I am trying to cite multiple items that were created with the same title and authors, but have different dates. PaperPile isn’t able to add these items to a single cite: it says “oops, it’s already added.” What’s more, when clicking on the inline paperpile icon from a search result for one of the items, it will bring me to the wrong entry in my database.

I’ve gone and changed the titles of the entries, which will make them technically incorrect but, hopefully, citeable. However the error still remains, in that I can’t cite multiple entries, which now have the same authors but different dates and different titles.

Hoping there will be a fix for this; I expect it will be common in many fields.

1 Like

is this what you were looking for?
(APA format)

(author1last & author2last, 2017, 2018)

REFERENCES
author1last, A., & author2last, A. (2017). title. The Journal, 1(1), 1.
author1last, A., & author2last, A. (2018). title. The Journal, 2(2), 2.

No, it isn’t a citation format issue, it’s something to do with the app.

I cannot imagine anything else that this might be - can you describe any more clearly?

The engine cannot differentiate between different database entries that share the same author and title.

1 Like

perhaps the staff can understand this question… I sure can’t. It would appear that my example perfectly demonstrates your situation - i.e., 2 “different database entries that share the same author and title”. The title in my example is TITLE. The authors in my example are AUTHOR1LAST and AUTHOR2LAST. Same title, same authors.

Try the following:

  1. Remove the references from the document.
  2. Go to Paperpile --> View all references, in the Google Docs menu and remove the references from that page.
  3. In your library, search for the articles (i.e. their title), then click the Duplicates filter next to the list of references, then mouse over one of them and click the “Not a duplicate” link.
  4. Re-insert the references into the document.

What’s the workaround if I can’t procedurally add documents because there’s already one there with the same authors and title (e.g. paperpile says its already in my library and won’t let me click to add it)?

I think I know what you are talking about. I had the same problem (if I understand your issue correctly) and I was not able to find the solution.

In my situation, we published a preprint of a paper (https://peerj.com/preprints/1541/), and later a peer-reviewed paper (https://peerj.com/articles/2057/), both of which had exact same title, year, authors, and almost the same journal name (PeerJ Preprints vs PeerJ). Paperpile is unable to distinguish between the two. If this is the issue you are having @cointelbro, then hopefully the links above will help Paperpile staff to troubleshoot the issue. In this particular situation, both articles have DOIs that should uniquely identify the papers, so it seems like a bug in Paperpile to me.

Thank you for the details @Andrey. It would help us if we knew more about your situation as well @cointelbro.

This does sound like an uncommon scenario and the best approach is to add the new entry manually via Add Papers -> Add Manually -> Enter data by hand. Duplicate detection is more art than science and encountering the same paper on different sources (i.e. on the publisher’s website and in a Google Scholar search) is a much more common occurrence than having to add two copies of nearly the exact same paper. As a result, we err on the side of preventing duplicate imports.

Well, I do not think the situation I described is rare at all. It is becoming extremely common that papers are published as preprints prior to their submission for review. Preprint and paper are two very distinctive articles, and very often each has a DOI. If you have a mechanism to reliably identify DOI from a web page or PDF (and I am pretty sure you do), I find it to be a strange decision to ignore the detected DOI and assign both to the same manuscript entry.

2 Likes

We do not always have much information about the paper when doing a duplicate detection. This is particularly true on search results pages where we first have to crawl the link to get the full metadata. As a result, we do not use all fields when doing duplicate detection for imports.

One workaround for the preprint/published scenario is to create a manual entry for the second paper and only fill in the DOI or PMID. Then select the paper and run an auto-update (shift + A). We do not do the same duplication tests when adding papers manually, so the updating of the metadata will not be blocked in this case.

I agree with Andrey that it is a common occurrence that you’ll see with more users as you expand. I see it in computer science (RFCs, preprints), legal/policy stuff (e.g. DoD directives), and then of course the rapidly expanding preprint universe that is almost field-defining in a lot of places. My humble opinion is that, if the system for dealing with similar but not identical objects won’t be made more robust, we should at least have the option to turn it off. Then false duplicate detection could be our cross to bear.

2 Likes

Thank you for the feedback. We will discuss internally different ways to address this issue.

Why would you want to cite both a preprint and the published version, or even have both in your library? As soon as my article is published, I replace the preprint with the publication, just like you should no longer cite someone else’s preprint after the article is published, as e.g. biorxiv points you to do. Any situation that requires citing a preprint after publication is a corner case, and I think for the large majority of cases it is better to treat them as duplicates that should be resolved.

Anything with iterative development. Protocol specs. Revisions to policy. maybe leave the corner case deliberations to devs. thx

In case it’s useful, another example I’m running into are contracts which are repeated multiple times, across time, with the same name and signatories.

I’m having the same issue. An example is this one: https://www.biorxiv.org/content/early/2018/02/07/261214
It’s an bioRxiv paper that’s been published in Neuron. I had the bioRxiv version already in the library, then when I tried to add the Neuron version, it says it’s already there…but with the bioRxiv info. Two papers, two DOIs but only one entry in the library possible…

1 Like